All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] Introduce an internal API to interact with the fsck machinery
@ 2014-12-08 16:13 Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 01/18] Introduce fsck options Johannes Schindelin
                   ` (18 more replies)
  0 siblings, 19 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:13 UTC (permalink / raw)
  To: gitster; +Cc: git

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.


Johannes Schindelin (18):
  Introduce fsck options
  Introduce identifiers for fsck messages
  Provide a function to parse fsck message IDs
  Offer a function to demote fsck errors to warnings
  Allow demoting errors to warnings via receive.fsck.<key> = warn
  fsck: report the ID of the error/warning
  Make fsck_ident() warn-friendly
  Make fsck_commit() warn-friendly
  fsck: handle multiple authors in commits specially
  Make fsck_tag() warn-friendly
  Add a simple test for receive.fsck.*
  Disallow demoting grave fsck errors to warnings
  Optionally ignore specific fsck issues completely
  fsck: allow upgrading fsck warnings to errors
  Document the new receive.fsck.* options.
  fsck: support demoting errors to warnings
  Introduce `git fsck --quick`
  git receive-pack: support excluding objects from fsck'ing

 Documentation/config.txt        |  27 +++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  66 ++++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  36 ++-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 512 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  28 ++-
 t/t1450-fsck.sh                 |  33 +++
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  46 ++++
 11 files changed, 624 insertions(+), 162 deletions(-)

-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH 01/18] Introduce fsck options
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-10 15:33   ` Junio C Hamano
  2014-12-08 16:14 ` [PATCH 02/18] Introduce identifiers for fsck messages Johannes Schindelin
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index a27515a..2241e29 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static unsigned char head_sha1[20];
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -630,6 +632,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a369f55..1c17c3f 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -74,6 +74,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -191,7 +192,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -782,10 +783,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			fsck_options.walk = mark_link;
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 855d94b..e9e8bec 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 2fffa43..d6f539f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,7 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -24,9 +24,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -39,7 +39,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -48,14 +48,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -65,14 +65,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -80,11 +80,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -137,7 +137,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -191,7 +191,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -216,30 +216,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -247,7 +247,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -255,36 +255,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -292,30 +292,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -325,39 +325,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -373,64 +373,64 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %s", buffer);
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %s", buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -439,34 +439,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..84a337c 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	int strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 02/18] Introduce identifiers for fsck messages
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 01/18] Introduce fsck options Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 03/18] Provide a function to parse fsck message IDs Johannes Schindelin
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

Rather than specifying only whether a message by the fsck machinery
constitutes an error or a warning, let's specify an identifier relating
to the concrete problem that was encountered. This is necessary for
upcoming support to be able to demote certain errors to warnings.

In the course, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  24 +++-----
 fsck.c         | 185 +++++++++++++++++++++++++++++++++++++++------------------
 fsck.h         |   5 +-
 3 files changed, 137 insertions(+), 77 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2241e29..99d4538 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -47,32 +47,22 @@ static int show_dangling = 1;
 #endif
 
 static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+                      const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+	        severity, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d6f539f..3cea034 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,83 @@
 #include "fsck.h"
 #include "refs.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE) \
+	FUNC(BAD_EMAIL) \
+	FUNC(BAD_NAME) \
+	FUNC(BAD_PARENT_SHA1) \
+	FUNC(BAD_TIMEZONE) \
+	FUNC(BAD_TREE_SHA1) \
+	FUNC(DATE_OVERFLOW) \
+	FUNC(DUPLICATE_ENTRIES) \
+	FUNC(INVALID_OBJECT_SHA1) \
+	FUNC(INVALID_TAG_OBJECT) \
+	FUNC(INVALID_TREE) \
+	FUNC(INVALID_TYPE) \
+	FUNC(MISSING_AUTHOR) \
+	FUNC(MISSING_COMMITTER) \
+	FUNC(MISSING_EMAIL) \
+	FUNC(MISSING_GRAFT) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL) \
+	FUNC(MISSING_OBJECT) \
+	FUNC(MISSING_PARENT) \
+	FUNC(MISSING_SPACE_BEFORE_DATE) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL) \
+	FUNC(MISSING_TAG) \
+	FUNC(MISSING_TAG_ENTRY) \
+	FUNC(MISSING_TAG_OBJECT) \
+	FUNC(MISSING_TREE) \
+	FUNC(MISSING_TYPE) \
+	FUNC(MISSING_TYPE_ENTRY) \
+	FUNC(NOT_SORTED) \
+	FUNC(NUL_IN_HEADER) \
+	FUNC(TAG_OBJECT_NOT_TAG) \
+	FUNC(UNKNOWN_TYPE) \
+	FUNC(UNTERMINATED_HEADER) \
+	FUNC(ZERO_PADDED_DATE) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE) \
+	FUNC(EMPTY_NAME) \
+	FUNC(FULL_PATHNAME) \
+	FUNC(HAS_DOT) \
+	FUNC(HAS_DOTDOT) \
+	FUNC(HAS_DOTGIT) \
+	FUNC(INVALID_TAG_NAME) \
+	FUNC(MISSING_TAGGER_ENTRY) \
+	FUNC(NULL_SHA1) \
+	FUNC(ZERO_PADDED_FILEMODE)
+
+#define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
+
+#define MSG_ID(x) FSCK_MSG_##x,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+
+int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
+{
+	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_type = fsck_msg_type(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_type, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -216,25 +293,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -247,15 +324,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -263,28 +342,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -292,7 +371,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -309,13 +388,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -325,23 +404,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_INVALID_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -373,11 +452,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -388,47 +469,47 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %s", buffer);
+		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, "invalid 'tag' name: %s", buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -444,7 +525,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_INVALID_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -453,7 +534,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -466,22 +547,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int type, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 84a337c..a18e9a6 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 03/18] Provide a function to parse fsck message IDs
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 01/18] Introduce fsck options Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 02/18] Introduce identifiers for fsck messages Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-10 17:56   ` Junio C Hamano
  2014-12-08 16:14 ` [PATCH 04/18] Offer a function to demote fsck errors to warnings Johannes Schindelin
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

This function will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. It has to handle partial strings because we would
like to be able to parse, say, '--strict=missing-email=warn' command
lines.

To make the parsing robust, we generate strings from the enum keys, and we
will match both lower-case, dash-separated values as well as camelCased
ones (e.g. both "missing-email" and "missingEmail" will match the
"MISSING_EMAIL" key).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/fsck.c b/fsck.c
index 3cea034..05b146c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,6 +63,38 @@ enum fsck_msg_id {
 	FSCK_MSG_MAX
 };
 
+#define STR(x) #x
+#define MSG_ID_STR(x) STR(x),
+static const char *msg_id_str[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID_STR)
+	NULL
+};
+
+static int parse_msg_id(const char *text, int len)
+{
+	int i, j;
+
+	for (i = 0; i < FSCK_MSG_MAX; i++) {
+		const char *key = msg_id_str[i];
+		/* msg_id_str is upper-case, with underscores */
+		for (j = 0; j < len; j++) {
+			char c = *(key++);
+			if (c == '_') {
+				if (isalpha(text[j]))
+					c = *(key++);
+				else if (text[j] != '_')
+					c = '-';
+			}
+			if (toupper(text[j]) != c)
+				break;
+		}
+		if (j == len && !*key)
+			return i;
+	}
+
+	die("Unhandled type: %.*s", len, text);
+}
+
 int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
 {
 	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (2 preceding siblings ...)
  2014-12-08 16:14 ` [PATCH 03/18] Provide a function to parse fsck message IDs Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-10 18:00   ` Junio C Hamano
  2014-12-08 16:14 ` [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn Johannes Schindelin
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The function added by this commit parses a list of settings in the form:

	missing-email=warn,bad-name=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we
need to take extra care to default to all FSCK_ERROR in those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fsck.h |  7 +++++--
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 05b146c..9e6d70f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -97,9 +97,63 @@ static int parse_msg_id(const char *text, int len)
 
 int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
 {
+	if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
+		return options->strict_mode[msg_id];
+	if (options->strict)
+		return FSCK_ERROR;
 	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+	int match_len = strlen(match);
+	if (match_len != len)
+		return -1;
+	return memcmp(string, match, len);
+}
+
+void fsck_strict_mode(struct fsck_options *options, const char *mode)
+{
+	int type = FSCK_ERROR;
+
+	if (!options->strict_mode) {
+		int i;
+		int *strict_mode = malloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			strict_mode[i] = fsck_msg_type(i, options);
+		options->strict_mode = strict_mode;
+	}
+
+	while (*mode) {
+		int len = strcspn(mode, " ,|"), equal, msg_id;
+
+		if (!len) {
+			mode++;
+			continue;
+		}
+
+		for (equal = 0; equal < len; equal++)
+			if (mode[equal] == '=')
+				break;
+
+		if (equal < len) {
+			const char *type_str = mode + equal + 1;
+			int type_len = len - equal - 1;
+			if (!substrcmp(type_str, type_len, "error"))
+				type = FSCK_ERROR;
+			else if (!substrcmp(type_str, type_len, "warn"))
+				type = FSCK_WARN;
+			else
+				die("Unknown fsck message type: '%.*s'",
+					len - equal - 1, type_str);
+		}
+
+		msg_id = parse_msg_id(mode, equal);
+		options->strict_mode[msg_id] = type;
+		mode += len;
+	}
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -585,6 +639,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int type, const char *message)
 {
+	if (type == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index a18e9a6..9d67ea2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,8 @@
 
 struct fsck_options;
 
+void fsck_strict_mode(struct fsck_options *options, const char *mode);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +27,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	int strict:1;
+	int *strict_mode;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (3 preceding siblings ...)
  2014-12-08 16:14 ` [PATCH 04/18] Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-10 17:52   ` Junio C Hamano
  2014-12-08 16:14 ` [PATCH 06/18] fsck: report the ID of the error/warning Johannes Schindelin
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.missing-email warn

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 27 +++++++++++++++++++++++----
 builtin/unpack-objects.c |  5 +++++
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 1c17c3f..34a11b3 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,6 +1565,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (starts_with(arg, "--strict=")) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_strict_mode(&fsck_options, arg + 9);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e908d07..111e514 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -35,6 +35,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_strict_mode = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int unpack_limit = 100;
@@ -109,6 +110,14 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (starts_with(var, "receive.fsck.")) {
+		if (fsck_strict_mode.len)
+			strbuf_addch(&fsck_strict_mode, ',');
+		strbuf_addf(&fsck_strict_mode,
+			"%s=%s", var + 13, value ? value : "error");
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1266,8 +1275,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
 		if (quiet)
 			argv_array_push(&child.args, "-q");
-		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+		if (fsck_objects) {
+			if (fsck_strict_mode.len)
+				argv_array_pushf(&child.args, "--strict=%s",
+					fsck_strict_mode.buf);
+			else
+				argv_array_push(&child.args, "--strict");
+		}
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1284,8 +1298,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
-		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+		if (fsck_objects) {
+			if (fsck_strict_mode.len)
+				argv_array_pushf(&child.args, "--strict=%s",
+					fsck_strict_mode.buf);
+			else
+				argv_array_push(&child.args, "--strict");
+		}
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e9e8bec..916616f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (starts_with(arg, "--strict=")) {
+				strict = 1;
+				fsck_strict_mode(&fsck_options, arg + 9);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 06/18] fsck: report the ID of the error/warning
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (4 preceding siblings ...)
  2014-12-08 16:14 ` [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 07/18] Make fsck_ident() warn-friendly Johannes Schindelin
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missing-email", the user might want to call `git config
receive.fsck.missing-email warn`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/fsck.c b/fsck.c
index 9e6d70f..ff50a87 100644
--- a/fsck.c
+++ b/fsck.c
@@ -154,6 +154,23 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
 	}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c == '_')
+			c = '-';
+		else
+			c = tolower(c);
+		strbuf_addch(sb, c);
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -162,6 +179,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	append_msg_id(&sb, msg_id_str[id]);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_type, sb.buf);
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 07/18] Make fsck_ident() warn-friendly
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (5 preceding siblings ...)
  2014-12-08 16:14 ` [PATCH 06/18] fsck: report the ID of the error/warning Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-08 16:14 ` [PATCH 08/18] Make fsck_commit() warn-friendly Johannes Schindelin
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index ff50a87..256f567 100644
--- a/fsck.c
+++ b/fsck.c
@@ -444,40 +444,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 08/18] Make fsck_commit() warn-friendly
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (6 preceding siblings ...)
  2014-12-08 16:14 ` [PATCH 07/18] Make fsck_ident() warn-friendly Johannes Schindelin
@ 2014-12-08 16:14 ` Johannes Schindelin
  2014-12-08 16:15 ` [PATCH 09/18] fsck: handle multiple authors in commits specially Johannes Schindelin
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:14 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missing-author error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 256f567..a63654c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -499,12 +499,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -513,11 +519,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 09/18] fsck: handle multiple authors in commits specially
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (7 preceding siblings ...)
  2014-12-08 16:14 ` [PATCH 08/18] Make fsck_commit() warn-friendly Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-10 18:04   ` Junio C Hamano
  2014-12-08 16:15 ` [PATCH 10/18] Make fsck_tag() warn-friendly Johannes Schindelin
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
missing-committer = warn, but that could hide missing tree objects in the
same commit because we cannot continue verifying any commit object after
encountering a missing committer line, while we can continue in the case
of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fsck.c b/fsck.c
index a63654c..21ff35b 100644
--- a/fsck.c
+++ b/fsck.c
@@ -37,6 +37,7 @@
 	FUNC(MISSING_TREE) \
 	FUNC(MISSING_TYPE) \
 	FUNC(MISSING_TYPE_ENTRY) \
+	FUNC(MULTIPLE_AUTHORS) \
 	FUNC(NOT_SORTED) \
 	FUNC(NUL_IN_HEADER) \
 	FUNC(TAG_OBJECT_NOT_TAG) \
@@ -536,6 +537,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+		if (err)
+			return err;
+		/* require_end_of_header() ensured that there is a newline */
+		buffer = strchr(buffer, '\n') + 1;
+	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 10/18] Make fsck_tag() warn-friendly
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (8 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 09/18] fsck: handle multiple authors in commits specially Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-08 16:15 ` [PATCH 11/18] Add a simple test for receive.fsck.* Johannes Schindelin
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1048 bytes --]

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 21ff35b..c1e7a85 100644
--- a/fsck.c
+++ b/fsck.c
@@ -604,7 +604,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 11/18] Add a simple test for receive.fsck.*
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (9 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 10/18] Make fsck_tag() warn-friendly Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-08 16:15 ` [PATCH 12/18] Disallow demoting grave fsck errors to warnings Johannes Schindelin
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..db79e56 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,24 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit << EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.missing-mail = warn' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.missing-email warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missing-email" act
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 12/18] Disallow demoting grave fsck errors to warnings
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (10 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 11/18] Add a simple test for receive.fsck.* Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-10 18:06   ` Junio C Hamano
  2014-12-08 16:15 ` [PATCH 13/18] Optionally ignore specific fsck issues completely Johannes Schindelin
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 8 ++++++--
 t/t5504-fetch-receive-strict.sh | 9 +++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index c1e7a85..f8339af 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,9 @@
 #include "refs.h"
 
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER) \
+	FUNC(UNTERMINATED_HEADER) \
 	/* errors */ \
 	FUNC(BAD_DATE) \
 	FUNC(BAD_EMAIL) \
@@ -39,10 +42,8 @@
 	FUNC(MISSING_TYPE_ENTRY) \
 	FUNC(MULTIPLE_AUTHORS) \
 	FUNC(NOT_SORTED) \
-	FUNC(NUL_IN_HEADER) \
 	FUNC(TAG_OBJECT_NOT_TAG) \
 	FUNC(UNKNOWN_TYPE) \
-	FUNC(UNTERMINATED_HEADER) \
 	FUNC(ZERO_PADDED_DATE) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE) \
@@ -56,6 +57,7 @@
 	FUNC(NULL_SHA1) \
 	FUNC(ZERO_PADDED_FILEMODE)
 
+#define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
 #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
 
 #define MSG_ID(x) FSCK_MSG_##x,
@@ -150,6 +152,8 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
 		}
 
 		msg_id = parse_msg_id(mode, equal);
+		if (type != FSCK_ERROR && msg_id < FIRST_NON_FATAL_ERROR)
+			die("Cannot demote %.*s", len, mode);
 		options->strict_mode[msg_id] = type;
 		mode += len;
 	}
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index db79e56..8a47db2 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -135,4 +135,13 @@ test_expect_success 'push with receive.fsck.missing-mail = warn' '
 	grep "missing-email" act
 '
 
+test_expect_success 'receive.fsck.unterminated-header = warn triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config receive.fsck.unterminated-header warn &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminated-header=warn" act
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 13/18] Optionally ignore specific fsck issues completely
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (11 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 12/18] Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-10 18:07   ` Junio C Hamano
  2014-12-08 16:15 ` [PATCH 14/18] fsck: allow upgrading fsck warnings to errors Johannes Schindelin
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective error to "ignore".

This change "abuses" the missing-email=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 7 ++++++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index f8339af..abfd3af 100644
--- a/fsck.c
+++ b/fsck.c
@@ -146,6 +146,8 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
 				type = FSCK_ERROR;
 			else if (!substrcmp(type_str, type_len, "warn"))
 				type = FSCK_WARN;
+			else if (!substrcmp(type_str, type_len, "ignore"))
+				type = FSCK_IGNORE;
 			else
 				die("Unknown fsck message type: '%.*s'",
 					len - equal - 1, type_str);
@@ -184,6 +186,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_IGNORE)
+		return 0;
+
 	append_msg_id(&sb, msg_id_str[id]);
 
 	va_start(ap, fmt);
diff --git a/fsck.h b/fsck.h
index 9d67ea2..82bedf9 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 8a47db2..0e521d9 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -132,7 +132,12 @@ test_expect_success 'push with receive.fsck.missing-mail = warn' '
 	test_must_fail git push --porcelain dst bogus &&
 	git --git-dir=dst/.git config receive.fsck.missing-email warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missing-email" act
+	grep "missing-email" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config receive.fsck.missing-email ignore &&
+	git  --git-dir=dst/.git config receive.fsck.bad-date warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missing-email" act
 '
 
 test_expect_success 'receive.fsck.unterminated-header = warn triggers error' '
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 14/18] fsck: allow upgrading fsck warnings to errors
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (12 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 13/18] Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-10 18:08   ` Junio C Hamano
  2014-12-08 16:15 ` [PATCH 15/18] Document the new receive.fsck.* options Johannes Schindelin
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by setting receive.fsck.invalid-tag-name and
receive.fsck.missing-tagger-entry to 'error'.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 ++++++++++++++++--------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fsck.c b/fsck.c
index abfd3af..154f361 100644
--- a/fsck.c
+++ b/fsck.c
@@ -52,13 +52,15 @@
 	FUNC(HAS_DOT) \
 	FUNC(HAS_DOTDOT) \
 	FUNC(HAS_DOTGIT) \
-	FUNC(INVALID_TAG_NAME) \
-	FUNC(MISSING_TAGGER_ENTRY) \
 	FUNC(NULL_SHA1) \
-	FUNC(ZERO_PADDED_FILEMODE)
+	FUNC(ZERO_PADDED_FILEMODE) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(INVALID_TAG_NAME) \
+	FUNC(MISSING_TAGGER_ENTRY)
 
 #define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
 #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
+#define FIRST_INFO FSCK_MSG_INVALID_TAG_NAME
 
 #define MSG_ID(x) FSCK_MSG_##x,
 enum fsck_msg_id {
@@ -103,7 +105,7 @@ int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
 	if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
 		return options->strict_mode[msg_id];
 	if (options->strict)
-		return FSCK_ERROR;
+		return msg_id < FIRST_INFO ? FSCK_ERROR : FSCK_WARN;
 	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
 }
 
@@ -643,13 +645,19 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, "invalid 'tag' name: %s", buffer);
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, "invalid 'tag' name: %s", buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 15/18] Document the new receive.fsck.* options.
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (13 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 14/18] fsck: allow upgrading fsck warnings to errors Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-08 16:15 ` [PATCH 16/18] fsck: support demoting errors to warnings Johannes Schindelin
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 7deae0b..b3276ee 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2109,6 +2109,20 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.*::
+	When `receive.fsckObjects is set to true, errors can be switched
+	to warnings and vice versa by setting e.g. `receive.fsck.bad-name`
+	to `warn` or `error` (or `ignore` to hide those errors
+	completely). For convenience, fsck prefixes the error/warning
+	with the name of the option, e.g. "missing-email: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.missing-email` to `ignore` will hide that issue.
+	For convenience, camelCased options are accepted, too (e.g.
+	`receive.fsck.missingEmail`).
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (14 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 15/18] Document the new receive.fsck.* options Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-10 18:15   ` Junio C Hamano
  2014-12-08 16:15 ` [PATCH 17/18] Introduce `git fsck --quick` Johannes Schindelin
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
missing emails in author, committer and tagger lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 13 +++++++++++++
 builtin/fsck.c           | 15 +++++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 39 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b3276ee..fa58c26 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1192,6 +1192,19 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.*::
+	With these options, fsck errors can be switched to warnings and
+	vice versa by setting e.g. `fsck.bad-name` to `warn` or `error`
+	(or `ignore` to hide those errors completely). For convenience,
+	fsck prefixes the error/warning with the name of the option, e.g.
+	"missing-email: invalid author/committer line - missing email"
+	means that setting `fsck.missing-email` to `ignore` will hide that
+	issue.  For convenience, camelCased options are accepted, too (e.g.
+	`fsck.missingEmail`).
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99d4538..2b8faa4 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,19 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (starts_with(var, "fsck.")) {
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "%s=%s", var + 5, value ? value : "error");
+		fsck_strict_mode(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *severity,
                       const char *err)
 {
@@ -638,6 +651,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 019fddd..d74df19 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -283,6 +283,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.multiple-authors=ignore fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 17/18] Introduce `git fsck --quick`
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (15 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 16/18] fsck: support demoting errors to warnings Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-08 16:15 ` [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
  2014-12-10 18:34 ` [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 	 [--[no-]dangling] [--[no-]progress] [<object>*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--quick::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2b8faa4..dcea9b0 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -184,6 +185,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (quick && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -618,6 +621,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -654,7 +658,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!quick)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index d74df19..d389d4a 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -407,4 +407,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+	rm -rf quick &&
+	git init quick &&
+	(
+		cd quick &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --quick &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --quick
+	)
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (16 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 17/18] Introduce `git fsck --quick` Johannes Schindelin
@ 2014-12-08 16:15 ` Johannes Schindelin
  2014-12-10 18:23   ` Junio C Hamano
  2014-12-10 18:34 ` [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-08 16:15 UTC (permalink / raw)
  To: gitster; +Cc: git

The optional new config option `receive.fsck.skip-list` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/receive-pack.c          |  9 +++++++
 fsck.c                          | 59 +++++++++++++++++++++++++++++++++++++++--
 fsck.h                          |  2 ++
 t/t5504-fetch-receive-strict.sh | 12 +++++++++
 4 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 111e514..5169f1f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -110,6 +110,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (starts_with(var, "receive.fsck.skip-list")) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		if (fsck_strict_mode.len)
+			strbuf_addch(&fsck_strict_mode, ',');
+		strbuf_addf(&fsck_strict_mode, "skip-list=%s", path);
+		return 0;
+	}
+
 	if (starts_with(var, "receive.fsck.")) {
 		if (fsck_strict_mode.len)
 			strbuf_addch(&fsck_strict_mode, ',');
diff --git a/fsck.c b/fsck.c
index 154f361..00693f2 100644
--- a/fsck.c
+++ b/fsck.c
@@ -7,6 +7,7 @@
 #include "tag.h"
 #include "fsck.h"
 #include "refs.h"
+#include "sha1-array.h"
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -56,7 +57,9 @@
 	FUNC(ZERO_PADDED_FILEMODE) \
 	/* infos (reported as warnings, but ignored by default) */ \
 	FUNC(INVALID_TAG_NAME) \
-	FUNC(MISSING_TAGGER_ENTRY)
+	FUNC(MISSING_TAGGER_ENTRY) \
+	/* special value */ \
+	FUNC(SKIP_LIST)
 
 #define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
 #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
@@ -109,6 +112,43 @@ int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
 	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
 }
 
+static void init_skip_list(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skip_list = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skip_list)
+		sorted = options->skip_list->sorted;
+	else {
+		sorted = 1;
+		options->skip_list = &skip_list;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skip_list, sha1);
+		if (sorted && skip_list.nr > 1 &&
+				hashcmp(skip_list.sha1[skip_list.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skip_list.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
 	int match_len = strlen(match);
@@ -141,6 +181,18 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
 			if (mode[equal] == '=')
 				break;
 
+		msg_id = parse_msg_id(mode, equal);
+		if (msg_id == FSCK_MSG_SKIP_LIST) {
+			char *path = xstrndup(mode + equal + 1, len - equal - 1);
+
+			if (equal == len)
+				die("skip-list requires a path");
+			init_skip_list(options, path);
+			free(path);
+			mode += len;
+			continue;
+		}
+
 		if (equal < len) {
 			const char *type_str = mode + equal + 1;
 			int type_len = len - equal - 1;
@@ -155,7 +207,6 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
 					len - equal - 1, type_str);
 		}
 
-		msg_id = parse_msg_id(mode, equal);
 		if (type != FSCK_ERROR && msg_id < FIRST_NON_FATAL_ERROR)
 			die("Cannot demote %.*s", len, mode);
 		options->strict_mode[msg_id] = type;
@@ -681,6 +732,10 @@ static int fsck_tag(struct tag *tag, const char *data,
 int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
+	if (options->skip_list &&
+			sha1_array_lookup(options->skip_list, obj->sha1) >= 0)
+		return 0;
+
 	if (!obj)
 		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
diff --git a/fsck.h b/fsck.h
index 82bedf9..74d11cd 100644
--- a/fsck.h
+++ b/fsck.h
@@ -29,6 +29,8 @@ struct fsck_options {
 	fsck_error error_func;
 	int strict:1;
 	int *strict_mode;
+	/* TODO: consider reading into a hashmap */
+	struct sha1_array *skip_list;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 0e521d9..cf6cd5d 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skip-list' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skip-list SKIP &&
+	echo $commit > dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.missing-mail = warn' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH 01/18] Introduce fsck options
  2014-12-08 16:14 ` [PATCH 01/18] Introduce fsck options Johannes Schindelin
@ 2014-12-10 15:33   ` Junio C Hamano
  2014-12-22 17:26     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 15:33 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

THis is not limited to this step, but

> Subject: Re: [PATCH 01/18] Introduce fsck options

please make it easier to cluster and spot the series in the eventual
shortlog by giving a common prefix to the patches, e.g.

	fsck: introduce fsck_options struct

> +static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
> +static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;

Is it a good idea to allow walker to be strict but obj verifier to
be not (or vice versa)?  I am wondering why this is not a single
struct with two callback function pointers.

> +struct fsck_options {
> +	fsck_walk_func walk;
> +	fsck_error error_func;
> +	int strict:1;

A signed 1-bit-wide bitfield can hold its sign-bit and nothing else,
no?

    unsigned strict:1;

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn
  2014-12-08 16:14 ` [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn Johannes Schindelin
@ 2014-12-10 17:52   ` Junio C Hamano
  2014-12-22 21:44     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 17:52 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> For example, missing emails in commit and tag objects can be demoted to
> mere warnings with
>
> 	git config receive.fsck.missing-email warn

No punctuations in the first and the last level of configuration
variable names, please.  I.e. s/missing-email/missingEmail/ or
something.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 03/18] Provide a function to parse fsck message IDs
  2014-12-08 16:14 ` [PATCH 03/18] Provide a function to parse fsck message IDs Johannes Schindelin
@ 2014-12-10 17:56   ` Junio C Hamano
  2014-12-22 21:27     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 17:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> This function will be used in the next commits to allow the user to
> ask fsck to handle specific problems differently, e.g. demoting certain
> errors to warnings. It has to handle partial strings because we would
> like to be able to parse, say, '--strict=missing-email=warn' command
> lines.
>
> To make the parsing robust, we generate strings from the enum keys, and we
> will match both lower-case, dash-separated values as well as camelCased
> ones (e.g. both "missing-email" and "missingEmail" will match the
> "MISSING_EMAIL" key).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/fsck.c b/fsck.c
> index 3cea034..05b146c 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -63,6 +63,38 @@ enum fsck_msg_id {
>  	FSCK_MSG_MAX
>  };
>  
> +#define STR(x) #x
> +#define MSG_ID_STR(x) STR(x),
> +static const char *msg_id_str[FSCK_MSG_MAX + 1] = {
> +	FOREACH_MSG_ID(MSG_ID_STR)
> +	NULL
> +};

I wondered what the ugly macro was in the previous step, but as a
way to keep these two lists in sync it makes sense.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-08 16:14 ` [PATCH 04/18] Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2014-12-10 18:00   ` Junio C Hamano
  2014-12-22 21:43     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:00 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> There are legacy repositories out there whose older commits and tags
> have issues that prevent pushing them when 'receive.fsckObjects' is set.
> One real-life example is a commit object that has been hand-crafted to
> list two authors.
>
> Often, it is not possible to fix those issues without disrupting the
> work with said repositories, yet it is still desirable to perform checks
> by setting `receive.fsckObjects = true`. This commit is the first step
> to allow demoting specific fsck issues to mere warnings.
>
> The function added by this commit parses a list of settings in the form:
>
> 	missing-email=warn,bad-name=warn,...
>
> Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
> git fsck so far, but other call paths (e.g. git index-pack --strict)
> error out *always* no matter what type was specified. Therefore, we
> need to take extra care to default to all FSCK_ERROR in those cases.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fsck.h |  7 +++++--
>  2 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index 05b146c..9e6d70f 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -97,9 +97,63 @@ static int parse_msg_id(const char *text, int len)
>  
>  int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
>  {
> +	if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
> +		return options->strict_mode[msg_id];
> +	if (options->strict)
> +		return FSCK_ERROR;
>  	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
>  }

Hmm, if you are later going to allow demoting (hopefully also promoting)
error to warn, etc., would the comparison between msg_id and FIRST_WARNING
make much sense?

In other words, at some point wouldn't we be better off with
something like this

	struct {
        	enum id;
                const char *id_string;
                enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR };
	} possible_fsck_errors[];

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 09/18] fsck: handle multiple authors in commits specially
  2014-12-08 16:15 ` [PATCH 09/18] fsck: handle multiple authors in commits specially Johannes Schindelin
@ 2014-12-10 18:04   ` Junio C Hamano
  2014-12-22 21:53     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> This problem has been detected in the wild, and is the primary reason
> to introduce an option to demote certain fsck errors to warnings. Let's
> offer to ignore this particular problem specifically.
> ...
> +	while (skip_prefix(buffer, "author ", &buffer)) {
> +		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
> +		if (err)
> +			return err;

If we have an option to demote this to a warning, wouldn't we want
to do the same fsck_ident() on that secondary author line?

> +		/* require_end_of_header() ensured that there is a newline */
> +		buffer = strchr(buffer, '\n') + 1;
> +	}
>  	if (!skip_prefix(buffer, "committer ", &buffer))
>  		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
>  	err = fsck_ident(&buffer, &commit->object, options);

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 12/18] Disallow demoting grave fsck errors to warnings
  2014-12-08 16:15 ` [PATCH 12/18] Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2014-12-10 18:06   ` Junio C Hamano
  2014-12-22 21:56     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Some kinds of errors are intrinsically unrecoverable (e.g. errors while
> uncompressing objects). It does not make sense to allow demoting them to
> mere warnings.

Makes sense, but the patch makes it look as if this is an "oops, we
should have done the list in patch 02/18 in this order from the
beginning".  Can we reorder the patches?

>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c                          | 8 ++++++--
>  t/t5504-fetch-receive-strict.sh | 9 +++++++++
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index c1e7a85..f8339af 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -9,6 +9,9 @@
>  #include "refs.h"
>  
>  #define FOREACH_MSG_ID(FUNC) \
> +	/* fatal errors */ \
> +	FUNC(NUL_IN_HEADER) \
> +	FUNC(UNTERMINATED_HEADER) \
>  	/* errors */ \
>  	FUNC(BAD_DATE) \
>  	FUNC(BAD_EMAIL) \
> @@ -39,10 +42,8 @@
>  	FUNC(MISSING_TYPE_ENTRY) \
>  	FUNC(MULTIPLE_AUTHORS) \
>  	FUNC(NOT_SORTED) \
> -	FUNC(NUL_IN_HEADER) \
>  	FUNC(TAG_OBJECT_NOT_TAG) \
>  	FUNC(UNKNOWN_TYPE) \
> -	FUNC(UNTERMINATED_HEADER) \
>  	FUNC(ZERO_PADDED_DATE) \
>  	/* warnings */ \
>  	FUNC(BAD_FILEMODE) \
> @@ -56,6 +57,7 @@
>  	FUNC(NULL_SHA1) \
>  	FUNC(ZERO_PADDED_FILEMODE)
>  
> +#define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
>  #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
>  
>  #define MSG_ID(x) FSCK_MSG_##x,
> @@ -150,6 +152,8 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
>  		}
>  
>  		msg_id = parse_msg_id(mode, equal);
> +		if (type != FSCK_ERROR && msg_id < FIRST_NON_FATAL_ERROR)
> +			die("Cannot demote %.*s", len, mode);
>  		options->strict_mode[msg_id] = type;
>  		mode += len;
>  	}
> diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
> index db79e56..8a47db2 100755
> --- a/t/t5504-fetch-receive-strict.sh
> +++ b/t/t5504-fetch-receive-strict.sh
> @@ -135,4 +135,13 @@ test_expect_success 'push with receive.fsck.missing-mail = warn' '
>  	grep "missing-email" act
>  '
>  
> +test_expect_success 'receive.fsck.unterminated-header = warn triggers error' '
> +	rm -rf dst &&
> +	git init dst &&
> +	git --git-dir=dst/.git config receive.fsckobjects true &&
> +	git --git-dir=dst/.git config receive.fsck.unterminated-header warn &&
> +	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
> +	grep "Cannot demote unterminated-header=warn" act
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 13/18] Optionally ignore specific fsck issues completely
  2014-12-08 16:15 ` [PATCH 13/18] Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2014-12-10 18:07   ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:07 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> An fsck issue in a legacy repository might be so common that one would
> like not to bother the user with mentioning it at all. With this change,
> that is possible by setting the respective error to "ignore".

Makes sense.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 14/18] fsck: allow upgrading fsck warnings to errors
  2014-12-08 16:15 ` [PATCH 14/18] fsck: allow upgrading fsck warnings to errors Johannes Schindelin
@ 2014-12-10 18:08   ` Junio C Hamano
  2014-12-22 22:01     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> The 'invalid tag name' and 'missing tagger entry' warnings can now be
> upgraded to errors by setting receive.fsck.invalid-tag-name and
> receive.fsck.missing-tagger-entry to 'error'.

Hmm, why can't all warnings promotable to errors, or are the above
two mentioned only as examples?

>
> Incidentally, the missing tagger warning is now really shown as a warning
> (as opposed to being reported with the "error:" prefix, as it used to be
> the case before this commit).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c                | 24 ++++++++++++++++--------
>  t/t5302-pack-index.sh |  2 +-
>  2 files changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index abfd3af..154f361 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -52,13 +52,15 @@
>  	FUNC(HAS_DOT) \
>  	FUNC(HAS_DOTDOT) \
>  	FUNC(HAS_DOTGIT) \
> -	FUNC(INVALID_TAG_NAME) \
> -	FUNC(MISSING_TAGGER_ENTRY) \
>  	FUNC(NULL_SHA1) \
> -	FUNC(ZERO_PADDED_FILEMODE)
> +	FUNC(ZERO_PADDED_FILEMODE) \
> +	/* infos (reported as warnings, but ignored by default) */ \
> +	FUNC(INVALID_TAG_NAME) \
> +	FUNC(MISSING_TAGGER_ENTRY)
>  
>  #define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
>  #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
> +#define FIRST_INFO FSCK_MSG_INVALID_TAG_NAME
>  
>  #define MSG_ID(x) FSCK_MSG_##x,
>  enum fsck_msg_id {
> @@ -103,7 +105,7 @@ int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
>  	if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
>  		return options->strict_mode[msg_id];
>  	if (options->strict)
> -		return FSCK_ERROR;
> +		return msg_id < FIRST_INFO ? FSCK_ERROR : FSCK_WARN;
>  	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
>  }
>  
> @@ -643,13 +645,19 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
>  		goto done;
>  	}
>  	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
> -	if (check_refname_format(sb.buf, 0))
> -		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, "invalid 'tag' name: %s", buffer);
> +	if (check_refname_format(sb.buf, 0)) {
> +		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, "invalid 'tag' name: %s", buffer);
> +		if (ret)
> +			goto done;
> +	}
>  	buffer = eol + 1;
>  
> -	if (!skip_prefix(buffer, "tagger ", &buffer))
> +	if (!skip_prefix(buffer, "tagger ", &buffer)) {
>  		/* early tags do not contain 'tagger' lines; warn only */
> -		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
> +		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
> +		if (ret)
> +			goto done;
> +	}
>  	else
>  		ret = fsck_ident(&buffer, &tag->object, options);
>  
> diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
> index 61bc8da..3dc5ec4 100755
> --- a/t/t5302-pack-index.sh
> +++ b/t/t5302-pack-index.sh
> @@ -259,7 +259,7 @@ EOF
>      thirtyeight=${tag#??} &&
>      rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
>      git index-pack --strict tag-test-${pack1}.pack 2>err &&
> -    grep "^error:.* expected .tagger. line" err
> +    grep "^warning:.* expected .tagger. line" err
>  '
>  
>  test_done

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-08 16:15 ` [PATCH 16/18] fsck: support demoting errors to warnings Johannes Schindelin
@ 2014-12-10 18:15   ` Junio C Hamano
  2014-12-22 22:25     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:15 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> We already have support in `git receive-pack` to deal with some legacy
> repositories which have non-fatal issues.
>
> Let's make `git fsck` itself useful with such repositories, too, by
> allowing users to ignore known issues, or at least demote those issues
> to mere warnings.
>
> Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
> missing emails in author, committer and tagger lines.

Hopefully I do not have to repeat myself, but please do not have
punctuations in the first and the last level of configuration
variable names, i.e. fsck.missingEmail, not mising-email.

Should these be tied to receive-pack ones in any way?  E.g. if you
set fsck.missingEmail to ignore, you do not have to do the same for
receive and accept a push with the specific error turned off?

Not a rhetorical question.  I can see it argued both ways.  The
justification to defend the position of not tying these two I would
have is so that I can be more strict to newer breakages (i.e. not
accepting a push that introduce a new breakage by not ignoring with
receive.fsck.*) while allowing breakages that are already present.
The justification for the opposite position is to make it more
convenient to write a consistent policy.  Whichever way is chosen,
we would want to see the reason left in the log message so that
people do not have to wonder what the original motivation was when
they decide if it is a good idea to change this part of the code.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing
  2014-12-08 16:15 ` [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2014-12-10 18:23   ` Junio C Hamano
  2014-12-22 22:19     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> The optional new config option `receive.fsck.skip-list` specifies the path
> to a file listing the names, i.e. SHA-1s, one per line, of objects that
> are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.
>
> This is extremely handy in case of legacy repositories where it would
> cause more pain to change incorrect objects than to live with them
> (e.g. a duplicate 'author' line in an early commit object).
>
> The intended use case is for server administrators to inspect objects
> that are reported by `git push` as being too problematic to enter the
> repository, and to add the objects' SHA-1 to a (preferably sorted) file
> when the objects are legitimate, i.e. when it is determined that those
> problematic objects should be allowed to enter the server.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/receive-pack.c          |  9 +++++++
>  fsck.c                          | 59 +++++++++++++++++++++++++++++++++++++++--
>  fsck.h                          |  2 ++
>  t/t5504-fetch-receive-strict.sh | 12 +++++++++
>  4 files changed, 80 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index 111e514..5169f1f 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -110,6 +110,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
>  		return 0;
>  	}
>  
> +	if (starts_with(var, "receive.fsck.skip-list")) {

s/skip-list/skiplist/;

> +		const char *path = is_absolute_path(value) ?
> +			value : git_path("%s", value);
> +		if (fsck_strict_mode.len)
> +			strbuf_addch(&fsck_strict_mode, ',');
> +		strbuf_addf(&fsck_strict_mode, "skip-list=%s", path);
> +		return 0;
> +	}
> +
>  	if (starts_with(var, "receive.fsck.")) {
>  		if (fsck_strict_mode.len)
>  			strbuf_addch(&fsck_strict_mode, ',');
> diff --git a/fsck.c b/fsck.c
> index 154f361..00693f2 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -7,6 +7,7 @@
>  #include "tag.h"
>  #include "fsck.h"
>  #include "refs.h"
> +#include "sha1-array.h"
>  
>  #define FOREACH_MSG_ID(FUNC) \
>  	/* fatal errors */ \
> @@ -56,7 +57,9 @@
>  	FUNC(ZERO_PADDED_FILEMODE) \
>  	/* infos (reported as warnings, but ignored by default) */ \
>  	FUNC(INVALID_TAG_NAME) \
> -	FUNC(MISSING_TAGGER_ENTRY)
> +	FUNC(MISSING_TAGGER_ENTRY) \
> +	/* special value */ \
> +	FUNC(SKIP_LIST)

This feels like a kludge to me without comment on what "special
value" means.  Does it mean "this object has an error (which by
default is ignored) of being on the skip list?"  Should we be able
to optionally warn an object on the skip-list exists with the same
mechansim the rest of the series uses to tweak the error level?

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 00/18] Introduce an internal API to interact with the fsck machinery
  2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                   ` (17 preceding siblings ...)
  2014-12-08 16:15 ` [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2014-12-10 18:34 ` Junio C Hamano
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                     ` (2 more replies)
  18 siblings, 3 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-10 18:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> At the moment, the git-fsck's integrity checks are targeted toward the
> end user, i.e. the error messages are really just messages, intended for
> human consumption.
>
> Under certain circumstances, some of those errors should be allowed to
> be turned into mere warnings, though, because the cost of fixing the
> issues might well be larger than the cost of carrying those flawed
> objects.

Overall I very much like what this series aims to do.
Thanks for working on this.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 01/18] Introduce fsck options
  2014-12-10 15:33   ` Junio C Hamano
@ 2014-12-22 17:26     ` Johannes Schindelin
  2014-12-22 17:32       ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 17:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > Subject: Re: [PATCH 01/18] Introduce fsck options
> 
> please make it easier to cluster and spot the series in the eventual
> shortlog by giving a common prefix to the patches, e.g.
> 
> 	fsck: introduce fsck_options struct

I use the fsck: prefix consistently now.

> > +static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
> > +static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
> 
> Is it a good idea to allow walker to be strict but obj verifier to
> be not (or vice versa)?  I am wondering why this is not a single
> struct with two callback function pointers.

Unfortunately not. There are two different walkers used, and in fact,
fsck_walk_options() is only used to walk the objects, not to fsck them.

Now, I could use only one struct and set the walker, but that is not
thread-safe, and while code is not threaded yet AFAICT, it might be in the
future. That is why I decided to be rather safe than sorry. If you want it
differently, please just say the word, I will make it so.

> > +struct fsck_options {
> > +	fsck_walk_func walk;
> > +	fsck_error error_func;
> > +	int strict:1;
> 
> A signed 1-bit-wide bitfield can hold its sign-bit and nothing else,
> no?
> 
>     unsigned strict:1;

Oops. Right. For some reason, it worked here, though... Fixed!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 01/18] Introduce fsck options
  2014-12-22 17:26     ` Johannes Schindelin
@ 2014-12-22 17:32       ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 17:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Is it a good idea to allow walker to be strict but obj verifier to
>> be not (or vice versa)?  I am wondering why this is not a single
>> struct with two callback function pointers.
>
> Unfortunately not. There are two different walkers used, and in fact,
> fsck_walk_options() is only used to walk the objects, not to fsck them.
>
> Now, I could use only one struct and set the walker, but that is not
> thread-safe, and while code is not threaded yet AFAICT, it might be in the
> future. That is why I decided to be rather safe than sorry. If you want it
> differently, please just say the word, I will make it so.

Thanks for explaining; I just found that the reason behind the
design choice was unclear and wanted to know.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 03/18] Provide a function to parse fsck message IDs
  2014-12-10 17:56   ` Junio C Hamano
@ 2014-12-22 21:27     ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 21:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > This function will be used in the next commits to allow the user to
> > ask fsck to handle specific problems differently, e.g. demoting certain
> > errors to warnings. It has to handle partial strings because we would
> > like to be able to parse, say, '--strict=missing-email=warn' command
> > lines.
> >
> > To make the parsing robust, we generate strings from the enum keys, and we
> > will match both lower-case, dash-separated values as well as camelCased
> > ones (e.g. both "missing-email" and "missingEmail" will match the
> > "MISSING_EMAIL" key).
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  fsck.c | 32 ++++++++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
> >
> > diff --git a/fsck.c b/fsck.c
> > index 3cea034..05b146c 100644
> > --- a/fsck.c
> > +++ b/fsck.c
> > @@ -63,6 +63,38 @@ enum fsck_msg_id {
> >  	FSCK_MSG_MAX
> >  };
> >  
> > +#define STR(x) #x
> > +#define MSG_ID_STR(x) STR(x),
> > +static const char *msg_id_str[FSCK_MSG_MAX + 1] = {
> > +	FOREACH_MSG_ID(MSG_ID_STR)
> > +	NULL
> > +};
> 
> I wondered what the ugly macro was in the previous step, but as a
> way to keep these two lists in sync it makes sense.

I added a comment to the commit message.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-10 18:00   ` Junio C Hamano
@ 2014-12-22 21:43     ` Johannes Schindelin
  2014-12-22 21:59       ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 21:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > There are legacy repositories out there whose older commits and tags
> > have issues that prevent pushing them when 'receive.fsckObjects' is set.
> > One real-life example is a commit object that has been hand-crafted to
> > list two authors.
> >
> > Often, it is not possible to fix those issues without disrupting the
> > work with said repositories, yet it is still desirable to perform checks
> > by setting `receive.fsckObjects = true`. This commit is the first step
> > to allow demoting specific fsck issues to mere warnings.
> >
> > The function added by this commit parses a list of settings in the form:
> >
> > 	missing-email=warn,bad-name=warn,...
> >
> > Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
> > git fsck so far, but other call paths (e.g. git index-pack --strict)
> > error out *always* no matter what type was specified. Therefore, we
> > need to take extra care to default to all FSCK_ERROR in those cases.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  fsck.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  fsck.h |  7 +++++--
> >  2 files changed, 63 insertions(+), 2 deletions(-)
> >
> > diff --git a/fsck.c b/fsck.c
> > index 05b146c..9e6d70f 100644
> > --- a/fsck.c
> > +++ b/fsck.c
> > @@ -97,9 +97,63 @@ static int parse_msg_id(const char *text, int len)
> >  
> >  int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
> >  {
> > +	if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
> > +		return options->strict_mode[msg_id];
> > +	if (options->strict)
> > +		return FSCK_ERROR;
> >  	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
> >  }
> 
> Hmm, if you are later going to allow demoting (hopefully also promoting)
> error to warn, etc., would the comparison between msg_id and FIRST_WARNING
> make much sense?

A later patch indeed adds that option. The reason the comparison still
makes sense is that the pure infos do not return at all so far, but all of
the reported warnings are fatal in strict mode (i.e. when
receive.fsckObjects = true). In another later patch it is made possible to
promote even infos (such as 'missing tagger') to warnings or even errors,
and that is when the "return FSCK_ERROR" is changed to "return msg_id <
FIRST_INFO ? FSCK_ERROR : FSCK_WARN".

> In other words, at some point wouldn't we be better off with
> something like this
> 
> 	struct {
>         	enum id;
>                 const char *id_string;
>                 enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR };
> 	} possible_fsck_errors[];

I considered that, and Michael Haggerty also suggested that in a private
mail. However, I find that there is a clear hierarchy in the default
messages: fatal errors, errors, warnings and infos. This should be
reflected by the order IMHO.

But I guess it would make a lot of sense to insert those levels as special
enum values to make it harder to forget to adjust, say, "#define
FIRST_WARNING FSCK_MSG_BAD_FILEMODE" when introducing another warning that
sorts before said ID alphabetically. In other words, I think that we can
really afford to put something like

	...
        FUNC(UNKNOWN_TYPE) \
        FUNC(ZERO_PADDED_DATE) \
        FUNC(___WARNINGS) \
        FUNC(BAD_FILEMODE) \
        FUNC(EMPTY_NAME) \
	...

at the price of making the parsing a little more complicated and wasting a
slight bit of more space for the msg_id_str array.

What do you think?

Ciao,
Dscho
Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn
  2014-12-10 17:52   ` Junio C Hamano
@ 2014-12-22 21:44     ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 21:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > For example, missing emails in commit and tag objects can be demoted to
> > mere warnings with
> >
> > 	git config receive.fsck.missing-email warn
> 
> No punctuations in the first and the last level of configuration
> variable names, please.  I.e. s/missing-email/missingEmail/ or
> something.

Fixed.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 09/18] fsck: handle multiple authors in commits specially
  2014-12-10 18:04   ` Junio C Hamano
@ 2014-12-22 21:53     ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 21:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > This problem has been detected in the wild, and is the primary reason
> > to introduce an option to demote certain fsck errors to warnings. Let's
> > offer to ignore this particular problem specifically.
> > ...
> > +	while (skip_prefix(buffer, "author ", &buffer)) {
> > +		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
> > +		if (err)
> > +			return err;
> 
> If we have an option to demote this to a warning, wouldn't we want
> to do the same fsck_ident() on that secondary author line?

Good point! I changed the following to use fsck_ident() instead:

> > +		/* require_end_of_header() ensured that there is a newline */
> > +		buffer = strchr(buffer, '\n') + 1;
> > +	}
> >  	if (!skip_prefix(buffer, "committer ", &buffer))
> >  		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
> >  	err = fsck_ident(&buffer, &commit->object, options);

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 12/18] Disallow demoting grave fsck errors to warnings
  2014-12-10 18:06   ` Junio C Hamano
@ 2014-12-22 21:56     ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 21:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > Some kinds of errors are intrinsically unrecoverable (e.g. errors while
> > uncompressing objects). It does not make sense to allow demoting them to
> > mere warnings.
> 
> Makes sense, but the patch makes it look as if this is an "oops, we
> should have done the list in patch 02/18 in this order from the
> beginning".  Can we reorder the patches?

I considered this already, but it would more be like a squash than a
reordering. And when I squashed the patches, the story did not read as
clearly to me as it does now. However, if you think this argument is too
weak, I will squash them.

Is it too weak?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-22 21:43     ` Johannes Schindelin
@ 2014-12-22 21:59       ` Junio C Hamano
  2014-12-22 22:32         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 21:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> In other words, at some point wouldn't we be better off with
>> something like this
>> 
>> 	struct {
>>         	enum id;
>>                 const char *id_string;
>>                 enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR };
>> 	} possible_fsck_errors[];
>
> I considered that, and Michael Haggerty also suggested that in a private
> mail. However, I find that there is a clear hierarchy in the default
> messages: fatal errors, errors, warnings and infos.

I am glad I am not alone ;-)

These classes are ordered from more severe to less, but I do not
think it makes much sense to force the default view of "if you
customize to demote a questionable Q that is classified as an error
by default as an warning, you must demote all the other ones that we
deem less serious than Q, which come earlier (or later---I do not
remember which) in our predefined list".  So in that sense, I do not
consider that various kinds of questionables fsck can detect are
hierarchical at all.

I do agree that it makes it easier to code the initialization of
such an array to have "up to this point we assign the level 'fatal
error' by default" constants.  Then the initialization can become

	for (i = 0; i < FIRST_WARN; i++)
        	possible_fsck_errors[i].error_level = FSCK_INFO;
	while (i < FIRST_ERROR)
        	possible_fsck_errors[i++].error_level = FSCK_WARN;
	while (i < ARRAY_SIZE(possible_fsck_errors))
        	possible_fsck_errors[i++].error_level = FSCK_ERROR;

or something.  So I am not against the FIRST_WARNING constant at
all, but I find it very questionable in a fully customizable system
to use such a constant anywhere other than the initialization time.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 14/18] fsck: allow upgrading fsck warnings to errors
  2014-12-10 18:08   ` Junio C Hamano
@ 2014-12-22 22:01     ` Johannes Schindelin
  2014-12-22 22:15       ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > The 'invalid tag name' and 'missing tagger entry' warnings can now be
> > upgraded to errors by setting receive.fsck.invalid-tag-name and
> > receive.fsck.missing-tagger-entry to 'error'.
> 
> Hmm, why can't all warnings promotable to errors, or are the above
> two mentioned only as examples?

Those were the only ones that were always shown as warnings but never
treated as errors.

There is a third one coming, as part of the patches that will let fsck
warn about NTFS-incompatible file names, but I want to get this patch
series integrated into git.git first.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 14/18] fsck: allow upgrading fsck warnings to errors
  2014-12-22 22:01     ` Johannes Schindelin
@ 2014-12-22 22:15       ` Junio C Hamano
  2014-12-22 22:39         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 22:15 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Wed, 10 Dec 2014, Junio C Hamano wrote:
>
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> 
>> > The 'invalid tag name' and 'missing tagger entry' warnings can now be
>> > upgraded to errors by setting receive.fsck.invalid-tag-name and
>> > receive.fsck.missing-tagger-entry to 'error'.
>> 
>> Hmm, why can't all warnings promotable to errors, or are the above
>> two mentioned only as examples?
>
> Those were the only ones that were always shown as warnings but never
> treated as errors.

Sorry but I don't quite understand this comment; I suspect the root
cause might be that we have different mental models on these
tweakable error severities.

Because I come from the school "To these N kinds of events you can
independently assign different (i.e. info, warn, error) outcomes",
moving the FIRST_{INFO,WARNING,...} position in the array would only
affect what happens by default, never hindering the user's ability
to tweak (in other words, there is no linkage between "now you can
tweak" and the order of events in the list, the latter of which only
would affect what the default severity of each event is).

It appears that your design is from a different mental model and the
order and position in that list has more significance than what the
default severity of each event is but how much the severity can be
tweaked, or something, which I somehow find incomprehensible.

Puzzled...

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing
  2014-12-10 18:23   ` Junio C Hamano
@ 2014-12-22 22:19     ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3110 bytes --]

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > The optional new config option `receive.fsck.skip-list` specifies the path
> > to a file listing the names, i.e. SHA-1s, one per line, of objects that
> > are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.
> >
> > This is extremely handy in case of legacy repositories where it would
> > cause more pain to change incorrect objects than to live with them
> > (e.g. a duplicate 'author' line in an early commit object).
> >
> > The intended use case is for server administrators to inspect objects
> > that are reported by `git push` as being too problematic to enter the
> > repository, and to add the objects' SHA-1 to a (preferably sorted) file
> > when the objects are legitimate, i.e. when it is determined that those
> > problematic objects should be allowed to enter the server.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  builtin/receive-pack.c          |  9 +++++++
> >  fsck.c                          | 59 +++++++++++++++++++++++++++++++++++++++--
> >  fsck.h                          |  2 ++
> >  t/t5504-fetch-receive-strict.sh | 12 +++++++++
> >  4 files changed, 80 insertions(+), 2 deletions(-)
> >
> > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> > index 111e514..5169f1f 100644
> > --- a/builtin/receive-pack.c
> > +++ b/builtin/receive-pack.c
> > @@ -110,6 +110,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
> >  		return 0;
> >  	}
> >  
> > +	if (starts_with(var, "receive.fsck.skip-list")) {
> 
> s/skip-list/skiplist/;
> 
> > +		const char *path = is_absolute_path(value) ?
> > +			value : git_path("%s", value);
> > +		if (fsck_strict_mode.len)
> > +			strbuf_addch(&fsck_strict_mode, ',');
> > +		strbuf_addf(&fsck_strict_mode, "skip-list=%s", path);
> > +		return 0;
> > +	}
> > +
> >  	if (starts_with(var, "receive.fsck.")) {
> >  		if (fsck_strict_mode.len)
> >  			strbuf_addch(&fsck_strict_mode, ',');
> > diff --git a/fsck.c b/fsck.c
> > index 154f361..00693f2 100644
> > --- a/fsck.c
> > +++ b/fsck.c
> > @@ -7,6 +7,7 @@
> >  #include "tag.h"
> >  #include "fsck.h"
> >  #include "refs.h"
> > +#include "sha1-array.h"
> >  
> >  #define FOREACH_MSG_ID(FUNC) \
> >  	/* fatal errors */ \
> > @@ -56,7 +57,9 @@
> >  	FUNC(ZERO_PADDED_FILEMODE) \
> >  	/* infos (reported as warnings, but ignored by default) */ \
> >  	FUNC(INVALID_TAG_NAME) \
> > -	FUNC(MISSING_TAGGER_ENTRY)
> > +	FUNC(MISSING_TAGGER_ENTRY) \
> > +	/* special value */ \
> > +	FUNC(SKIP_LIST)
> 
> This feels like a kludge to me without comment on what "special
> value" means.  Does it mean "this object has an error (which by
> default is ignored) of being on the skip list?"  Should we be able
> to optionally warn an object on the skip-list exists with the same
> mechansim the rest of the series uses to tweak the error level?

I addressed both concerns – I hope... ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-10 18:15   ` Junio C Hamano
@ 2014-12-22 22:25     ` Johannes Schindelin
  2014-12-22 22:34       ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > We already have support in `git receive-pack` to deal with some legacy
> > repositories which have non-fatal issues.
> >
> > Let's make `git fsck` itself useful with such repositories, too, by
> > allowing users to ignore known issues, or at least demote those issues
> > to mere warnings.
> >
> > Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
> > missing emails in author, committer and tagger lines.
> 
> Hopefully I do not have to repeat myself, but please do not have
> punctuations in the first and the last level of configuration variable
> names, i.e. fsck.missingEmail, not mising-email.

I vetted the complete patch series and think I caught them all.

Or do you want the error messages to also use camelCased IDs, i.e.

	warning in tag $tag: missingTaggerEntry: invalid format ...

instead of

	warning in tag $tag: missing-tagger-entry: invalid format ...

? It is easy to do, but looks slightly uglier to this developer's eyes...

> Should these be tied to receive-pack ones in any way?  E.g. if you
> set fsck.missingEmail to ignore, you do not have to do the same for
> receive and accept a push with the specific error turned off?
> 
> Not a rhetorical question.  I can see it argued both ways.  The
> justification to defend the position of not tying these two I would
> have is so that I can be more strict to newer breakages (i.e. not
> accepting a push that introduce a new breakage by not ignoring with
> receive.fsck.*) while allowing breakages that are already present.
> The justification for the opposite position is to make it more
> convenient to write a consistent policy.  Whichever way is chosen,
> we would want to see the reason left in the log message so that
> people do not have to wonder what the original motivation was when
> they decide if it is a good idea to change this part of the code.

Hmm. I really tried very hard to separate the fsck.* from the receive.*
settings because the two code paths already behave differently: many
warnings reported (and ignored) by fsck are fatal errors when pushing with
receive.fsckObjects=true. My understanding was that these differences are
deliberate, and my interpretation was that the fsck and the receive
settings were considered to be fundamentally different, even if the same
fsck machinery performs the validation.

If you agree, I would rephrase this line of argument and add it to the
commit message. Do you agree?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-22 21:59       ` Junio C Hamano
@ 2014-12-22 22:32         ` Johannes Schindelin
  2014-12-22 22:40           ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2455 bytes --]

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> In other words, at some point wouldn't we be better off with
> >> something like this
> >> 
> >> 	struct {
> >>         	enum id;
> >>                 const char *id_string;
> >>                 enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR };
> >> 	} possible_fsck_errors[];
> >
> > I considered that, and Michael Haggerty also suggested that in a private
> > mail. However, I find that there is a clear hierarchy in the default
> > messages: fatal errors, errors, warnings and infos.
> 
> I am glad I am not alone ;-)
> 
> These classes are ordered from more severe to less, but I do not
> think it makes much sense to force the default view of "if you
> customize to demote a questionable Q that is classified as an error
> by default as an warning, you must demote all the other ones that we
> deem less serious than Q, which come earlier (or later---I do not
> remember which) in our predefined list".  So in that sense, I do not
> consider that various kinds of questionables fsck can detect are
> hierarchical at all.

Oh, but please understand that this hierarchy only applies to the default
settings. All of these settings can be overridden individually – and the
first override will initialize a full array with the default settings.

So the order really only plays a role for the defaults, no more.

> I do agree that it makes it easier to code the initialization of
> such an array to have "up to this point we assign the level 'fatal
> error' by default" constants.  Then the initialization can become
> 
> 	for (i = 0; i < FIRST_WARN; i++)
>         	possible_fsck_errors[i].error_level = FSCK_INFO;
> 	while (i < FIRST_ERROR)
>         	possible_fsck_errors[i++].error_level = FSCK_WARN;
> 	while (i < ARRAY_SIZE(possible_fsck_errors))
>         	possible_fsck_errors[i++].error_level = FSCK_ERROR;
> 
> or something.  So I am not against the FIRST_WARNING constant at
> all, but I find it very questionable in a fully customizable system
> to use such a constant anywhere other than the initialization time.

This is indeed the case. The code we are discussing comes after the

	if (options->strict_mode)
		return options->strict_mode[msg_id];

In other words, once the overrides are in place, the default settings are
skipped entirely.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-22 22:25     ` Johannes Schindelin
@ 2014-12-22 22:34       ` Junio C Hamano
  2014-12-22 22:46         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 22:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Or do you want the error messages to also use camelCased IDs, i.e.
>
> 	warning in tag $tag: missingTaggerEntry: invalid format ...
>
> instead of
>
> 	warning in tag $tag: missing-tagger-entry: invalid format ...
>
> ? It is easy to do, but looks slightly uglier to this developer's eyes...

Do you really need to know what I think?  Can I say "The latter is
probably slightly better, but both look ugly to me"?

If we want a human readable message

    "warning: tag object lacks tagger field '$tag'"

would be my preference.

But I personally think it may not be necessary to give a pretty
message that later can go through l10n here.  If we take that
position, it is just a machine-readble error token, so I'd say
whichever way you find more readable is OK.

>> Should these be tied to receive-pack ones in any way?  E.g. if you
>> set fsck.missingEmail to ignore, you do not have to do the same for
>> receive and accept a push with the specific error turned off?
>> 
>> Not a rhetorical question.  I can see it argued both ways.  The
>> justification to defend the position of not tying these two I would
>> have is so that I can be more strict to newer breakages (i.e. not
>> accepting a push that introduce a new breakage by not ignoring with
>> receive.fsck.*) while allowing breakages that are already present.
>> The justification for the opposite position is to make it more
>> convenient to write a consistent policy.  Whichever way is chosen,
>> we would want to see the reason left in the log message so that
>> people do not have to wonder what the original motivation was when
>> they decide if it is a good idea to change this part of the code.
>
> Hmm. I really tried very hard to separate the fsck.* from the receive.*
> settings because the two code paths already behave differently:...
>
> If you agree, I would rephrase this line of argument and add it to the
> commit message. Do you agree?

Yeah, that reasoning sounds sensible.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 14/18] fsck: allow upgrading fsck warnings to errors
  2014-12-22 22:15       ` Junio C Hamano
@ 2014-12-22 22:39         ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2467 bytes --]

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Wed, 10 Dec 2014, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> >> 
> >> > The 'invalid tag name' and 'missing tagger entry' warnings can now be
> >> > upgraded to errors by setting receive.fsck.invalid-tag-name and
> >> > receive.fsck.missing-tagger-entry to 'error'.
> >> 
> >> Hmm, why can't all warnings promotable to errors, or are the above
> >> two mentioned only as examples?
> >
> > Those were the only ones that were always shown as warnings but never
> > treated as errors.
> 
> Sorry but I don't quite understand this comment; I suspect the root
> cause might be that we have different mental models on these
> tweakable error severities.
> 
> Because I come from the school "To these N kinds of events you can
> independently assign different (i.e. info, warn, error) outcomes",
> moving the FIRST_{INFO,WARNING,...} position in the array would only
> affect what happens by default, never hindering the user's ability
> to tweak (in other words, there is no linkage between "now you can
> tweak" and the order of events in the list, the latter of which only
> would affect what the default severity of each event is).

We agree on this mental model.

The only problem this patch tries to fix is that the warnings about a
missing tagger and about invalid tag names were never leading to an error.
They were purely printed, but then ignored. So what this patch does is to
add "if (err) return err;" handling for those two warnings.

As a consequence, the ordering of message IDs needs to be fixed because
the non-fatal warnings were ordered alphabetically before, but now the
non-fatal warnings are extracted so that we can give them the appropriate
FSCK_WARN by defauly – even in the git-receive-pack case.

In other words, the value assigned to those two warnings was completely
ignored before, which was the reason why it did not matter that we
assigned them to report FSCK_ERRORs in the git-receive-pack case before:
they were still only printed out and never stopped any tag from entering
the host's repository.

I could change the ordering in the patch that introduces the message IDs,
of course, but it would be even more puzzling if those two messages, of
all, were not ordered alphabetically with the others.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-22 22:32         ` Johannes Schindelin
@ 2014-12-22 22:40           ` Junio C Hamano
  2014-12-22 22:55             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 22:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Michael Haggerty

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Mon, 22 Dec 2014, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> 
>> >> In other words, at some point wouldn't we be better off with
>> >> something like this
>> >> 
>> >> 	struct {
>> >>         	enum id;
>> >>                 const char *id_string;
>> >>                 enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR };
>> >> 	} possible_fsck_errors[];
>> >
>> > I considered that, and Michael Haggerty also suggested that in a private
>> > mail. However, I find that there is a clear hierarchy in the default
>> > messages: fatal errors, errors, warnings and infos.
>> 
>> I am glad I am not alone ;-)
>> ...
> Oh, but please understand that this hierarchy only applies to the default
> settings. All of these settings can be overridden individually – and the
> first override will initialize a full array with the default settings.

But that means that the runtime needs to switch between two code
with and without override, no?

> 	if (options->strict_mode)
> 		return options->strict_mode[msg_id];

In other words, I think this is misleading and unnecessary
optimization for the "full array" allocation.  A code that uses an
array of a struct like the above that Michael and I independently
suggested would initialize once with or without an override and then
at the runtime there is no "if the array is there use it"
conditional.

I do not know why Michael suggested the same thing, but the reason
why I prefer that arrangement is because I think it would be easier
to read and maintain.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-22 22:34       ` Junio C Hamano
@ 2014-12-22 22:46         ` Johannes Schindelin
  2014-12-22 22:50           ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2831 bytes --]

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Or do you want the error messages to also use camelCased IDs, i.e.
> >
> > 	warning in tag $tag: missingTaggerEntry: invalid format ...
> >
> > instead of
> >
> > 	warning in tag $tag: missing-tagger-entry: invalid format ...
> >
> > ? It is easy to do, but looks slightly uglier to this developer's eyes...
> 
> Do you really need to know what I think?

Well, but yes ;-)

> Can I say "The latter is probably slightly better, but both look ugly to
> me"?

Of course you can say that! ;-) The problem these ugly messages try to
solve is to give the user a hint which setting to change if they want to
override the default behavior, though...

> If we want a human readable message
> 
>     "warning: tag object lacks tagger field '$tag'"
> 
> would be my preference.
> 
> But I personally think it may not be necessary to give a pretty
> message that later can go through l10n here.  If we take that
> position, it is just a machine-readble error token, so I'd say
> whichever way you find more readable is OK.

Okay, I will leave it as-is, then.

> >> Should these be tied to receive-pack ones in any way?  E.g. if you
> >> set fsck.missingEmail to ignore, you do not have to do the same for
> >> receive and accept a push with the specific error turned off?
> >> 
> >> Not a rhetorical question.  I can see it argued both ways.  The
> >> justification to defend the position of not tying these two I would
> >> have is so that I can be more strict to newer breakages (i.e. not
> >> accepting a push that introduce a new breakage by not ignoring with
> >> receive.fsck.*) while allowing breakages that are already present.
> >> The justification for the opposite position is to make it more
> >> convenient to write a consistent policy.  Whichever way is chosen,
> >> we would want to see the reason left in the log message so that
> >> people do not have to wonder what the original motivation was when
> >> they decide if it is a good idea to change this part of the code.
> >
> > Hmm. I really tried very hard to separate the fsck.* from the receive.*
> > settings because the two code paths already behave differently:...
> >
> > If you agree, I would rephrase this line of argument and add it to the
> > commit message. Do you agree?
> 
> Yeah, that reasoning sounds sensible.

I added this paragraph:

    In the same spirit that `git receive-pack`'s usage of the fsck machinery
    differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
    are fatal with `git receive-pack` when receive.fsckObjects = true, for
    example – we strictly separate the fsck.* from the receive.fsck.*
    settings.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-22 22:46         ` Johannes Schindelin
@ 2014-12-22 22:50           ` Junio C Hamano
  2014-12-22 22:57             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 22:50 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Of course you can say that! ;-) The problem these ugly messages try to
> solve is to give the user a hint which setting to change if they want to
> override the default behavior, though...

Ahh, OK, then dashed form would not work as a configuration variable
names, so missingTaggerEntry would be the only usable option.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-22 22:40           ` Junio C Hamano
@ 2014-12-22 22:55             ` Johannes Schindelin
  2014-12-22 23:15               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2694 bytes --]

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Mon, 22 Dec 2014, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >> 
> >> >> In other words, at some point wouldn't we be better off with
> >> >> something like this
> >> >> 
> >> >> 	struct {
> >> >>         	enum id;
> >> >>                 const char *id_string;
> >> >>                 enum error_level { FSCK_PASS, FSCK_WARN, FSCK_ERROR };
> >> >> 	} possible_fsck_errors[];
> >> >
> >> > I considered that, and Michael Haggerty also suggested that in a private
> >> > mail. However, I find that there is a clear hierarchy in the default
> >> > messages: fatal errors, errors, warnings and infos.
> >> 
> >> I am glad I am not alone ;-)
> >> ...
> > Oh, but please understand that this hierarchy only applies to the default
> > settings. All of these settings can be overridden individually – and the
> > first override will initialize a full array with the default settings.
> 
> But that means that the runtime needs to switch between two code
> with and without override, no?
> 
> > 	if (options->strict_mode)
> > 		return options->strict_mode[msg_id];
> 
> In other words, I think this is misleading and unnecessary
> optimization for the "full array" allocation.  A code that uses an
> array of a struct like the above that Michael and I independently
> suggested would initialize once with or without an override and then
> at the runtime there is no "if the array is there use it"
> conditional.
> 
> I do not know why Michael suggested the same thing, but the reason
> why I prefer that arrangement is because I think it would be easier
> to read and maintain.

Well, I disagree that it would be easier to maintain, because it appears
to me that the clear hierarchy keeps things simple. For example if some
clearly fatal error is clustered with non-fatal ones due to alphabetical
ordering, it is much harder to spot when it is marked as a demoteable
error by mistake.

For example, try to spot the error here:

	...
	F(ALMOST_HAPPY, INFO) \
	F(CANNOT_RECOVER, ERROR) \
	F(COFFEE_IS_EMPTY, WARN) \
	F(JUST_BEING_CHATTY, INFO) \
	F(LIFE_IS_GOOD, INFO) \
	F(MISSING_SOMETHING_VITAL, FATAL_ERROR) \
	F(NEED_TO_SLEEP, WARN) \
	F(SOMETHING_WENT_WRONG, ERROR) \
	...

Personally, I find it very, very hard to spot that CANNOT_RECOVER is
marked as a mere ERROR instead of a FATAL_ERROR. Even if it is nicely
alphabetically ordered.

I will sleep over this, though. Maybe I can come up with a solution that
makes all three of us happy.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-22 22:50           ` Junio C Hamano
@ 2014-12-22 22:57             ` Johannes Schindelin
  2014-12-22 23:13               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-22 22:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Of course you can say that! ;-) The problem these ugly messages try to
> > solve is to give the user a hint which setting to change if they want to
> > override the default behavior, though...
> 
> Ahh, OK, then dashed form would not work as a configuration variable
> names, so missingTaggerEntry would be the only usable option.

Except that cunning me has made it so that both missing-tagger-entry *and*
missingTaggerEntry work...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-22 22:57             ` Johannes Schindelin
@ 2014-12-22 23:13               ` Junio C Hamano
  2014-12-23  9:50                 ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 23:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Mon, 22 Dec 2014, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> 
>> > Of course you can say that! ;-) The problem these ugly messages try to
>> > solve is to give the user a hint which setting to change if they want to
>> > override the default behavior, though...
>> 
>> Ahh, OK, then dashed form would not work as a configuration variable
>> names, so missingTaggerEntry would be the only usable option.
>
> Except that cunning me has made it so that both missing-tagger-entry *and*
> missingTaggerEntry work...

Then the missing-tagger-entry side needs to be dropped.  The naming
does not conform to the way how we name our officially supported
configuration variables.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-22 22:55             ` Johannes Schindelin
@ 2014-12-22 23:15               ` Junio C Hamano
  2014-12-23 10:53                 ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-22 23:15 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Michael Haggerty

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> For example, try to spot the error here:
>
> 	...
> 	F(ALMOST_HAPPY, INFO) \
> 	F(CANNOT_RECOVER, ERROR) \
> 	F(COFFEE_IS_EMPTY, WARN) \
> 	F(JUST_BEING_CHATTY, INFO) \
> 	F(LIFE_IS_GOOD, INFO) \
> 	F(MISSING_SOMETHING_VITAL, FATAL_ERROR) \
> 	F(NEED_TO_SLEEP, WARN) \
> 	F(SOMETHING_WENT_WRONG, ERROR) \
> 	...

But that is not what is being suggested at all.  I already said that
FIRST_SOMETHING is fine as a measure to initialize, didn't I?

I am only saying that if you have a place to store customized level,
you should initialize that part with default levels and always look
it up from that place at runtime.  It is perfectly fine for the
initialization step to take advantage of the ordering and
FIRST_SOMETHING constants.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-22 23:13               ` Junio C Hamano
@ 2014-12-23  9:50                 ` Johannes Schindelin
  2014-12-23 16:32                   ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23  9:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Mon, 22 Dec 2014, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >> 
> >> > Of course you can say that! ;-) The problem these ugly messages try to
> >> > solve is to give the user a hint which setting to change if they want to
> >> > override the default behavior, though...
> >> 
> >> Ahh, OK, then dashed form would not work as a configuration variable
> >> names, so missingTaggerEntry would be the only usable option.
> >
> > Except that cunning me has made it so that both missing-tagger-entry *and*
> > missingTaggerEntry work...
> 
> Then the missing-tagger-entry side needs to be dropped.  The naming
> does not conform to the way how we name our officially supported
> configuration variables.

But it does conform with the way we do our command-line parameters, and it
would actually cause *more* work (and more complicated code) to have two
separate parsers, or even to force the parser to accept only one way to
specify settings...

Should I really introduce more complexity just to disallow non-camelCased
config variables?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-22 23:15               ` Junio C Hamano
@ 2014-12-23 10:53                 ` Johannes Schindelin
  2014-12-23 16:18                   ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 10:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5711 bytes --]

Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > For example, try to spot the error here:
> >
> > 	...
> > 	F(ALMOST_HAPPY, INFO) \
> > 	F(CANNOT_RECOVER, ERROR) \
> > 	F(COFFEE_IS_EMPTY, WARN) \
> > 	F(JUST_BEING_CHATTY, INFO) \
> > 	F(LIFE_IS_GOOD, INFO) \
> > 	F(MISSING_SOMETHING_VITAL, FATAL_ERROR) \
> > 	F(NEED_TO_SLEEP, WARN) \
> > 	F(SOMETHING_WENT_WRONG, ERROR) \
> > 	...
> 
> But that is not what is being suggested at all.  I already said that
> FIRST_SOMETHING is fine as a measure to initialize, didn't I?
> 
> I am only saying that if you have a place to store customized level,
> you should initialize that part with default levels and always look
> it up from that place at runtime.  It is perfectly fine for the
> initialization step to take advantage of the ordering and
> FIRST_SOMETHING constants.

Thanks for clarifying, I was worried that you wanted to encode the
severity levels explicitly (F(ID, LEVEL) instead of F(ID) in the correct
order). The DRY principle also suggests that we should not encode the
severity level in two ways (which would leave the door open for
inconsistencies). That means that we should not initialize a static array
of severity levels, but initialize the array using a loop.

Okay, now that we have established that the initial ordering by severity
makes sense, let's examine the initialization step.

Basically, our approaches differ only in *when* to initialize that array
of severity levels: you want to initialize it always, and I want to
initialize it only when the severity levels are customized by the caller.

Now, let's have a look how the fsck_options are currently initialized. My
code follows the convention established with strbufs, argv_arrays, etc:
there is a preprocessor definition (_DEFAULT, imitating the _INIT
definitions) that allows us to initialize such structs very conveniently.
Please note that no loop is required, and certainly no extra code has to
be called to initialize the struct. We get away with initializing that
array lazily in the fsck_strict_mode() function when we detect that it
needs to be initialized, being populated by the very same function that
provides the default settings before customization. This is a very robust
setup as the knowledge about, say, the size of that array is confined
strictly to fsck.c.

However, if we had to change the lookup such that it uses an array always,
we would have to introduce a function to initialize the struct, always, in
particular we would have to find a place to call that initialization
function in, say, builtin/fsck.c (actually, in every code path that calls
into the fsck machinery). Arguably, the code would get more complex –
introducing new call paths just to initialize the fsck_options struct –
and I would argue further that there is no gain from an elegance,
readability and maintenance point of view: whether the array is
initialized lazily or not, it will be initialized the exact same way. All
it means is that we have to introduce separate code paths because we would
separate explicitly the initialization from the configuration step.

Therefore, I do not believe that introducing an fsck_options_init() is
what you would really want.

An alternative would be to initialize the array at compile time – we would
have to violate the DRY principle for that, repeating the severity levels
many times over, and we could no longer confine the visibility to the
message IDs to fsck.c because not only the size of the array of severity
levels would have to be known to every user of fsck.h, but the exact
default severity levels themselves, to be able to initialize the struct.
But we could initialize the struct with a known set of settings via the
_DEFAULTS definition that way.

However, you already expressed slight disagreement with the preprocessor
magic needed to initialize both the enum and the array of message ID
strings from the same list in a way that lets the compiler ensure
consistency; I am afraid that if I were to modify _DEFAULTS to populate the
entire severity level array, the resulting code would find your utter
contempt.

I believe, therefore, that this is also not what you want.

So that leaves only one alternative: to initialize a global array with the
default severity levels at *some* stage. I have no idea what that stage
would be, therefore we would have to either establish ugly, and
DRY-violating, compile time initialization, or we would have to call a
function before using any of the fsck machinery – but that is fragile: it
is too easy to forget one call path and access an uninitialized array!
Worse, even if we *had* a fully initialized array of default severity
levels, we would still have to have an on-demand copy (i.e. lazy
initialization) of said array lest we modify the global defaults in
fsck_strict_mode()! Essentially, we would just *add* complexity to the
current solution, not replace it with anything simpler.

Therefore, I believe that you cannot be a fan of this alternative, either.

In short, I still find it much more elegant to determine the default
severity levels from the numeric enum values, and initialize the severity
level array only on demand, than to introduce a separate call to
initialize the array always, in particular since we would have to execute
that initialization loop *all* the time in the latter case – even if we do
not customize, let alone look up, any value – or clutter the code with
ugly, ugly preprocessor constructs.

And I hope that my arguments convinced you, too!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-23 10:53                 ` Johannes Schindelin
@ 2014-12-23 16:18                   ` Junio C Hamano
  2014-12-23 16:30                     ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 16:18 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Michael Haggerty

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> However, if we had to change the lookup such that it uses an array always,
> we would have to introduce a function to initialize the struct, always, in
> particular we would have to find a place to call that initialization
> function in, say, builtin/fsck.c (actually, in every code path that calls
> into the fsck machinery).

You would need to call a function to "initialize" the table if you
support customization by reading the configuration files anyway.  I
am not sure why you think finding such a place is hard.  Puzzled.

Also I suspect that you can tell the compiler to initialize the
array in place with default values, perhaps like this?

-- >8 --
#include <stdio.h>

/* sorted by the default severity (lowest impact first) */
#define EVENT_LIST(F) \
	F(EVENT_A), \
	F(EVENT_B), \
	F(EVENT_C), \
	F(EVENT_D)

#define ID_(event) ID_ ## event
enum event_id {
	EVENT_LIST(ID_)
};


enum severity_level {
	severity_info, severity_warn, severity_error
};

/* below this one are INFO */
#define FIRST_WARN_EVENT_ID		ID_EVENT_B
/* below this one are WARN */
#define FIRST_ERROR_EVENT_ID		ID_EVENT_C

#define STRING_(s) #s
#define DESC_(event) \
	{ \
		ID_ ## event, \
		STRING_(event), \
		(ID_ ## event < FIRST_WARN_EVENT_ID \
		? severity_info \
		: ID_ ## event < FIRST_ERROR_EVENT_ID \
		? severity_warn \
		: severity_error) \
	}

struct event_config {
	enum event_id id;
	const char * name;
	enum severity_level level;
} event[] = {
	EVENT_LIST(DESC_)
};

int main(int ac, char **av)
{
	int i;

	for (i = 0; i < sizeof(event) / sizeof(event[0]); i++) {
		printf("%d, %s, %d\n",
		       event[i].id, event[i].name, event[i].level);
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-23 16:18                   ` Junio C Hamano
@ 2014-12-23 16:30                     ` Johannes Schindelin
  2014-12-23 17:20                       ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 16:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > However, if we had to change the lookup such that it uses an array
> > always, we would have to introduce a function to initialize the
> > struct, always, in particular we would have to find a place to call
> > that initialization function in, say, builtin/fsck.c (actually, in
> > every code path that calls into the fsck machinery).
> 
> You would need to call a function to "initialize" the table if you
> support customization by reading the configuration files anyway.

Yes, this is the config machinery. But I need employ that only if I want
to let the caller customize the severity levels. However, the fsck
machinery is also called from places where such a customization is not
offered. They would now need to be changed, too.

> Also I suspect that you can tell the compiler to initialize the
> array in place with default values, perhaps like this?
> 
> -- >8 --
> #include <stdio.h>
> 
> /* sorted by the default severity (lowest impact first) */
> #define EVENT_LIST(F) \
> 	F(EVENT_A), \
> 	F(EVENT_B), \
> 	F(EVENT_C), \
> 	F(EVENT_D)
> 
> #define ID_(event) ID_ ## event
> enum event_id {
> 	EVENT_LIST(ID_)
> };
> 
> 
> enum severity_level {
> 	severity_info, severity_warn, severity_error
> };
> 
> /* below this one are INFO */
> #define FIRST_WARN_EVENT_ID		ID_EVENT_B
> /* below this one are WARN */
> #define FIRST_ERROR_EVENT_ID		ID_EVENT_C
> 
> #define STRING_(s) #s
> #define DESC_(event) \
> 	{ \
> 		ID_ ## event, \
> 		STRING_(event), \
> 		(ID_ ## event < FIRST_WARN_EVENT_ID \
> 		? severity_info \
> 		: ID_ ## event < FIRST_ERROR_EVENT_ID \
> 		? severity_warn \
> 		: severity_error) \
> 	}

This is exactly the ugly, ugly preprocessor construct I thought you would
meet with contempt. I mean, compared to this, my FUNC() hack is outright
pretty ;-)

And *still*, this is *just* a global table with defaults. I would *still*
need to copy-on-write when the first customization of the severity level
takes place because I cannot allow the global defaults to be modified by
one caller (that would defeat the whole purpose of having per-caller
settings bundled in the fsck_options struct).

You see, I still would need to have a lazy initialization, the complexity
in that part would not be reduced at all.

So I am afraid that this approach really adds complexity rather than
replacing it with something simpler than my current code.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23  9:50                 ` Johannes Schindelin
@ 2014-12-23 16:32                   ` Junio C Hamano
  2014-12-23 16:47                     ` Johannes Schindelin
  2014-12-23 17:07                     ` Junio C Hamano
  0 siblings, 2 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 16:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> >> > Of course you can say that! ;-) The problem these ugly messages try to
>> >> > solve is to give the user a hint which setting to change if they want to
>> >> > override the default behavior, though...
>> >> 
>> >> Ahh, OK, then dashed form would not work as a configuration variable
>> >> names, so missingTaggerEntry would be the only usable option.
>> >
>> > Except that cunning me has made it so that both missing-tagger-entry *and*
>> > missingTaggerEntry work...
>> 
>> Then the missing-tagger-entry side needs to be dropped.  The naming
>> does not conform to the way how we name our officially supported
>> configuration variables.
>
> But it does conform with the way we do our command-line parameters,

Hmmm....  What is the expected user interaction?  The way I read the
series was ($MISSING_TAGGER stands for the "token" we choose to show):

    (1) The user runs fsck without customization, and may see a
	warning (or error):

        $ git fsck
        error in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER

    (2) The user demotes it to warning and runs fsck again:

	$ git config fsck.$MISSING_TAGGER warn
        $ git fsck
        warning: tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER

I suspect that it would be much better if the configuration
variables were organized the other way around, e.g.

	$ git config fsck.warn missingTagger,someOtherKindOfError

Then a one-shot override would make sense and easier to give as
command line option, e.g.

	$ git fsck --warn=missingTagger,someOtherKindOfError

But the proposed organization to use one variable per questionable
event type (as opposed to one variable per severity level) would
lead to a one-shot override of this form, e.g.

	$ git fsck --missing-tagger=warn --some-other-kind-of-error=warn

which I think is insane to require us to support unbound number of
dashed options.

Or are you saying that we allow "git config core.file-mode true"
from the command line to set core.fileMode configuration?

Puzzled...

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 16:32                   ` Junio C Hamano
@ 2014-12-23 16:47                     ` Johannes Schindelin
  2014-12-23 17:14                       ` Junio C Hamano
  2014-12-23 17:07                     ` Junio C Hamano
  1 sibling, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 16:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> >> > Of course you can say that! ;-) The problem these ugly messages
> >> >> > try to solve is to give the user a hint which setting to change
> >> >> > if they want to override the default behavior, though...
> >> >> 
> >> >> Ahh, OK, then dashed form would not work as a configuration
> >> >> variable names, so missingTaggerEntry would be the only usable
> >> >> option.
> >> >
> >> > Except that cunning me has made it so that both
> >> > missing-tagger-entry *and* missingTaggerEntry work...
> >> 
> >> Then the missing-tagger-entry side needs to be dropped.  The naming
> >> does not conform to the way how we name our officially supported
> >> configuration variables.
> >
> > But it does conform with the way we do our command-line parameters,
> 
> Hmmm....  What is the expected user interaction?  The way I read the
> series was ($MISSING_TAGGER stands for the "token" we choose to show):
> 
>     (1) The user runs fsck without customization, and may see a
> 	warning (or error):
> 
>         $ git fsck
>         error in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER
> 
>     (2) The user demotes it to warning and runs fsck again:
> 
> 	$ git config fsck.$MISSING_TAGGER warn
>         $ git fsck
>         warning: tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER

The intended use case is actually when receive.fsckObjects = true and you
call `git push`, seeing 'remote: error: $MULTIPLE_AUTHORS: ...'.

Now, the $MULTIPLE_AUTHORS *config* setting is parsed by `git
receive-pack`, but that is not the command that needs to customize the
fsck call: it is either `git index-pack` or `git unpack-objects`. So what
`git receive-pack` does is to pass the config options as command-line
options to the called command. For consistency with the rest of Git, the
command-line options were *not* camel-cased, but lower-case,
dash-separated.

The parser I wrote actually accepts both versions, allowing me to skip the
tedious step to convert the camelCased config setting into a
lower-case-dashed version to pass to `index-pack` or `unpack-objects`,
only to be parsed by the same parser as `fsck` would use directly.

So I am rather happy with the fact that the parser handles both camelCased
and lower-case-dashed versions.

> I suspect that it would be much better if the configuration variables
> were organized the other way around, e.g.
> 
> 	$ git config fsck.warn missingTagger,someOtherKindOfError

I had something similar in an earlier version of my patch series, but it
was shot down rightfully: if you want to allow inheriting defaults from
$HOME/.gitconfig, you have to configure the severity levels individually.

(The current solution also sidesteps the problematic situation when both
fsck.warn *and* fsck.error contain, say, missingTagger.)

> Then a one-shot override would make sense and easier to give as
> command line option, e.g.
> 
> 	$ git fsck --warn=missingTagger,someOtherKindOfError

Yep, my first implementation actually used
`--strict=missing-tagger,-some-demoted-error`. But as I mentioned above,
that approach is not as flexible as the current one.

> But the proposed organization to use one variable per questionable
> event type (as opposed to one variable per severity level) would
> lead to a one-shot override of this form, e.g.
> 
> 	$ git fsck --missing-tagger=warn --some-other-kind-of-error=warn
> 
> which I think is insane to require us to support unbound number of
> dashed options.

The intended use case is actually *not* the command-line, but the config
file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and*
.git/config to customize the settings.

> Or are you saying that we allow "git config core.file-mode true"
> from the command line to set core.fileMode configuration?

I do not understand this reference. I did not suggest to change `git
config`, did I? If I did, I apologize; it was definitely *not* my
intention to change long-standing customs.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 16:32                   ` Junio C Hamano
  2014-12-23 16:47                     ` Johannes Schindelin
@ 2014-12-23 17:07                     ` Junio C Hamano
  1 sibling, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 17:07 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> I suspect that it would be much better if the configuration
> variables were organized the other way around, e.g.
>
> 	$ git config fsck.warn missingTagger,someOtherKindOfError

By the way, I think I like this organization is much better than the
other way around, i.e. "fsck.missingTagger=warn", as we do not want
the namespace under fsck.* for variables that control the behaviour
of fsck that are *NOT* kinds of questionable conditions fsck can
find.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 16:47                     ` Johannes Schindelin
@ 2014-12-23 17:14                       ` Junio C Hamano
  2014-12-23 17:41                         ` Johannes Schindelin
  2015-01-22 15:49                         ` Michael Haggerty
  0 siblings, 2 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 17:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> The parser I wrote actually accepts both versions, allowing me to skip the
> tedious step to convert the camelCased config setting into a
> lower-case-dashed version to pass to `index-pack` or `unpack-objects`,
> only to be parsed by the same parser as `fsck` would use directly.
>
> So I am rather happy with the fact that the parser handles both camelCased
> and lower-case-dashed versions.

That is myopic view of the world that ignores maintainability and
teachability, doing disservice to our user base.

What message does it send to an unsuspecting new user that
fsck.random-error is silently accepted (because we will never
document it) as an alias for fsck.randomError, while most of the
configuration variables will not take such an alias?

>> I suspect that it would be much better if the configuration variables
>> were organized the other way around, e.g.
>> 
>> 	$ git config fsck.warn missingTagger,someOtherKindOfError
>
> I had something similar in an earlier version of my patch series, but it
> was shot down rightfully: if you want to allow inheriting defaults from
> $HOME/.gitconfig, you have to configure the severity levels individually.

Hmmm.  What's wrong with "fsck.warn -missingTagger" that overrides
the earlier one, or even "fsck.info missingTagger" after having
"fsck.warn other,missingTagger,yetanother", with the usual "last one
wins" rule?

Whoever shot it down "rightfully" is wrong here, I would think.

>> But the proposed organization to use one variable per questionable
>> event type (as opposed to one variable per severity level) would
>> lead to a one-shot override of this form, e.g.
>> 
>> 	$ git fsck --missing-tagger=warn --some-other-kind-of-error=warn
>> 
>> which I think is insane to require us to support unbound number of
>> dashed options.
>
> The intended use case is actually *not* the command-line, but the config
> file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and*
> .git/config to customize the settings.

But we do need to worry about one-shot override from the command
line.  A configuration that sticks without a way to override is a
no-no.

>> Or are you saying that we allow "git config core.file-mode true"
>> from the command line to set core.fileMode configuration?
>
> I do not understand this reference.

I was puzzled by your "command line" and wondering if you meant
"from the command line, aVariable can be spelled a-variable".

> I did not suggest to change `git
> config`, did I? If I did, I apologize; it was definitely *not* my
> intention to change long-standing customs.

Then fsck.missing-tagger is definitely out.  Long standing customs
is that a multi-word token at the first and the last level is not
dashed-multi-word.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-23 16:30                     ` Johannes Schindelin
@ 2014-12-23 17:20                       ` Junio C Hamano
  2014-12-23 17:28                         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 17:20 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Michael Haggerty

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> And *still*, this is *just* a global table with defaults. I would *still*
> need to copy-on-write when the first customization of the severity level
> takes place because I cannot allow the global defaults to be modified by
> one caller (that would defeat the whole purpose of having per-caller
> settings bundled in the fsck_options struct).

If you allocate a per-invocation fsck_options struct, then the
initialization the default with code is dead easy---you can just do
it immediately after you x[cm]alloc()---no?

What am I missing?

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-23 17:20                       ` Junio C Hamano
@ 2014-12-23 17:28                         ` Johannes Schindelin
  2014-12-23 18:14                           ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 17:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > And *still*, this is *just* a global table with defaults. I would *still*
> > need to copy-on-write when the first customization of the severity level
> > takes place because I cannot allow the global defaults to be modified by
> > one caller (that would defeat the whole purpose of having per-caller
> > settings bundled in the fsck_options struct).
> 
> If you allocate a per-invocation fsck_options struct, then the
> initialization the default with code is dead easy---you can just do
> it immediately after you x[cm]alloc()---no?

There is no alloc. Right now, the initialization reads:

	struct fsck_options options = strict ?
		FSCK_OPTIONS_STRICT : FSCK_OPTIONS_DEFAULT;

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 17:14                       ` Junio C Hamano
@ 2014-12-23 17:41                         ` Johannes Schindelin
  2014-12-23 17:56                           ` Junio C Hamano
  2015-01-22 15:49                         ` Michael Haggerty
  1 sibling, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 17:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > The parser I wrote actually accepts both versions, allowing me to skip
> > the tedious step to convert the camelCased config setting into a
> > lower-case-dashed version to pass to `index-pack` or `unpack-objects`,
> > only to be parsed by the same parser as `fsck` would use directly.
> >
> > So I am rather happy with the fact that the parser handles both
> > camelCased and lower-case-dashed versions.
> 
> That is myopic view of the world that ignores maintainability and
> teachability, doing disservice to our user base.

Okay, so just to clarify: you want me to

- split the parser into

	- a parser that accepts only camelCased variable names when they
	  come from the config (for use in fsck and receive-pack), and

	- another parser that rejects camelCased variable names and only
	  accepts lower-case-dashed, intended for command-line parsing
	  in fsck, index-pack and unpack-objects, and

- consequently have a converter from the camelCased variable names we
  receive from the config in receive-pack so we can pass lower-case-dashed
  settings to index-pack and unpack-objects.

If you want it this way, I will do it this way.

> What message does it send to an unsuspecting new user that
> fsck.random-error is silently accepted (because we will never document
> it) as an alias for fsck.randomError, while most of the configuration
> variables will not take such an alias?

I will not participate in a discussion about consistency again. There is
nothing that can be done about it; what matters is what you will accept
and what not. I will make the code stricter (and consequently more
complex) if that is what you want.

> >> I suspect that it would be much better if the configuration variables
> >> were organized the other way around, e.g.
> >> 
> >> 	$ git config fsck.warn missingTagger,someOtherKindOfError
> >
> > I had something similar in an earlier version of my patch series, but
> > it was shot down rightfully: if you want to allow inheriting defaults
> > from $HOME/.gitconfig, you have to configure the severity levels
> > individually.
> 
> Hmmm.  What's wrong with "fsck.warn -missingTagger" that overrides
> the earlier one, or even "fsck.info missingTagger" after having
> "fsck.warn other,missingTagger,yetanother", with the usual "last one
> wins" rule?

I will change the code (next year...).

> >> But the proposed organization to use one variable per questionable
> >> event type (as opposed to one variable per severity level) would lead
> >> to a one-shot override of this form, e.g.
> >> 
> >> 	$ git fsck --missing-tagger=warn --some-other-kind-of-error=warn
> >> 
> >> which I think is insane to require us to support unbound number of
> >> dashed options.
> >
> > The intended use case is actually *not* the command-line, but the config
> > file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and*
> > .git/config to customize the settings.
> 
> But we do need to worry about one-shot override from the command
> line.  A configuration that sticks without a way to override is a
> no-no.

And of course you can, by specifying the config setting via the -c
command-line option. The only inconsistency here is that all other
command-line options are lower-case-dashed, while the config settings are
camelCased.

> >> Or are you saying that we allow "git config core.file-mode true" from
> >> the command line to set core.fileMode configuration?
> >
> > I do not understand this reference.
> 
> I was puzzled by your "command line" and wondering if you meant
> "from the command line, aVariable can be spelled a-variable".

Well, of course, if you call `git -c aVariable command
--option=a-variable` you have a nice accumulation of styles right there
;-)

> > I did not suggest to change `git config`, did I? If I did, I
> > apologize; it was definitely *not* my intention to change
> > long-standing customs.
> 
> Then fsck.missing-tagger is definitely out.  Long standing customs
> is that a multi-word token at the first and the last level is not
> dashed-multi-word.

But I already changed all of the patches to fsck.missingTagger.

The only thing I did not do yet is to split the parser into two, one
accepting only camelCased, one accepting only lower-case-dashed options,
and a translator to convert from camelCase to lower-case-dashed versions
(because it is a lot of work and additional complexity, as well as
opportunity for bugs to hide because we'll have three code paths). I asked
you above whether you want that, and I will do it if you say that you do.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 17:41                         ` Johannes Schindelin
@ 2014-12-23 17:56                           ` Junio C Hamano
  2014-12-23 18:06                             ` Johannes Schindelin
  2014-12-23 18:09                             ` Junio C Hamano
  0 siblings, 2 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 17:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Okay, so just to clarify: you want me to
>
> - split the parser into
>
> 	- a parser that accepts only camelCased variable names when they
> 	  come from the config (for use in fsck and receive-pack), and

OK.

> 	- another parser that rejects camelCased variable names and only
> 	  accepts lower-case-dashed, intended for command-line parsing
> 	  in fsck, index-pack and unpack-objects, and
>
> - consequently have a converter from the camelCased variable names we
>   receive from the config in receive-pack so we can pass lower-case-dashed
>   settings to index-pack and unpack-objects.

I am not sure about the latter two.  This needs a design discussion
what the command line options should be.

I think the command line should be like this:

	git cmd --warn=missingTags,missingAuthor

in the first place, i.e. "we may invent tokens to denote new kinds
of errors as we improve fsck", not with "we may add options for new
kinds of errors", i.e. the command line should not look like this:

	git cmd --missing-tags=warn --missing-author=warn

And from that point of view, I see no reason to support the dashed
variant anywhere in the code, neither in the config parser or in the
command line parser.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 17:56                           ` Junio C Hamano
@ 2014-12-23 18:06                             ` Johannes Schindelin
  2014-12-23 18:09                             ` Junio C Hamano
  1 sibling, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 18:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Okay, so just to clarify: you want me to
> >
> > - split the parser into
> >
> > 	- a parser that accepts only camelCased variable names when they
> > 	come from the config (for use in fsck and receive-pack), and
> 
> OK.
> 
> > 	- another parser that rejects camelCased variable names and only
> > 	  accepts lower-case-dashed, intended for command-line parsing
> > 	  in fsck, index-pack and unpack-objects, and
> >
> > - consequently have a converter from the camelCased variable names we
> >   receive from the config in receive-pack so we can pass lower-case-dashed
> >   settings to index-pack and unpack-objects.
> 
> I am not sure about the latter two.  This needs a design discussion
> what the command line options should be.
> 
> I think the command line should be like this:
> 
> 	git cmd --warn=missingTags,missingAuthor

Okay. This contradicts the convention where Git uses lower-case-dashed
command-line option values (e.g. on-demand, error-all, etc) and no
camelCased options were present so far. But your wish is my command.

Ciao,
Dscho
> 
> in the first place, i.e. "we may invent tokens to denote new kinds
> of errors as we improve fsck", not with "we may add options for new
> kinds of errors", i.e. the command line should not look like this:
> 
> 	git cmd --missing-tags=warn --missing-author=warn
> 
> And from that point of view, I see no reason to support the dashed
> variant anywhere in the code, neither in the config parser or in the
> command line parser.
> 

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 17:56                           ` Junio C Hamano
  2014-12-23 18:06                             ` Johannes Schindelin
@ 2014-12-23 18:09                             ` Junio C Hamano
  2014-12-23 18:14                               ` Johannes Schindelin
  1 sibling, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 18:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> Okay, so just to clarify: you want me to
>>
>> - split the parser into
>>
>> 	- a parser that accepts only camelCased variable names when they
>> 	  come from the config (for use in fsck and receive-pack), and
>
> OK.
>
>> 	- another parser that rejects camelCased variable names and only
>> 	  accepts lower-case-dashed, intended for command-line parsing
>> 	  in fsck, index-pack and unpack-objects, and
>>
>> - consequently have a converter from the camelCased variable names we
>>   receive from the config in receive-pack so we can pass lower-case-dashed
>>   settings to index-pack and unpack-objects.
>
> I am not sure about the latter two.  This needs a design discussion
> what the command line options should be.
>
> I think the command line should be like this:
>
> 	git cmd --warn=missingTags,missingAuthor
>
> in the first place, i.e. "we may invent tokens to denote new kinds
> of errors as we improve fsck", not with "we may add options for new
> kinds of errors", i.e. the command line should not look like this:
>
> 	git cmd --missing-tags=warn --missing-author=warn
>
> And from that point of view, I see no reason to support the dashed
> variant anywhere in the code, neither in the config parser or in the
> command line parser.

Having said that, I think "missingTags" etc. should not be
configuration variable names (instead, they should be values).

Because of that, I do not think we need consistency between the way
these "tokens that denote kinds of errors fsck denotes" are spelled
and the way "configuration variable names" are spelled.

In other words, I do not think there is nothing that comes from how
our configuration variable names are spelled that gives preference
to one over the other between the two styles:

(1) Tokens are camelCased

	[fsck]
		warn = missingTagger,missingAuthor
                error = zeroPadTreeEntry

	$ git cmd --warn=missingTagger,missingAuthor

(2) Tokens are dashed-multi-words

	[fsck]
		warn = missing-tagger,missing-author
                error = zero-pad-tree-entry

	$ git cmd --warn=missing-tagger,missing-author

Do I have a strong preference between these two?

Not really.  My gut reaction is that (2) may be easier to read, but
I can be persuaded either way.

Who else has/had opinions on this?  Earlier you said that the
configuration the other way, i.e. "[fsck] warn = missingTag", was
shot down---who did shoot it?  Perhaps that person may be able to
point out where in my thinking above I am going in the wrong
direction.

Thanks.

[Footnote]

In either case, I'd recommend that we take [ ,]+ as inter-token
separator to ease the use on the command line and config file, to
allow these:

	[fsck]
		warn = missingTagger missingAuthor
		warn = missingTagger,missingAuthor

	$ git cmd --warn missingTagger,missingAuthor
	$ git cmd --warn "missingTagger missingAuthor"

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-23 17:28                         ` Johannes Schindelin
@ 2014-12-23 18:14                           ` Junio C Hamano
  2014-12-23 18:23                             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 18:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List, Michael Haggerty

On Tue, Dec 23, 2014 at 9:28 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> > And *still*, this is *just* a global table with defaults. I would *still*
>> > need to copy-on-write when the first customization of the severity level
>> > takes place because I cannot allow the global defaults to be modified by
>> > one caller (that would defeat the whole purpose of having per-caller
>> > settings bundled in the fsck_options struct).
>
> There is no alloc. Right now, the initialization reads:
>
>         struct fsck_options options = strict ?
>                 FSCK_OPTIONS_STRICT : FSCK_OPTIONS_DEFAULT;

Then it is just the matter of having

   fsck_options_init(&options);
   if (strict)
    options.some_field = make_it_strict;

as the first few statements, no?

I am not sure why it is so difficult....

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 18:09                             ` Junio C Hamano
@ 2014-12-23 18:14                               ` Johannes Schindelin
  2014-12-23 18:56                                 ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 18:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Having said that, I think "missingTags" etc. should not be configuration
> variable names (instead, they should be values).
> 
> Because of that, I do not think we need consistency between the way
> these "tokens that denote kinds of errors fsck denotes" are spelled and
> the way "configuration variable names" are spelled.

Okay. That makes more sense.

Now I can remove the complexity introduced by teaching the parser to
accept camelCased values, and we're golden.

> In either case, I'd recommend that we take [ ,]+ as inter-token
> separator to ease the use on the command line and config file

And this is indeed the case:

+void fsck_strict_mode(struct fsck_options *options, const char *mode)
+...
+       while (*mode) {
+               int len = strcspn(mode, " ,|"), equal, msg_id;
+...

In other words, I even allowed the pipe symbol as separator.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 04/18] Offer a function to demote fsck errors to warnings
  2014-12-23 18:14                           ` Junio C Hamano
@ 2014-12-23 18:23                             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 18:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Michael Haggerty

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> On Tue, Dec 23, 2014 at 9:28 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >>
> >> > And *still*, this is *just* a global table with defaults. I would *still*
> >> > need to copy-on-write when the first customization of the severity level
> >> > takes place because I cannot allow the global defaults to be modified by
> >> > one caller (that would defeat the whole purpose of having per-caller
> >> > settings bundled in the fsck_options struct).
> >
> > There is no alloc. Right now, the initialization reads:
> >
> >         struct fsck_options options = strict ?
> >                 FSCK_OPTIONS_STRICT : FSCK_OPTIONS_DEFAULT;
> 
> Then it is just the matter of having
> 
>    fsck_options_init(&options);
>    if (strict)
>     options.some_field = make_it_strict;
> 
> as the first few statements, no?
> 
> I am not sure why it is so difficult....

It is not difficult. But I try to avoid complexity when I can. Since you
asked specifically, I will introduce it, though. Hopefully still this year
(I'll not be available for a while starting tomorrow).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 18:14                               ` Johannes Schindelin
@ 2014-12-23 18:56                                 ` Junio C Hamano
  2014-12-23 20:12                                   ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 18:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Tue, 23 Dec 2014, Junio C Hamano wrote:
>
>> Having said that, I think "missingTags" etc. should not be configuration
>> variable names (instead, they should be values).
>> 
>> Because of that, I do not think we need consistency between the way
>> these "tokens that denote kinds of errors fsck denotes" are spelled and
>> the way "configuration variable names" are spelled.
>
> Okay. That makes more sense.

I am sorry that I didn't step back and think about it earlier to
notice that we shouldn't be talking about configuration variable
name syntax.  I could have saved us time going back and forth if
I did so earlier.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 18:56                                 ` Junio C Hamano
@ 2014-12-23 20:12                                   ` Johannes Schindelin
  2014-12-23 21:17                                     ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2014-12-23 20:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Tue, 23 Dec 2014, Junio C Hamano wrote:
> >
> >> Having said that, I think "missingTags" etc. should not be
> >> configuration variable names (instead, they should be values).
> >> 
> >> Because of that, I do not think we need consistency between the way
> >> these "tokens that denote kinds of errors fsck denotes" are spelled
> >> and the way "configuration variable names" are spelled.
> >
> > Okay. That makes more sense.
> 
> I am sorry that I didn't step back and think about it earlier to notice
> that we shouldn't be talking about configuration variable name syntax.
> I could have saved us time going back and forth if I did so earlier.

Do not worry. You were just trying to make this software better, same as
I tried.

Unfortunately, I will not be able to submit v2 of this patch series this
year, but I will do so in the second week of January (including the change
to the global array with the default severity levels because I do want to
see this feature integrated).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 20:12                                   ` Johannes Schindelin
@ 2014-12-23 21:17                                     ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2014-12-23 21:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Unfortunately, I will not be able to submit v2 of this patch series this
> year, but I will do so in the second week of January (including the change
> to the global array with the default severity levels because I do want to
> see this feature integrated).

Heh, we are not in a hurry.  Enjoy your holidays.

A happy new year to you in advance ;-)

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH v2 00/18] Introduce an internal API to interact with the  fsck machinery
  2014-12-10 18:34 ` [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
@ 2015-01-19 15:49   ` Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 01/18] fsck: Introduce fsck options Johannes Schindelin
                       ` (18 more replies)
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2 siblings, 19 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:49 UTC (permalink / raw)
  To: gitster; +Cc: git

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.

Interdiff vs v1 below the diffstat. Sorry for the size; the comments I
received for v1 made it necessary to change the patch series rather
extensively (I rebased the branch twenty-five times since sending off
the first version of the patch series, which might also serve as an apology
for not getting v2 out sooner).

Johannes Schindelin (19):
  fsck: Introduce fsck options
  fsck: Introduce identifiers for fsck messages
  fsck: Provide a function to parse fsck message IDs
  fsck: Offer a function to demote fsck errors to warnings
  fsck: Allow demoting errors to warnings via receive.fsck.warn = <key>
  fsck: Report the ID of the error/warning
  fsck: Make fsck_ident() warn-friendly
  fsck: Make fsck_commit() warn-friendly
  fsck: Handle multiple authors in commits specially
  fsck: Make fsck_tag() warn-friendly
  fsck: Add a simple test for receive.fsck.*
  fsck: Disallow demoting grave fsck errors to warnings
  fsck: Optionally ignore specific fsck issues completely
  fsck: Allow upgrading fsck warnings to errors
  fsck: Document the new receive.fsck.* options.
  fsck: Support demoting errors to warnings
  fsck: Introduce `git fsck --quick`
  fsck: git receive-pack: support excluding objects from fsck'ing

 Documentation/config.txt        |  38 +++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  66 +++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  35 ++-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 525 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  27 ++-
 t/t1450-fsck.sh                 |  37 ++-
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  46 ++++
 11 files changed, 648 insertions(+), 164 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4f86d3f..0daba8a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1209,14 +1209,14 @@ filter.<driver>.smudge::
 	linkgit:gitattributes[5] for details.
 
 fsck.*::
-	With these options, fsck errors can be switched to warnings and
-	vice versa by setting e.g. `fsck.bad-name` to `warn` or `error`
-	(or `ignore` to hide those errors completely). For convenience,
-	fsck prefixes the error/warning with the name of the option, e.g.
-	"missing-email: invalid author/committer line - missing email"
-	means that setting `fsck.missing-email` to `ignore` will hide that
-	issue.  For convenience, camelCased options are accepted, too (e.g.
-	`fsck.missingEmail`).
+	The `fsck.error`, `fsck.warn` and `fsck.ignore` settings specify
+	comma-separated lists of fsck message IDs which should trigger
+	fsck to error out, to print the message and continue, or to ignore
+	said messages, respectively.
++
+For convenience, fsck prefixes the error/warning with the name of the option,
+e.g.  "missing-email: invalid author/committer line - missing email" means
+that setting `fsck.ignore = missing-email` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
@@ -2144,18 +2144,29 @@ receive.fsckObjects::
 	is used instead.
 
 receive.fsck.*::
-	When `receive.fsckObjects is set to true, errors can be switched
-	to warnings and vice versa by setting e.g. `receive.fsck.bad-name`
-	to `warn` or `error` (or `ignore` to hide those errors
-	completely). For convenience, fsck prefixes the error/warning
-	with the name of the option, e.g. "missing-email: invalid
-	author/committer line - missing email" means that setting
-	`receive.fsck.missing-email` to `ignore` will hide that issue.
-	For convenience, camelCased options are accepted, too (e.g.
-	`receive.fsck.missingEmail`).
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.*`
+	settings. These settings contain comma-separated lists of fsck
+	message IDs. For convenience, fsck prefixes the error/warning with
+	the message ID, e.g. "missing-email: invalid author/committer line
+	- missing email" means that setting `receive.fsck.ignore =
+	missing-email` will hide that issue.
++
+--
+	error::
+		a comma-separated list of fsck message IDs that should be
+		trigger fsck to error out.
+	warn::
+		a comma-separated list of fsck message IDs that should be
+		displayed, but fsck should continue to error out.
+	ignore::
+		a comma-separated list of fsck message IDs that should be
+		ignored completely.
 +
 This feature is intended to support working with legacy repositories
-which would not pass pushing when `receive.fsckObjects = true`.
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories certain known issues but still catch
+other issues.
 
 receive.unpackLimit::
 	If the number of objects received in a push is below this
diff --git a/builtin/fsck.c b/builtin/fsck.c
index dcea9b0..c767909 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -51,8 +51,8 @@ static int fsck_config(const char *var, const char *value, void *cb)
 {
 	if (starts_with(var, "fsck.")) {
 		struct strbuf sb = STRBUF_INIT;
-		strbuf_addf(&sb, "%s=%s", var + 5, value ? value : "error");
-		fsck_strict_mode(&fsck_obj_options, sb.buf);
+		strbuf_addf(&sb, "%s=%s", var + 5, value);
+		fsck_set_severity(&fsck_obj_options, sb.buf);
 		strbuf_release(&sb);
 		return 0;
 	}
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 2efcb6d..f464ca0 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1568,7 +1568,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (starts_with(arg, "--strict=")) {
 				strict = 1;
 				do_fsck_object = 1;
-				fsck_strict_mode(&fsck_options, arg + 9);
+				fsck_set_severity(&fsck_options, arg + 9);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 86bcda2..40514c2 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -36,7 +36,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
-static struct strbuf fsck_strict_mode = STRBUF_INIT;
+static struct strbuf fsck_severity = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -116,20 +116,19 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (starts_with(var, "receive.fsck.skip-list")) {
+	if (starts_with(var, "receive.fsck.skiplist")) {
 		const char *path = is_absolute_path(value) ?
 			value : git_path("%s", value);
-		if (fsck_strict_mode.len)
-			strbuf_addch(&fsck_strict_mode, ',');
-		strbuf_addf(&fsck_strict_mode, "skip-list=%s", path);
+		if (fsck_severity.len)
+			strbuf_addch(&fsck_severity, ',');
+		strbuf_addf(&fsck_severity, "skiplist=%s", path);
 		return 0;
 	}
 
 	if (starts_with(var, "receive.fsck.")) {
-		if (fsck_strict_mode.len)
-			strbuf_addch(&fsck_strict_mode, ',');
-		strbuf_addf(&fsck_strict_mode,
-			"%s=%s", var + 13, value ? value : "error");
+		if (fsck_severity.len)
+			strbuf_addch(&fsck_severity, ',');
+		strbuf_addf(&fsck_severity, "%s=%s", var + 13, value);
 		return 0;
 	}
 
@@ -1489,9 +1488,9 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects) {
-			if (fsck_strict_mode.len)
+			if (fsck_severity.len)
 				argv_array_pushf(&child.args, "--strict=%s",
-					fsck_strict_mode.buf);
+					fsck_severity.buf);
 			else
 				argv_array_push(&child.args, "--strict");
 		}
@@ -1512,9 +1511,9 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects) {
-			if (fsck_strict_mode.len)
+			if (fsck_severity.len)
 				argv_array_pushf(&child.args, "--strict=%s",
-					fsck_strict_mode.buf);
+					fsck_severity.buf);
 			else
 				argv_array_push(&child.args, "--strict");
 		}
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 179a960..82f2d62 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -532,7 +532,7 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 			}
 			if (starts_with(arg, "--strict=")) {
 				strict = 1;
-				fsck_strict_mode(&fsck_options, arg + 9);
+				fsck_set_severity(&fsck_options, arg + 9);
 				continue;
 			}
 			if (starts_with(arg, "--pack_header=")) {
diff --git a/fsck.c b/fsck.c
index 4f8a754..dbf9fa1 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,121 +10,126 @@
 #include "utf8.h"
 #include "sha1-array.h"
 
+#define FSCK_FATAL -1
+#define FSCK_INFO -2
+
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
-	FUNC(NUL_IN_HEADER) \
-	FUNC(UNTERMINATED_HEADER) \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
-	FUNC(BAD_DATE) \
-	FUNC(BAD_EMAIL) \
-	FUNC(BAD_NAME) \
-	FUNC(BAD_PARENT_SHA1) \
-	FUNC(BAD_TIMEZONE) \
-	FUNC(BAD_TREE_SHA1) \
-	FUNC(DATE_OVERFLOW) \
-	FUNC(DUPLICATE_ENTRIES) \
-	FUNC(INVALID_OBJECT_SHA1) \
-	FUNC(INVALID_TAG_OBJECT) \
-	FUNC(INVALID_TREE) \
-	FUNC(INVALID_TYPE) \
-	FUNC(MISSING_AUTHOR) \
-	FUNC(MISSING_COMMITTER) \
-	FUNC(MISSING_EMAIL) \
-	FUNC(MISSING_GRAFT) \
-	FUNC(MISSING_NAME_BEFORE_EMAIL) \
-	FUNC(MISSING_OBJECT) \
-	FUNC(MISSING_PARENT) \
-	FUNC(MISSING_SPACE_BEFORE_DATE) \
-	FUNC(MISSING_SPACE_BEFORE_EMAIL) \
-	FUNC(MISSING_TAG) \
-	FUNC(MISSING_TAG_ENTRY) \
-	FUNC(MISSING_TAG_OBJECT) \
-	FUNC(MISSING_TREE) \
-	FUNC(MISSING_TYPE) \
-	FUNC(MISSING_TYPE_ENTRY) \
-	FUNC(MULTIPLE_AUTHORS) \
-	FUNC(NOT_SORTED) \
-	FUNC(TAG_OBJECT_NOT_TAG) \
-	FUNC(UNKNOWN_TYPE) \
-	FUNC(ZERO_PADDED_DATE) \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(DATE_OVERFLOW, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(INVALID_OBJECT_SHA1, ERROR) \
+	FUNC(INVALID_TAG_OBJECT, ERROR) \
+	FUNC(INVALID_TREE, ERROR) \
+	FUNC(INVALID_TYPE, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
+	FUNC(NOT_SORTED, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
-	FUNC(BAD_FILEMODE) \
-	FUNC(EMPTY_NAME) \
-	FUNC(FULL_PATHNAME) \
-	FUNC(HAS_DOT) \
-	FUNC(HAS_DOTDOT) \
-	FUNC(HAS_DOTGIT) \
-	FUNC(NULL_SHA1) \
-	FUNC(ZERO_PADDED_FILEMODE) \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
 	/* infos (reported as warnings, but ignored by default) */ \
-	FUNC(INVALID_TAG_NAME) \
-	FUNC(MISSING_TAGGER_ENTRY) \
-	/* special value */ \
-	FUNC(SKIP_LIST)
-
-#define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
-#define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
-#define FIRST_INFO FSCK_MSG_INVALID_TAG_NAME
+	FUNC(INVALID_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
-#define MSG_ID(x) FSCK_MSG_##x,
+#define MSG_ID(id, severity) FSCK_MSG_##id,
 enum fsck_msg_id {
 	FOREACH_MSG_ID(MSG_ID)
 	FSCK_MSG_MAX
 };
+#undef MSG_ID
 
 #define STR(x) #x
-#define MSG_ID_STR(x) STR(x),
-static const char *msg_id_str[FSCK_MSG_MAX + 1] = {
-	FOREACH_MSG_ID(MSG_ID_STR)
-	NULL
+#define MSG_ID(id, severity) { STR(id), FSCK_##severity },
+static struct {
+	const char *id_string;
+	int severity;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ NULL, -1 }
 };
+#undef MSG_ID
 
 static int parse_msg_id(const char *text, int len)
 {
 	int i, j;
 
 	for (i = 0; i < FSCK_MSG_MAX; i++) {
-		const char *key = msg_id_str[i];
-		/* msg_id_str is upper-case, with underscores */
+		const char *key = msg_id_info[i].id_string;
+		/* id_string is upper-case, with underscores */
 		for (j = 0; j < len; j++) {
 			char c = *(key++);
-			if (c == '_') {
-				if (isalpha(text[j]))
-					c = *(key++);
-				else if (text[j] != '_')
-					c = '-';
-			}
-			if (toupper(text[j]) != c)
+			if (c == '_')
+				c = '-';
+			if (text[j] != tolower(c))
 				break;
 		}
 		if (j == len && !*key)
 			return i;
 	}
 
-	die("Unhandled type: %.*s", len, text);
+	die("Unhandled message id: %.*s", len, text);
 }
 
-int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
+static int fsck_msg_severity(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
 {
-	if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
-		return options->strict_mode[msg_id];
-	if (options->strict)
-		return msg_id < FIRST_INFO ? FSCK_ERROR : FSCK_WARN;
-	return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
+	int severity;
+
+	if (options->msg_severity && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
+		severity = options->msg_severity[msg_id];
+	else {
+		severity = msg_id_info[msg_id].severity;
+		if (options->strict && severity == FSCK_WARN)
+			severity = FSCK_ERROR;
+	}
+
+	return severity;
 }
 
-static void init_skip_list(struct fsck_options *options, const char *path)
+static void init_skiplist(struct fsck_options *options, const char *path)
 {
-	static struct sha1_array skip_list = SHA1_ARRAY_INIT;
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
 	int sorted, fd;
 	char buffer[41];
 	unsigned char sha1[20];
 
-	if (options->skip_list)
-		sorted = options->skip_list->sorted;
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
 	else {
 		sorted = 1;
-		options->skip_list = &skip_list;
+		options->skiplist = &skiplist;
 	}
 
 	fd = open(path, O_RDONLY);
@@ -138,16 +143,16 @@ static void init_skip_list(struct fsck_options *options, const char *path)
 			break;
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
 			die("Invalid SHA-1: %s", buffer);
-		sha1_array_append(&skip_list, sha1);
-		if (sorted && skip_list.nr > 1 &&
-				hashcmp(skip_list.sha1[skip_list.nr - 2],
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
 					sha1) > 0)
 			sorted = 0;
 	}
 	close(fd);
 
 	if (sorted)
-		skip_list.sorted = 1;
+		skiplist.sorted = 1;
 }
 
 static inline int substrcmp(const char *string, int len, const char *match)
@@ -158,16 +163,16 @@ static inline int substrcmp(const char *string, int len, const char *match)
 	return memcmp(string, match, len);
 }
 
-void fsck_strict_mode(struct fsck_options *options, const char *mode)
+void fsck_set_severity(struct fsck_options *options, const char *mode)
 {
-	int type = FSCK_ERROR;
+	int severity = FSCK_ERROR;
 
-	if (!options->strict_mode) {
+	if (!options->msg_severity) {
 		int i;
-		int *strict_mode = malloc(sizeof(int) * FSCK_MSG_MAX);
+		int *msg_severity = malloc(sizeof(int) * FSCK_MSG_MAX);
 		for (i = 0; i < FSCK_MSG_MAX; i++)
-			strict_mode[i] = fsck_msg_type(i, options);
-		options->strict_mode = strict_mode;
+			msg_severity[i] = fsck_msg_severity(i, options);
+		options->msg_severity = msg_severity;
 	}
 
 	while (*mode) {
@@ -182,35 +187,36 @@ void fsck_strict_mode(struct fsck_options *options, const char *mode)
 			if (mode[equal] == '=')
 				break;
 
-		msg_id = parse_msg_id(mode, equal);
-		if (msg_id == FSCK_MSG_SKIP_LIST) {
-			char *path = xstrndup(mode + equal + 1, len - equal - 1);
-
-			if (equal == len)
-				die("skip-list requires a path");
-			init_skip_list(options, path);
-			free(path);
-			mode += len;
-			continue;
-		}
-
 		if (equal < len) {
-			const char *type_str = mode + equal + 1;
-			int type_len = len - equal - 1;
-			if (!substrcmp(type_str, type_len, "error"))
-				type = FSCK_ERROR;
-			else if (!substrcmp(type_str, type_len, "warn"))
-				type = FSCK_WARN;
-			else if (!substrcmp(type_str, type_len, "ignore"))
-				type = FSCK_IGNORE;
+			if (!substrcmp(mode, equal, "error"))
+				severity = FSCK_ERROR;
+			else if (!substrcmp(mode, equal, "warn"))
+				severity = FSCK_WARN;
+			else if (!substrcmp(mode, equal, "ignore"))
+				severity = FSCK_IGNORE;
+			else if (!substrcmp(mode, equal, "skiplist")) {
+				char *path = xstrndup(mode + equal + 1,
+					len - equal - 1);
+
+				if (equal == len)
+					die("skiplist requires a path");
+				init_skiplist(options, path);
+				free(path);
+				mode += len;
+				continue;
+			}
 			else
-				die("Unknown fsck message type: '%.*s'",
-					len - equal - 1, type_str);
+				die("Unknown fsck message severity: '%.*s'",
+					equal, mode);
+			mode += equal + 1;
+			len -= equal + 1;
 		}
 
-		if (type != FSCK_ERROR && msg_id < FIRST_NON_FATAL_ERROR)
+		msg_id = parse_msg_id(mode, len);
+		if (severity != FSCK_ERROR &&
+				msg_id_info[msg_id].severity == FSCK_FATAL)
 			die("Cannot demote %.*s", len, mode);
-		options->strict_mode[msg_id] = type;
+		options->msg_severity[msg_id] = severity;
 		mode += len;
 	}
 }
@@ -238,16 +244,21 @@ static int report(struct fsck_options *options, struct object *object,
 {
 	va_list ap;
 	struct strbuf sb = STRBUF_INIT;
-	int msg_type = fsck_msg_type(id, options), result;
+	int msg_severity = fsck_msg_severity(id, options), result;
 
-	if (msg_type == FSCK_IGNORE)
+	if (msg_severity == FSCK_IGNORE)
 		return 0;
 
-	append_msg_id(&sb, msg_id_str[id]);
+	if (msg_severity == FSCK_FATAL)
+		msg_severity = FSCK_ERROR;
+	else if (msg_severity == FSCK_INFO)
+		msg_severity = FSCK_WARN;
+
+	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
-	result = options->error_func(object, msg_type, sb.buf);
+	result = options->error_func(object, msg_severity, sb.buf);
 	strbuf_release(&sb);
 	va_end(ap);
 
@@ -606,8 +617,9 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
 		if (err)
 			return err;
-		/* require_end_of_header() ensured that there is a newline */
-		buffer = strchr(buffer, '\n') + 1;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
 	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
@@ -737,8 +749,8 @@ static int fsck_tag(struct tag *tag, const char *data,
 int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
-	if (options->skip_list &&
-			sha1_array_lookup(options->skip_list, obj->sha1) >= 0)
+	if (options->skiplist &&
+			sha1_array_lookup(options->skiplist, obj->sha1) >= 0)
 		return 0;
 
 	if (!obj)
@@ -759,9 +771,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *message)
+int fsck_error_function(struct object *obj, int severity, const char *message)
 {
-	if (type == FSCK_WARN) {
+	if (severity == FSCK_WARN) {
 		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
 		return 0;
 	}
diff --git a/fsck.h b/fsck.h
index 74d11cd..5397fbd 100644
--- a/fsck.h
+++ b/fsck.h
@@ -7,7 +7,7 @@
 
 struct fsck_options;
 
-void fsck_strict_mode(struct fsck_options *options, const char *mode);
+void fsck_set_severity(struct fsck_options *options, const char *mode);
 
 /*
  * callback function for fsck_walk
@@ -27,10 +27,9 @@ int fsck_error_function(struct object *obj, int type, const char *message);
 struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
-	int strict:1;
-	int *strict_mode;
-	/* TODO: consider reading into a hashmap */
-	struct sha1_array *skip_list;
+	unsigned strict:1;
+	int *msg_severity;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 3054113..1c624a3 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: invalid-tag-name: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missing-tagger-entry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
@@ -295,7 +295,7 @@ test_expect_success 'force fsck to ignore double author' '
 	git update-ref refs/heads/bogus "$new" &&
 	test_when_finished "git update-ref -d refs/heads/bogus" &&
 	test_must_fail git fsck &&
-	git -c fsck.multiple-authors=ignore fsck
+	git -c fsck.ignore=multiple-authors fsck
 '
 
 _bz='\0'
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index cf6cd5d..21fa9c8 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,42 +123,42 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
-test_expect_success 'push with receive.fsck.skip-list' '
+test_expect_success 'push with receive.fsck.skiplist' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	test_must_fail git push --porcelain dst bogus &&
-	git --git-dir=dst/.git config receive.fsck.skip-list SKIP &&
+	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
 	echo $commit > dst/.git/SKIP &&
 	git push --porcelain dst bogus
 '
 
-test_expect_success 'push with receive.fsck.missing-mail = warn' '
+test_expect_success 'push with receive.fsck.warn = missing-email' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	test_must_fail git push --porcelain dst bogus &&
-	git --git-dir=dst/.git config receive.fsck.missing-email warn &&
+	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
 	git push --porcelain dst bogus >act 2>&1 &&
 	grep "missing-email" act &&
 	git --git-dir=dst/.git branch -D bogus &&
-	git  --git-dir=dst/.git config receive.fsck.missing-email ignore &&
-	git  --git-dir=dst/.git config receive.fsck.bad-date warn &&
+	git  --git-dir=dst/.git config receive.fsck.ignore missing-email &&
+	git  --git-dir=dst/.git config receive.fsck.warn bad-date &&
 	git push --porcelain dst bogus >act 2>&1 &&
 	test_must_fail grep "missing-email" act
 '
 
-test_expect_success 'receive.fsck.unterminated-header = warn triggers error' '
+test_expect_success 'receive.fsck.warn = unterminated-header triggers error' '
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
-	git --git-dir=dst/.git config receive.fsck.unterminated-header warn &&
+	git --git-dir=dst/.git config receive.fsck.warn unterminated-header &&
 	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
-	grep "Cannot demote unterminated-header=warn" act
+	grep "Cannot demote unterminated-header" act
 '
 
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 01/18] fsck: Introduce fsck options
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 02/18] fsck: Introduce identifiers for fsck messages Johannes Schindelin
                       ` (17 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index a27515a..2241e29 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static unsigned char head_sha1[20];
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -630,6 +632,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4632117..925f7b5 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -74,6 +74,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -191,7 +192,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -782,10 +783,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			fsck_options.walk = mark_link;
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ac66672..6d17040 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 10bcb65..d83b811 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -25,9 +25,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -40,7 +40,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -49,14 +49,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -66,14 +66,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -81,11 +81,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -138,7 +138,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -194,7 +194,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -219,30 +219,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -250,7 +250,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -258,36 +258,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,30 +295,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,39 +328,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -376,65 +376,65 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -443,34 +443,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..07d0ab2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	unsigned strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 02/18] fsck: Introduce identifiers for fsck messages
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 01/18] fsck: Introduce fsck options Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 03/18] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
                       ` (16 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

Instead of specifying whether a message by the fsck machinery constitutes
an error or a warning, let's specify an identifier relating to the
concrete problem that was encountered. This is necessary for upcoming
support to be able to demote certain errors to warnings.

In the process, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

We could use a simple enum for the message IDs here, but we want to
guarantee that the enum values are associated with the appropriate
severity levels. Besides, we want to introduce a parser in the next commit
that maps the string representation to the enum value, hence we use the
slightly ugly preprocessor construct that is extensible for use with said
parser.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  24 ++-----
 fsck.c         | 201 +++++++++++++++++++++++++++++++++++++++++----------------
 fsck.h         |   5 +-
 3 files changed, 153 insertions(+), 77 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2241e29..99d4538 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -47,32 +47,22 @@ static int show_dangling = 1;
 #endif
 
 static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+                      const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+	        severity, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d83b811..30f7a48 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,98 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(DATE_OVERFLOW, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(INVALID_OBJECT_SHA1, ERROR) \
+	FUNC(INVALID_TAG_OBJECT, ERROR) \
+	FUNC(INVALID_TREE, ERROR) \
+	FUNC(INVALID_TYPE, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(NOT_SORTED, ERROR) \
+	FUNC(NUL_IN_HEADER, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(UNTERMINATED_HEADER, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(INVALID_TAG_NAME, WARN) \
+	FUNC(MISSING_TAGGER_ENTRY, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN)
+
+#define MSG_ID(id, severity) FSCK_MSG_##id,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+#undef MSG_ID
+
+#define MSG_ID(id, severity) { FSCK_##severity },
+static struct {
+	int severity;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ -1 }
+};
+#undef MSG_ID
+
+static int fsck_msg_severity(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
+{
+	int severity;
+
+	severity = msg_id_info[msg_id].severity;
+	if (options->strict && severity == FSCK_WARN)
+		severity = FSCK_ERROR;
+
+	return severity;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_severity = fsck_msg_severity(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_severity, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -219,25 +311,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -250,15 +342,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -266,28 +360,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,7 +389,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -312,13 +406,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,23 +422,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_INVALID_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -376,11 +470,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -391,48 +487,49 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -448,7 +545,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_INVALID_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -457,7 +554,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -470,22 +567,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int severity, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 07d0ab2..f6f268a 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 03/18] fsck: Provide a function to parse fsck message IDs
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 01/18] fsck: Introduce fsck options Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 02/18] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
                       ` (15 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

This function will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. It has to handle partial strings because we would
like to be able to parse, say, 'missing-email,missing-tagger-entry'
command lines.

To make the parsing robust, we generate strings from the enum keys, and
using these keys, we will map lower-case, dash-separated strings values
to the corresponding enum values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 30f7a48..2d91e28 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,15 +63,38 @@ enum fsck_msg_id {
 };
 #undef MSG_ID
 
-#define MSG_ID(id, severity) { FSCK_##severity },
+#define STR(x) #x
+#define MSG_ID(id, severity) { STR(id), FSCK_##severity },
 static struct {
+	const char *id_string;
 	int severity;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ -1 }
+	{ NULL, -1 }
 };
 #undef MSG_ID
 
+static int parse_msg_id(const char *text, int len)
+{
+	int i, j;
+
+	for (i = 0; i < FSCK_MSG_MAX; i++) {
+		const char *key = msg_id_info[i].id_string;
+		/* id_string is upper-case, with underscores */
+		for (j = 0; j < len; j++) {
+			char c = *(key++);
+			if (c == '_')
+				c = '-';
+			if (text[j] != tolower(c))
+				break;
+		}
+		if (j == len && !*key)
+			return i;
+	}
+
+	die("Unhandled message id: %.*s", len, text);
+}
+
 static int fsck_msg_severity(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to  warnings
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (2 preceding siblings ...)
  2015-01-19 15:50     ` [PATCH v2 03/18] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-21  8:49       ` Junio C Hamano
  2015-01-19 15:50     ` [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
                       ` (14 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The function added by this commit parses a list of settings in the form:

	missing-email=warn,bad-name=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we
need to take extra care to default to all FSCK_ERROR in those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fsck.h |  7 +++++--
 2 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 2d91e28..7d4c22c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -100,13 +100,67 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 {
 	int severity;
 
-	severity = msg_id_info[msg_id].severity;
-	if (options->strict && severity == FSCK_WARN)
-		severity = FSCK_ERROR;
+	if (options->msg_severity && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
+		severity = options->msg_severity[msg_id];
+	else {
+		severity = msg_id_info[msg_id].severity;
+		if (options->strict && severity == FSCK_WARN)
+			severity = FSCK_ERROR;
+	}
 
 	return severity;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+	int match_len = strlen(match);
+	if (match_len != len)
+		return -1;
+	return memcmp(string, match, len);
+}
+
+void fsck_set_severity(struct fsck_options *options, const char *mode)
+{
+	int severity = FSCK_ERROR;
+
+	if (!options->msg_severity) {
+		int i;
+		int *msg_severity = malloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			msg_severity[i] = fsck_msg_severity(i, options);
+		options->msg_severity = msg_severity;
+	}
+
+	while (*mode) {
+		int len = strcspn(mode, " ,|"), equal, msg_id;
+
+		if (!len) {
+			mode++;
+			continue;
+		}
+
+		for (equal = 0; equal < len; equal++)
+			if (mode[equal] == '=')
+				break;
+
+		if (equal < len) {
+			if (!substrcmp(mode, equal, "error"))
+				severity = FSCK_ERROR;
+			else if (!substrcmp(mode, equal, "warn"))
+				severity = FSCK_WARN;
+			else
+				die("Unknown fsck message severity: '%.*s'",
+					equal, mode);
+			mode += equal + 1;
+			len -= equal + 1;
+		}
+
+		msg_id = parse_msg_id(mode, len);
+		options->msg_severity[msg_id] = severity;
+		mode += len;
+	}
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -596,6 +650,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int severity, const char *message)
 {
+	if (severity == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index f6f268a..4349860 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,8 @@
 
 struct fsck_options;
 
+void fsck_set_severity(struct fsck_options *options, const char *mode);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +27,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	int *msg_severity;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 05/18] fsck: Allow demoting errors to warnings via  receive.fsck.warn = <key>
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (3 preceding siblings ...)
  2015-01-19 15:50     ` [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-21  8:54       ` Junio C Hamano
  2015-01-19 15:50     ` [PATCH v2 06/18] fsck: Report the ID of the error/warning Johannes Schindelin
                       ` (13 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.warn = missing-email

The value is actually a comma-separated list, and there is a
corresponding receive.fsck.error setting.

In case that the same key is listed in multiple receive.fsck.* lines in
the config, the latter configuration wins.

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 26 ++++++++++++++++++++++----
 builtin/unpack-objects.c |  5 +++++
 3 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 925f7b5..f464ca0 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,6 +1565,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (starts_with(arg, "--strict=")) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_set_severity(&fsck_options, arg + 9);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e0ce78e..da2e019 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -36,6 +36,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_severity = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -115,6 +116,13 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (starts_with(var, "receive.fsck.")) {
+		if (fsck_severity.len)
+			strbuf_addch(&fsck_severity, ',');
+		strbuf_addf(&fsck_severity, "%s=%s", var + 13, value);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1470,8 +1478,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
 		if (quiet)
 			argv_array_push(&child.args, "-q");
-		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+		if (fsck_objects) {
+			if (fsck_severity.len)
+				argv_array_pushf(&child.args, "--strict=%s",
+					fsck_severity.buf);
+			else
+				argv_array_push(&child.args, "--strict");
+		}
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1488,8 +1501,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
-		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+		if (fsck_objects) {
+			if (fsck_severity.len)
+				argv_array_pushf(&child.args, "--strict=%s",
+					fsck_severity.buf);
+			else
+				argv_array_push(&child.args, "--strict");
+		}
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d17040..82f2d62 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (starts_with(arg, "--strict=")) {
+				strict = 1;
+				fsck_set_severity(&fsck_options, arg + 9);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 06/18] fsck: Report the ID of the error/warning
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (4 preceding siblings ...)
  2015-01-19 15:50     ` [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-19 15:50     ` [PATCH v2 07/18] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
                       ` (12 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missing-email", the user might want to call `git config
receive.fsck.warn missing-email`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c          | 19 +++++++++++++++++++
 t/t1450-fsck.sh |  4 ++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 7d4c22c..78944f0 100644
--- a/fsck.c
+++ b/fsck.c
@@ -161,6 +161,23 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 	}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c == '_')
+			c = '-';
+		else
+			c = tolower(c);
+		strbuf_addch(sb, c);
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -169,6 +186,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	append_msg_id(&sb, msg_id_info[id].id_string);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_severity, sb.buf);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cfb32b6..ea0f216 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: invalid-tag-name: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missing-tagger-entry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 07/18] fsck: Make fsck_ident() warn-friendly
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (5 preceding siblings ...)
  2015-01-19 15:50     ` [PATCH v2 06/18] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-21  8:56       ` Junio C Hamano
  2015-01-19 15:50     ` [PATCH v2 08/18] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
                       ` (11 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index 78944f0..233385b 100644
--- a/fsck.c
+++ b/fsck.c
@@ -453,40 +453,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 08/18] fsck: Make fsck_commit() warn-friendly
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (6 preceding siblings ...)
  2015-01-19 15:50     ` [PATCH v2 07/18] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-01-19 15:50     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 09/18] fsck: Handle multiple authors in commits specially Johannes Schindelin
                       ` (10 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:50 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missing-author error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 233385b..a3b1429 100644
--- a/fsck.c
+++ b/fsck.c
@@ -508,12 +508,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -522,11 +528,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 09/18] fsck: Handle multiple authors in commits specially
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (7 preceding siblings ...)
  2015-01-19 15:50     ` [PATCH v2 08/18] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 10/18] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
                       ` (9 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
receive.fsck.warn = missing-committer, but that could hide missing tree
objects in the same commit because we cannot continue verifying any
commit object after encountering a missing committer line, while we can
continue in the case of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fsck.c b/fsck.c
index a3b1429..ed0a669 100644
--- a/fsck.c
+++ b/fsck.c
@@ -38,6 +38,7 @@
 	FUNC(MISSING_TREE, ERROR) \
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
 	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
@@ -545,6 +546,14 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+		if (err)
+			return err;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
+	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 10/18] fsck: Make fsck_tag() warn-friendly
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (8 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 09/18] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
                       ` (8 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index ed0a669..b8cbbfb 100644
--- a/fsck.c
+++ b/fsck.c
@@ -614,7 +614,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.*
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (9 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 10/18] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-21  8:59       ` Junio C Hamano
  2015-01-19 15:51     ` [PATCH v2 12/18] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
                       ` (7 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..d491172 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,24 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit << EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.warn = missing-email' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missing-email" act
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 12/18] fsck: Disallow demoting grave fsck errors to  warnings
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (10 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 13/18] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
                       ` (6 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 13 +++++++++++--
 t/t5504-fetch-receive-strict.sh |  9 +++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index b8cbbfb..f2c8044 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,12 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FSCK_FATAL -1
+
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
 	FUNC(BAD_EMAIL, ERROR) \
@@ -40,10 +45,8 @@
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
-	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
-	FUNC(UNTERMINATED_HEADER, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
@@ -157,6 +160,9 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 		}
 
 		msg_id = parse_msg_id(mode, len);
+		if (severity != FSCK_ERROR &&
+				msg_id_info[msg_id].severity == FSCK_FATAL)
+			die("Cannot demote %.*s", len, mode);
 		options->msg_severity[msg_id] = severity;
 		mode += len;
 	}
@@ -187,6 +193,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	if (msg_severity == FSCK_FATAL)
+		msg_severity = FSCK_ERROR;
+
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index d491172..2757c3a 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -135,4 +135,13 @@ test_expect_success 'push with receive.fsck.warn = missing-email' '
 	grep "missing-email" act
 '
 
+test_expect_success 'receive.fsck.warn = unterminated-header triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config receive.fsck.warn unterminated-header &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminated-header" act
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 13/18] fsck: Optionally ignore specific fsck issues  completely
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (11 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 12/18] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 14/18] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
                       ` (5 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective error to "ignore".

This change "abuses" the warn=missing-email test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 7 ++++++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index f2c8044..649c8fe 100644
--- a/fsck.c
+++ b/fsck.c
@@ -152,6 +152,8 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 				severity = FSCK_ERROR;
 			else if (!substrcmp(mode, equal, "warn"))
 				severity = FSCK_WARN;
+			else if (!substrcmp(mode, equal, "ignore"))
+				severity = FSCK_IGNORE;
 			else
 				die("Unknown fsck message severity: '%.*s'",
 					equal, mode);
@@ -193,6 +195,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	if (msg_severity == FSCK_IGNORE)
+		return 0;
+
 	if (msg_severity == FSCK_FATAL)
 		msg_severity = FSCK_ERROR;
 
diff --git a/fsck.h b/fsck.h
index 4349860..7be6c50 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 2757c3a..e81cedb 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -132,7 +132,12 @@ test_expect_success 'push with receive.fsck.warn = missing-email' '
 	test_must_fail git push --porcelain dst bogus &&
 	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missing-email" act
+	grep "missing-email" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config receive.fsck.ignore missing-email &&
+	git  --git-dir=dst/.git config receive.fsck.warn bad-date &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missing-email" act
 '
 
 test_expect_success 'receive.fsck.warn = unterminated-header triggers error' '
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 14/18] fsck: Allow upgrading fsck warnings to errors
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (12 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 13/18] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 15/18] fsck: Document the new receive.fsck.* options Johannes Schindelin
                       ` (4 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalid-tag-name` and
`missing-tagger-entry` to the receive.fsck.error config setting.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 +++++++++++++++++-------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 649c8fe..480cd87 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,6 +10,7 @@
 #include "utf8.h"
 
 #define FSCK_FATAL -1
+#define FSCK_INFO -2
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -55,10 +56,11 @@
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
-	FUNC(INVALID_TAG_NAME, WARN) \
-	FUNC(MISSING_TAGGER_ENTRY, WARN) \
 	FUNC(NULL_SHA1, WARN) \
-	FUNC(ZERO_PADDED_FILEMODE, WARN)
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(INVALID_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, severity) FSCK_MSG_##id,
 enum fsck_msg_id {
@@ -200,6 +202,8 @@ static int report(struct fsck_options *options, struct object *object,
 
 	if (msg_severity == FSCK_FATAL)
 		msg_severity = FSCK_ERROR;
+	else if (msg_severity == FSCK_INFO)
+		msg_severity = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
@@ -658,15 +662,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 15/18] fsck: Document the new receive.fsck.* options.
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (13 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 14/18] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 22:44       ` Eric Sunshine
  2015-01-19 15:51     ` [PATCH v2 16/18] fsck: Support demoting errors to warnings Johannes Schindelin
                       ` (3 subsequent siblings)
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ae6791d..7371a5f 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2130,6 +2130,31 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.*::
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.*`
+	settings. These settings contain comma-separated lists of fsck
+	message IDs. For convenience, fsck prefixes the error/warning with
+	the message ID, e.g. "missing-email: invalid author/committer line
+	- missing email" means that setting `receive.fsck.ignore =
+	missing-email` will hide that issue.
++
+--
+	error::
+		a comma-separated list of fsck message IDs that should be
+		trigger fsck to error out.
+	warn::
+		a comma-separated list of fsck message IDs that should be
+		displayed, but fsck should continue to error out.
+	ignore::
+		a comma-separated list of fsck message IDs that should be
+		ignored completely.
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories certain known issues but still catch
+other issues.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 16/18] fsck: Support demoting errors to warnings
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (14 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 15/18] fsck: Document the new receive.fsck.* options Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:51     ` [PATCH v2 17/18] fsck: Introduce `git fsck --quick` Johannes Schindelin
                       ` (2 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.ignore=missing-email fsck` would hide problems with
missing emails in author, committer and tagger lines.

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.* from the receive.fsck.*
settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 13 +++++++++++++
 builtin/fsck.c           | 15 +++++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 39 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 7371a5f..0daba8a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1208,6 +1208,19 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.*::
+	The `fsck.error`, `fsck.warn` and `fsck.ignore` settings specify
+	comma-separated lists of fsck message IDs which should trigger
+	fsck to error out, to print the message and continue, or to ignore
+	said messages, respectively.
++
+For convenience, fsck prefixes the error/warning with the name of the option,
+e.g.  "missing-email: invalid author/committer line - missing email" means
+that setting `fsck.ignore = missing-email` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99d4538..d5403c4 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,19 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (starts_with(var, "fsck.")) {
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "%s=%s", var + 5, value);
+		fsck_set_severity(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *severity,
                       const char *err)
 {
@@ -638,6 +651,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index ea0f216..a79ff9f 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.ignore=multiple-authors fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 17/18] fsck: Introduce `git fsck --quick`
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (15 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 16/18] fsck: Support demoting errors to warnings Johannes Schindelin
@ 2015-01-19 15:51     ` Johannes Schindelin
  2015-01-19 15:52     ` [PATCH v2 18/18] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
  2015-01-21  9:17     ` [PATCH v2 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:51 UTC (permalink / raw)
  To: gitster; +Cc: git

This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 	 [--[no-]dangling] [--[no-]progress] [<object>*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--quick::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index d5403c4..c767909 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -184,6 +185,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (quick && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -618,6 +621,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -654,7 +658,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!quick)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index a79ff9f..1c624a3 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+	rm -rf quick &&
+	git init quick &&
+	(
+		cd quick &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --quick &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --quick
+	)
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v2 18/18] fsck: git receive-pack: support excluding objects  from fsck'ing
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (16 preceding siblings ...)
  2015-01-19 15:51     ` [PATCH v2 17/18] fsck: Introduce `git fsck --quick` Johannes Schindelin
@ 2015-01-19 15:52     ` Johannes Schindelin
  2015-01-21  9:02       ` Junio C Hamano
  2015-01-21  9:17     ` [PATCH v2 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  18 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-19 15:52 UTC (permalink / raw)
  To: gitster; +Cc: git

The optional new config option `receive.fsck.skiplist` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/receive-pack.c          |  9 +++++++
 fsck.c                          | 53 +++++++++++++++++++++++++++++++++++++++++
 fsck.h                          |  1 +
 t/t5504-fetch-receive-strict.sh | 12 ++++++++++
 4 files changed, 75 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index da2e019..40514c2 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -116,6 +116,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (starts_with(var, "receive.fsck.skiplist")) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		if (fsck_severity.len)
+			strbuf_addch(&fsck_severity, ',');
+		strbuf_addf(&fsck_severity, "skiplist=%s", path);
+		return 0;
+	}
+
 	if (starts_with(var, "receive.fsck.")) {
 		if (fsck_severity.len)
 			strbuf_addch(&fsck_severity, ',');
diff --git a/fsck.c b/fsck.c
index 480cd87..dbf9fa1 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 #include "utf8.h"
+#include "sha1-array.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -117,6 +118,43 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 	return severity;
 }
 
+static void init_skiplist(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
+	else {
+		sorted = 1;
+		options->skiplist = &skiplist;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skiplist.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
 	int match_len = strlen(match);
@@ -156,6 +194,17 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 				severity = FSCK_WARN;
 			else if (!substrcmp(mode, equal, "ignore"))
 				severity = FSCK_IGNORE;
+			else if (!substrcmp(mode, equal, "skiplist")) {
+				char *path = xstrndup(mode + equal + 1,
+					len - equal - 1);
+
+				if (equal == len)
+					die("skiplist requires a path");
+				init_skiplist(options, path);
+				free(path);
+				mode += len;
+				continue;
+			}
 			else
 				die("Unknown fsck message severity: '%.*s'",
 					equal, mode);
@@ -700,6 +749,10 @@ static int fsck_tag(struct tag *tag, const char *data,
 int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
+	if (options->skiplist &&
+			sha1_array_lookup(options->skiplist, obj->sha1) >= 0)
+		return 0;
+
 	if (!obj)
 		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
diff --git a/fsck.h b/fsck.h
index 7be6c50..cae280e 100644
--- a/fsck.h
+++ b/fsck.h
@@ -29,6 +29,7 @@ struct fsck_options {
 	fsck_error error_func;
 	unsigned strict:1;
 	int *msg_severity;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index e81cedb..21fa9c8 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skiplist' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
+	echo $commit > dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.warn = missing-email' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.0.0.rc3.9669.g840d1f9

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 15/18] fsck: Document the new receive.fsck.* options.
  2015-01-19 15:51     ` [PATCH v2 15/18] fsck: Document the new receive.fsck.* options Johannes Schindelin
@ 2015-01-19 22:44       ` Eric Sunshine
  2015-01-20  7:24         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Eric Sunshine @ 2015-01-19 22:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Git List

On Mon, Jan 19, 2015 at 10:51 AM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index ae6791d..7371a5f 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2130,6 +2130,31 @@ receive.fsckObjects::
>         Defaults to false. If not set, the value of `transfer.fsckObjects`
>         is used instead.
>
> +receive.fsck.*::
> +       When `receive.fsckObjects` is set to true, errors can be switched
> +       to warnings and vice versa by configuring the `receive.fsck.*`
> +       settings. These settings contain comma-separated lists of fsck
> +       message IDs. For convenience, fsck prefixes the error/warning with
> +       the message ID, e.g. "missing-email: invalid author/committer line
> +       - missing email" means that setting `receive.fsck.ignore =
> +       missing-email` will hide that issue.
> ++
> +--
> +       error::
> +               a comma-separated list of fsck message IDs that should be
> +               trigger fsck to error out.
> +       warn::
> +               a comma-separated list of fsck message IDs that should be
> +               displayed, but fsck should continue to error out.
> +       ignore::
> +               a comma-separated list of fsck message IDs that should be
> +               ignored completely.
> ++
> +This feature is intended to support working with legacy repositories
> +which would not pass pushing when `receive.fsckObjects = true`, allowing
> +the host to accept repositories certain known issues but still catch

s/certain/with &/

> +other issues.
> +
>  receive.unpackLimit::
>         If the number of objects received in a push is below this
>         limit then the objects will be unpacked into loose object
> --
> 2.0.0.rc3.9669.g840d1f9
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 15/18] fsck: Document the new receive.fsck.* options.
  2015-01-19 22:44       ` Eric Sunshine
@ 2015-01-20  7:24         ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-20  7:24 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, Git List, ericsunshine

Hi Eric,

On 2015-01-19 23:44, Eric Sunshine wrote:
> On Mon, Jan 19, 2015 at 10:51 AM, Johannes Schindelin
> <johannes.schindelin@gmx.de> wrote:
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index ae6791d..7371a5f 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -2130,6 +2130,31 @@ receive.fsckObjects::
>>         Defaults to false. If not set, the value of `transfer.fsckObjects`
>>         is used instead.
>>
>> +receive.fsck.*::
>> +       When `receive.fsckObjects` is set to true, errors can be switched
>> +       to warnings and vice versa by configuring the `receive.fsck.*`
>> +       settings. These settings contain comma-separated lists of fsck
>> +       message IDs. For convenience, fsck prefixes the error/warning with
>> +       the message ID, e.g. "missing-email: invalid author/committer line
>> +       - missing email" means that setting `receive.fsck.ignore =
>> +       missing-email` will hide that issue.
>> ++
>> +--
>> +       error::
>> +               a comma-separated list of fsck message IDs that should be
>> +               trigger fsck to error out.
>> +       warn::
>> +               a comma-separated list of fsck message IDs that should be
>> +               displayed, but fsck should continue to error out.
>> +       ignore::
>> +               a comma-separated list of fsck message IDs that should be
>> +               ignored completely.
>> ++
>> +This feature is intended to support working with legacy repositories
>> +which would not pass pushing when `receive.fsckObjects = true`, allowing
>> +the host to accept repositories certain known issues but still catch
> 
> s/certain/with &/

Good catch. Fixed here (to be included in the next re-roll):

https://github.com/dscho/git/commit/2517476646835e61c33581935fc68062a8ff3f56#diff-ba92ef40c548c691816362bbdc35a613R2155

Thanks!
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to  warnings
  2015-01-19 15:50     ` [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-01-21  8:49       ` Junio C Hamano
  2015-01-21 17:42         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21  8:49 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> +static inline int substrcmp(const char *string, int len, const char *match)
> +{
> +	int match_len = strlen(match);
> +	if (match_len != len)
> +		return -1;
> +	return memcmp(string, match, len);
> +}

Is this what we call "starts_with()" these days?

> +void fsck_set_severity(struct fsck_options *options, const char *mode)
> +{
> +	int severity = FSCK_ERROR;
> +
> +	if (!options->msg_severity) {
> +		int i;
> +		int *msg_severity = malloc(sizeof(int) * FSCK_MSG_MAX);

xmalloc()?

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 05/18] fsck: Allow demoting errors to warnings via  receive.fsck.warn = <key>
  2015-01-19 15:50     ` [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
@ 2015-01-21  8:54       ` Junio C Hamano
  2015-01-21 18:01         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21  8:54 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>  
> +	if (starts_with(var, "receive.fsck.")) {
> +		if (fsck_severity.len)
> +			strbuf_addch(&fsck_severity, ',');
> +		strbuf_addf(&fsck_severity, "%s=%s", var + 13, value);

Wouldn't it be safer to use skip_prefix() that lets you avoid the
hardcoded "var + 13" here?

> @@ -1470,8 +1478,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>  		argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
>  		if (quiet)
>  			argv_array_push(&child.args, "-q");
> -		if (fsck_objects)
> -			argv_array_push(&child.args, "--strict");
> +		if (fsck_objects) {
> +			if (fsck_severity.len)
> +				argv_array_pushf(&child.args, "--strict=%s",
> +					fsck_severity.buf);
> +			else
> +				argv_array_push(&child.args, "--strict");
> +		}
>  		child.no_stdout = 1;
>  		child.err = err_fd;
>  		child.git_cmd = 1;
> @@ -1488,8 +1501,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>  
>  		argv_array_pushl(&child.args, "index-pack",
>  				 "--stdin", hdr_arg, keep_arg, NULL);
> -		if (fsck_objects)
> -			argv_array_push(&child.args, "--strict");
> +		if (fsck_objects) {
> +			if (fsck_severity.len)
> +				argv_array_pushf(&child.args, "--strict=%s",
> +					fsck_severity.buf);
> +			else
> +				argv_array_push(&child.args, "--strict");
> +		}

Hmm.  The above two hunks look suspiciously similar.  Would it be
worth to give them a single helper function?

> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> index 6d17040..82f2d62 100644
> --- a/builtin/unpack-objects.c
> +++ b/builtin/unpack-objects.c
> @@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
>  				strict = 1;
>  				continue;
>  			}
> +			if (starts_with(arg, "--strict=")) {
> +				strict = 1;
> +				fsck_set_severity(&fsck_options, arg + 9);
> +				continue;
> +			}
>  			if (starts_with(arg, "--pack_header=")) {
>  				struct pack_header *hdr;
>  				char *c;

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 07/18] fsck: Make fsck_ident() warn-friendly
  2015-01-19 15:50     ` [PATCH v2 07/18] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-01-21  8:56       ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21  8:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> When fsck_ident() identifies a problem with the ident, it should still
> advance the pointer to the next line so that fsck can continue in the
> case of a mere warning.

Quite sensible.

>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c | 49 +++++++++++++++++++++++++++----------------------
>  1 file changed, 27 insertions(+), 22 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index 78944f0..233385b 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -453,40 +453,45 @@ static int require_end_of_header(const void *data, unsigned long size,
>  
>  static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
>  {
> +	const char *p = *ident;
>  	char *end;
>  
> -	if (**ident == '<')
> +	*ident = strchrnul(*ident, '\n');
> +	if (**ident == '\n')
> +		(*ident)++;
> +
> +	if (*p == '<')
>  		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.*
  2015-01-19 15:51     ` [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
@ 2015-01-21  8:59       ` Junio C Hamano
  2015-01-21 18:14         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21  8:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  t/t5504-fetch-receive-strict.sh | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
> index 69ee13c..d491172 100755
> --- a/t/t5504-fetch-receive-strict.sh
> +++ b/t/t5504-fetch-receive-strict.sh
> @@ -115,4 +115,24 @@ test_expect_success 'push with transfer.fsckobjects' '
>  	test_cmp exp act
>  '
>  
> +cat >bogus-commit << EOF

"cat >bogus-commit <<\EOF", to reduce the mental burden of readers.

> +tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
> +author Bugs Bunny 1234567890 +0000
> +committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
> +
> +This commit object intentionally broken
> +EOF
> +
> +test_expect_success 'push with receive.fsck.warn = missing-email' '
> +	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
> +	git push . $commit:refs/heads/bogus &&
> +	rm -rf dst &&
> +	git init dst &&
> +	git --git-dir=dst/.git config receive.fsckobjects true &&
> +	test_must_fail git push --porcelain dst bogus &&
> +	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
> +	git push --porcelain dst bogus >act 2>&1 &&
> +	grep "missing-email" act
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 18/18] fsck: git receive-pack: support excluding objects  from fsck'ing
  2015-01-19 15:52     ` [PATCH v2 18/18] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-01-21  9:02       ` Junio C Hamano
  2015-01-21 18:17         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21  9:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> The optional new config option `receive.fsck.skiplist` specifies the path
> to a file listing the names, i.e. SHA-1s, one per line, of objects that
> are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

Makes sense, but I wonder if it makes sense to have a similar "ok to
be broken" list for "git fsck" (or perhaps they could even use the
same list) for exactly the same reason why this option makes sense.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 00/18] Introduce an internal API to interact with the  fsck machinery
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
                       ` (17 preceding siblings ...)
  2015-01-19 15:52     ` [PATCH v2 18/18] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-01-21  9:17     ` Junio C Hamano
  2015-01-21 18:24       ` Johannes Schindelin
  18 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21  9:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

The documentation did not format well.  Tentatively I added the
attached fix-up on top of the series before merging to 'pu'.

 * The wildcard in "fsck.*" and "receive.fsck.*" may have made sense
   back in v1 when the variables are unbounded set, but v2 fixes it
   and we now have a known fixed set of variables, so lets list them
   explicitly (this is not a "fix to unbreak formatting").

 * The "--" that is not closed was giving me this:

    asciidoc: ERROR: git-config.txt: line 413: section title not permitted in delimited block
    asciidoc: ERROR: config.txt: line 2414: [blockdef-open] missing closing delimiter
    make: *** [git-config.xml] Error 1

   (this is "workaround to unbreak formatting"; I didn't check the
   actual output closely).

 * the line that begins with "- missing email" after indent was
   taken as an bulletted item or something, so I rewrapped the
   paragraph somewhat to avoid having the dash at the beginning.


 Documentation/config.txt | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6718578..aae66bb 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1208,7 +1208,9 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
-fsck.*::
+fsck.error::
+fsck.warn::
+fsck.ignore::
 	The `fsck.error`, `fsck.warn` and `fsck.ignore` settings specify
 	comma-separated lists of fsck message IDs which should trigger
 	fsck to error out, to print the message and continue, or to ignore
@@ -2138,25 +2140,26 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
-receive.fsck.*::
+receive.fsck.error::
+receive.fsck.warn::
+receive.fsck.ignore::
 	When `receive.fsckObjects` is set to true, errors can be switched
 	to warnings and vice versa by configuring the `receive.fsck.*`
 	settings. These settings contain comma-separated lists of fsck
 	message IDs. For convenience, fsck prefixes the error/warning with
-	the message ID, e.g. "missing-email: invalid author/committer line
-	- missing email" means that setting `receive.fsck.ignore =
-	missing-email` will hide that issue.
-+
---
-	error::
-		a comma-separated list of fsck message IDs that should be
-		trigger fsck to error out.
-	warn::
-		a comma-separated list of fsck message IDs that should be
-		displayed, but fsck should continue to error out.
-	ignore::
-		a comma-separated list of fsck message IDs that should be
-		ignored completely.
+	the message ID, e.g. "missing-email: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.ignore = missing-email` will hide that issue.
++
+error;;
+	a comma-separated list of fsck message IDs that should be
+	trigger fsck to error out.
+warn;;
+	a comma-separated list of fsck message IDs that should be
+	displayed, but fsck should continue to error out.
+ignore;;
+	a comma-separated list of fsck message IDs that should be
+	ignored completely.
 +
 This feature is intended to support working with legacy repositories
 which would not pass pushing when `receive.fsckObjects = true`, allowing

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to  warnings
  2015-01-21  8:49       ` Junio C Hamano
@ 2015-01-21 17:42         ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 17:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On 2015-01-21 09:49, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> +static inline int substrcmp(const char *string, int len, const char *match)
>> +{
>> +	int match_len = strlen(match);
>> +	if (match_len != len)
>> +		return -1;
>> +	return memcmp(string, match, len);
>> +}
> 
> Is this what we call "starts_with()" these days?

Unfortunately not quite: It really requires the substring specified by `string` and `len` to be identical to the full `match`. For example, `substrcmp("Hello world!", 5, "Hell")` would report a failure (because the substring "Hello" is *not* matching "Hell"), while `starts_with("Hello world!", "Hell")` would obviously succeed.

>> +void fsck_set_severity(struct fsck_options *options, const char *mode)
>> +{
>> +	int severity = FSCK_ERROR;
>> +
>> +	if (!options->msg_severity) {
>> +		int i;
>> +		int *msg_severity = malloc(sizeof(int) * FSCK_MSG_MAX);
> 
> xmalloc()?

Absolutely! Fixed.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key>
  2015-01-21  8:54       ` Junio C Hamano
@ 2015-01-21 18:01         ` Johannes Schindelin
  2015-01-21 21:47           ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 18:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On 2015-01-21 09:54, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>>
>> +	if (starts_with(var, "receive.fsck.")) {
>> +		if (fsck_severity.len)
>> +			strbuf_addch(&fsck_severity, ',');
>> +		strbuf_addf(&fsck_severity, "%s=%s", var + 13, value);
> 
> Wouldn't it be safer to use skip_prefix() that lets you avoid the
> hardcoded "var + 13" here?

Yep, and much more elegant, too. I also fixed three more instances of the same pattern.
 
>> @@ -1470,8 +1478,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>>  		argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
>>  		if (quiet)
>>  			argv_array_push(&child.args, "-q");
>> -		if (fsck_objects)
>> -			argv_array_push(&child.args, "--strict");
>> +		if (fsck_objects) {
>> +			if (fsck_severity.len)
>> +				argv_array_pushf(&child.args, "--strict=%s",
>> +					fsck_severity.buf);
>> +			else
>> +				argv_array_push(&child.args, "--strict");
>> +		}
>>  		child.no_stdout = 1;
>>  		child.err = err_fd;
>>  		child.git_cmd = 1;
>> @@ -1488,8 +1501,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>>
>>  		argv_array_pushl(&child.args, "index-pack",
>>  				 "--stdin", hdr_arg, keep_arg, NULL);
>> -		if (fsck_objects)
>> -			argv_array_push(&child.args, "--strict");
>> +		if (fsck_objects) {
>> +			if (fsck_severity.len)
>> +				argv_array_pushf(&child.args, "--strict=%s",
>> +					fsck_severity.buf);
>> +			else
>> +				argv_array_push(&child.args, "--strict");
>> +		}
> 
> Hmm.  The above two hunks look suspiciously similar.  Would it be
> worth to give them a single helper function?

Hmm. Not sure. I see what you mean, but for now I found

+                       argv_array_pushf(&child.args, "--strict%s%s",
+                               fsck_severity.len ? "=" : "",
+                               fsck_severity.buf);

to be more elegant than to add a fully-fledged new function. But if you feel strongly, I will gladly implement a separate function; I would appreciate suggestions as to the function name...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.*
  2015-01-21  8:59       ` Junio C Hamano
@ 2015-01-21 18:14         ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 18:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On 2015-01-21 09:59, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
>>  t/t5504-fetch-receive-strict.sh | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
>> index 69ee13c..d491172 100755
>> --- a/t/t5504-fetch-receive-strict.sh
>> +++ b/t/t5504-fetch-receive-strict.sh
>> @@ -115,4 +115,24 @@ test_expect_success 'push with transfer.fsckobjects' '
>>  	test_cmp exp act
>>  '
>>
>> +cat >bogus-commit << EOF
> 
> "cat >bogus-commit <<\EOF", to reduce the mental burden of readers.

Certainly!
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 18/18] fsck: git receive-pack: support excluding objects  from fsck'ing
  2015-01-21  9:02       ` Junio C Hamano
@ 2015-01-21 18:17         ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 18:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On 2015-01-21 10:02, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> The optional new config option `receive.fsck.skiplist` specifies the path
>> to a file listing the names, i.e. SHA-1s, one per line, of objects that
>> are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.
> 
> Makes sense, but I wonder if it makes sense to have a similar "ok to
> be broken" list for "git fsck" (or perhaps they could even use the
> same list) for exactly the same reason why this option makes sense.

Sure! The most pressing use case for the skiplist feature is a Git server, hence this patch. I will implement the corresponding `git fsck` patch before re-submitting.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 00/18] Introduce an internal API to interact with the fsck machinery
  2015-01-21  9:17     ` [PATCH v2 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
@ 2015-01-21 18:24       ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 18:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On 2015-01-21 10:17, Junio C Hamano wrote:
> The documentation did not format well.  Tentatively I added the
> attached fix-up on top of the series before merging to 'pu'.
> 
>  [...]

Sorry for that! I have to admit that I did not even build the documentation :-( I incorporated your fixes into the respective patches.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH v3 00/19] Introduce an internal API to interact with the fsck machinery
  2014-12-10 18:34 ` [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
@ 2015-01-21 19:23   ` Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 01/19] fsck: Introduce fsck options Johannes Schindelin
                       ` (18 more replies)
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2 siblings, 19 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:23 UTC (permalink / raw)
  To: gitster; +Cc: git

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.

Interdiff vs v2 below the diffstat.

Johannes Schindelin (19):
  fsck: Introduce fsck options
  fsck: Introduce identifiers for fsck messages
  fsck: Provide a function to parse fsck message IDs
  fsck: Offer a function to demote fsck errors to warnings
  fsck: Allow demoting errors to warnings via receive.fsck.warn = <key>
  fsck: Report the ID of the error/warning
  fsck: Make fsck_ident() warn-friendly
  fsck: Make fsck_commit() warn-friendly
  fsck: Handle multiple authors in commits specially
  fsck: Make fsck_tag() warn-friendly
  fsck: Add a simple test for receive.fsck.*
  fsck: Disallow demoting grave fsck errors to warnings
  fsck: Optionally ignore specific fsck issues completely
  fsck: Allow upgrading fsck warnings to errors
  fsck: Document the new receive.fsck.* options.
  fsck: Support demoting errors to warnings
  fsck: Introduce `git fsck --quick`
  fsck: git receive-pack: support excluding objects from fsck'ing
  fsck: support ignoring objects in `git fsck` via fsck.skiplist

 Documentation/config.txt        |  57 +++++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  76 ++++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  24 +-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 525 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  27 ++-
 t/t1450-fsck.sh                 |  37 ++-
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  46 ++++
 11 files changed, 668 insertions(+), 162 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0daba8a..644411a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1208,7 +1208,9 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
-fsck.*::
+fsck.error::
+fsck.warn::
+fsck.ignore::
 	The `fsck.error`, `fsck.warn` and `fsck.ignore` settings specify
 	comma-separated lists of fsck message IDs which should trigger
 	fsck to error out, to print the message and continue, or to ignore
@@ -1221,6 +1223,13 @@ that setting `fsck.ignore = missing-email` will hide that issue.
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
 
+fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
@@ -2143,31 +2152,41 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
-receive.fsck.*::
+receive.fsck.error::
+receive.fsck.warn::
+receive.fsck.ignore::
 	When `receive.fsckObjects` is set to true, errors can be switched
 	to warnings and vice versa by configuring the `receive.fsck.*`
 	settings. These settings contain comma-separated lists of fsck
 	message IDs. For convenience, fsck prefixes the error/warning with
-	the message ID, e.g. "missing-email: invalid author/committer line
-	- missing email" means that setting `receive.fsck.ignore =
-	missing-email` will hide that issue.
+	the message ID, e.g. "missing-email: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.ignore = missing-email` will hide that issue.
 +
 --
-	error::
-		a comma-separated list of fsck message IDs that should be
-		trigger fsck to error out.
-	warn::
-		a comma-separated list of fsck message IDs that should be
-		displayed, but fsck should continue to error out.
-	ignore::
-		a comma-separated list of fsck message IDs that should be
-		ignored completely.
+error;;
+	a comma-separated list of fsck message IDs that should be
+	trigger fsck to error out.
+warn;;
+	a comma-separated list of fsck message IDs that should be
+	displayed, but fsck should continue to error out.
+ignore;;
+	a comma-separated list of fsck message IDs that should be
+	ignored completely.
+--
 +
 This feature is intended to support working with legacy repositories
 which would not pass pushing when `receive.fsckObjects = true`, allowing
-the host to accept repositories certain known issues but still catch
+the host to accept repositories with certain known issues but still catch
 other issues.
 
+receive.fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
diff --git a/builtin/fsck.c b/builtin/fsck.c
index c767909..760b4bd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -49,9 +49,19 @@ static int show_dangling = 1;
 
 static int fsck_config(const char *var, const char *value, void *cb)
 {
-	if (starts_with(var, "fsck.")) {
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
 		struct strbuf sb = STRBUF_INIT;
-		strbuf_addf(&sb, "%s=%s", var + 5, value);
+		strbuf_addf(&sb, "skiplist=%s", path);
+		fsck_set_severity(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	if (skip_prefix(var, "fsck.", &var)) {
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "%s=%s", var, value);
 		fsck_set_severity(&fsck_obj_options, sb.buf);
 		strbuf_release(&sb);
 		return 0;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f464ca0..b82b4dd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,10 +1565,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
-			} else if (starts_with(arg, "--strict=")) {
+			} else if (skip_prefix(arg, "--strict=", &arg)) {
 				strict = 1;
 				do_fsck_object = 1;
-				fsck_set_severity(&fsck_options, arg + 9);
+				fsck_set_severity(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 40514c2..8e6d1a1 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -116,7 +116,7 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (starts_with(var, "receive.fsck.skiplist")) {
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
 		const char *path = is_absolute_path(value) ?
 			value : git_path("%s", value);
 		if (fsck_severity.len)
@@ -125,10 +125,9 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (starts_with(var, "receive.fsck.")) {
-		if (fsck_severity.len)
-			strbuf_addch(&fsck_severity, ',');
-		strbuf_addf(&fsck_severity, "%s=%s", var + 13, value);
+	if (skip_prefix(var, "receive.fsck.", &var)) {
+		strbuf_addf(&fsck_severity, "%s%s=%s",
+			fsck_severity.len ? "," : "", var, value);
 		return 0;
 	}
 
@@ -1487,13 +1486,10 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
 		if (quiet)
 			argv_array_push(&child.args, "-q");
-		if (fsck_objects) {
-			if (fsck_severity.len)
-				argv_array_pushf(&child.args, "--strict=%s",
-					fsck_severity.buf);
-			else
-				argv_array_push(&child.args, "--strict");
-		}
+		if (fsck_objects)
+			argv_array_pushf(&child.args, "--strict%s%s",
+				fsck_severity.len ? "=" : "",
+				fsck_severity.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1510,13 +1506,10 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
-		if (fsck_objects) {
-			if (fsck_severity.len)
-				argv_array_pushf(&child.args, "--strict=%s",
-					fsck_severity.buf);
-			else
-				argv_array_push(&child.args, "--strict");
-		}
+		if (fsck_objects)
+			argv_array_pushf(&child.args, "--strict%s%s",
+				fsck_severity.len ? "=" : "",
+				fsck_severity.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 82f2d62..fe9117c 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,9 +530,9 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
-			if (starts_with(arg, "--strict=")) {
+			if (skip_prefix(arg, "--strict=", &arg)) {
 				strict = 1;
-				fsck_set_severity(&fsck_options, arg + 9);
+				fsck_set_severity(&fsck_options, arg);
 				continue;
 			}
 			if (starts_with(arg, "--pack_header=")) {
diff --git a/fsck.c b/fsck.c
index dbf9fa1..15cb8bd 100644
--- a/fsck.c
+++ b/fsck.c
@@ -169,7 +169,7 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 
 	if (!options->msg_severity) {
 		int i;
-		int *msg_severity = malloc(sizeof(int) * FSCK_MSG_MAX);
+		int *msg_severity = xmalloc(sizeof(int) * FSCK_MSG_MAX);
 		for (i = 0; i < FSCK_MSG_MAX; i++)
 			msg_severity[i] = fsck_msg_severity(i, options);
 		options->msg_severity = msg_severity;
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 21fa9c8..d367bb2 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,7 +115,7 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
-cat >bogus-commit << EOF
+cat >bogus-commit <<\EOF
 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
 author Bugs Bunny 1234567890 +0000
 committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 01/19] fsck: Introduce fsck options
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
@ 2015-01-21 19:24     ` Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
                       ` (17 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:24 UTC (permalink / raw)
  To: gitster; +Cc: git

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index a27515a..2241e29 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static unsigned char head_sha1[20];
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -630,6 +632,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4632117..925f7b5 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -74,6 +74,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -191,7 +192,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -782,10 +783,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			fsck_options.walk = mark_link;
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ac66672..6d17040 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 10bcb65..d83b811 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -25,9 +25,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -40,7 +40,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -49,14 +49,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -66,14 +66,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -81,11 +81,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -138,7 +138,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -194,7 +194,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -219,30 +219,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -250,7 +250,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -258,36 +258,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,30 +295,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,39 +328,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -376,65 +376,65 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -443,34 +443,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..07d0ab2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	unsigned strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 02/19] fsck: Introduce identifiers for fsck messages
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 01/19] fsck: Introduce fsck options Johannes Schindelin
@ 2015-01-21 19:24     ` Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
                       ` (16 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:24 UTC (permalink / raw)
  To: gitster; +Cc: git

Instead of specifying whether a message by the fsck machinery constitutes
an error or a warning, let's specify an identifier relating to the
concrete problem that was encountered. This is necessary for upcoming
support to be able to demote certain errors to warnings.

In the process, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

We could use a simple enum for the message IDs here, but we want to
guarantee that the enum values are associated with the appropriate
severity levels. Besides, we want to introduce a parser in the next commit
that maps the string representation to the enum value, hence we use the
slightly ugly preprocessor construct that is extensible for use with said
parser.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  24 ++-----
 fsck.c         | 201 +++++++++++++++++++++++++++++++++++++++++----------------
 fsck.h         |   5 +-
 3 files changed, 153 insertions(+), 77 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2241e29..99d4538 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -47,32 +47,22 @@ static int show_dangling = 1;
 #endif
 
 static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+                      const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+	        severity, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d83b811..30f7a48 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,98 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(DATE_OVERFLOW, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(INVALID_OBJECT_SHA1, ERROR) \
+	FUNC(INVALID_TAG_OBJECT, ERROR) \
+	FUNC(INVALID_TREE, ERROR) \
+	FUNC(INVALID_TYPE, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(NOT_SORTED, ERROR) \
+	FUNC(NUL_IN_HEADER, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(UNTERMINATED_HEADER, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(INVALID_TAG_NAME, WARN) \
+	FUNC(MISSING_TAGGER_ENTRY, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN)
+
+#define MSG_ID(id, severity) FSCK_MSG_##id,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+#undef MSG_ID
+
+#define MSG_ID(id, severity) { FSCK_##severity },
+static struct {
+	int severity;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ -1 }
+};
+#undef MSG_ID
+
+static int fsck_msg_severity(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
+{
+	int severity;
+
+	severity = msg_id_info[msg_id].severity;
+	if (options->strict && severity == FSCK_WARN)
+		severity = FSCK_ERROR;
+
+	return severity;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_severity = fsck_msg_severity(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_severity, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -219,25 +311,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -250,15 +342,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -266,28 +360,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,7 +389,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -312,13 +406,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,23 +422,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_INVALID_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -376,11 +470,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -391,48 +487,49 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -448,7 +545,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_INVALID_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -457,7 +554,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -470,22 +567,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int severity, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 07d0ab2..f6f268a 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 03/19] fsck: Provide a function to parse fsck message IDs
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 01/19] fsck: Introduce fsck options Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-01-21 19:24     ` Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
                       ` (15 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:24 UTC (permalink / raw)
  To: gitster; +Cc: git

This function will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. It has to handle partial strings because we would
like to be able to parse, say, 'missing-email,missing-tagger-entry'
command lines.

To make the parsing robust, we generate strings from the enum keys, and
using these keys, we will map lower-case, dash-separated strings values
to the corresponding enum values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 30f7a48..2d91e28 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,15 +63,38 @@ enum fsck_msg_id {
 };
 #undef MSG_ID
 
-#define MSG_ID(id, severity) { FSCK_##severity },
+#define STR(x) #x
+#define MSG_ID(id, severity) { STR(id), FSCK_##severity },
 static struct {
+	const char *id_string;
 	int severity;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ -1 }
+	{ NULL, -1 }
 };
 #undef MSG_ID
 
+static int parse_msg_id(const char *text, int len)
+{
+	int i, j;
+
+	for (i = 0; i < FSCK_MSG_MAX; i++) {
+		const char *key = msg_id_info[i].id_string;
+		/* id_string is upper-case, with underscores */
+		for (j = 0; j < len; j++) {
+			char c = *(key++);
+			if (c == '_')
+				c = '-';
+			if (text[j] != tolower(c))
+				break;
+		}
+		if (j == len && !*key)
+			return i;
+	}
+
+	die("Unhandled message id: %.*s", len, text);
+}
+
 static int fsck_msg_severity(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (2 preceding siblings ...)
  2015-01-21 19:24     ` [PATCH v3 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-01-21 19:24     ` Johannes Schindelin
  2015-01-21 19:24     ` [PATCH v3 05/19] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
                       ` (14 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:24 UTC (permalink / raw)
  To: gitster; +Cc: git

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The function added by this commit parses a list of settings in the form:

	missing-email=warn,bad-name=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we
need to take extra care to default to all FSCK_ERROR in those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fsck.h |  7 +++++--
 2 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 2d91e28..02715ee 100644
--- a/fsck.c
+++ b/fsck.c
@@ -100,13 +100,67 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 {
 	int severity;
 
-	severity = msg_id_info[msg_id].severity;
-	if (options->strict && severity == FSCK_WARN)
-		severity = FSCK_ERROR;
+	if (options->msg_severity && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
+		severity = options->msg_severity[msg_id];
+	else {
+		severity = msg_id_info[msg_id].severity;
+		if (options->strict && severity == FSCK_WARN)
+			severity = FSCK_ERROR;
+	}
 
 	return severity;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+	int match_len = strlen(match);
+	if (match_len != len)
+		return -1;
+	return memcmp(string, match, len);
+}
+
+void fsck_set_severity(struct fsck_options *options, const char *mode)
+{
+	int severity = FSCK_ERROR;
+
+	if (!options->msg_severity) {
+		int i;
+		int *msg_severity = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			msg_severity[i] = fsck_msg_severity(i, options);
+		options->msg_severity = msg_severity;
+	}
+
+	while (*mode) {
+		int len = strcspn(mode, " ,|"), equal, msg_id;
+
+		if (!len) {
+			mode++;
+			continue;
+		}
+
+		for (equal = 0; equal < len; equal++)
+			if (mode[equal] == '=')
+				break;
+
+		if (equal < len) {
+			if (!substrcmp(mode, equal, "error"))
+				severity = FSCK_ERROR;
+			else if (!substrcmp(mode, equal, "warn"))
+				severity = FSCK_WARN;
+			else
+				die("Unknown fsck message severity: '%.*s'",
+					equal, mode);
+			mode += equal + 1;
+			len -= equal + 1;
+		}
+
+		msg_id = parse_msg_id(mode, len);
+		options->msg_severity[msg_id] = severity;
+		mode += len;
+	}
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -596,6 +650,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int severity, const char *message)
 {
+	if (severity == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index f6f268a..4349860 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,8 @@
 
 struct fsck_options;
 
+void fsck_set_severity(struct fsck_options *options, const char *mode);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +27,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	int *msg_severity;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 05/19] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key>
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (3 preceding siblings ...)
  2015-01-21 19:24     ` [PATCH v3 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-01-21 19:24     ` Johannes Schindelin
  2015-01-21 19:25     ` [PATCH v3 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
                       ` (13 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:24 UTC (permalink / raw)
  To: gitster; +Cc: git

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.warn = missing-email

The value is actually a comma-separated list, and there is a
corresponding receive.fsck.error setting.

In case that the same key is listed in multiple receive.fsck.* lines in
the config, the latter configuration wins.

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 15 +++++++++++++--
 builtin/unpack-objects.c |  5 +++++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 925f7b5..b82b4dd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,6 +1565,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_set_severity(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e0ce78e..18d5012 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -36,6 +36,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_severity = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -115,6 +116,12 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (skip_prefix(var, "receive.fsck.", &var)) {
+		strbuf_addf(&fsck_severity, "%s%s=%s",
+			fsck_severity.len ? "," : "", var, value);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1471,7 +1478,9 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s%s",
+				fsck_severity.len ? "=" : "",
+				fsck_severity.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1489,7 +1498,9 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s%s",
+				fsck_severity.len ? "=" : "",
+				fsck_severity.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d17040..fe9117c 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				fsck_set_severity(&fsck_options, arg);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 06/19] fsck: Report the ID of the error/warning
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (4 preceding siblings ...)
  2015-01-21 19:24     ` [PATCH v3 05/19] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
@ 2015-01-21 19:25     ` Johannes Schindelin
  2015-01-21 19:25     ` [PATCH v3 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
                       ` (12 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:25 UTC (permalink / raw)
  To: gitster; +Cc: git

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missing-email", the user might want to call `git config
receive.fsck.warn missing-email`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c          | 19 +++++++++++++++++++
 t/t1450-fsck.sh |  4 ++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 02715ee..09f69fe 100644
--- a/fsck.c
+++ b/fsck.c
@@ -161,6 +161,23 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 	}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c == '_')
+			c = '-';
+		else
+			c = tolower(c);
+		strbuf_addch(sb, c);
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -169,6 +186,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	append_msg_id(&sb, msg_id_info[id].id_string);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_severity, sb.buf);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cfb32b6..ea0f216 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: invalid-tag-name: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missing-tagger-entry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 07/19] fsck: Make fsck_ident() warn-friendly
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (5 preceding siblings ...)
  2015-01-21 19:25     ` [PATCH v3 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-01-21 19:25     ` Johannes Schindelin
  2015-01-21 19:25     ` [PATCH v3 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
                       ` (11 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:25 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index 09f69fe..16500e3 100644
--- a/fsck.c
+++ b/fsck.c
@@ -453,40 +453,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 08/19] fsck: Make fsck_commit() warn-friendly
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (6 preceding siblings ...)
  2015-01-21 19:25     ` [PATCH v3 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-01-21 19:25     ` Johannes Schindelin
  2015-01-21 19:25     ` [PATCH v3 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
                       ` (10 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:25 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missing-author error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 16500e3..8979357 100644
--- a/fsck.c
+++ b/fsck.c
@@ -508,12 +508,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -522,11 +528,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 09/19] fsck: Handle multiple authors in commits specially
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (7 preceding siblings ...)
  2015-01-21 19:25     ` [PATCH v3 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-01-21 19:25     ` Johannes Schindelin
  2015-01-21 19:25     ` [PATCH v3 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
                       ` (9 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:25 UTC (permalink / raw)
  To: gitster; +Cc: git

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
receive.fsck.warn = missing-committer, but that could hide missing tree
objects in the same commit because we cannot continue verifying any
commit object after encountering a missing committer line, while we can
continue in the case of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fsck.c b/fsck.c
index 8979357..3118db1 100644
--- a/fsck.c
+++ b/fsck.c
@@ -38,6 +38,7 @@
 	FUNC(MISSING_TREE, ERROR) \
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
 	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
@@ -545,6 +546,14 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+		if (err)
+			return err;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
+	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 10/19] fsck: Make fsck_tag() warn-friendly
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (8 preceding siblings ...)
  2015-01-21 19:25     ` [PATCH v3 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-01-21 19:25     ` Johannes Schindelin
  2015-01-21 19:25     ` [PATCH v3 11/19] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
                       ` (8 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:25 UTC (permalink / raw)
  To: gitster; +Cc: git

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 3118db1..4adf9ce 100644
--- a/fsck.c
+++ b/fsck.c
@@ -614,7 +614,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 11/19] fsck: Add a simple test for receive.fsck.*
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (9 preceding siblings ...)
  2015-01-21 19:25     ` [PATCH v3 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-01-21 19:25     ` Johannes Schindelin
  2015-01-21 19:26     ` [PATCH v3 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
                       ` (7 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:25 UTC (permalink / raw)
  To: gitster; +Cc: git

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..40c7557 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,24 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit <<\EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.warn = missing-email' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missing-email" act
+'
+
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (10 preceding siblings ...)
  2015-01-21 19:25     ` [PATCH v3 11/19] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
@ 2015-01-21 19:26     ` Johannes Schindelin
  2015-01-21 19:26     ` [PATCH v3 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
                       ` (6 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:26 UTC (permalink / raw)
  To: gitster; +Cc: git

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 13 +++++++++++--
 t/t5504-fetch-receive-strict.sh |  9 +++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 4adf9ce..27a381b 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,12 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FSCK_FATAL -1
+
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
 	FUNC(BAD_EMAIL, ERROR) \
@@ -40,10 +45,8 @@
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
-	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
-	FUNC(UNTERMINATED_HEADER, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
@@ -157,6 +160,9 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 		}
 
 		msg_id = parse_msg_id(mode, len);
+		if (severity != FSCK_ERROR &&
+				msg_id_info[msg_id].severity == FSCK_FATAL)
+			die("Cannot demote %.*s", len, mode);
 		options->msg_severity[msg_id] = severity;
 		mode += len;
 	}
@@ -187,6 +193,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	if (msg_severity == FSCK_FATAL)
+		msg_severity = FSCK_ERROR;
+
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 40c7557..75d718f 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -135,4 +135,13 @@ test_expect_success 'push with receive.fsck.warn = missing-email' '
 	grep "missing-email" act
 '
 
+test_expect_success 'receive.fsck.warn = unterminated-header triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config receive.fsck.warn unterminated-header &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminated-header" act
+'
+
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (11 preceding siblings ...)
  2015-01-21 19:26     ` [PATCH v3 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-01-21 19:26     ` Johannes Schindelin
  2015-01-21 19:26     ` [PATCH v3 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
                       ` (5 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:26 UTC (permalink / raw)
  To: gitster; +Cc: git

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective error to "ignore".

This change "abuses" the warn=missing-email test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 7 ++++++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 27a381b..028a7ca 100644
--- a/fsck.c
+++ b/fsck.c
@@ -152,6 +152,8 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 				severity = FSCK_ERROR;
 			else if (!substrcmp(mode, equal, "warn"))
 				severity = FSCK_WARN;
+			else if (!substrcmp(mode, equal, "ignore"))
+				severity = FSCK_IGNORE;
 			else
 				die("Unknown fsck message severity: '%.*s'",
 					equal, mode);
@@ -193,6 +195,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	if (msg_severity == FSCK_IGNORE)
+		return 0;
+
 	if (msg_severity == FSCK_FATAL)
 		msg_severity = FSCK_ERROR;
 
diff --git a/fsck.h b/fsck.h
index 4349860..7be6c50 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 75d718f..5e54a13 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -132,7 +132,12 @@ test_expect_success 'push with receive.fsck.warn = missing-email' '
 	test_must_fail git push --porcelain dst bogus &&
 	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missing-email" act
+	grep "missing-email" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config receive.fsck.ignore missing-email &&
+	git  --git-dir=dst/.git config receive.fsck.warn bad-date &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missing-email" act
 '
 
 test_expect_success 'receive.fsck.warn = unterminated-header triggers error' '
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (12 preceding siblings ...)
  2015-01-21 19:26     ` [PATCH v3 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-01-21 19:26     ` Johannes Schindelin
  2015-01-21 19:27     ` [PATCH v3 15/19] fsck: Document the new receive.fsck.* options Johannes Schindelin
                       ` (4 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:26 UTC (permalink / raw)
  To: gitster; +Cc: git

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalid-tag-name` and
`missing-tagger-entry` to the receive.fsck.error config setting.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 +++++++++++++++++-------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 028a7ca..1334941 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,6 +10,7 @@
 #include "utf8.h"
 
 #define FSCK_FATAL -1
+#define FSCK_INFO -2
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -55,10 +56,11 @@
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
-	FUNC(INVALID_TAG_NAME, WARN) \
-	FUNC(MISSING_TAGGER_ENTRY, WARN) \
 	FUNC(NULL_SHA1, WARN) \
-	FUNC(ZERO_PADDED_FILEMODE, WARN)
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(INVALID_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, severity) FSCK_MSG_##id,
 enum fsck_msg_id {
@@ -200,6 +202,8 @@ static int report(struct fsck_options *options, struct object *object,
 
 	if (msg_severity == FSCK_FATAL)
 		msg_severity = FSCK_ERROR;
+	else if (msg_severity == FSCK_INFO)
+		msg_severity = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
@@ -658,15 +662,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 15/19] fsck: Document the new receive.fsck.* options.
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (13 preceding siblings ...)
  2015-01-21 19:26     ` [PATCH v3 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-01-21 19:27     ` Johannes Schindelin
  2015-01-21 19:27     ` [PATCH v3 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
                       ` (3 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:27 UTC (permalink / raw)
  To: gitster; +Cc: git

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ae6791d..cc4cd91 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2130,6 +2130,34 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.error::
+receive.fsck.warn::
+receive.fsck.ignore::
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.*`
+	settings. These settings contain comma-separated lists of fsck
+	message IDs. For convenience, fsck prefixes the error/warning with
+	the message ID, e.g. "missing-email: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.ignore = missing-email` will hide that issue.
++
+--
+error;;
+	a comma-separated list of fsck message IDs that should be
+	trigger fsck to error out.
+warn;;
+	a comma-separated list of fsck message IDs that should be
+	displayed, but fsck should continue to error out.
+ignore;;
+	a comma-separated list of fsck message IDs that should be
+	ignored completely.
+--
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories with certain known issues but still catch
+other issues.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 16/19] fsck: Support demoting errors to warnings
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (14 preceding siblings ...)
  2015-01-21 19:27     ` [PATCH v3 15/19] fsck: Document the new receive.fsck.* options Johannes Schindelin
@ 2015-01-21 19:27     ` Johannes Schindelin
  2015-01-21 19:27     ` [PATCH v3 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
                       ` (2 subsequent siblings)
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:27 UTC (permalink / raw)
  To: gitster; +Cc: git

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.ignore=missing-email fsck` would hide problems with
missing emails in author, committer and tagger lines.

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.* from the receive.fsck.*
settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 15 +++++++++++++++
 builtin/fsck.c           | 15 +++++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 41 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index cc4cd91..115811c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1208,6 +1208,21 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.error::
+fsck.warn::
+fsck.ignore::
+	The `fsck.error`, `fsck.warn` and `fsck.ignore` settings specify
+	comma-separated lists of fsck message IDs which should trigger
+	fsck to error out, to print the message and continue, or to ignore
+	said messages, respectively.
++
+For convenience, fsck prefixes the error/warning with the name of the option,
+e.g.  "missing-email: invalid author/committer line - missing email" means
+that setting `fsck.ignore = missing-email` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99d4538..6f5e671 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,19 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (skip_prefix(var, "fsck.", &var)) {
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "%s=%s", var, value);
+		fsck_set_severity(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *severity,
                       const char *err)
 {
@@ -638,6 +651,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index ea0f216..a79ff9f 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.ignore=multiple-authors fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 17/19] fsck: Introduce `git fsck --quick`
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (15 preceding siblings ...)
  2015-01-21 19:27     ` [PATCH v3 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
@ 2015-01-21 19:27     ` Johannes Schindelin
  2015-01-21 19:27     ` [PATCH v3 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
  2015-01-21 19:27     ` [PATCH v3 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:27 UTC (permalink / raw)
  To: gitster; +Cc: git

This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 	 [--[no-]dangling] [--[no-]progress] [<object>*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--quick::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 6f5e671..7ae4715 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -184,6 +185,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (quick && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -618,6 +621,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -654,7 +658,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!quick)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index a79ff9f..1c624a3 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+	rm -rf quick &&
+	git init quick &&
+	(
+		cd quick &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --quick &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --quick
+	)
+'
+
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (16 preceding siblings ...)
  2015-01-21 19:27     ` [PATCH v3 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
@ 2015-01-21 19:27     ` Johannes Schindelin
  2015-01-21 19:27     ` [PATCH v3 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:27 UTC (permalink / raw)
  To: gitster; +Cc: git

The optional new config option `receive.fsck.skiplist` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt        |  7 ++++++
 builtin/receive-pack.c          |  9 +++++++
 fsck.c                          | 53 +++++++++++++++++++++++++++++++++++++++++
 fsck.h                          |  1 +
 t/t5504-fetch-receive-strict.sh | 12 ++++++++++
 5 files changed, 82 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 115811c..636adff 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2173,6 +2173,13 @@ which would not pass pushing when `receive.fsckObjects = true`, allowing
 the host to accept repositories with certain known issues but still catch
 other issues.
 
+receive.fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 18d5012..8e6d1a1 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -116,6 +116,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		if (fsck_severity.len)
+			strbuf_addch(&fsck_severity, ',');
+		strbuf_addf(&fsck_severity, "skiplist=%s", path);
+		return 0;
+	}
+
 	if (skip_prefix(var, "receive.fsck.", &var)) {
 		strbuf_addf(&fsck_severity, "%s%s=%s",
 			fsck_severity.len ? "," : "", var, value);
diff --git a/fsck.c b/fsck.c
index 1334941..15cb8bd 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 #include "utf8.h"
+#include "sha1-array.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -117,6 +118,43 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 	return severity;
 }
 
+static void init_skiplist(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
+	else {
+		sorted = 1;
+		options->skiplist = &skiplist;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skiplist.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
 	int match_len = strlen(match);
@@ -156,6 +194,17 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 				severity = FSCK_WARN;
 			else if (!substrcmp(mode, equal, "ignore"))
 				severity = FSCK_IGNORE;
+			else if (!substrcmp(mode, equal, "skiplist")) {
+				char *path = xstrndup(mode + equal + 1,
+					len - equal - 1);
+
+				if (equal == len)
+					die("skiplist requires a path");
+				init_skiplist(options, path);
+				free(path);
+				mode += len;
+				continue;
+			}
 			else
 				die("Unknown fsck message severity: '%.*s'",
 					equal, mode);
@@ -700,6 +749,10 @@ static int fsck_tag(struct tag *tag, const char *data,
 int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
+	if (options->skiplist &&
+			sha1_array_lookup(options->skiplist, obj->sha1) >= 0)
+		return 0;
+
 	if (!obj)
 		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
diff --git a/fsck.h b/fsck.h
index 7be6c50..cae280e 100644
--- a/fsck.h
+++ b/fsck.h
@@ -29,6 +29,7 @@ struct fsck_options {
 	fsck_error error_func;
 	unsigned strict:1;
 	int *msg_severity;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 5e54a13..d367bb2 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skiplist' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
+	echo $commit > dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.warn = missing-email' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v3 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
                       ` (17 preceding siblings ...)
  2015-01-21 19:27     ` [PATCH v3 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-01-21 19:27     ` Johannes Schindelin
  18 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-21 19:27 UTC (permalink / raw)
  To: gitster; +Cc: git

Identical to support in `git receive-pack for the config option
`receive.fsck.skiplist`, we now support ignoring given objects in
`git fsck` altogether.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt |  7 +++++++
 builtin/fsck.c           | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 636adff..644411a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1223,6 +1223,13 @@ that setting `fsck.ignore = missing-email` will hide that issue.
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
 
+fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7ae4715..760b4bd 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -49,6 +49,16 @@ static int show_dangling = 1;
 
 static int fsck_config(const char *var, const char *value, void *cb)
 {
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "skiplist=%s", path);
+		fsck_set_severity(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
 	if (skip_prefix(var, "fsck.", &var)) {
 		struct strbuf sb = STRBUF_INIT;
 		strbuf_addf(&sb, "%s=%s", var, value);
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key>
  2015-01-21 18:01         ` Johannes Schindelin
@ 2015-01-21 21:47           ` Junio C Hamano
  2015-01-22  9:35             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-01-21 21:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>>> @@ -1488,8 +1501,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>>>
>>>  		argv_array_pushl(&child.args, "index-pack",
>>>  				 "--stdin", hdr_arg, keep_arg, NULL);
>>> -		if (fsck_objects)
>>> -			argv_array_push(&child.args, "--strict");
>>> +		if (fsck_objects) {
>>> +			if (fsck_severity.len)
>>> +				argv_array_pushf(&child.args, "--strict=%s",
>>> +					fsck_severity.buf);
>>> +			else
>>> +				argv_array_push(&child.args, "--strict");
>>> +		}
>> 
>> Hmm.  The above two hunks look suspiciously similar.  Would it be
>> worth to give them a single helper function?
>
> Hmm. Not sure. I see what you mean, but for now I found
>
> +                       argv_array_pushf(&child.args, "--strict%s%s",
> +                               fsck_severity.len ? "=" : "",
> +                               fsck_severity.buf);
>
> to be more elegant than to add a fully-fledged new function. But if
> you feel strongly, I will gladly implement a separate function; I
> would appreciate suggestions as to the function name...

Peff first introduced that trick elsewhere in our codebase, I think,
but I find it a bit too ugly.

As you accumulate fsck_severity strbuf like this anyway:

	strbuf_addf(&fsck_severity, "%s%s=%s",
        	fsck_severity.len ? "," : "", var, value);

to flip what to prefix each element on the list with, I wonder if it
is simpler to change that empty string to "=", which will allow you
to say this:

	argv_array_pushf(&child.args, "--strict%s", fsck_severity.buf);

Or even this:

	strbuf_addf(&fsck_strict_arg, "%s%s=%s",
        	fsck_strict_arg.len ? "," : "--strict=", var, value);

and then the child.args stuff can become

	if (fsck_strict_arg.len)
		argv_array_push(&child.args, fsck_strict_arg.buf);

In any case, I tend to agree with you that it is overkill to add a
helper function for just to add a single element to the argument
list.

Thanks.

	

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key>
  2015-01-21 21:47           ` Junio C Hamano
@ 2015-01-22  9:35             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-22  9:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On 2015-01-21 22:47, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>>>> @@ -1488,8 +1501,13 @@ static const char *unpack(int err_fd, struct shallow_info *si)
>>>>
>>>>  		argv_array_pushl(&child.args, "index-pack",
>>>>  				 "--stdin", hdr_arg, keep_arg, NULL);
>>>> -		if (fsck_objects)
>>>> -			argv_array_push(&child.args, "--strict");
>>>> +		if (fsck_objects) {
>>>> +			if (fsck_severity.len)
>>>> +				argv_array_pushf(&child.args, "--strict=%s",
>>>> +					fsck_severity.buf);
>>>> +			else
>>>> +				argv_array_push(&child.args, "--strict");
>>>> +		}
>>>
>>> Hmm.  The above two hunks look suspiciously similar.  Would it be
>>> worth to give them a single helper function?
>>
>> Hmm. Not sure. I see what you mean, but for now I found
>>
>> +                       argv_array_pushf(&child.args, "--strict%s%s",
>> +                               fsck_severity.len ? "=" : "",
>> +                               fsck_severity.buf);
>>
>> to be more elegant than to add a fully-fledged new function. But if
>> you feel strongly, I will gladly implement a separate function; I
>> would appreciate suggestions as to the function name...
> 
> Peff first introduced that trick elsewhere in our codebase, I think,
> but I find it a bit too ugly.
> 
> As you accumulate fsck_severity strbuf like this anyway:
> 
> 	strbuf_addf(&fsck_severity, "%s%s=%s",
>         	fsck_severity.len ? "," : "", var, value);
> 
> to flip what to prefix each element on the list with, I wonder if it
> is simpler to change that empty string to "=", which will allow you
> to say this:
> 
> 	argv_array_pushf(&child.args, "--strict%s", fsck_severity.buf);

But of course! This is what I did now:

-- snip --
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8e6d1a1..08e3716 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -126,8 +126,8 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 	}
 
 	if (skip_prefix(var, "receive.fsck.", &var)) {
-		strbuf_addf(&fsck_severity, "%s%s=%s",
-			fsck_severity.len ? "," : "", var, value);
+		strbuf_addf(&fsck_severity, "%c%s=%s",
+			fsck_severity.len ? ',' : '=', var, value);
 		return 0;
 	}
 
@@ -1487,8 +1487,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_pushf(&child.args, "--strict%s%s",
-				fsck_severity.len ? "=" : "",
+			argv_array_pushf(&child.args, "--strict%s",
 				fsck_severity.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
@@ -1507,8 +1506,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_pushf(&child.args, "--strict%s%s",
-				fsck_severity.len ? "=" : "",
+			argv_array_pushf(&child.args, "--strict%s",
 				fsck_severity.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
-- snap --

> Or even this:
> 
> 	strbuf_addf(&fsck_strict_arg, "%s%s=%s",
>         	fsck_strict_arg.len ? "," : "--strict=", var, value);

Unfortunately not, because just `--strict` needs to be passed in case no severity levels were overridden.

> In any case, I tend to agree with you that it is overkill to add a
> helper function for just to add a single element to the argument
> list.

I am glad we agree!

Ciao,
Dscho

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2014-12-23 17:14                       ` Junio C Hamano
  2014-12-23 17:41                         ` Johannes Schindelin
@ 2015-01-22 15:49                         ` Michael Haggerty
  2015-01-22 17:17                           ` Johannes Schindelin
  1 sibling, 1 reply; 275+ messages in thread
From: Michael Haggerty @ 2015-01-22 15:49 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin; +Cc: git, Tanay Abhra

On 12/23/2014 06:14 PM, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> On Tue, 23 Dec 2014, Junio C Hamano wrote:
>>> I suspect that it would be much better if the configuration variables
>>> were organized the other way around, e.g.
>>>
>>> 	$ git config fsck.warn missingTagger,someOtherKindOfError
>>
>> I had something similar in an earlier version of my patch series, but it
>> was shot down rightfully: if you want to allow inheriting defaults from
>> $HOME/.gitconfig, you have to configure the severity levels individually.
> 
> Hmmm.  What's wrong with "fsck.warn -missingTagger" that overrides
> the earlier one, or even "fsck.info missingTagger" after having
> "fsck.warn other,missingTagger,yetanother", with the usual "last one
> wins" rule?
> 
> Whoever shot it down "rightfully" is wrong here, I would think.

Sorry I didn't notice this earlier; Johannes, please CC me on these
series, especially the ones that I commented on earlier.

I might have been the one who "shot down" the "<severity>=<name>" style
of configuration [1].

I don't feel strongly enough to make a big deal about this, especially
considering that the other alternative has already been implemented. But
for the record, let me explain why I prefer the "<name>=<severity>"
style of configuration.

First, it is a truer representation of the data structure within the
software, which is basically one severity value for each error type.
This is not a decisive argument, but it often means that there is less
impedance mismatch between the style of configuration and the concepts
that it is configuring. For example,

    $ git config receive.fsck.warn A,B,C
    $ git config receive.fsck.error C,D,E

seems to be configuring two sets, but it is not. It is mysteriously
setting "C" to be an error, in seeming contradiction of the first line [2].

Second, it is not correct to say that this is just an application of the
"last setting wins" rule. The "last setting wins" rule has heretofore,
as far as I know, only covered *single* settings that take a single
value. If we applied that rule to the following:

    $ git config receive.fsck.warn A,B,C
    $ git config receive.fsck.warn B,F

then the net result would be "B,F". But that is not your proposal at
all; your proposal is for these two settings to be interpreted the same as

    $ git config receive.fsck.warn A,B,C,F

Similarly, the traditional last setting rule, applied to the first
example above, wouldn't cause the value of "fsck.warn" to be reduced to
"A,B", as you propose. This is not the "last setting rule" that we are
familiar with--it operates *across and within* values and across
*multiple* names rather than just across the values for a single name.

Third, the "<severity>=<name>" style is hard to inquire via the command
line, and probably also incompatible with the simplified internal config
API in git (and probably libgit2, JGit, etc). The problem is that
determining a *single* setting requires *three* configuration variables
be inquired, and that the settings for those three variables need to be
processed in the correct order, including the correct order of
interleavings. For example, how would you inquire about the configured
severity level of "missingTaggerEntry" using the shell? It would be a
mess that would necessarily have to involve "git config --get-regexp"
and error-prone parsing of comma-separated values. It would be so much
easier to type

    $ git config receive.fsck.missingtaggerentry

Fourth, the "<severity>=<name>" style would cause config files to get
cluttered up with unused values. Suppose you have earlier run

    $ git config receive.fsck.warn A,B,C
    $ git config receive.fsck.ignore D,E

and now you want to demote "B" to "ignore". You can do

    $ git config --add receive.fsck.ignore B

(don't forget "--add" or you've silently erased other, unrelated
settings!) This gives the behavior that you want. But now your config
file looks like

    [receive "fsck"]
            warn = A,B,C
            ignore = D,E
            ignore = B

The "B" on the first line is now just being carried along for no reason,
but it would be quite awkward to clean it up programmatically.
Effectively, these settings can only be added to but never removed
because of the way multiple properties are mashed into a single setting.


I believe that one of the main arguments for the "<severity>=<name>"
style of configuration is that it carries over more easily into
convenient command-line options. But I think it will be unusual to want
to configure these options by hand on the command line, let alone adjust
many settings at the same time. The idea isn't to make it easy to work
with repositories that have a level of breakage that fluctuates over
time. It is to make it possible to work with *specific* repositories
that have known breakage in their history. For such a repo you would
configure one or two "ignore" options one time and then never adjust
them again. (And it will also allow us to make our checks stricter in
the future without breaking existing repositories, and even to add
optional "policy" checks, like "forbid Windows-incompatible filenames".)

I would even go so far as to say that we don't *need* command-line
option versions of these settings; if somebody really needs that they
can type

    $ git -c receive.fsck.missingtaggerentry=ignore fsck

(which also has the advantage of passing the setting through to any
child processes). But *if* command-line options are considered
necessary, I don't think that using a "<name>=<severity>" style within
the config needs to rule out allowing command-line options in the form
"--<severity>=<name>,<name>" as Junio has suggested.

Looking back at this email, I guess that I'm more strongly against the
"<name>=<severity>" configuration style than I thought :-/

Michael

[1] I prefer to think that I just offered a little gentle discussion
that informed Johannes's independent decision :-)

[2] But even on these terms, it is anomalous. The usual git way to
configure a set in git would be

    $ git config receive.fsck.warn A
    $ git config --add receive.fsck.warn B
    $ git config --add receive.fsck.warn C
    $ git config receive.fsck.error C
    $ git config --add receive.fsck.error D
    $ git config --add receive.fsck.error E

, which in fact has fewer of the disadvantages listed in this email.

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2015-01-22 15:49                         ` Michael Haggerty
@ 2015-01-22 17:17                           ` Johannes Schindelin
  2015-01-31 20:41                             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-22 17:17 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git, Tanay Abhra

Hi Michael,

On 2015-01-22 16:49, Michael Haggerty wrote:
> On 12/23/2014 06:14 PM, Junio C Hamano wrote:
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>> On Tue, 23 Dec 2014, Junio C Hamano wrote:
>>>> I suspect that it would be much better if the configuration variables
>>>> were organized the other way around, e.g.
>>>>
>>>> 	$ git config fsck.warn missingTagger,someOtherKindOfError
>>>
>>> I had something similar in an earlier version of my patch series, but it
>>> was shot down rightfully: if you want to allow inheriting defaults from
>>> $HOME/.gitconfig, you have to configure the severity levels individually.
>>
>> Hmmm.  What's wrong with "fsck.warn -missingTagger" that overrides
>> the earlier one, or even "fsck.info missingTagger" after having
>> "fsck.warn other,missingTagger,yetanother", with the usual "last one
>> wins" rule?
>>
>> Whoever shot it down "rightfully" is wrong here, I would think.
> 
> Sorry I didn't notice this earlier; Johannes, please CC me on these
> series, especially the ones that I commented on earlier.

Very sorry, this is my fault. It can only be explained by my switching around some tools for other tools to work with email-based patch submission (which I had not done in a long time). But still, my mistake.


> [1] I prefer to think that I just offered a little gentle discussion
> that informed Johannes's independent decision :-)

You did convince me back then. I just did not want to put up a fight against Junio because I was more interested in getting this feature merged before the holidays (it does feel awkward for me to leave work unwrapped-up before leaving for an extended amount of time, but I guess I am getting more used to that).

So now I cannot avoid discussing this issue properly...

In essence, I agreed with Junio from the point of view of an elegant implementation. But then, Michael is correct that it does not really matter as much how complicated the code is, but that it is much more important that the feature is elegant to use.

Now let's step back a bit and think about the users which is supposed to be supported by this patch series: Git repository hosters -- such as GitHub -- need to ensure a certain cleanliness of the repositories they host (for a range of reasons, including the prevention of malicious attacks, or helping users publish their code in a correct form).

And the scenario in which the feature needs to be used is most likely started by some Git user pushing some commits, and `git receive-pack` triggering an error. Then the user files a trouble ticket and GitHubber needs to inspect the error and the respective object. Now, in the vast number of cases I imagine that the objects *are* faulty. However, on occasion the problem should not prevent the push, e.g. when somebody crafted a commit object with two authors, forgetting that the tools usually cannot handle such commits. Then the GitHubber has to decide on a case by case basis whether to demote that error to a warning and allow the object to be pushed *into that specific repository*.

I do see the need for this feature to be simple and robust, from the users' point of view. In other words, I agree with Michael that we need to avoid confusing settings such as

```
[receive.fsck]
    warn = missing-tagger-entry
    error = missing-tagger-entry
```

This feature will be used rarely enough that the poor soul stuck with interpreting the above config section won't remember that a very specific version of "last setting wins" is in effect.

If I remember correctly, Peff suggested that there needs to be a way to handle these settings in the /etc/gitconfig $HOME/.gitconfig $XDG../gitconfig .git/config cascade, but now I am puzzled whether it is even desirable to demote fsck errors globally, i.e. whether we really need to pay attention to that config cascade.

And finally, in the course of preparing this patch series, we came up with an alternative solution to the problem: the receive.fsck.skiplist (i.e. a file that contains a sorted list of SHA-1s of objects that should be skipped from fsck'ing). I am more and more convinced that this is the most convenient tool for the scenario described above: manual inspection of individual objects will tell whether it is safe to allow them onto the server or not.

However, others might disagree and prefer the explicit approach, e.g. when some source generates a consistent stream of objects triggering fsck errors.

Summary: I have no preference how to specify the severity levels of fsck messages, but I will gladly change my code to whatever you (meaning Junio and Michael in particular) want to see implemented.

Thanks for helping me with this feature,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH 16/18] fsck: support demoting errors to warnings
  2015-01-22 17:17                           ` Johannes Schindelin
@ 2015-01-31 20:41                             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 20:41 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git, Tanay Abhra

Hi Michael & Junio,

On 2015-01-22 18:17, Johannes Schindelin wrote:

> [...] we need to avoid confusing settings such as
> 
> ```
> [receive.fsck]
>     warn = missing-tagger-entry
>     error = missing-tagger-entry
> ```

I *think* I found a solution.

Please let me recapitulate quickly the problem Michael brought up: if we support `receive.fsck.warn` to override `receive.fsck.error` and vice versa, with comma-separated lists, then it can be quite confusing to the user, and actually quite difficult to figure out on the command-line which setting is in effect (because it really depends on the *order* of the receive.fsck.* lines, *plus* the fact that the values are comma-separated lists).

On the other hand, Junio pointed out two shortcomings with my original implementation (i.e. to support `receive.fsck.<id> = (error|warn|ignore)`), however: it is tedious to set multiple severity levels, and it violates the config file convention that the config variable names are CamelCased (the message IDs are dashed-lowercase instead).

The solution I just implemented (and will send out shortly in v4 of the patch series) is the following: the config variable is called receive.fsck.severity and it accepts comma-separated settings. Example:

```
[receive "fsck"]
        severity = multiple-authors=ignore,missing-tagger=error
```

Now, it is *still* not the easiest to figure out the setting from the command-line:

```sh
$ git config --get-all receive.fsck.severity |
        tr "," "\n" |
        grep ^multiple-authors= |
        tail -n 1
```

But I hope this is good enough, Michael?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2014-12-10 18:34 ` [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
  2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
@ 2015-01-31 21:04   ` Johannes Schindelin
  2015-01-31 21:04     ` [PATCH v4 01/19] fsck: Introduce fsck options Johannes Schindelin
                       ` (20 more replies)
  2 siblings, 21 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:04 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.

Changes vs v3: address Junio's concern regarding repeated patterns,
heed Peff's advice to avoid that pesky clang warning re: unsigned enums,
use the fsck.skiplist setting in builtin/fsck.c (not
receive.fsck.skiplist), and switch to fsck.severity to address Michael's
concerns that letting fsck.(error|warn|ignore)'s comma-separated lists
possibly overriding each other partially; interdiff below the diffstat.

Johannes Schindelin (19):
  fsck: Introduce fsck options
  fsck: Introduce identifiers for fsck messages
  fsck: Provide a function to parse fsck message IDs
  fsck: Offer a function to demote fsck errors to warnings
  fsck: Allow demoting errors to warnings
  fsck: Report the ID of the error/warning
  fsck: Make fsck_ident() warn-friendly
  fsck: Make fsck_commit() warn-friendly
  fsck: Handle multiple authors in commits specially
  fsck: Make fsck_tag() warn-friendly
  fsck: Add a simple test for receive.fsck.severity
  fsck: Disallow demoting grave fsck errors to warnings
  fsck: Optionally ignore specific fsck issues completely
  fsck: Allow upgrading fsck warnings to errors
  fsck: Document the new receive.fsck.severity options.
  fsck: Support demoting errors to warnings
  fsck: Introduce `git fsck --quick`
  fsck: git receive-pack: support excluding objects from fsck'ing
  fsck: support ignoring objects in `git fsck` via fsck.skiplist

 Documentation/config.txt        |  41 ++++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  73 ++++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  21 +-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 532 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  27 +-
 t/t1450-fsck.sh                 |  37 ++-
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  49 ++++
 11 files changed, 656 insertions(+), 162 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 644411a..93c43d5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1208,17 +1208,14 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
-fsck.error::
-fsck.warn::
-fsck.ignore::
-	The `fsck.error`, `fsck.warn` and `fsck.ignore` settings specify
-	comma-separated lists of fsck message IDs which should trigger
-	fsck to error out, to print the message and continue, or to ignore
-	said messages, respectively.
+fsck.severity::
+	A comma-separated lists of of the form `<id>=<level>` where `<id>`
+	denotes a fsck message ID such as `missing-email` and `<level>` is
+	one of `error`, `warn` and `ignore`.
 +
-For convenience, fsck prefixes the error/warning with the name of the option,
+For convenience, fsck prefixes the error/warning with the message ID,
 e.g.  "missing-email: invalid author/committer line - missing email" means
-that setting `fsck.ignore = missing-email` will hide that issue.
+that setting `fsck.severity = missing-email=ignore` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
@@ -2152,28 +2149,15 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
-receive.fsck.error::
-receive.fsck.warn::
-receive.fsck.ignore::
+receive.fsck.severity::
 	When `receive.fsckObjects` is set to true, errors can be switched
-	to warnings and vice versa by configuring the `receive.fsck.*`
-	settings. These settings contain comma-separated lists of fsck
-	message IDs. For convenience, fsck prefixes the error/warning with
-	the message ID, e.g. "missing-email: invalid
+	to warnings and vice versa by configuring the `receive.fsck.severity`
+	setting. These settings contain comma-separated lists of the form
+	`<id>=<level>` where the `<id>` is the fsck message ID and the level
+	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
+	the error/warning with the message ID, e.g. "missing-email: invalid
 	author/committer line - missing email" means that setting
-	`receive.fsck.ignore = missing-email` will hide that issue.
-+
---
-error;;
-	a comma-separated list of fsck message IDs that should be
-	trigger fsck to error out.
-warn;;
-	a comma-separated list of fsck message IDs that should be
-	displayed, but fsck should continue to error out.
-ignore;;
-	a comma-separated list of fsck message IDs that should be
-	ignored completely.
---
+	`receive.fsck.severity = missing-email=ignore` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which would not pass pushing when `receive.fsckObjects = true`, allowing
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7eb4ff8..81570d8 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -49,7 +49,12 @@ static int show_dangling = 1;
 
 static int fsck_config(const char *var, const char *value, void *cb)
 {
-	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+	if (strcmp(var, "fsck.severity") == 0) {
+		fsck_set_severity(&fsck_obj_options, value);
+		return 0;
+	}
+
+	if (strcmp(var, "fsck.skiplist") == 0) {
 		const char *path = is_absolute_path(value) ?
 			value : git_path("%s", value);
 		struct strbuf sb = STRBUF_INIT;
@@ -59,14 +64,6 @@ static int fsck_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (skip_prefix(var, "fsck.", &var)) {
-		struct strbuf sb = STRBUF_INIT;
-		strbuf_addf(&sb, "%s=%s", var, value);
-		fsck_set_severity(&fsck_obj_options, sb.buf);
-		strbuf_release(&sb);
-		return 0;
-	}
-
 	return git_default_config(var, value, cb);
 }
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8e6d1a1..f454e65 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -116,18 +116,17 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.severity") == 0) {
+		strbuf_addf(&fsck_severity, "%c%s",
+			fsck_severity.len ? ',' : '=', value);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsck.skiplist") == 0) {
 		const char *path = is_absolute_path(value) ?
 			value : git_path("%s", value);
-		if (fsck_severity.len)
-			strbuf_addch(&fsck_severity, ',');
-		strbuf_addf(&fsck_severity, "skiplist=%s", path);
-		return 0;
-	}
-
-	if (skip_prefix(var, "receive.fsck.", &var)) {
-		strbuf_addf(&fsck_severity, "%s%s=%s",
-			fsck_severity.len ? "," : "", var, value);
+		strbuf_addf(&fsck_severity, "%cskiplist=%s",
+			fsck_severity.len ? ',' : '=', path);
 		return 0;
 	}
 
@@ -1487,8 +1486,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_pushf(&child.args, "--strict%s%s",
-				fsck_severity.len ? "=" : "",
+			argv_array_pushf(&child.args, "--strict%s",
 				fsck_severity.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
@@ -1507,8 +1505,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_pushf(&child.args, "--strict%s%s",
-				fsck_severity.len ? "=" : "",
+			argv_array_pushf(&child.args, "--strict%s",
 				fsck_severity.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
diff --git a/fsck.c b/fsck.c
index 15cb8bd..046af02 100644
--- a/fsck.c
+++ b/fsck.c
@@ -107,7 +107,9 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 {
 	int severity;
 
-	if (options->msg_severity && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
+	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
+
+	if (options->msg_severity)
 		severity = options->msg_severity[msg_id];
 	else {
 		severity = msg_id_info[msg_id].severity;
@@ -184,35 +186,40 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 		}
 
 		for (equal = 0; equal < len; equal++)
-			if (mode[equal] == '=')
+			if (mode[equal] == '=' || mode[equal] == ':')
 				break;
 
-		if (equal < len) {
-			if (!substrcmp(mode, equal, "error"))
+		if (!substrcmp(mode, equal, "skiplist")) {
+			char *path = xstrndup(mode + equal + 1,
+				len - equal - 1);
+
+			if (equal == len)
+				die("skiplist requires a path");
+			init_skiplist(options, path);
+			free(path);
+			mode += len;
+			continue;
+		}
+
+		msg_id = parse_msg_id(mode, equal);
+
+		if (equal == len)
+			severity = FSCK_ERROR;
+		else {
+			const char *p = mode + equal + 1;
+			int len2 = len - equal - 1;
+
+			if (!substrcmp(p, len2, "error"))
 				severity = FSCK_ERROR;
-			else if (!substrcmp(mode, equal, "warn"))
+			else if (!substrcmp(p, len2, "warn"))
 				severity = FSCK_WARN;
-			else if (!substrcmp(mode, equal, "ignore"))
+			else if (!substrcmp(p, len2, "ignore"))
 				severity = FSCK_IGNORE;
-			else if (!substrcmp(mode, equal, "skiplist")) {
-				char *path = xstrndup(mode + equal + 1,
-					len - equal - 1);
-
-				if (equal == len)
-					die("skiplist requires a path");
-				init_skiplist(options, path);
-				free(path);
-				mode += len;
-				continue;
-			}
 			else
 				die("Unknown fsck message severity: '%.*s'",
-					equal, mode);
-			mode += equal + 1;
-			len -= equal + 1;
+					len2, p);
 		}
 
-		msg_id = parse_msg_id(mode, len);
 		if (severity != FSCK_ERROR &&
 				msg_id_info[msg_id].severity == FSCK_FATAL)
 			die("Cannot demote %.*s", len, mode);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 1c624a3..b32afaf 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -295,7 +295,7 @@ test_expect_success 'force fsck to ignore double author' '
 	git update-ref refs/heads/bogus "$new" &&
 	test_when_finished "git update-ref -d refs/heads/bogus" &&
 	test_must_fail git fsck &&
-	git -c fsck.ignore=multiple-authors fsck
+	git -c fsck.severity=multiple-authors=ignore fsck
 '
 
 _bz='\0'
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index d367bb2..7881e17 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -135,28 +135,31 @@ test_expect_success 'push with receive.fsck.skiplist' '
 	git push --porcelain dst bogus
 '
 
-test_expect_success 'push with receive.fsck.warn = missing-email' '
+test_expect_success 'push with receive.fsck.severity = missing-email=warn' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	test_must_fail git push --porcelain dst bogus &&
-	git --git-dir=dst/.git config receive.fsck.warn missing-email &&
+	git --git-dir=dst/.git config \
+		receive.fsck.severity missing-email=warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
 	grep "missing-email" act &&
 	git --git-dir=dst/.git branch -D bogus &&
-	git  --git-dir=dst/.git config receive.fsck.ignore missing-email &&
-	git  --git-dir=dst/.git config receive.fsck.warn bad-date &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.severity missing-email=ignore,bad-date=warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
 	test_must_fail grep "missing-email" act
 '
 
-test_expect_success 'receive.fsck.warn = unterminated-header triggers error' '
+test_expect_success \
+	'receive.fsck.severity = unterminated-header=warn triggers error' '
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
-	git --git-dir=dst/.git config receive.fsck.warn unterminated-header &&
+	git --git-dir=dst/.git config \
+		receive.fsck.severity unterminated-header=warn &&
 	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
 	grep "Cannot demote unterminated-header" act
 '
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 01/19] fsck: Introduce fsck options
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
@ 2015-01-31 21:04     ` Johannes Schindelin
  2015-01-31 21:04     ` [PATCH v4 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
                       ` (19 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:04 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0c75786..0d03c15 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static unsigned char head_sha1[20];
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -630,6 +632,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4632117..925f7b5 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -74,6 +74,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -191,7 +192,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -782,10 +783,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			fsck_options.walk = mark_link;
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ac66672..6d17040 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 10bcb65..d83b811 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -25,9 +25,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -40,7 +40,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -49,14 +49,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -66,14 +66,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -81,11 +81,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -138,7 +138,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -194,7 +194,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -219,30 +219,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -250,7 +250,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -258,36 +258,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,30 +295,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,39 +328,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -376,65 +376,65 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -443,34 +443,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..07d0ab2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	unsigned strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 02/19] fsck: Introduce identifiers for fsck messages
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2015-01-31 21:04     ` [PATCH v4 01/19] fsck: Introduce fsck options Johannes Schindelin
@ 2015-01-31 21:04     ` Johannes Schindelin
  2015-01-31 21:04     ` [PATCH v4 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
                       ` (18 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:04 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Instead of specifying whether a message by the fsck machinery constitutes
an error or a warning, let's specify an identifier relating to the
concrete problem that was encountered. This is necessary for upcoming
support to be able to demote certain errors to warnings.

In the process, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

We could use a simple enum for the message IDs here, but we want to
guarantee that the enum values are associated with the appropriate
severity levels. Besides, we want to introduce a parser in the next commit
that maps the string representation to the enum value, hence we use the
slightly ugly preprocessor construct that is extensible for use with said
parser.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  24 ++-----
 fsck.c         | 201 +++++++++++++++++++++++++++++++++++++++++----------------
 fsck.h         |   5 +-
 3 files changed, 153 insertions(+), 77 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0d03c15..1f7944c 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -47,32 +47,22 @@ static int show_dangling = 1;
 #endif
 
 static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+                      const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+	        severity, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d83b811..30f7a48 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,98 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(DATE_OVERFLOW, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(INVALID_OBJECT_SHA1, ERROR) \
+	FUNC(INVALID_TAG_OBJECT, ERROR) \
+	FUNC(INVALID_TREE, ERROR) \
+	FUNC(INVALID_TYPE, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(NOT_SORTED, ERROR) \
+	FUNC(NUL_IN_HEADER, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(UNTERMINATED_HEADER, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(INVALID_TAG_NAME, WARN) \
+	FUNC(MISSING_TAGGER_ENTRY, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN)
+
+#define MSG_ID(id, severity) FSCK_MSG_##id,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+#undef MSG_ID
+
+#define MSG_ID(id, severity) { FSCK_##severity },
+static struct {
+	int severity;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ -1 }
+};
+#undef MSG_ID
+
+static int fsck_msg_severity(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
+{
+	int severity;
+
+	severity = msg_id_info[msg_id].severity;
+	if (options->strict && severity == FSCK_WARN)
+		severity = FSCK_ERROR;
+
+	return severity;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_severity = fsck_msg_severity(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_severity, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -219,25 +311,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -250,15 +342,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -266,28 +360,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,7 +389,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -312,13 +406,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,23 +422,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_INVALID_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -376,11 +470,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -391,48 +487,49 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -448,7 +545,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_INVALID_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -457,7 +554,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -470,22 +567,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int severity, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 07d0ab2..f6f268a 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 03/19] fsck: Provide a function to parse fsck message IDs
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2015-01-31 21:04     ` [PATCH v4 01/19] fsck: Introduce fsck options Johannes Schindelin
  2015-01-31 21:04     ` [PATCH v4 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-01-31 21:04     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 05/19] fsck: Allow demoting errors to warnings Johannes Schindelin
                       ` (17 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:04 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This function will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. It has to handle partial strings because we would
like to be able to parse, say, 'missing-email,missing-tagger-entry'
command lines.

To make the parsing robust, we generate strings from the enum keys, and
using these keys, we will map lower-case, dash-separated strings values
to the corresponding enum values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 30f7a48..2d91e28 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,15 +63,38 @@ enum fsck_msg_id {
 };
 #undef MSG_ID
 
-#define MSG_ID(id, severity) { FSCK_##severity },
+#define STR(x) #x
+#define MSG_ID(id, severity) { STR(id), FSCK_##severity },
 static struct {
+	const char *id_string;
 	int severity;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ -1 }
+	{ NULL, -1 }
 };
 #undef MSG_ID
 
+static int parse_msg_id(const char *text, int len)
+{
+	int i, j;
+
+	for (i = 0; i < FSCK_MSG_MAX; i++) {
+		const char *key = msg_id_info[i].id_string;
+		/* id_string is upper-case, with underscores */
+		for (j = 0; j < len; j++) {
+			char c = *(key++);
+			if (c == '_')
+				c = '-';
+			if (text[j] != tolower(c))
+				break;
+		}
+		if (j == len && !*key)
+			return i;
+	}
+
+	die("Unhandled message id: %.*s", len, text);
+}
+
 static int fsck_msg_severity(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 05/19] fsck: Allow demoting errors to warnings
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (2 preceding siblings ...)
  2015-01-31 21:04     ` [PATCH v4 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 04/19] fsck: Offer a function to demote fsck " Johannes Schindelin
                       ` (16 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.severity missing-email=warn

The value is actually a comma-separated list.

In case that the same key is listed in multiple receive.fsck.severity
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain severity
settings).

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 13 +++++++++++--
 builtin/unpack-objects.c |  5 +++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 925f7b5..b82b4dd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,6 +1565,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_set_severity(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e0ce78e..9b7f1a8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -36,6 +36,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_severity = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -115,6 +116,12 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.severity") == 0) {
+		strbuf_addf(&fsck_severity, "%c%s",
+			fsck_severity.len ? ',' : '=', value);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1471,7 +1478,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_severity.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1489,7 +1497,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_severity.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d17040..fe9117c 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				fsck_set_severity(&fsck_options, arg);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (3 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 05/19] fsck: Allow demoting errors to warnings Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 11/19] fsck: Add a simple test for receive.fsck.severity Johannes Schindelin
                       ` (15 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The function added by this commit parses a list of settings in the form:

	missing-email=warn,bad-name=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we
need to take extra care to default to all FSCK_ERROR in those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fsck.h |  7 +++++--
 2 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 2d91e28..b5e7d2c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -100,13 +100,73 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 {
 	int severity;
 
-	severity = msg_id_info[msg_id].severity;
-	if (options->strict && severity == FSCK_WARN)
-		severity = FSCK_ERROR;
+	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
+
+	if (options->msg_severity)
+		severity = options->msg_severity[msg_id];
+	else {
+		severity = msg_id_info[msg_id].severity;
+		if (options->strict && severity == FSCK_WARN)
+			severity = FSCK_ERROR;
+	}
 
 	return severity;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+	int match_len = strlen(match);
+	if (match_len != len)
+		return -1;
+	return memcmp(string, match, len);
+}
+
+void fsck_set_severity(struct fsck_options *options, const char *mode)
+{
+	int severity = FSCK_ERROR;
+
+	if (!options->msg_severity) {
+		int i;
+		int *msg_severity = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			msg_severity[i] = fsck_msg_severity(i, options);
+		options->msg_severity = msg_severity;
+	}
+
+	while (*mode) {
+		int len = strcspn(mode, " ,|"), equal, msg_id;
+
+		if (!len) {
+			mode++;
+			continue;
+		}
+
+		for (equal = 0; equal < len; equal++)
+			if (mode[equal] == '=' || mode[equal] == ':')
+				break;
+
+		msg_id = parse_msg_id(mode, equal);
+
+		if (equal == len)
+			severity = FSCK_ERROR;
+		else {
+			const char *p = mode + equal + 1;
+			int len2 = len - equal - 1;
+
+			if (!substrcmp(p, len2, "error"))
+				severity = FSCK_ERROR;
+			else if (!substrcmp(p, len2, "warn"))
+				severity = FSCK_WARN;
+			else
+				die("Unknown fsck message severity: '%.*s'",
+					len2, p);
+		}
+
+		options->msg_severity[msg_id] = severity;
+		mode += len;
+	}
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -596,6 +656,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int severity, const char *message)
 {
+	if (severity == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index f6f268a..4349860 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,8 @@
 
 struct fsck_options;
 
+void fsck_set_severity(struct fsck_options *options, const char *mode);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +27,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	int *msg_severity;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 11/19] fsck: Add a simple test for receive.fsck.severity
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (4 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 04/19] fsck: Offer a function to demote fsck " Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
                       ` (14 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..9d49cb7 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,25 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit <<\EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.severity = missing-email=warn' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config \
+		receive.fsck.severity missing-email=warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missing-email" act
+'
+
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (5 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 11/19] fsck: Add a simple test for receive.fsck.severity Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
                       ` (13 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 13 +++++++++++--
 t/t5504-fetch-receive-strict.sh | 11 +++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index f72e404..e126320 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,12 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FSCK_FATAL -1
+
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
 	FUNC(BAD_EMAIL, ERROR) \
@@ -40,10 +45,8 @@
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
-	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
-	FUNC(UNTERMINATED_HEADER, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
@@ -163,6 +166,9 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 					len2, p);
 		}
 
+		if (severity != FSCK_ERROR &&
+				msg_id_info[msg_id].severity == FSCK_FATAL)
+			die("Cannot demote %.*s", len, mode);
 		options->msg_severity[msg_id] = severity;
 		mode += len;
 	}
@@ -193,6 +199,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	if (msg_severity == FSCK_FATAL)
+		msg_severity = FSCK_ERROR;
+
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 9d49cb7..0b6af82 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -136,4 +136,15 @@ test_expect_success 'push with receive.fsck.severity = missing-email=warn' '
 	grep "missing-email" act
 '
 
+test_expect_success \
+	'receive.fsck.severity = unterminated-header=warn triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config \
+		receive.fsck.severity unterminated-header=warn &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminated-header" act
+'
+
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 07/19] fsck: Make fsck_ident() warn-friendly
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (6 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
                       ` (12 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index e52b027..cd3ee48 100644
--- a/fsck.c
+++ b/fsck.c
@@ -459,40 +459,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 08/19] fsck: Make fsck_commit() warn-friendly
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (7 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:05     ` [PATCH v4 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
                       ` (11 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missing-author error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index cd3ee48..7ce7857 100644
--- a/fsck.c
+++ b/fsck.c
@@ -514,12 +514,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -528,11 +534,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 10/19] fsck: Make fsck_tag() warn-friendly
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (8 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-01-31 21:05     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 15/19] fsck: Document the new receive.fsck.severity options Johannes Schindelin
                       ` (10 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:05 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index b92d8c4..f72e404 100644
--- a/fsck.c
+++ b/fsck.c
@@ -620,7 +620,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 15/19] fsck: Document the new receive.fsck.severity options.
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (9 preceding siblings ...)
  2015-01-31 21:05     ` [PATCH v4 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
                       ` (9 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ae6791d..f893492 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2130,6 +2130,21 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.severity::
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.severity`
+	setting. These settings contain comma-separated lists of the form
+	`<id>=<level>` where the `<id>` is the fsck message ID and the level
+	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
+	the error/warning with the message ID, e.g. "missing-email: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.severity = missing-email=ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories with certain known issues but still catch
+other issues.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 09/19] fsck: Handle multiple authors in commits specially
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (10 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 15/19] fsck: Document the new receive.fsck.severity options Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
                       ` (8 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
receive.fsck.severity to missing-committer=warn, but that could hide
missing tree objects in the same commit because we cannot continue
verifying any commit object after encountering a missing committer line,
while we can continue in the case of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fsck.c b/fsck.c
index 7ce7857..b92d8c4 100644
--- a/fsck.c
+++ b/fsck.c
@@ -38,6 +38,7 @@
 	FUNC(MISSING_TREE, ERROR) \
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
 	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
@@ -551,6 +552,14 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+		if (err)
+			return err;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
+	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (11 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
                       ` (7 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalid-tag-name` and
`missing-tagger-entry` in the receive.fsck.severity config setting.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 +++++++++++++++++-------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 03ec945..c4079ed 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,6 +10,7 @@
 #include "utf8.h"
 
 #define FSCK_FATAL -1
+#define FSCK_INFO -2
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -55,10 +56,11 @@
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
-	FUNC(INVALID_TAG_NAME, WARN) \
-	FUNC(MISSING_TAGGER_ENTRY, WARN) \
 	FUNC(NULL_SHA1, WARN) \
-	FUNC(ZERO_PADDED_FILEMODE, WARN)
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(INVALID_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, severity) FSCK_MSG_##id,
 enum fsck_msg_id {
@@ -206,6 +208,8 @@ static int report(struct fsck_options *options, struct object *object,
 
 	if (msg_severity == FSCK_FATAL)
 		msg_severity = FSCK_ERROR;
+	else if (msg_severity == FSCK_INFO)
+		msg_severity = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
@@ -664,15 +668,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 06/19] fsck: Report the ID of the error/warning
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (12 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
                       ` (6 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missing-email", the user might want to call `git config
--add receive.fsck.severity missing-email=warn`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c          | 19 +++++++++++++++++++
 t/t1450-fsck.sh |  4 ++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index b5e7d2c..e52b027 100644
--- a/fsck.c
+++ b/fsck.c
@@ -167,6 +167,23 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 	}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c == '_')
+			c = '-';
+		else
+			c = tolower(c);
+		strbuf_addch(sb, c);
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -175,6 +192,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	append_msg_id(&sb, msg_id_info[id].id_string);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_severity, sb.buf);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cfb32b6..ea0f216 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: invalid-tag-name: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missing-tagger-entry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (13 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
                       ` (5 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective error to "ignore".

This change "abuses" the missing-email=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 7 ++++++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index e126320..03ec945 100644
--- a/fsck.c
+++ b/fsck.c
@@ -161,6 +161,8 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 				severity = FSCK_ERROR;
 			else if (!substrcmp(p, len2, "warn"))
 				severity = FSCK_WARN;
+			else if (!substrcmp(p, len2, "ignore"))
+				severity = FSCK_IGNORE;
 			else
 				die("Unknown fsck message severity: '%.*s'",
 					len2, p);
@@ -199,6 +201,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_severity = fsck_msg_severity(id, options), result;
 
+	if (msg_severity == FSCK_IGNORE)
+		return 0;
+
 	if (msg_severity == FSCK_FATAL)
 		msg_severity = FSCK_ERROR;
 
diff --git a/fsck.h b/fsck.h
index 4349860..7be6c50 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 0b6af82..9e4e77b 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -133,7 +133,12 @@ test_expect_success 'push with receive.fsck.severity = missing-email=warn' '
 	git --git-dir=dst/.git config \
 		receive.fsck.severity missing-email=warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missing-email" act
+	grep "missing-email" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.severity missing-email=ignore,bad-date=warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missing-email" act
 '
 
 test_expect_success \
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 16/19] fsck: Support demoting errors to warnings
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (14 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
                       ` (4 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.severity=missing-email=ignore fsck` would hide
problems with missing emails in author, committer and tagger lines.

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.severity from the
receive.fsck.severity settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 12 ++++++++++++
 builtin/fsck.c           | 12 ++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 35 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f893492..4c0a13d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1208,6 +1208,18 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.severity::
+	A comma-separated lists of of the form `<id>=<level>` where `<id>`
+	denotes a fsck message ID such as `missing-email` and `<level>` is
+	one of `error`, `warn` and `ignore`.
++
+For convenience, fsck prefixes the error/warning with the message ID,
+e.g.  "missing-email: invalid author/committer line - missing email" means
+that setting `fsck.severity = missing-email=ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1f7944c..9e5cc31 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,16 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (strcmp(var, "fsck.severity") == 0) {
+		fsck_set_severity(&fsck_obj_options, value);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *severity,
                       const char *err)
 {
@@ -638,6 +648,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index ea0f216..0f15b74 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.severity=multiple-authors=ignore fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 17/19] fsck: Introduce `git fsck --quick`
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (15 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:06     ` [PATCH v4 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
                       ` (3 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 	 [--[no-]dangling] [--[no-]progress] [<object>*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--quick::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 9e5cc31..cf61aad 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -181,6 +182,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (quick && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -615,6 +618,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -651,7 +655,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!quick)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 0f15b74..b32afaf 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+	rm -rf quick &&
+	git init quick &&
+	(
+		cd quick &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --quick &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --quick
+	)
+'
+
 test_done
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (16 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
@ 2015-01-31 21:06     ` Johannes Schindelin
  2015-01-31 21:07     ` [PATCH v4 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
                       ` (2 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:06 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The optional new config option `receive.fsck.skiplist` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt        |  7 ++++++
 builtin/receive-pack.c          |  8 ++++++
 fsck.c                          | 54 +++++++++++++++++++++++++++++++++++++++++
 fsck.h                          |  1 +
 t/t5504-fetch-receive-strict.sh | 12 +++++++++
 5 files changed, 82 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4c0a13d..e685aef 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2157,6 +2157,13 @@ which would not pass pushing when `receive.fsckObjects = true`, allowing
 the host to accept repositories with certain known issues but still catch
 other issues.
 
+receive.fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 9b7f1a8..f454e65 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -122,6 +122,14 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		strbuf_addf(&fsck_severity, "%cskiplist=%s",
+			fsck_severity.len ? ',' : '=', path);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
diff --git a/fsck.c b/fsck.c
index c4079ed..046af02 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 #include "utf8.h"
+#include "sha1-array.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -119,6 +120,43 @@ static int fsck_msg_severity(enum fsck_msg_id msg_id,
 	return severity;
 }
 
+static void init_skiplist(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
+	else {
+		sorted = 1;
+		options->skiplist = &skiplist;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skiplist.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
 	int match_len = strlen(match);
@@ -151,6 +189,18 @@ void fsck_set_severity(struct fsck_options *options, const char *mode)
 			if (mode[equal] == '=' || mode[equal] == ':')
 				break;
 
+		if (!substrcmp(mode, equal, "skiplist")) {
+			char *path = xstrndup(mode + equal + 1,
+				len - equal - 1);
+
+			if (equal == len)
+				die("skiplist requires a path");
+			init_skiplist(options, path);
+			free(path);
+			mode += len;
+			continue;
+		}
+
 		msg_id = parse_msg_id(mode, equal);
 
 		if (equal == len)
@@ -706,6 +756,10 @@ static int fsck_tag(struct tag *tag, const char *data,
 int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
+	if (options->skiplist &&
+			sha1_array_lookup(options->skiplist, obj->sha1) >= 0)
+		return 0;
+
 	if (!obj)
 		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
diff --git a/fsck.h b/fsck.h
index 7be6c50..cae280e 100644
--- a/fsck.h
+++ b/fsck.h
@@ -29,6 +29,7 @@ struct fsck_options {
 	fsck_error error_func;
 	unsigned strict:1;
 	int *msg_severity;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 9e4e77b..7881e17 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skiplist' '
+	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
+	echo $commit > dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.severity = missing-email=warn' '
 	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v4 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (17 preceding siblings ...)
  2015-01-31 21:06     ` [PATCH v4 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-01-31 21:07     ` Johannes Schindelin
  2015-02-02 11:41     ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-01-31 21:07 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Identical to support in `git receive-pack for the config option
`receive.fsck.skiplist`, we now support ignoring given objects in
`git fsck` via `fsck.skiplist` altogether.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt |  7 +++++++
 builtin/fsck.c           | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e685aef..93c43d5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1220,6 +1220,13 @@ that setting `fsck.severity = missing-email=ignore` will hide that issue.
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
 
+fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index cf61aad..81570d8 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -54,6 +54,16 @@ static int fsck_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "skiplist=%s", path);
+		fsck_set_severity(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
-- 
2.2.0.33.gc18b867

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (18 preceding siblings ...)
  2015-01-31 21:07     ` [PATCH v4 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
@ 2015-02-02 11:41     ` Johannes Schindelin
  2015-02-02 12:43       ` Michael Haggerty
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
  20 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-02-02 11:41 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Hi all (in particular Junio),

On 2015-01-31 22:04, Johannes Schindelin wrote:

> [...] switch to fsck.severity to address Michael's
> concerns that letting fsck.(error|warn|ignore)'s comma-separated lists
> possibly overriding each other partially;

Having participated in the CodingStyle thread, I came to the conclusion that the fsck.severity solution favors syntax over intuitiveness.

Therefore, I would like to support the case for `fsck.level.missingAuthor` (note that there is an extra ".level." in contrast to earlier suggestions).

The benefits:

- it is very, very easy to understand

- cumulative settings are intuitively cumulative, i.e. setting `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail` completely unaffected

- it is very easy to enquire and set the levels via existing `git config` calls

Now, there is one downside, but *only* if we ignore Postel's law.

Postel's law ("be lenient in what you accept as input, but strict in your output") would dictate that our message ID parser accept both "missing-author" and "missingAuthor" if we follow the inconsistent practice of using lowercase-dashed keys on the command-line but CamelCased ones in the config.

However, earlier Junio made very clear that the parser is required to fail to parse "missing-author" in the config, and to fail to parse "missingAuthor" on the command-line.

Therefore, the design I recommend above will require two, minimally different parsers for essentially the same thing.

IMHO this is a downside that is by far outweighed by the ease of use of the new feature, therefore I am willing to bear the burden of implementation.

Do you agree?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-02-02 11:41     ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
@ 2015-02-02 12:43       ` Michael Haggerty
  2015-02-02 16:48         ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Michael Haggerty @ 2015-02-02 12:43 UTC (permalink / raw)
  To: Johannes Schindelin, gitster; +Cc: git, peff

On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
> Hi all (in particular Junio),
> 
> On 2015-01-31 22:04, Johannes Schindelin wrote:
> 
>> [...] switch to fsck.severity to address Michael's concerns that
>> letting fsck.(error|warn|ignore)'s comma-separated lists possibly
>> overriding each other partially;
> 
> Having participated in the CodingStyle thread, I came to the
> conclusion that the fsck.severity solution favors syntax over
> intuitiveness.
> 
> Therefore, I would like to support the case for
> `fsck.level.missingAuthor` (note that there is an extra ".level." in
> contrast to earlier suggestions).

Why "level"?

> The benefits:
> 
> - it is very, very easy to understand
> 
> - cumulative settings are intuitively cumulative, i.e. setting
> `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
> completely unaffected
> 
> - it is very easy to enquire and set the levels via existing `git
> config` calls
> 
> Now, there is one downside, but *only* if we ignore Postel's law.
> 
> Postel's law ("be lenient in what you accept as input, but strict in
> your output") would dictate that our message ID parser accept both
> "missing-author" and "missingAuthor" if we follow the inconsistent
> practice of using lowercase-dashed keys on the command-line but
> CamelCased ones in the config.
> 
> However, earlier Junio made very clear that the parser is required to
> fail to parse "missing-author" in the config, and to fail to parse
> "missingAuthor" on the command-line.
> 
> Therefore, the design I recommend above will require two, minimally
> different parsers for essentially the same thing.
> 
> IMHO this is a downside that is by far outweighed by the ease of use
> of the new feature, therefore I am willing to bear the burden of
> implementation.

I again encourage you to consider skipping the implementation of
command-line options entirely. It's not like users are going to want to
use different options for different invocations. Let them use

    git -c fsck.level.missingAuthor=ignore fsck

if they really want to play around, then

    git config fsck.level.missingAuthor ignore

to make it permanent. After that they will never have to worry about
that option again.

And Postel needn't be offended :-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-02-02 12:43       ` Michael Haggerty
@ 2015-02-02 16:48         ` Johannes Schindelin
  2015-02-03 15:11           ` Michael Haggerty
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-02-02 16:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: gitster, git, peff

Hi Michael,

On 2015-02-02 13:43, Michael Haggerty wrote:
> On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
>> Hi all (in particular Junio),
>>
>> On 2015-01-31 22:04, Johannes Schindelin wrote:
>>
>>> [...] switch to fsck.severity to address Michael's concerns that
>>> letting fsck.(error|warn|ignore)'s comma-separated lists possibly
>>> overriding each other partially;
>>
>> Having participated in the CodingStyle thread, I came to the
>> conclusion that the fsck.severity solution favors syntax over
>> intuitiveness.
>>
>> Therefore, I would like to support the case for
>> `fsck.level.missingAuthor` (note that there is an extra ".level." in
>> contrast to earlier suggestions).
> 
> Why "level"?

"Severity level", or "error level". Maybe ".severity." would be better?

>> The benefits:
>>
>> - it is very, very easy to understand
>>
>> - cumulative settings are intuitively cumulative, i.e. setting
>> `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
>> completely unaffected
>>
>> - it is very easy to enquire and set the levels via existing `git
>> config` calls
>>
>> Now, there is one downside, but *only* if we ignore Postel's law.
>>
>> Postel's law ("be lenient in what you accept as input, but strict in
>> your output") would dictate that our message ID parser accept both
>> "missing-author" and "missingAuthor" if we follow the inconsistent
>> practice of using lowercase-dashed keys on the command-line but
>> CamelCased ones in the config.
>>
>> However, earlier Junio made very clear that the parser is required to
>> fail to parse "missing-author" in the config, and to fail to parse
>> "missingAuthor" on the command-line.
>>
>> Therefore, the design I recommend above will require two, minimally
>> different parsers for essentially the same thing.
>>
>> IMHO this is a downside that is by far outweighed by the ease of use
>> of the new feature, therefore I am willing to bear the burden of
>> implementation.
> 
> I again encourage you to consider skipping the implementation of
> command-line options entirely. It's not like users are going to want to
> use different options for different invocations. Let them use
> 
>     git -c fsck.level.missingAuthor=ignore fsck
> 
> if they really want to play around, then
> 
>     git config fsck.level.missingAuthor ignore
> 
> to make it permanent. After that they will never have to worry about
> that option again.

Unfortunately, I have to pass the `receive.fsck.*` settings from `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the command-line, because it is `git-receive-pack` that consumes the config setting, but it is one of `git-unpack-objects` and `git-index-pack` that has to act on it...

> And Postel needn't be offended :-)

;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-02-02 16:48         ` Johannes Schindelin
@ 2015-02-03 15:11           ` Michael Haggerty
  2015-02-03 16:33             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Michael Haggerty @ 2015-02-03 15:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: gitster, git, peff

On 02/02/2015 05:48 PM, Johannes Schindelin wrote:
> On 2015-02-02 13:43, Michael Haggerty wrote:
>> On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
>>> Hi all (in particular Junio),
>>>
>>> On 2015-01-31 22:04, Johannes Schindelin wrote:
>>>
>>>> [...] switch to fsck.severity to address Michael's concerns that
>>>> letting fsck.(error|warn|ignore)'s comma-separated lists possibly
>>>> overriding each other partially;
>>>
>>> Having participated in the CodingStyle thread, I came to the
>>> conclusion that the fsck.severity solution favors syntax over
>>> intuitiveness.
>>>
>>> Therefore, I would like to support the case for
>>> `fsck.level.missingAuthor` (note that there is an extra ".level." in
>>> contrast to earlier suggestions).
>>
>> Why "level"?
> 
> "Severity level", or "error level". Maybe ".severity." would be better?

Sorry, I should have been clearer. I understand why the word "level"
makes sense, as opposed to, say, "peanut-butter". What I don't
understand is why a middle word is needed at all. In the config file it
will look like

[fsck "level"]
        missingAuthor = error

, which looks funny. "level" is a constant, so it seems superfluous.

If anything, it might be more useful to allow an optional middle word to
allow the strictness level to be adjusted based on which command
encounters the problem. For example, if you want to tolerate existing
commits that have missing authors, but not allow any new ones to be
pushed, you could set

[strictness]
        missingAuthor = ignore
[strictness "receive-pack"]
        missingAuthor = error

(There's probably a better word than "strictness", but you get the idea.)

>>> The benefits:
>>>
>>> - it is very, very easy to understand
>>>
>>> - cumulative settings are intuitively cumulative, i.e. setting
>>> `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
>>> completely unaffected
>>>
>>> - it is very easy to enquire and set the levels via existing `git
>>> config` calls
>>>
>>> Now, there is one downside, but *only* if we ignore Postel's law.
>>>
>>> Postel's law ("be lenient in what you accept as input, but strict in
>>> your output") would dictate that our message ID parser accept both
>>> "missing-author" and "missingAuthor" if we follow the inconsistent
>>> practice of using lowercase-dashed keys on the command-line but
>>> CamelCased ones in the config.
>>>
>>> However, earlier Junio made very clear that the parser is required to
>>> fail to parse "missing-author" in the config, and to fail to parse
>>> "missingAuthor" on the command-line.
>>>
>>> Therefore, the design I recommend above will require two, minimally
>>> different parsers for essentially the same thing.
>>>
>>> IMHO this is a downside that is by far outweighed by the ease of use
>>> of the new feature, therefore I am willing to bear the burden of
>>> implementation.
>>
>> I again encourage you to consider skipping the implementation of
>> command-line options entirely. It's not like users are going to want to
>> use different options for different invocations. Let them use
>>
>>     git -c fsck.level.missingAuthor=ignore fsck
>>
>> if they really want to play around, then
>>
>>     git config fsck.level.missingAuthor ignore
>>
>> to make it permanent. After that they will never have to worry about
>> that option again.
> 
> Unfortunately, I have to pass the `receive.fsck.*` settings from
> `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
> command-line, because it is `git-receive-pack` that consumes the config
> setting, but it is one of `git-unpack-objects` and `git-index-pack` that
> has to act on it...

Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
mechanism? If I run

    git -c foo.bar=baz $CMD

, then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
"'foo.bar=baz'", which causes child processes to treat that value as a
configuration setting. I don't have a lot of experience with this but I
think it should do what you need.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-02-03 15:11           ` Michael Haggerty
@ 2015-02-03 16:33             ` Johannes Schindelin
  2015-02-04  3:50               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-02-03 16:33 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: gitster, git, peff

Hi Michael,

On 2015-02-03 16:11, Michael Haggerty wrote:
> On 02/02/2015 05:48 PM, Johannes Schindelin wrote:
>> On 2015-02-02 13:43, Michael Haggerty wrote:
>>> On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
>>>> Hi all (in particular Junio),
>>>>
>>>> On 2015-01-31 22:04, Johannes Schindelin wrote:
>>>>
>>>>> [...] switch to fsck.severity to address Michael's concerns that
>>>>> letting fsck.(error|warn|ignore)'s comma-separated lists possibly
>>>>> overriding each other partially;
>>>>
>>>> Having participated in the CodingStyle thread, I came to the
>>>> conclusion that the fsck.severity solution favors syntax over
>>>> intuitiveness.
>>>>
>>>> Therefore, I would like to support the case for
>>>> `fsck.level.missingAuthor` (note that there is an extra ".level." in
>>>> contrast to earlier suggestions).
>>>
>>> Why "level"?
>>
>> "Severity level", or "error level". Maybe ".severity." would be better?
> 
> Sorry, I should have been clearer. I understand why the word "level"
> makes sense, as opposed to, say, "peanut-butter". What I don't
> understand is why a middle word is needed at all. In the config file it
> will look like
> 
> [fsck "level"]
>         missingAuthor = error
> 
> , which looks funny. "level" is a constant, so it seems superfluous.
> 
> If anything, it might be more useful to allow an optional middle word to
> allow the strictness level to be adjusted based on which command
> encounters the problem. For example, if you want to tolerate existing
> commits that have missing authors, but not allow any new ones to be
> pushed, you could set
> 
> [strictness]
>         missingAuthor = ignore
> [strictness "receive-pack"]
>         missingAuthor = error
> 
> (There's probably a better word than "strictness", but you get the idea.)

Ah. Well, the idea of the middle constant is to separate the severity levels from all other fsck (or receive.fsck) settings. The 'fsck.skiplist' setting that I introduce in this patch series, for example, looks pretty much the same as 'fsck.missingauthor', but they have different roles.

This becomes important when I want to catch obvious problems such as 'fsck.missingautor': if I have an extra '.level', I can be certain that it is a typo rather than a config setting unrelated to the severity levels.

>>>> The benefits:
>>>>
>>>> - it is very, very easy to understand
>>>>
>>>> - cumulative settings are intuitively cumulative, i.e. setting
>>>> `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
>>>> completely unaffected
>>>>
>>>> - it is very easy to enquire and set the levels via existing `git
>>>> config` calls
>>>>
>>>> Now, there is one downside, but *only* if we ignore Postel's law.
>>>>
>>>> Postel's law ("be lenient in what you accept as input, but strict in
>>>> your output") would dictate that our message ID parser accept both
>>>> "missing-author" and "missingAuthor" if we follow the inconsistent
>>>> practice of using lowercase-dashed keys on the command-line but
>>>> CamelCased ones in the config.
>>>>
>>>> However, earlier Junio made very clear that the parser is required to
>>>> fail to parse "missing-author" in the config, and to fail to parse
>>>> "missingAuthor" on the command-line.
>>>>
>>>> Therefore, the design I recommend above will require two, minimally
>>>> different parsers for essentially the same thing.
>>>>
>>>> IMHO this is a downside that is by far outweighed by the ease of use
>>>> of the new feature, therefore I am willing to bear the burden of
>>>> implementation.
>>>
>>> I again encourage you to consider skipping the implementation of
>>> command-line options entirely. It's not like users are going to want to
>>> use different options for different invocations. Let them use
>>>
>>>     git -c fsck.level.missingAuthor=ignore fsck
>>>
>>> if they really want to play around, then
>>>
>>>     git config fsck.level.missingAuthor ignore
>>>
>>> to make it permanent. After that they will never have to worry about
>>> that option again.
>>
>> Unfortunately, I have to pass the `receive.fsck.*` settings from
>> `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
>> command-line, because it is `git-receive-pack` that consumes the config
>> setting, but it is one of `git-unpack-objects` and `git-index-pack` that
>> has to act on it...
> 
> Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
> mechanism? If I run
> 
>     git -c foo.bar=baz $CMD
> 
> , then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
> "'foo.bar=baz'", which causes child processes to treat that value as a
> configuration setting. I don't have a lot of experience with this but I
> think it should do what you need.

This is true, but please remember that the receive.fsck.* settings should be heeded by index-pack/unpack-objects *only* if one of the latter programs is called by receive-pack. It would therefore be a little funny (or wrong, depending on your point of view) if, say, index-pack would respect the receive.fsck.* settings.

That is why receive-pack adds a `--strict` command-line option when receive.fsckobjects is set to true instead of letting index-pack (or unpack-objects) look at the config variable receive.fsckobjects itself.

In the same spirit, I extend the `--strict` command-line option with an optional list of severity level overrides (e.g. `--strict=missing-author=ignore,...`) if receive-pack was configured to override those levels.

Now, as this optional argument is intended for internal use only, we could declare that it is okay to pass CamelCased stuff there, even if it disagrees with our command-line option conventions. If it is not okay, I will have to write a rewriter that consumes the CamelCased config settings and rewrites them as lowercase-dashed settings, to be passed to index-pack or unpack-objects, which in turn parses the lowercase-dashed settings.

It might be seen as unnecessarily complicated, which is why I argued in favor of letting the parser accept both forms (which incidentally would heed Postel's law, too). But I understand if Junio does not want that, because it technically violates the conventions he wants to see established. I just need a definitive statement which way to go so I can implement it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-02-03 16:33             ` Johannes Schindelin
@ 2015-02-04  3:50               ` Junio C Hamano
  2015-02-04 11:02                 ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-02-04  3:50 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Michael Haggerty, git, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>> [fsck "level"]
>>         missingAuthor = error
>> 
>> , which looks funny. "level" is a constant, so it seems superfluous.

Yes, it is superfluous, but is one way to avoid the ambiguity with
"skiplist".  Structuring it like this would not be so bad, either,
though.

	[fsck]
        	error = missingAuthor[, other kinds of errors...]

A small set like {ignore, warn, error} is easily maintainable not to
conflict with "skiplist" and others.

So that "avoid ambiguity with skiplist" does not favor either choice
in any significant way.

> This becomes important when I want to catch obvious problems such as
> fsck.missingautor': if I have an extra '.level', I can be certain that
> it is a typo rather than a config setting unrelated to the severity
> levels.

"[fsck] error = missingAutor" would let you catch the typo in a similar
way with the same context clue, so this does not decide which is
better, either.

One clear benefit I can see in it is that you can do

	git config fsck.level.missingAuthor

in scripts that wants to learn the current setting for a single
variable.  With "fsck.error=missingAuthor[,other kinds]", you would
instead have to do a bit more silly post-processing

	git config -l | sed -ne '
        	/^fsck\./{
			# make it "var=,token1,token2,token3,"
			s/=/=,/
                        s/$/,/
                        s/[ 	]*//g
			s|^fsck\.\([^=]*\)=.*,missingAuthor,.*|\1|p
		}
	' | tail -n 1

to grab the last fsck.{error,ignore,...}= thing that has the token
(I personally do not think the latter is so bad, though).

I wonder if

	[fsckError]
        	missingAuthor = error
                missingTagger = warn

wouldn't be a better way, though.  We'd keep the easier scripting

	git config fsckError.missingTagger

There is nothing that says that the top-level grouping must match
the Git subcommand name.  Nothing says that one Git subcommand can
own at most one namespace, either.  Nothing stops us from reserving
fsckError top-level namespace for variable name collision avoidance
with other fsck.* variables, if that gives us a better system.

>>> Unfortunately, I have to pass the `receive.fsck.*` settings from
>>> `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
>>> command-line, because it is `git-receive-pack` that consumes the config
>>> setting, but it is one of `git-unpack-objects` and `git-index-pack` that
>>> has to act on it...

But receive-pack at some point decides what, if anything, needs to
be passed when invoking unpack-objects, or index-pack, no?  Why is
it hard to pass "-c var=val" at the beginning where it would have
passed "--strictness=var=val" at the end?

>> Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
>> mechanism? If I run
>> 
>>     git -c foo.bar=baz $CMD
>> 
>> , then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
>> "'foo.bar=baz'", which causes child processes to treat that value as a
>> configuration setting. I don't have a lot of experience with this but I
>> think it should do what you need.
>
> This is true, but please remember that the receive.fsck.* settings
> should be heeded by index-pack/unpack-objects *only* if one of the
> latter programs is called by receive-pack. It would therefore be a
> little funny (or wrong, depending on your point of view) if, say,
> index-pack would respect the receive.fsck.* settings.

That means it would be fine if receive-pack invokes (when it sees
receive.fsck.severity=missingAuthor=error,missingTagger=warn config
meant for it and was told with receive.fsckObjects to check the
incoming objects) a command line like this:

	git -c fsckError.missingAuthor=error \
            -c fsckError.missingTagger=warn \
		index-pack $args...

(or whatever variable names and name structure we settle on).  And
the index-pack command does not have to even know there are
receive.fsck.* variables at all, no?

Another way to do that may be for receive-pack to invoke

	git index-pack --use-fsck-severity=receive.fsck $args...

to instruct it to look at receive.fsck.* variables, again when and
only when receive-pack wants to do so.  I think either way would be
fine, as this communication is an internal implementation detail
between receive-pack and index-pack and is not meant to be exposed
to the end users anyway.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery
  2015-02-04  3:50               ` Junio C Hamano
@ 2015-02-04 11:02                 ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-02-04 11:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, peff

Hi Junio,

On 2015-02-04 04:50, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>>> [fsck "level"]
>>>         missingAuthor = error
>>>
>>> , which looks funny. "level" is a constant, so it seems superfluous.
> 
> Yes, it is superfluous, but is one way to avoid the ambiguity with
> "skiplist".  Structuring it like this would not be so bad, either,
> though.
> 
> 	[fsck]
>         	error = missingAuthor[, other kinds of errors...]
> 
> A small set like {ignore, warn, error} is easily maintainable not to
> conflict with "skiplist" and others.

But you augmented the case against the {ignore, warn, error} structure with this yourself:

> With "fsck.error=missingAuthor[,other kinds]", you would
> instead have to do a bit more silly post-processing
> 
> 	git config -l | sed -ne '
>         	/^fsck\./{
> 			# make it "var=,token1,token2,token3,"
> 			s/=/=,/
>                         s/$/,/
>                         s/[ 	]*//g
> 			s|^fsck\.\([^=]*\)=.*,missingAuthor,.*|\1|p
> 		}
> 	' | tail -n 1

If a script to determine the current state of affairs is already convoluted, how complicated would it be for users of this feature? No, let's go with Michael's suggestion and have a much more elegant (read: easy to grasp) configuration.

I see that the extra ".level." is universally frowned on, and therefore am convinced that it would be bad to add it.

As to the conflict with "skiplist", by your own argument ("small set ... is easily maintainable") it is not a problem at all.

>> This becomes important when I want to catch obvious problems such as
>> fsck.missingautor': if I have an extra '.level', I can be certain that
>> it is a typo rather than a config setting unrelated to the severity
>> levels.
> 
> "[fsck] error = missingAutor" would let you catch the typo in a similar
> way with the same context clue, so this does not decide which is
> better, either.

No, you are correct, this does not decide it. What decides it that `fsck.missingAuthor = error` is better than `fsck.error = missingAuthor` is that the user does not need to wrap her head around the significance of the order of mutually overriding `error`, `warn` and `ignore` settings in the former.

As to the typo handling, you are absolutely correct that it is easy to handle "skiplist" first and then expect all other fsck.* settings (or receive.fsck.* settings) to match a message ID. To make things even smoother, I think it would make most sense to only *warn* about typos instead of *erroring out*.

> I wonder if
> 
> 	[fsckError]
>         	missingAuthor = error
>                 missingTagger = warn

Hmm. I am not so sure that `fsckError` is the correct term. After all, we can also upgrade warnings to errors, not only demote errors to warnings. So I guess the `fsck.*` naming scheme is not so bad after all!

>>>> Unfortunately, I have to pass the `receive.fsck.*` settings from
>>>> `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
>>>> command-line, because it is `git-receive-pack` that consumes the config
>>>> setting, but it is one of `git-unpack-objects` and `git-index-pack` that
>>>> has to act on it...
> 
> But receive-pack at some point decides what, if anything, needs to
> be passed when invoking unpack-objects, or index-pack, no?  Why is
> it hard to pass "-c var=val" at the beginning where it would have
> passed "--strictness=var=val" at the end?

It is not hard, it is just inelegant: 1) we would have to introduce a new function to handle the config settings in `unpack-objects`, 2) the command-line would be longer (for what benefit?), and 3) index-pack uses run_command() to launch the children [*1*] and wouldn't you find it super-ugly if that argv started with a "-c", "..."? I would.

We already have command-line handling in both index-pack and unpack-objects (and config handling only in the former), we already have --strict handling in particular, and arguably the severity levels are tightly connected to that --strict option that at least in my opinion it makes sense to understand that the severity levels are optional parameters to that command-line option.

So in short, I maintain that passing the options via the config mechanism would just make things more complicated, and in no way better.

>>> Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
>>> mechanism? If I run
>>>
>>>     git -c foo.bar=baz $CMD
>>>
>>> , then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
>>> "'foo.bar=baz'", which causes child processes to treat that value as a
>>> configuration setting. I don't have a lot of experience with this but I
>>> think it should do what you need.
>>
>> This is true, but please remember that the receive.fsck.* settings
>> should be heeded by index-pack/unpack-objects *only* if one of the
>> latter programs is called by receive-pack. It would therefore be a
>> little funny (or wrong, depending on your point of view) if, say,
>> index-pack would respect the receive.fsck.* settings.
> 
> That means it would be fine if receive-pack invokes (when it sees
> receive.fsck.severity=missingAuthor=error,missingTagger=warn config
> meant for it and was told with receive.fsckObjects to check the
> incoming objects) a command line like this:
> 
> 	git -c fsckError.missingAuthor=error \
>             -c fsckError.missingTagger=warn \
> 		index-pack $args...
> 
> (or whatever variable names and name structure we settle on).  And
> the index-pack command does not have to even know there are
> receive.fsck.* variables at all, no?

It does not know that at all already, so yes: I agree. It does not even need to know that those settings were config variables. Even better: it should not let config settings interfere with the settings receive-pack wants index-pack to use.

So there you have it, another reason why passing the levels via command-line options is conceptually much more sound than passing them via fake config settings.

> I think either way would be fine, as this communication is an internal implementation detail between receive-pack and index-pack and is not meant to be exposed to the end users anyway.

Yeah, I think we really used enough time to discuss something as unimportant as this detail.

Just to make sure: are you okay with passing CamelCased settings (e.g. `--strict=missingAuthor=warn,...`) as part of that internal communication between receive-pack and index-pack/unpack-objects? That is my preference by now.

Ciao,
Dscho

Footnote *1*: https://github.com/msysgit/git/blob/c47d6ec67188cec2a782bc245aa7df4e3cbdbc01/builtin/receive-pack.c#L992-L1001

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
  2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                       ` (19 preceding siblings ...)
  2015-02-02 11:41     ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
@ 2015-06-18 20:07     ` Johannes Schindelin
  2015-06-18 20:07       ` [PATCH v5 01/19] fsck: Introduce fsck options Johannes Schindelin
                         ` (20 more replies)
  20 siblings, 21 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:07 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.

Changes since v4 (sorry for the long delay):

- the config settings' convention changed as discussed. Example:
  `fsck.severity.warn = missing-author` is now
  `fsck.missingAuthor = warn` (or `fsck.missingauthor = warn` because
  config settings are traditionally case-insensitive). As a consequence,
  the command-line parameter passed on to `index-pack` and
  `unpack-objects` also uses camelCased values.

- previously, we errored out when encountering an unknown message id,
  now we warn instead.

- I now use `msg_type` consistently where I used `severity` before because
  it appears clearer to me

- the skiplist handling is now done only in the error case, for enhanced
  performance. While at it, a potential segmentation fault was fixed when
  a NULL object was dereferenced to be looked up in the skiplist.

Interdiff below the diffstat. It's huge. Sorry.

Johannes Schindelin (19):
  fsck: Introduce fsck options
  fsck: Introduce identifiers for fsck messages
  fsck: Provide a function to parse fsck message IDs
  fsck: Offer a function to demote fsck errors to warnings
  fsck (receive-pack): Allow demoting errors to warnings
  fsck: Report the ID of the error/warning
  fsck: Make fsck_ident() warn-friendly
  fsck: Make fsck_commit() warn-friendly
  fsck: Handle multiple authors in commits specially
  fsck: Make fsck_tag() warn-friendly
  fsck: Add a simple test for receive.fsck.<msg-id>
  fsck: Disallow demoting grave fsck errors to warnings
  fsck: Optionally ignore specific fsck issues completely
  fsck: Allow upgrading fsck warnings to errors
  fsck: Document the new receive.fsck.<msg-id> options
  fsck: Support demoting errors to warnings
  fsck: Introduce `git fsck --quick`
  fsck: git receive-pack: support excluding objects from fsck'ing
  fsck: support ignoring objects in `git fsck` via fsck.skiplist

 Documentation/config.txt        |  39 +++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  75 ++++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  25 +-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 553 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  31 ++-
 t/t1450-fsck.sh                 |  37 ++-
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  51 ++++
 11 files changed, 686 insertions(+), 163 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b5b1a22..5aba63a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1250,14 +1250,13 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
-fsck.severity::
-	A comma-separated lists of of the form `<id>=<level>` where `<id>`
-	denotes a fsck message ID such as `missing-email` and `<level>` is
-	one of `error`, `warn` and `ignore`.
+fsck.<msg-id>::
+	Allows overriding the message type (error, warn or ignore) of a
+	specific message ID such as `missingemail`.
 +
 For convenience, fsck prefixes the error/warning with the message ID,
-e.g.  "missing-email: invalid author/committer line - missing email" means
-that setting `fsck.severity = missing-email=ignore` will hide that issue.
+e.g.  "missingemail: invalid author/committer line - missing email" means
+that setting `fsck.missingemail = ignore` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
@@ -2224,15 +2223,14 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
-receive.fsck.severity::
+receive.fsck.<msg-id>::
 	When `receive.fsckObjects` is set to true, errors can be switched
-	to warnings and vice versa by configuring the `receive.fsck.severity`
-	setting. These settings contain comma-separated lists of the form
-	`<id>=<level>` where the `<id>` is the fsck message ID and the level
+	to warnings and vice versa by configuring the `receive.fsck.<msg-id>`
+	setting where the `<msg-id>` is the fsck message ID and the value
 	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
-	the error/warning with the message ID, e.g. "missing-email: invalid
+	the error/warning with the message ID, e.g. "missingemail: invalid
 	author/committer line - missing email" means that setting
-	`receive.fsck.severity = missing-email=ignore` will hide that issue.
+	`receive.fsck.missingemail = ignore` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which would not pass pushing when `receive.fsckObjects = true`, allowing
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 32e476a..ce538ac 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -49,8 +49,8 @@ static int show_dangling = 1;
 
 static int fsck_config(const char *var, const char *value, void *cb)
 {
-	if (strcmp(var, "fsck.severity") == 0) {
-		fsck_set_severity(&fsck_obj_options, value);
+	if (skip_prefix(var, "fsck.", &var)) {
+		fsck_set_msg_type(&fsck_obj_options, var, -1, value, -1);
 		return 0;
 	}
 
@@ -59,7 +59,7 @@ static int fsck_config(const char *var, const char *value, void *cb)
 			value : git_path("%s", value);
 		struct strbuf sb = STRBUF_INIT;
 		strbuf_addf(&sb, "skiplist=%s", path);
-		fsck_set_severity(&fsck_obj_options, sb.buf);
+		fsck_set_msg_types(&fsck_obj_options, sb.buf);
 		strbuf_release(&sb);
 		return 0;
 	}
@@ -67,11 +67,11 @@ static int fsck_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
-static void objreport(struct object *obj, const char *severity,
-                      const char *err)
+static void objreport(struct object *obj, const char *msg_type,
+			const char *err)
 {
 	fprintf(stderr, "%s in %s %s: %s\n",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1), err);
+		msg_type, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
 static int objerror(struct object *obj, const char *err)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 94c64ab..98e14fe 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1636,7 +1636,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (skip_prefix(arg, "--strict=", &arg)) {
 				strict = 1;
 				do_fsck_object = 1;
-				fsck_set_severity(&fsck_options, arg);
+				fsck_set_msg_types(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 4cbeb14..80574f9 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -19,6 +19,7 @@
 #include "tag.h"
 #include "gpg-interface.h"
 #include "sigchain.h"
+#include "fsck.h"
 
 static const char receive_pack_usage[] = "git receive-pack <git-dir>";
 
@@ -36,7 +37,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
-static struct strbuf fsck_severity = STRBUF_INIT;
+static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -116,17 +117,20 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (strcmp(var, "receive.fsck.severity") == 0) {
-		strbuf_addf(&fsck_severity, "%c%s",
-			fsck_severity.len ? ',' : '=', value);
-		return 0;
-	}
-
 	if (strcmp(var, "receive.fsck.skiplist") == 0) {
 		const char *path = is_absolute_path(value) ?
 			value : git_path("%s", value);
-		strbuf_addf(&fsck_severity, "%cskiplist=%s",
-			fsck_severity.len ? ',' : '=', path);
+		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
+			fsck_msg_types.len ? ',' : '=', path);
+		return 0;
+	}
+
+	if (skip_prefix(var, "receive.fsck.", &var)) {
+		if (is_valid_msg_type(var, value))
+			strbuf_addf(&fsck_msg_types, "%c%s=%s",
+				fsck_msg_types.len ? ',' : '=', var, value);
+		else
+			warning("Skipping unknown msg id '%s'", var);
 		return 0;
 	}
 
@@ -1506,7 +1510,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
 			argv_array_pushf(&child.args, "--strict%s",
-				fsck_severity.buf);
+				fsck_msg_types.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1525,7 +1529,7 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
 			argv_array_pushf(&child.args, "--strict%s",
-				fsck_severity.buf);
+				fsck_msg_types.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index fe9117c..7cc086f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -532,7 +532,7 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 			}
 			if (skip_prefix(arg, "--strict=", &arg)) {
 				strict = 1;
-				fsck_set_severity(&fsck_options, arg);
+				fsck_set_msg_types(&fsck_options, arg);
 				continue;
 			}
 			if (starts_with(arg, "--pack_header=")) {
diff --git a/fsck.c b/fsck.c
index 046af02..9b8981e 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,7 +63,7 @@
 	FUNC(INVALID_TAG_NAME, INFO) \
 	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
-#define MSG_ID(id, severity) FSCK_MSG_##id,
+#define MSG_ID(id, msg_type) FSCK_MSG_##id,
 enum fsck_msg_id {
 	FOREACH_MSG_ID(MSG_ID)
 	FSCK_MSG_MAX
@@ -71,10 +71,10 @@ enum fsck_msg_id {
 #undef MSG_ID
 
 #define STR(x) #x
-#define MSG_ID(id, severity) { STR(id), FSCK_##severity },
+#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
 static struct {
 	const char *id_string;
-	int severity;
+	int msg_type;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
 	{ NULL, -1 }
@@ -85,39 +85,42 @@ static int parse_msg_id(const char *text, int len)
 {
 	int i, j;
 
+	if (len < 0)
+		len = strlen(text);
+
 	for (i = 0; i < FSCK_MSG_MAX; i++) {
 		const char *key = msg_id_info[i].id_string;
-		/* id_string is upper-case, with underscores */
+		/* match id_string case-insensitively, without underscores. */
 		for (j = 0; j < len; j++) {
 			char c = *(key++);
 			if (c == '_')
-				c = '-';
-			if (text[j] != tolower(c))
+				c = *(key++);
+			if (toupper(text[j]) != c)
 				break;
 		}
 		if (j == len && !*key)
 			return i;
 	}
 
-	die("Unhandled message id: %.*s", len, text);
+	return -1;
 }
 
-static int fsck_msg_severity(enum fsck_msg_id msg_id,
+static int fsck_msg_type(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-	int severity;
+	int msg_type;
 
 	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
 
-	if (options->msg_severity)
-		severity = options->msg_severity[msg_id];
+	if (options->msg_type)
+		msg_type = options->msg_type[msg_id];
 	else {
-		severity = msg_id_info[msg_id].severity;
-		if (options->strict && severity == FSCK_WARN)
-			severity = FSCK_ERROR;
+		msg_type = msg_id_info[msg_id].msg_type;
+		if (options->strict && msg_type == FSCK_WARN)
+			msg_type = FSCK_ERROR;
 	}
 
-	return severity;
+	return msg_type;
 }
 
 static void init_skiplist(struct fsck_options *options, const char *path)
@@ -165,66 +168,87 @@ static inline int substrcmp(const char *string, int len, const char *match)
 	return memcmp(string, match, len);
 }
 
-void fsck_set_severity(struct fsck_options *options, const char *mode)
+static int parse_msg_type(const char *str, int len)
+{
+	if (len < 0)
+		len = strlen(str);
+
+	if (!substrcmp(str, len, "error"))
+		return FSCK_ERROR;
+	else if (!substrcmp(str, len, "warn"))
+		return FSCK_WARN;
+	else if (!substrcmp(str, len, "ignore"))
+		return FSCK_IGNORE;
+	else
+		die("Unknown fsck message type: '%.*s'",
+				len, str);
+}
+
+int is_valid_msg_type(const char *msg_id, const char *msg_type)
 {
-	int severity = FSCK_ERROR;
+	if (parse_msg_id(msg_id, -1) < 0)
+		return 0;
+	parse_msg_type(msg_type, -1);
+	return 1;
+}
+
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, int msg_id_len,
+		const char *msg_type, int msg_type_len)
+{
+	int id = parse_msg_id(msg_id, msg_id_len), type;
+
+	if (id < 0)
+		die("Unhandled message id: %.*s", msg_id_len, msg_id);
+	type = parse_msg_type(msg_type, msg_type_len);
 
-	if (!options->msg_severity) {
+	if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL)
+		die("Cannot demote %.*s to %.*s", msg_id_len, msg_id,
+				msg_type_len, msg_type);
+
+	if (!options->msg_type) {
 		int i;
-		int *msg_severity = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
 		for (i = 0; i < FSCK_MSG_MAX; i++)
-			msg_severity[i] = fsck_msg_severity(i, options);
-		options->msg_severity = msg_severity;
+			msg_type[i] = fsck_msg_type(i, options);
+		options->msg_type = msg_type;
 	}
 
-	while (*mode) {
-		int len = strcspn(mode, " ,|"), equal, msg_id;
+	options->msg_type[id] = type;
+}
+
+void fsck_set_msg_types(struct fsck_options *options, const char *values)
+{
+	while (*values) {
+		int len = strcspn(values, " ,|"), equal;
 
 		if (!len) {
-			mode++;
+			values++;
 			continue;
 		}
 
 		for (equal = 0; equal < len; equal++)
-			if (mode[equal] == '=' || mode[equal] == ':')
+			if (values[equal] == '=' || values[equal] == ':')
 				break;
 
-		if (!substrcmp(mode, equal, "skiplist")) {
-			char *path = xstrndup(mode + equal + 1,
+		if (!substrcmp(values, equal, "skiplist")) {
+			char *path = xstrndup(values + equal + 1,
 				len - equal - 1);
 
 			if (equal == len)
 				die("skiplist requires a path");
 			init_skiplist(options, path);
 			free(path);
-			mode += len;
+			values += len;
 			continue;
 		}
 
-		msg_id = parse_msg_id(mode, equal);
-
 		if (equal == len)
-			severity = FSCK_ERROR;
-		else {
-			const char *p = mode + equal + 1;
-			int len2 = len - equal - 1;
-
-			if (!substrcmp(p, len2, "error"))
-				severity = FSCK_ERROR;
-			else if (!substrcmp(p, len2, "warn"))
-				severity = FSCK_WARN;
-			else if (!substrcmp(p, len2, "ignore"))
-				severity = FSCK_IGNORE;
-			else
-				die("Unknown fsck message severity: '%.*s'",
-					len2, p);
-		}
+			die("Missing '=': '%.*s'", len, values);
 
-		if (severity != FSCK_ERROR &&
-				msg_id_info[msg_id].severity == FSCK_FATAL)
-			die("Cannot demote %.*s", len, mode);
-		options->msg_severity[msg_id] = severity;
-		mode += len;
+		fsck_set_msg_type(options, values, equal,
+				values + equal + 1, len - equal - 1);
+		values += len;
 	}
 }
 
@@ -235,11 +259,8 @@ static void append_msg_id(struct strbuf *sb, const char *msg_id)
 
 		if (!c)
 			break;
-		if (c == '_')
-			c = '-';
-		else
-			c = tolower(c);
-		strbuf_addch(sb, c);
+		if (c != '_')
+			strbuf_addch(sb, tolower(c));
 	}
 
 	strbuf_addstr(sb, ": ");
@@ -251,21 +272,25 @@ static int report(struct fsck_options *options, struct object *object,
 {
 	va_list ap;
 	struct strbuf sb = STRBUF_INIT;
-	int msg_severity = fsck_msg_severity(id, options), result;
+	int msg_type = fsck_msg_type(id, options), result;
+
+	if (msg_type == FSCK_IGNORE)
+		return 0;
 
-	if (msg_severity == FSCK_IGNORE)
+	if (options->skiplist && object &&
+			sha1_array_lookup(options->skiplist, object->sha1) >= 0)
 		return 0;
 
-	if (msg_severity == FSCK_FATAL)
-		msg_severity = FSCK_ERROR;
-	else if (msg_severity == FSCK_INFO)
-		msg_severity = FSCK_WARN;
+	if (msg_type == FSCK_FATAL)
+		msg_type = FSCK_ERROR;
+	else if (msg_type == FSCK_INFO)
+		msg_type = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
-	result = options->error_func(object, msg_severity, sb.buf);
+	result = options->error_func(object, msg_type, sb.buf);
 	strbuf_release(&sb);
 	va_end(ap);
 
@@ -756,10 +781,6 @@ static int fsck_tag(struct tag *tag, const char *data,
 int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
-	if (options->skiplist &&
-			sha1_array_lookup(options->skiplist, obj->sha1) >= 0)
-		return 0;
-
 	if (!obj)
 		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
@@ -778,9 +799,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int severity, const char *message)
+int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
-	if (severity == FSCK_WARN) {
+	if (msg_type == FSCK_WARN) {
 		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
 		return 0;
 	}
diff --git a/fsck.h b/fsck.h
index cae280e..cab9c65 100644
--- a/fsck.h
+++ b/fsck.h
@@ -7,7 +7,11 @@
 
 struct fsck_options;
 
-void fsck_set_severity(struct fsck_options *options, const char *mode);
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, int msg_id_len,
+		const char *msg_type, int msg_type_len);
+void fsck_set_msg_types(struct fsck_options *options, const char *values);
+int is_valid_msg_type(const char *msg_id, const char *msg_type);
 
 /*
  * callback function for fsck_walk
@@ -28,7 +32,7 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
-	int *msg_severity;
+	int *msg_type;
 	struct sha1_array *skiplist;
 };
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index b32afaf..471e2ea 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid-tag-name: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: missing-tagger-entry: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: invalidtagname: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
@@ -295,7 +295,7 @@ test_expect_success 'force fsck to ignore double author' '
 	git update-ref refs/heads/bogus "$new" &&
 	test_when_finished "git update-ref -d refs/heads/bogus" &&
 	test_must_fail git fsck &&
-	git -c fsck.severity=multiple-authors=ignore fsck
+	git -c fsck.multipleauthors=ignore fsck
 '
 
 _bz='\0'
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 7881e17..1ada54c 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -124,44 +124,46 @@ This commit object intentionally broken
 EOF
 
 test_expect_success 'push with receive.fsck.skiplist' '
-	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	test_must_fail git push --porcelain dst bogus &&
 	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
-	echo $commit > dst/.git/SKIP &&
+	echo $commit >dst/.git/SKIP &&
 	git push --porcelain dst bogus
 '
 
-test_expect_success 'push with receive.fsck.severity = missing-email=warn' '
-	commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+test_expect_success 'push with receive.fsck.missingemail=warn' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	test_must_fail git push --porcelain dst bogus &&
 	git --git-dir=dst/.git config \
-		receive.fsck.severity missing-email=warn &&
+		receive.fsck.missingemail warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missing-email" act &&
+	grep "missingemail" act &&
 	git --git-dir=dst/.git branch -D bogus &&
 	git  --git-dir=dst/.git config --add \
-		receive.fsck.severity missing-email=ignore,bad-date=warn &&
+		receive.fsck.missingemail ignore &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.baddate warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	test_must_fail grep "missing-email" act
+	test_must_fail grep "missingemail" act
 '
 
 test_expect_success \
-	'receive.fsck.severity = unterminated-header=warn triggers error' '
+	'receive.fsck.unterminatedheader=warn triggers error' '
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	git --git-dir=dst/.git config \
-		receive.fsck.severity unterminated-header=warn &&
+		receive.fsck.unterminatedheader warn &&
 	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
-	grep "Cannot demote unterminated-header" act
+	grep "Cannot demote unterminatedheader" act
 '
 
 test_done

-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 01/19] fsck: Introduce fsck options
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
@ 2015-06-18 20:07       ` Johannes Schindelin
  2015-06-18 20:07       ` [PATCH v5 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
                         ` (19 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:07 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2679793..981dca5 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static struct object_id head_oid;
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -638,6 +640,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 48fa472..87ae9ba 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -75,6 +75,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -192,7 +193,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -838,10 +839,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			fsck_options.walk = mark_link;
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ac66672..6d17040 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 10bcb65..d83b811 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -25,9 +25,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -40,7 +40,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -49,14 +49,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -66,14 +66,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -81,11 +81,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -138,7 +138,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -194,7 +194,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -219,30 +219,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -250,7 +250,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -258,36 +258,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,30 +295,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,39 +328,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -376,65 +376,65 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -443,34 +443,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..07d0ab2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	unsigned strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 02/19] fsck: Introduce identifiers for fsck messages
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
  2015-06-18 20:07       ` [PATCH v5 01/19] fsck: Introduce fsck options Johannes Schindelin
@ 2015-06-18 20:07       ` Johannes Schindelin
  2015-06-18 20:07       ` [PATCH v5 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
                         ` (18 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:07 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Instead of specifying whether a message by the fsck machinery constitutes
an error or a warning, let's specify an identifier relating to the
concrete problem that was encountered. This is necessary for upcoming
support to be able to demote certain errors to warnings.

In the process, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

We could use a simple enum for the message IDs here, but we want to
guarantee that the enum values are associated with the appropriate
message types (i.e. error or warning?). Besides, we want to introduce a
parser in the next commit that maps the string representation to the
enum value, hence we use the slightly ugly preprocessor construct that
is extensible for use with said parser.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  26 +++-----
 fsck.c         | 201 +++++++++++++++++++++++++++++++++++++++++----------------
 fsck.h         |   5 +-
 3 files changed, 154 insertions(+), 78 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 981dca5..fff38fe 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,33 +46,23 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
-static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+static void objreport(struct object *obj, const char *msg_type,
+			const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+		msg_type, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d83b811..ed0bfc3 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,98 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(DATE_OVERFLOW, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(INVALID_OBJECT_SHA1, ERROR) \
+	FUNC(INVALID_TAG_OBJECT, ERROR) \
+	FUNC(INVALID_TREE, ERROR) \
+	FUNC(INVALID_TYPE, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(NOT_SORTED, ERROR) \
+	FUNC(NUL_IN_HEADER, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(UNTERMINATED_HEADER, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(INVALID_TAG_NAME, WARN) \
+	FUNC(MISSING_TAGGER_ENTRY, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN)
+
+#define MSG_ID(id, msg_type) FSCK_MSG_##id,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+#undef MSG_ID
+
+#define MSG_ID(id, msg_type) { FSCK_##msg_type },
+static struct {
+	int msg_type;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ -1 }
+};
+#undef MSG_ID
+
+static int fsck_msg_type(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
+{
+	int msg_type;
+
+	msg_type = msg_id_info[msg_id].msg_type;
+	if (options->strict && msg_type == FSCK_WARN)
+		msg_type = FSCK_ERROR;
+
+	return msg_type;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_type = fsck_msg_type(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_type, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -219,25 +311,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -250,15 +342,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -266,28 +360,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,7 +389,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -312,13 +406,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,23 +422,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_INVALID_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -376,11 +470,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -391,48 +487,49 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -448,7 +545,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_INVALID_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -457,7 +554,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -470,22 +567,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 07d0ab2..f6f268a 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 03/19] fsck: Provide a function to parse fsck message IDs
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
  2015-06-18 20:07       ` [PATCH v5 01/19] fsck: Introduce fsck options Johannes Schindelin
  2015-06-18 20:07       ` [PATCH v5 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-06-18 20:07       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
                         ` (17 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:07 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

These functions will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. The upcoming `fsck_set_msg_types()` function has to
handle partial strings because we would like to be able to parse, say,
'missingemail=warn,missingtaggerentry=warn' command line parameters
(which will be passed by receive-pack to index-pack and unpack-objects).

To make the parsing robust, we generate strings from the enum keys, and
using these keys, we match up strings without dashes case-insensitively
to the corresponding enum values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index ed0bfc3..4595c7f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,15 +63,41 @@ enum fsck_msg_id {
 };
 #undef MSG_ID
 
-#define MSG_ID(id, msg_type) { FSCK_##msg_type },
+#define STR(x) #x
+#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
 static struct {
+	const char *id_string;
 	int msg_type;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ -1 }
+	{ NULL, -1 }
 };
 #undef MSG_ID
 
+static int parse_msg_id(const char *text, int len)
+{
+	int i, j;
+
+	if (len < 0)
+		len = strlen(text);
+
+	for (i = 0; i < FSCK_MSG_MAX; i++) {
+		const char *key = msg_id_info[i].id_string;
+		/* match id_string case-insensitively, without underscores. */
+		for (j = 0; j < len; j++) {
+			char c = *(key++);
+			if (c == '_')
+				c = *(key++);
+			if (toupper(text[j]) != c)
+				break;
+		}
+		if (j == len && !*key)
+			return i;
+	}
+
+	return -1;
+}
+
 static int fsck_msg_type(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (2 preceding siblings ...)
  2015-06-18 20:07       ` [PATCH v5 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
                         ` (16 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The `fsck_set_msg_types()` function added by this commit parses a list
of settings in the form:

	missingemail=warn,badname=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we need
to take extra care to set all message types to FSCK_ERROR by default in
those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fsck.h | 10 ++++++--
 2 files changed, 87 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 4595c7f..7db81d2 100644
--- a/fsck.c
+++ b/fsck.c
@@ -103,13 +103,85 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
 {
 	int msg_type;
 
-	msg_type = msg_id_info[msg_id].msg_type;
-	if (options->strict && msg_type == FSCK_WARN)
-		msg_type = FSCK_ERROR;
+	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
+
+	if (options->msg_type)
+		msg_type = options->msg_type[msg_id];
+	else {
+		msg_type = msg_id_info[msg_id].msg_type;
+		if (options->strict && msg_type == FSCK_WARN)
+			msg_type = FSCK_ERROR;
+	}
 
 	return msg_type;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+	int match_len = strlen(match);
+	if (match_len != len)
+		return -1;
+	return memcmp(string, match, len);
+}
+
+static int parse_msg_type(const char *str, int len)
+{
+	if (len < 0)
+		len = strlen(str);
+
+	if (!substrcmp(str, len, "error"))
+		return FSCK_ERROR;
+	else if (!substrcmp(str, len, "warn"))
+		return FSCK_WARN;
+	else
+		die("Unknown fsck message type: '%.*s'",
+				len, str);
+}
+
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, int msg_id_len,
+		const char *msg_type, int msg_type_len)
+{
+	int id = parse_msg_id(msg_id, msg_id_len), type;
+
+	if (id < 0)
+		die("Unhandled message id: %.*s", msg_id_len, msg_id);
+	type = parse_msg_type(msg_type, msg_type_len);
+
+	if (!options->msg_type) {
+		int i;
+		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			msg_type[i] = fsck_msg_type(i, options);
+		options->msg_type = msg_type;
+	}
+
+	options->msg_type[id] = type;
+}
+
+void fsck_set_msg_types(struct fsck_options *options, const char *values)
+{
+	while (*values) {
+		int len = strcspn(values, " ,|"), equal;
+
+		if (!len) {
+			values++;
+			continue;
+		}
+
+		for (equal = 0; equal < len; equal++)
+			if (values[equal] == '=' || values[equal] == ':')
+				break;
+
+		if (equal == len)
+			die("Missing '=': '%.*s'", len, values);
+
+		fsck_set_msg_type(options, values, equal,
+				values + equal + 1, len - equal - 1);
+		values += len;
+	}
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -599,6 +671,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
+	if (msg_type == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index f6f268a..edb4540 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,11 @@
 
 struct fsck_options;
 
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, int msg_id_len,
+		const char *msg_type, int msg_type_len);
+void fsck_set_msg_types(struct fsck_options *options, const char *values);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +30,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	int *msg_type;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 05/19] fsck (receive-pack): Allow demoting errors to warnings
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (3 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
                         ` (15 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.missingemail=warn

The value is actually a comma-separated list.

In case that the same key is listed in multiple receive.fsck.<msg-id>
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain message type
settings).

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 17 +++++++++++++++--
 builtin/unpack-objects.c |  5 +++++
 fsck.c                   |  8 ++++++++
 fsck.h                   |  1 +
 5 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 87ae9ba..98e14fe 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1633,6 +1633,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_set_msg_types(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 94d0571..3afe8f8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -19,6 +19,7 @@
 #include "tag.h"
 #include "gpg-interface.h"
 #include "sigchain.h"
+#include "fsck.h"
 
 static const char receive_pack_usage[] = "git receive-pack <git-dir>";
 
@@ -36,6 +37,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -115,6 +117,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (skip_prefix(var, "receive.fsck.", &var)) {
+		if (is_valid_msg_type(var, value))
+			strbuf_addf(&fsck_msg_types, "%c%s=%s",
+				fsck_msg_types.len ? ',' : '=', var, value);
+		else
+			warning("Skipping unknown msg id '%s'", var);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1490,7 +1501,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_msg_types.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1508,7 +1520,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_msg_types.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d17040..7cc086f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				fsck_set_msg_types(&fsck_options, arg);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
diff --git a/fsck.c b/fsck.c
index 7db81d2..0c7cc26 100644
--- a/fsck.c
+++ b/fsck.c
@@ -138,6 +138,14 @@ static int parse_msg_type(const char *str, int len)
 				len, str);
 }
 
+int is_valid_msg_type(const char *msg_id, const char *msg_type)
+{
+	if (parse_msg_id(msg_id, -1) < 0)
+		return 0;
+	parse_msg_type(msg_type, -1);
+	return 1;
+}
+
 void fsck_set_msg_type(struct fsck_options *options,
 		const char *msg_id, int msg_id_len,
 		const char *msg_type, int msg_type_len)
diff --git a/fsck.h b/fsck.h
index edb4540..738c9df 100644
--- a/fsck.h
+++ b/fsck.h
@@ -10,6 +10,7 @@ void fsck_set_msg_type(struct fsck_options *options,
 		const char *msg_id, int msg_id_len,
 		const char *msg_type, int msg_type_len);
 void fsck_set_msg_types(struct fsck_options *options, const char *values);
+int is_valid_msg_type(const char *msg_id, const char *msg_type);
 
 /*
  * callback function for fsck_walk
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 06/19] fsck: Report the ID of the error/warning
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (4 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
                         ` (14 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missingemail", the user might want to call `git config
--add receive.fsck.missingemail=warn`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c          | 16 ++++++++++++++++
 t/t1450-fsck.sh |  4 ++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 0c7cc26..47cb686 100644
--- a/fsck.c
+++ b/fsck.c
@@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
 	}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c != '_')
+			strbuf_addch(sb, tolower(c));
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	append_msg_id(&sb, msg_id_info[id].id_string);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_type, sb.buf);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cfb32b6..286a643 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: invalidtagname: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 07/19] fsck: Make fsck_ident() warn-friendly
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (5 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
                         ` (13 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index 47cb686..8a1eea3 100644
--- a/fsck.c
+++ b/fsck.c
@@ -479,40 +479,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (6 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
                         ` (12 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missingauthor error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 8a1eea3..31d218d 100644
--- a/fsck.c
+++ b/fsck.c
@@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -548,11 +554,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 09/19] fsck: Handle multiple authors in commits specially
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (7 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:08       ` [PATCH v5 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
                         ` (11 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
receive.fsck.<msg-id> to missingcommitter=warn, but that could hide
missing tree objects in the same commit because we cannot continue
verifying any commit object after encountering a missing committer line,
while we can continue in the case of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fsck.c b/fsck.c
index 31d218d..856221d 100644
--- a/fsck.c
+++ b/fsck.c
@@ -38,6 +38,7 @@
 	FUNC(MISSING_TREE, ERROR) \
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
 	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
@@ -571,6 +572,14 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+		if (err)
+			return err;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
+	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 11/19] fsck: Add a simple test for receive.fsck.<msg-id>
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (8 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-06-18 20:08       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
                         ` (10 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:08 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..3f7e96a 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,25 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit <<\EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.missingemail=warn' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config \
+		receive.fsck.missingemail warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missingemail" act
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 10/19] fsck: Make fsck_tag() warn-friendly
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (9 preceding siblings ...)
  2015-06-18 20:08       ` [PATCH v5 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
                         ` (9 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 856221d..bd4bfc2 100644
--- a/fsck.c
+++ b/fsck.c
@@ -640,7 +640,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (10 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
                         ` (8 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 14 ++++++++++++--
 t/t5504-fetch-receive-strict.sh | 11 +++++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index bd4bfc2..2b2a360 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,12 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FSCK_FATAL -1
+
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
 	FUNC(BAD_EMAIL, ERROR) \
@@ -40,10 +45,8 @@
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NOT_SORTED, ERROR) \
-	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
-	FUNC(UNTERMINATED_HEADER, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
@@ -157,6 +160,10 @@ void fsck_set_msg_type(struct fsck_options *options,
 		die("Unhandled message id: %.*s", msg_id_len, msg_id);
 	type = parse_msg_type(msg_type, msg_type_len);
 
+	if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL)
+		die("Cannot demote %.*s to %.*s", msg_id_len, msg_id,
+				msg_type_len, msg_type);
+
 	if (!options->msg_type) {
 		int i;
 		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
@@ -213,6 +220,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_FATAL)
+		msg_type = FSCK_ERROR;
+
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 3f7e96a..0d64229 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -136,4 +136,15 @@ test_expect_success 'push with receive.fsck.missingemail=warn' '
 	grep "missingemail" act
 '
 
+test_expect_success \
+	'receive.fsck.unterminatedheader=warn triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config \
+		receive.fsck.unterminatedheader warn &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminatedheader" act
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (11 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
                         ` (7 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective message type to "ignore".

This change "abuses" the missingemail=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 9 ++++++++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 2b2a360..0f7eb22 100644
--- a/fsck.c
+++ b/fsck.c
@@ -137,6 +137,8 @@ static int parse_msg_type(const char *str, int len)
 		return FSCK_ERROR;
 	else if (!substrcmp(str, len, "warn"))
 		return FSCK_WARN;
+	else if (!substrcmp(str, len, "ignore"))
+		return FSCK_IGNORE;
 	else
 		die("Unknown fsck message type: '%.*s'",
 				len, str);
@@ -220,6 +222,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_IGNORE)
+		return 0;
+
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
 
diff --git a/fsck.h b/fsck.h
index 738c9df..7e49372 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 0d64229..cb077b7 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -133,7 +133,14 @@ test_expect_success 'push with receive.fsck.missingemail=warn' '
 	git --git-dir=dst/.git config \
 		receive.fsck.missingemail warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missingemail" act
+	grep "missingemail" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.missingemail ignore &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.baddate warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missingemail" act
 '
 
 test_expect_success \
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (12 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
                         ` (6 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalidtagname` and
`missingtaggerentry` in the receive.fsck.<msg-id> config setting.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 +++++++++++++++++-------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 0f7eb22..a5e7dfb 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,6 +10,7 @@
 #include "utf8.h"
 
 #define FSCK_FATAL -1
+#define FSCK_INFO -2
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -55,10 +56,11 @@
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
-	FUNC(INVALID_TAG_NAME, WARN) \
-	FUNC(MISSING_TAGGER_ENTRY, WARN) \
 	FUNC(NULL_SHA1, WARN) \
-	FUNC(ZERO_PADDED_FILEMODE, WARN)
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(INVALID_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, msg_type) FSCK_MSG_##id,
 enum fsck_msg_id {
@@ -227,6 +229,8 @@ static int report(struct fsck_options *options, struct object *object,
 
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
+	else if (msg_type == FSCK_INFO)
+		msg_type = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
@@ -685,15 +689,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 15/19] fsck: Document the new receive.fsck.<msg-id> options
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (13 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
                         ` (5 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 3e37b93..306ab7a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2205,6 +2205,20 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.<msg-id>::
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.<msg-id>`
+	setting where the `<msg-id>` is the fsck message ID and the value
+	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
+	the error/warning with the message ID, e.g. "missingemail: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.missingemail = ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories with certain known issues but still catch
+other issues.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 16/19] fsck: Support demoting errors to warnings
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (14 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:09       ` [PATCH v5 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
                         ` (4 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.missingemail=ignore fsck` would hide
problems with missing emails in author, committer and tagger lines.

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.<msg-id> from the
receive.fsck.<msg-id> settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 11 +++++++++++
 builtin/fsck.c           | 12 ++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 34 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 306ab7a..41fd460 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1250,6 +1250,17 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.<msg-id>::
+	Allows overriding the message type (error, warn or ignore) of a
+	specific message ID such as `missingemail`.
++
+For convenience, fsck prefixes the error/warning with the message ID,
+e.g.  "missingemail: invalid author/committer line - missing email" means
+that setting `fsck.missingemail = ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index fff38fe..6de9f3e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,16 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (skip_prefix(var, "fsck.", &var)) {
+		fsck_set_msg_type(&fsck_obj_options, var, -1, value, -1);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *msg_type,
 			const char *err)
 {
@@ -646,6 +656,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 286a643..fe4bb03 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.multipleauthors=ignore fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 17/19] fsck: Introduce `git fsck --quick`
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (15 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
@ 2015-06-18 20:09       ` Johannes Schindelin
  2015-06-18 20:10       ` [PATCH v5 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
                         ` (3 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:09 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 	 [--[no-]dangling] [--[no-]progress] [<object>*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--quick::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 6de9f3e..75fcb5f 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -181,6 +182,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (quick && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -623,6 +626,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -659,7 +663,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!quick)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index fe4bb03..471e2ea 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+	rm -rf quick &&
+	git init quick &&
+	(
+		cd quick &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --quick &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --quick
+	)
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (16 preceding siblings ...)
  2015-06-18 20:09       ` [PATCH v5 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
@ 2015-06-18 20:10       ` Johannes Schindelin
  2015-06-18 20:10       ` [PATCH v5 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
                         ` (2 subsequent siblings)
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:10 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The optional new config option `receive.fsck.skiplist` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt        |  7 ++++++
 builtin/receive-pack.c          |  8 ++++++
 fsck.c                          | 54 +++++++++++++++++++++++++++++++++++++++++
 fsck.h                          |  1 +
 t/t5504-fetch-receive-strict.sh | 12 +++++++++
 5 files changed, 82 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 41fd460..5f45115 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2230,6 +2230,13 @@ which would not pass pushing when `receive.fsckObjects = true`, allowing
 the host to accept repositories with certain known issues but still catch
 other issues.
 
+receive.fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3afe8f8..80574f9 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -117,6 +117,14 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
+			fsck_msg_types.len ? ',' : '=', path);
+		return 0;
+	}
+
 	if (skip_prefix(var, "receive.fsck.", &var)) {
 		if (is_valid_msg_type(var, value))
 			strbuf_addf(&fsck_msg_types, "%c%s=%s",
diff --git a/fsck.c b/fsck.c
index a5e7dfb..9b8981e 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 #include "utf8.h"
+#include "sha1-array.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -122,6 +123,43 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
 	return msg_type;
 }
 
+static void init_skiplist(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
+	else {
+		sorted = 1;
+		options->skiplist = &skiplist;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skiplist.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
 	int match_len = strlen(match);
@@ -193,6 +231,18 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
 			if (values[equal] == '=' || values[equal] == ':')
 				break;
 
+		if (!substrcmp(values, equal, "skiplist")) {
+			char *path = xstrndup(values + equal + 1,
+				len - equal - 1);
+
+			if (equal == len)
+				die("skiplist requires a path");
+			init_skiplist(options, path);
+			free(path);
+			values += len;
+			continue;
+		}
+
 		if (equal == len)
 			die("Missing '=': '%.*s'", len, values);
 
@@ -227,6 +277,10 @@ static int report(struct fsck_options *options, struct object *object,
 	if (msg_type == FSCK_IGNORE)
 		return 0;
 
+	if (options->skiplist && object &&
+			sha1_array_lookup(options->skiplist, object->sha1) >= 0)
+		return 0;
+
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
 	else if (msg_type == FSCK_INFO)
diff --git a/fsck.h b/fsck.h
index 7e49372..cab9c65 100644
--- a/fsck.h
+++ b/fsck.h
@@ -33,6 +33,7 @@ struct fsck_options {
 	fsck_error error_func;
 	unsigned strict:1;
 	int *msg_type;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index cb077b7..1ada54c 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skiplist' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
+	echo $commit >dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.missingemail=warn' '
 	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v5 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (17 preceding siblings ...)
  2015-06-18 20:10       ` [PATCH v5 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-06-18 20:10       ` Johannes Schindelin
  2015-06-18 22:11       ` [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
  20 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-18 20:10 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Identical to support in `git receive-pack for the config option
`receive.fsck.skiplist`, we now support ignoring given objects in
`git fsck` via `fsck.skiplist` altogether.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt |  7 +++++++
 builtin/fsck.c           | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5f45115..5aba63a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1261,6 +1261,13 @@ that setting `fsck.missingemail = ignore` will hide that issue.
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
 
+fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 75fcb5f..ce538ac 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -54,6 +54,16 @@ static int fsck_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "skiplist=%s", path);
+		fsck_set_msg_types(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (18 preceding siblings ...)
  2015-06-18 20:10       ` [PATCH v5 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
@ 2015-06-18 22:11       ` Junio C Hamano
  2015-06-19  0:04         ` Johannes Schindelin
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
  20 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-18 22:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> At the moment, the git-fsck's integrity checks are targeted toward the
> end user, i.e. the error messages are really just messages, intended for
> human consumption.
>
> Under certain circumstances, some of those errors should be allowed to
> be turned into mere warnings, though, because the cost of fixing the
> issues might well be larger than the cost of carrying those flawed
> objects.

> Interdiff below the diffstat. It's huge. Sorry.

Heh, no need to say sorry, though.  A large interdiff means you did
a lot more work, after all.

I haven't had a chance to go through the all the patches, but one
thing I noticed that did not appear in the interdiff is that some of
the message IDs are unclear.  For example, there are BAD_something,
INVALID_something and MISSING_something.  The last one is in a
different category and is good, but how are the former two
differenciated?  Do they follow some systematic rules, or they are
named after the way how they happened to be reported in the original
textual error message?

Some of the questionable groups are:

    BAD_DATE DATE_OVERFLOW

    BAD_TREE_SHA1 INVALID_OBJECT_SHA1 INVALID_TREE

    BAD_PARENT_SHA1 INVALID_OBJECT_SHA1

Also it is unclear if NOT_SORTED is to be used ever for any error
other than a tree object sorted incorrectly, or if we start noticing
a new error that something is not sorted, we will reuse this one.

I also briefly wondered if fsck.skipList should be finer grained
than "these are know to be broken, do not bother reporting problems
with them" (e.g. I know v0.99 lacks "tagger" so I want to squelch
MISSING_TAGGER_ENTRY for it, but I want to be notified on any other
errors).  But that only matters if we update Git to a version with a
new fsck that knows yet more kinds of breakages, so it is not a huge
issue, and the simplicity of "be silent on these objects" is
probably better overall.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-18 22:11       ` [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
@ 2015-06-19  0:04         ` Johannes Schindelin
  2015-06-19 17:33           ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19  0:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 00:11, Junio C Hamano wrote:

> I haven't had a chance to go through the all the patches, but one
> thing I noticed that did not appear in the interdiff is that some of
> the message IDs are unclear.  For example, there are BAD_something,
> INVALID_something and MISSING_something.  The last one is in a
> different category and is good, but how are the former two
> differenciated?  Do they follow some systematic rules, or they are
> named after the way how they happened to be reported in the original
> textual error message?

I basically made up names on the go, based on the messages.

> Some of the questionable groups are:
> 
>     BAD_DATE DATE_OVERFLOW

I guess it should be BAD_DATE_OVERFLOW to be more consistent?

>     BAD_TREE_SHA1 INVALID_OBJECT_SHA1 INVALID_TREE
> 
>     BAD_PARENT_SHA1 INVALID_OBJECT_SHA1

So how about s/INVALID_/BAD_/g?

> Also it is unclear if NOT_SORTED is to be used ever for any error
> other than a tree object sorted incorrectly, or if we start noticing
> a new error that something is not sorted, we will reuse this one.

s/NOT_SORTED/TREE_&/ maybe?

> I also briefly wondered if fsck.skipList should be finer grained
> than "these are know to be broken, do not bother reporting problems
> with them" (e.g. I know v0.99 lacks "tagger" so I want to squelch
> MISSING_TAGGER_ENTRY for it, but I want to be notified on any other
> errors).  But that only matters if we update Git to a version with a
> new fsck that knows yet more kinds of breakages, so it is not a huge
> issue, and the simplicity of "be silent on these objects" is
> probably better overall.

Well, the idea of skiplist is to say: "I have inspected this object and determined that errors in it should be ignored." As such, it does not really matter what problems future Git versions report because the person populating the skiplist is supposed to test thoroughly, not just asking `git fsck` what is going on.

And yes, the motivation for this feature is to keep it super-simple. ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH v6 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
                         ` (19 preceding siblings ...)
  2015-06-18 22:11       ` [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
@ 2015-06-19 13:32       ` Johannes Schindelin
  2015-06-19 13:32         ` [PATCH v6 01/19] fsck: Introduce fsck options Johannes Schindelin
                           ` (19 more replies)
  20 siblings, 20 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:32 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.

Changes since v5:

- The BAD_* names have been renamed to INVALID_* names

Interdiff below the diffstat.

Johannes Schindelin (19):
  fsck: Introduce fsck options
  fsck: Introduce identifiers for fsck messages
  fsck: Provide a function to parse fsck message IDs
  fsck: Offer a function to demote fsck errors to warnings
  fsck (receive-pack): Allow demoting errors to warnings
  fsck: Report the ID of the error/warning
  fsck: Make fsck_ident() warn-friendly
  fsck: Make fsck_commit() warn-friendly
  fsck: Handle multiple authors in commits specially
  fsck: Make fsck_tag() warn-friendly
  fsck: Add a simple test for receive.fsck.<msg-id>
  fsck: Disallow demoting grave fsck errors to warnings
  fsck: Optionally ignore specific fsck issues completely
  fsck: Allow upgrading fsck warnings to errors
  fsck: Document the new receive.fsck.<msg-id> options
  fsck: Support demoting errors to warnings
  fsck: Introduce `git fsck --quick`
  fsck: git receive-pack: support excluding objects from fsck'ing
  fsck: support ignoring objects in `git fsck` via fsck.skiplist

 Documentation/config.txt        |  39 +++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  75 ++++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  25 +-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 553 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  31 ++-
 t/t1450-fsck.sh                 |  37 ++-
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  51 ++++
 11 files changed, 686 insertions(+), 163 deletions(-)

diff --git a/fsck.c b/fsck.c
index 9b8981e..f80b508 100644
--- a/fsck.c
+++ b/fsck.c
@@ -19,17 +19,17 @@
 	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_DATE_OVERFLOW, ERROR) \
 	FUNC(BAD_EMAIL, ERROR) \
 	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_OBJECT_SHA1, ERROR) \
 	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TAG_OBJECT, ERROR) \
 	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE, ERROR) \
 	FUNC(BAD_TREE_SHA1, ERROR) \
-	FUNC(DATE_OVERFLOW, ERROR) \
+	FUNC(BAD_TYPE, ERROR) \
 	FUNC(DUPLICATE_ENTRIES, ERROR) \
-	FUNC(INVALID_OBJECT_SHA1, ERROR) \
-	FUNC(INVALID_TAG_OBJECT, ERROR) \
-	FUNC(INVALID_TREE, ERROR) \
-	FUNC(INVALID_TYPE, ERROR) \
 	FUNC(MISSING_AUTHOR, ERROR) \
 	FUNC(MISSING_COMMITTER, ERROR) \
 	FUNC(MISSING_EMAIL, ERROR) \
@@ -46,8 +46,8 @@
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
-	FUNC(NOT_SORTED, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(TREE_NOT_SORTED, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
@@ -60,7 +60,7 @@
 	FUNC(NULL_SHA1, WARN) \
 	FUNC(ZERO_PADDED_FILEMODE, WARN) \
 	/* infos (reported as warnings, but ignored by default) */ \
-	FUNC(INVALID_TAG_NAME, INFO) \
+	FUNC(BAD_TAG_NAME, INFO) \
 	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, msg_type) FSCK_MSG_##id,
@@ -525,7 +525,7 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 	if (has_dup_entries)
 		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += report(options, &item->object, FSCK_MSG_NOT_SORTED, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -580,7 +580,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(p, &end, 10)))
-		return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	p = end + 1;
@@ -659,7 +659,7 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (err)
 		return err;
 	if (!commit->tree)
-		return report(options, &commit->object, FSCK_MSG_INVALID_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -712,7 +712,7 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = report(options, &tag->object, FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		if (ret)
 			goto done;
 	}
@@ -728,7 +728,7 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = report(options, &tag->object, FSCK_MSG_INVALID_TYPE, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_BAD_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
@@ -744,7 +744,7 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0)) {
-		ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME,
+		ret = report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 		if (ret)
@@ -773,7 +773,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return report(options, &tag->object, FSCK_MSG_INVALID_TAG_OBJECT, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_BAD_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -782,7 +782,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return report(options, obj, FSCK_MSG_INVALID_OBJECT_SHA1, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_BAD_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 471e2ea..2863a8a 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,7 +231,7 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalidtagname: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: badtagname: invalid '\''tag'\'' name: wrong name format
 	warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 01/19] fsck: Introduce fsck options
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
@ 2015-06-19 13:32         ` Johannes Schindelin
  2015-06-19 19:03           ` Junio C Hamano
  2015-06-19 13:32         ` [PATCH v6 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
                           ` (18 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:32 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2679793..981dca5 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static struct object_id head_oid;
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -638,6 +640,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 48fa472..87ae9ba 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -75,6 +75,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -192,7 +193,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -838,10 +839,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			fsck_options.walk = mark_link;
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ac66672..6d17040 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 10bcb65..d83b811 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -25,9 +25,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -40,7 +40,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -49,14 +49,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -66,14 +66,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -81,11 +81,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -138,7 +138,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -194,7 +194,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -219,30 +219,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -250,7 +250,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -258,36 +258,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,30 +295,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,39 +328,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -376,65 +376,65 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -443,34 +443,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..07d0ab2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	unsigned strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 02/19] fsck: Introduce identifiers for fsck messages
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
  2015-06-19 13:32         ` [PATCH v6 01/19] fsck: Introduce fsck options Johannes Schindelin
@ 2015-06-19 13:32         ` Johannes Schindelin
  2015-06-19 19:06           ` Junio C Hamano
  2015-06-19 13:32         ` [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
                           ` (17 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:32 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Instead of specifying whether a message by the fsck machinery constitutes
an error or a warning, let's specify an identifier relating to the
concrete problem that was encountered. This is necessary for upcoming
support to be able to demote certain errors to warnings.

In the process, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

We could use a simple enum for the message IDs here, but we want to
guarantee that the enum values are associated with the appropriate
message types (i.e. error or warning?). Besides, we want to introduce a
parser in the next commit that maps the string representation to the
enum value, hence we use the slightly ugly preprocessor construct that
is extensible for use with said parser.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  26 +++-----
 fsck.c         | 201 +++++++++++++++++++++++++++++++++++++++++----------------
 fsck.h         |   5 +-
 3 files changed, 154 insertions(+), 78 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 981dca5..fff38fe 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,33 +46,23 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
-static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+static void objreport(struct object *obj, const char *msg_type,
+			const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+		msg_type, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d83b811..ab24618 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,98 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_DATE_OVERFLOW, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_OBJECT_SHA1, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TAG_OBJECT, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(BAD_TYPE, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(NUL_IN_HEADER, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(TREE_NOT_SORTED, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(UNTERMINATED_HEADER, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(BAD_TAG_NAME, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(MISSING_TAGGER_ENTRY, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN)
+
+#define MSG_ID(id, msg_type) FSCK_MSG_##id,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+#undef MSG_ID
+
+#define MSG_ID(id, msg_type) { FSCK_##msg_type },
+static struct {
+	int msg_type;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ -1 }
+};
+#undef MSG_ID
+
+static int fsck_msg_type(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
+{
+	int msg_type;
+
+	msg_type = msg_id_info[msg_id].msg_type;
+	if (options->strict && msg_type == FSCK_WARN)
+		msg_type = FSCK_ERROR;
+
+	return msg_type;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_type = fsck_msg_type(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_type, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -219,25 +311,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -250,15 +342,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -266,28 +360,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,7 +389,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -312,13 +406,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,23 +422,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -376,11 +470,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -391,48 +487,49 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_BAD_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
+			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -448,7 +545,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_BAD_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -457,7 +554,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_BAD_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -470,22 +567,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 07d0ab2..f6f268a 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
  2015-06-19 13:32         ` [PATCH v6 01/19] fsck: Introduce fsck options Johannes Schindelin
  2015-06-19 13:32         ` [PATCH v6 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-06-19 13:32         ` Johannes Schindelin
  2015-06-19 19:13           ` Junio C Hamano
  2015-06-19 13:33         ` [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
                           ` (16 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:32 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

These functions will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. The upcoming `fsck_set_msg_types()` function has to
handle partial strings because we would like to be able to parse, say,
'missingemail=warn,missingtaggerentry=warn' command line parameters
(which will be passed by receive-pack to index-pack and unpack-objects).

To make the parsing robust, we generate strings from the enum keys, and
using these keys, we match up strings without dashes case-insensitively
to the corresponding enum values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index ab24618..da5717c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,15 +63,41 @@ enum fsck_msg_id {
 };
 #undef MSG_ID
 
-#define MSG_ID(id, msg_type) { FSCK_##msg_type },
+#define STR(x) #x
+#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
 static struct {
+	const char *id_string;
 	int msg_type;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ -1 }
+	{ NULL, -1 }
 };
 #undef MSG_ID
 
+static int parse_msg_id(const char *text, int len)
+{
+	int i, j;
+
+	if (len < 0)
+		len = strlen(text);
+
+	for (i = 0; i < FSCK_MSG_MAX; i++) {
+		const char *key = msg_id_info[i].id_string;
+		/* match id_string case-insensitively, without underscores. */
+		for (j = 0; j < len; j++) {
+			char c = *(key++);
+			if (c == '_')
+				c = *(key++);
+			if (toupper(text[j]) != c)
+				break;
+		}
+		if (j == len && !*key)
+			return i;
+	}
+
+	return -1;
+}
+
 static int fsck_msg_type(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (2 preceding siblings ...)
  2015-06-19 13:32         ` [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-06-19 13:33         ` Johannes Schindelin
  2015-06-19 19:26           ` Junio C Hamano
  2015-06-19 13:33         ` [PATCH v6 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
                           ` (15 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:33 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The `fsck_set_msg_types()` function added by this commit parses a list
of settings in the form:

	missingemail=warn,badname=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we need
to take extra care to set all message types to FSCK_ERROR by default in
those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 fsck.h | 10 ++++++--
 2 files changed, 87 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index da5717c..8c3caff 100644
--- a/fsck.c
+++ b/fsck.c
@@ -103,13 +103,85 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
 {
 	int msg_type;
 
-	msg_type = msg_id_info[msg_id].msg_type;
-	if (options->strict && msg_type == FSCK_WARN)
-		msg_type = FSCK_ERROR;
+	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
+
+	if (options->msg_type)
+		msg_type = options->msg_type[msg_id];
+	else {
+		msg_type = msg_id_info[msg_id].msg_type;
+		if (options->strict && msg_type == FSCK_WARN)
+			msg_type = FSCK_ERROR;
+	}
 
 	return msg_type;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+	int match_len = strlen(match);
+	if (match_len != len)
+		return -1;
+	return memcmp(string, match, len);
+}
+
+static int parse_msg_type(const char *str, int len)
+{
+	if (len < 0)
+		len = strlen(str);
+
+	if (!substrcmp(str, len, "error"))
+		return FSCK_ERROR;
+	else if (!substrcmp(str, len, "warn"))
+		return FSCK_WARN;
+	else
+		die("Unknown fsck message type: '%.*s'",
+				len, str);
+}
+
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, int msg_id_len,
+		const char *msg_type, int msg_type_len)
+{
+	int id = parse_msg_id(msg_id, msg_id_len), type;
+
+	if (id < 0)
+		die("Unhandled message id: %.*s", msg_id_len, msg_id);
+	type = parse_msg_type(msg_type, msg_type_len);
+
+	if (!options->msg_type) {
+		int i;
+		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			msg_type[i] = fsck_msg_type(i, options);
+		options->msg_type = msg_type;
+	}
+
+	options->msg_type[id] = type;
+}
+
+void fsck_set_msg_types(struct fsck_options *options, const char *values)
+{
+	while (*values) {
+		int len = strcspn(values, " ,|"), equal;
+
+		if (!len) {
+			values++;
+			continue;
+		}
+
+		for (equal = 0; equal < len; equal++)
+			if (values[equal] == '=' || values[equal] == ':')
+				break;
+
+		if (equal == len)
+			die("Missing '=': '%.*s'", len, values);
+
+		fsck_set_msg_type(options, values, equal,
+				values + equal + 1, len - equal - 1);
+		values += len;
+	}
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -599,6 +671,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
+	if (msg_type == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index f6f268a..edb4540 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,11 @@
 
 struct fsck_options;
 
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, int msg_id_len,
+		const char *msg_type, int msg_type_len);
+void fsck_set_msg_types(struct fsck_options *options, const char *values);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +30,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	int *msg_type;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 05/19] fsck (receive-pack): Allow demoting errors to warnings
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (3 preceding siblings ...)
  2015-06-19 13:33         ` [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-06-19 13:33         ` Johannes Schindelin
  2015-06-19 13:33         ` [PATCH v6 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
                           ` (14 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:33 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.missingemail=warn

The value is actually a comma-separated list.

In case that the same key is listed in multiple receive.fsck.<msg-id>
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain message type
settings).

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 17 +++++++++++++++--
 builtin/unpack-objects.c |  5 +++++
 fsck.c                   |  8 ++++++++
 fsck.h                   |  1 +
 5 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 87ae9ba..98e14fe 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1633,6 +1633,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_set_msg_types(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 94d0571..3afe8f8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -19,6 +19,7 @@
 #include "tag.h"
 #include "gpg-interface.h"
 #include "sigchain.h"
+#include "fsck.h"
 
 static const char receive_pack_usage[] = "git receive-pack <git-dir>";
 
@@ -36,6 +37,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -115,6 +117,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (skip_prefix(var, "receive.fsck.", &var)) {
+		if (is_valid_msg_type(var, value))
+			strbuf_addf(&fsck_msg_types, "%c%s=%s",
+				fsck_msg_types.len ? ',' : '=', var, value);
+		else
+			warning("Skipping unknown msg id '%s'", var);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1490,7 +1501,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_msg_types.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1508,7 +1520,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_msg_types.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d17040..7cc086f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				fsck_set_msg_types(&fsck_options, arg);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
diff --git a/fsck.c b/fsck.c
index 8c3caff..8e6faa8 100644
--- a/fsck.c
+++ b/fsck.c
@@ -138,6 +138,14 @@ static int parse_msg_type(const char *str, int len)
 				len, str);
 }
 
+int is_valid_msg_type(const char *msg_id, const char *msg_type)
+{
+	if (parse_msg_id(msg_id, -1) < 0)
+		return 0;
+	parse_msg_type(msg_type, -1);
+	return 1;
+}
+
 void fsck_set_msg_type(struct fsck_options *options,
 		const char *msg_id, int msg_id_len,
 		const char *msg_type, int msg_type_len)
diff --git a/fsck.h b/fsck.h
index edb4540..738c9df 100644
--- a/fsck.h
+++ b/fsck.h
@@ -10,6 +10,7 @@ void fsck_set_msg_type(struct fsck_options *options,
 		const char *msg_id, int msg_id_len,
 		const char *msg_type, int msg_type_len);
 void fsck_set_msg_types(struct fsck_options *options, const char *values);
+int is_valid_msg_type(const char *msg_id, const char *msg_type);
 
 /*
  * callback function for fsck_walk
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 06/19] fsck: Report the ID of the error/warning
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (4 preceding siblings ...)
  2015-06-19 13:33         ` [PATCH v6 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
@ 2015-06-19 13:33         ` Johannes Schindelin
  2015-06-19 19:28           ` Junio C Hamano
  2015-06-19 13:33         ` [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
                           ` (13 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:33 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missingemail", the user might want to call `git config
--add receive.fsck.missingemail=warn`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c          | 16 ++++++++++++++++
 t/t1450-fsck.sh |  4 ++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 8e6faa8..0b3e18f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
 	}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c != '_')
+			strbuf_addch(sb, tolower(c));
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	append_msg_id(&sb, msg_id_info[id].id_string);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_type, sb.buf);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cfb32b6..d6d3b13 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: badtagname: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (5 preceding siblings ...)
  2015-06-19 13:33         ` [PATCH v6 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-06-19 13:33         ` Johannes Schindelin
  2015-06-19 19:48           ` Junio C Hamano
  2015-06-19 13:33         ` [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
                           ` (12 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:33 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index 0b3e18f..9faaf53 100644
--- a/fsck.c
+++ b/fsck.c
@@ -479,40 +479,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (6 preceding siblings ...)
  2015-06-19 13:33         ` [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-06-19 13:33         ` Johannes Schindelin
  2015-06-19 20:12           ` Junio C Hamano
  2015-06-19 13:34         ` [PATCH v6 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
                           ` (11 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:33 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missingauthor error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 9faaf53..9fe9f48 100644
--- a/fsck.c
+++ b/fsck.c
@@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -548,11 +554,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 09/19] fsck: Handle multiple authors in commits specially
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (7 preceding siblings ...)
  2015-06-19 13:33         ` [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-06-19 13:34         ` Johannes Schindelin
  2015-06-19 20:16           ` Junio C Hamano
  2015-06-19 13:34         ` [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
                           ` (10 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:34 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
receive.fsck.<msg-id> to missingcommitter=warn, but that could hide
missing tree objects in the same commit because we cannot continue
verifying any commit object after encountering a missing committer line,
while we can continue in the case of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fsck.c b/fsck.c
index 9fe9f48..0cfa4d0 100644
--- a/fsck.c
+++ b/fsck.c
@@ -38,6 +38,7 @@
 	FUNC(MISSING_TREE, ERROR) \
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(TREE_NOT_SORTED, ERROR) \
@@ -571,6 +572,14 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+		if (err)
+			return err;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
+	}
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (8 preceding siblings ...)
  2015-06-19 13:34         ` [PATCH v6 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-06-19 13:34         ` Johannes Schindelin
  2015-06-19 20:18           ` Junio C Hamano
  2015-06-19 13:34         ` [PATCH v6 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
                           ` (9 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:34 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 0cfa4d0..21e3052 100644
--- a/fsck.c
+++ b/fsck.c
@@ -640,7 +640,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 11/19] fsck: Add a simple test for receive.fsck.<msg-id>
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (9 preceding siblings ...)
  2015-06-19 13:34         ` [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-06-19 13:34         ` Johannes Schindelin
  2015-06-19 13:34         ` [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
                           ` (8 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:34 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..3f7e96a 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,25 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit <<\EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.missingemail=warn' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config \
+		receive.fsck.missingemail warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missingemail" act
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (10 preceding siblings ...)
  2015-06-19 13:34         ` [PATCH v6 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
@ 2015-06-19 13:34         ` Johannes Schindelin
  2015-06-19 20:21           ` Junio C Hamano
  2015-06-19 13:34         ` [PATCH v6 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
                           ` (7 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:34 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 14 ++++++++++++--
 t/t5504-fetch-receive-strict.sh | 11 +++++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 21e3052..a4fbce3 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,12 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FSCK_FATAL -1
+
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
 	FUNC(BAD_DATE_OVERFLOW, ERROR) \
@@ -39,11 +44,9 @@
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
-	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(TREE_NOT_SORTED, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
-	FUNC(UNTERMINATED_HEADER, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
@@ -157,6 +160,10 @@ void fsck_set_msg_type(struct fsck_options *options,
 		die("Unhandled message id: %.*s", msg_id_len, msg_id);
 	type = parse_msg_type(msg_type, msg_type_len);
 
+	if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL)
+		die("Cannot demote %.*s to %.*s", msg_id_len, msg_id,
+				msg_type_len, msg_type);
+
 	if (!options->msg_type) {
 		int i;
 		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
@@ -213,6 +220,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_FATAL)
+		msg_type = FSCK_ERROR;
+
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 3f7e96a..0d64229 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -136,4 +136,15 @@ test_expect_success 'push with receive.fsck.missingemail=warn' '
 	grep "missingemail" act
 '
 
+test_expect_success \
+	'receive.fsck.unterminatedheader=warn triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config \
+		receive.fsck.unterminatedheader warn &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminatedheader" act
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (11 preceding siblings ...)
  2015-06-19 13:34         ` [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-06-19 13:34         ` Johannes Schindelin
  2015-06-19 13:34         ` [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
                           ` (6 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:34 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective message type to "ignore".

This change "abuses" the missingemail=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 9 ++++++++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index a4fbce3..b728646 100644
--- a/fsck.c
+++ b/fsck.c
@@ -137,6 +137,8 @@ static int parse_msg_type(const char *str, int len)
 		return FSCK_ERROR;
 	else if (!substrcmp(str, len, "warn"))
 		return FSCK_WARN;
+	else if (!substrcmp(str, len, "ignore"))
+		return FSCK_IGNORE;
 	else
 		die("Unknown fsck message type: '%.*s'",
 				len, str);
@@ -220,6 +222,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_IGNORE)
+		return 0;
+
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
 
diff --git a/fsck.h b/fsck.h
index 738c9df..7e49372 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 0d64229..cb077b7 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -133,7 +133,14 @@ test_expect_success 'push with receive.fsck.missingemail=warn' '
 	git --git-dir=dst/.git config \
 		receive.fsck.missingemail warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missingemail" act
+	grep "missingemail" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.missingemail ignore &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.baddate warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missingemail" act
 '
 
 test_expect_success \
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (12 preceding siblings ...)
  2015-06-19 13:34         ` [PATCH v6 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-06-19 13:34         ` Johannes Schindelin
  2015-06-19 20:22           ` Junio C Hamano
  2015-06-19 13:35         ` [PATCH v6 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
                           ` (5 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:34 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalidtagname` and
`missingtaggerentry` in the receive.fsck.<msg-id> config setting.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 +++++++++++++++++-------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index b728646..dedad01 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,6 +10,7 @@
 #include "utf8.h"
 
 #define FSCK_FATAL -1
+#define FSCK_INFO -2
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -50,15 +51,16 @@
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
-	FUNC(BAD_TAG_NAME, WARN) \
 	FUNC(EMPTY_NAME, WARN) \
 	FUNC(FULL_PATHNAME, WARN) \
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
-	FUNC(MISSING_TAGGER_ENTRY, WARN) \
 	FUNC(NULL_SHA1, WARN) \
-	FUNC(ZERO_PADDED_FILEMODE, WARN)
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(BAD_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, msg_type) FSCK_MSG_##id,
 enum fsck_msg_id {
@@ -227,6 +229,8 @@ static int report(struct fsck_options *options, struct object *object,
 
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
+	else if (msg_type == FSCK_INFO)
+		msg_type = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
@@ -685,15 +689,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 15/19] fsck: Document the new receive.fsck.<msg-id> options
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (13 preceding siblings ...)
  2015-06-19 13:34         ` [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-06-19 13:35         ` Johannes Schindelin
  2015-06-19 13:35         ` [PATCH v6 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
                           ` (4 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:35 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 3e37b93..306ab7a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2205,6 +2205,20 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.<msg-id>::
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.<msg-id>`
+	setting where the `<msg-id>` is the fsck message ID and the value
+	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
+	the error/warning with the message ID, e.g. "missingemail: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.missingemail = ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories with certain known issues but still catch
+other issues.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 16/19] fsck: Support demoting errors to warnings
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (14 preceding siblings ...)
  2015-06-19 13:35         ` [PATCH v6 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
@ 2015-06-19 13:35         ` Johannes Schindelin
  2015-06-19 13:35         ` [PATCH v6 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
                           ` (3 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:35 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.missingemail=ignore fsck` would hide
problems with missing emails in author, committer and tagger lines.

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.<msg-id> from the
receive.fsck.<msg-id> settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 11 +++++++++++
 builtin/fsck.c           | 12 ++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 34 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 306ab7a..41fd460 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1250,6 +1250,17 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.<msg-id>::
+	Allows overriding the message type (error, warn or ignore) of a
+	specific message ID such as `missingemail`.
++
+For convenience, fsck prefixes the error/warning with the message ID,
+e.g.  "missingemail: invalid author/committer line - missing email" means
+that setting `fsck.missingemail = ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index fff38fe..6de9f3e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,16 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (skip_prefix(var, "fsck.", &var)) {
+		fsck_set_msg_type(&fsck_obj_options, var, -1, value, -1);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *msg_type,
 			const char *err)
 {
@@ -646,6 +656,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index d6d3b13..922c346 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.multipleauthors=ignore fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (15 preceding siblings ...)
  2015-06-19 13:35         ` [PATCH v6 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
@ 2015-06-19 13:35         ` Johannes Schindelin
  2015-06-19 20:32           ` Junio C Hamano
  2015-06-19 13:35         ` [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
                           ` (2 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:35 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 	 [--[no-]dangling] [--[no-]progress] [<object>*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--quick::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 6de9f3e..75fcb5f 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -181,6 +182,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (quick && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -623,6 +626,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -659,7 +663,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!quick)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 922c346..2863a8a 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+	rm -rf quick &&
+	git init quick &&
+	(
+		cd quick &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --quick &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --quick
+	)
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (16 preceding siblings ...)
  2015-06-19 13:35         ` [PATCH v6 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
@ 2015-06-19 13:35         ` Johannes Schindelin
  2015-06-19 20:39           ` Junio C Hamano
  2015-06-22  4:21           ` Junio C Hamano
  2015-06-19 13:35         ` [PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  19 siblings, 2 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:35 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The optional new config option `receive.fsck.skiplist` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt        |  7 ++++++
 builtin/receive-pack.c          |  8 ++++++
 fsck.c                          | 54 +++++++++++++++++++++++++++++++++++++++++
 fsck.h                          |  1 +
 t/t5504-fetch-receive-strict.sh | 12 +++++++++
 5 files changed, 82 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 41fd460..5f45115 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2230,6 +2230,13 @@ which would not pass pushing when `receive.fsckObjects = true`, allowing
 the host to accept repositories with certain known issues but still catch
 other issues.
 
+receive.fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3afe8f8..80574f9 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -117,6 +117,14 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
+			fsck_msg_types.len ? ',' : '=', path);
+		return 0;
+	}
+
 	if (skip_prefix(var, "receive.fsck.", &var)) {
 		if (is_valid_msg_type(var, value))
 			strbuf_addf(&fsck_msg_types, "%c%s=%s",
diff --git a/fsck.c b/fsck.c
index dedad01..f80b508 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 #include "utf8.h"
+#include "sha1-array.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -122,6 +123,43 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
 	return msg_type;
 }
 
+static void init_skiplist(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
+	else {
+		sorted = 1;
+		options->skiplist = &skiplist;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skiplist.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
 	int match_len = strlen(match);
@@ -193,6 +231,18 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
 			if (values[equal] == '=' || values[equal] == ':')
 				break;
 
+		if (!substrcmp(values, equal, "skiplist")) {
+			char *path = xstrndup(values + equal + 1,
+				len - equal - 1);
+
+			if (equal == len)
+				die("skiplist requires a path");
+			init_skiplist(options, path);
+			free(path);
+			values += len;
+			continue;
+		}
+
 		if (equal == len)
 			die("Missing '=': '%.*s'", len, values);
 
@@ -227,6 +277,10 @@ static int report(struct fsck_options *options, struct object *object,
 	if (msg_type == FSCK_IGNORE)
 		return 0;
 
+	if (options->skiplist && object &&
+			sha1_array_lookup(options->skiplist, object->sha1) >= 0)
+		return 0;
+
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
 	else if (msg_type == FSCK_INFO)
diff --git a/fsck.h b/fsck.h
index 7e49372..cab9c65 100644
--- a/fsck.h
+++ b/fsck.h
@@ -33,6 +33,7 @@ struct fsck_options {
 	fsck_error error_func;
 	unsigned strict:1;
 	int *msg_type;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index cb077b7..1ada54c 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skiplist' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
+	echo $commit >dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.missingemail=warn' '
 	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (17 preceding siblings ...)
  2015-06-19 13:35         ` [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-06-19 13:35         ` Johannes Schindelin
  2015-06-19 20:40           ` Junio C Hamano
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 13:35 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Identical to support in `git receive-pack for the config option
`receive.fsck.skiplist`, we now support ignoring given objects in
`git fsck` via `fsck.skiplist` altogether.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt |  7 +++++++
 builtin/fsck.c           | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5f45115..5aba63a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1261,6 +1261,13 @@ that setting `fsck.missingemail = ignore` will hide that issue.
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
 
+fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 75fcb5f..ce538ac 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -54,6 +54,16 @@ static int fsck_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "fsck.skiplist") == 0) {
+		const char *path = is_absolute_path(value) ?
+			value : git_path("%s", value);
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addf(&sb, "skiplist=%s", path);
+		fsck_set_msg_types(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-19  0:04         ` Johannes Schindelin
@ 2015-06-19 17:33           ` Junio C Hamano
  2015-06-19 19:43             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 17:33 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> I basically made up names on the go, based on the messages.
>
>> Some of the questionable groups are:
>> 
>>     BAD_DATE DATE_OVERFLOW
>
> I guess it should be BAD_DATE_OVERFLOW to be more consistent?

I am not sure about "consistency", but surely a common prefix would
help readers to group things.  But for this particular group, I was
wondering if singling out "integer overflow", "zero stuffed
timestamp", etc. into such a finer sub-errors of "you have a bad
timestamp" was beneficial.

>>     BAD_TREE_SHA1 INVALID_OBJECT_SHA1 INVALID_TREE
>> 
>>     BAD_PARENT_SHA1 INVALID_OBJECT_SHA1
>
> So how about s/INVALID_/BAD_/g?

It is not just about distinction between INVAID and BAD.

I was basically wondering what rule decides which one among
BAD_TREE_SHA1, INVALID_OBJECT_SHA1 and INVALID_TREE I would get when
I have a random non-hexdigit string in various places, e.g. after
'tree ' in the object header of a commit object, after 'tag ' in a
tag object that says 'type tree', etc.

>> Also it is unclear if NOT_SORTED is to be used ever for any error
>> other than a tree object sorted incorrectly, or if we start noticing
>> a new error that something is not sorted, we will reuse this one.
>
> s/NOT_SORTED/TREE_&/ maybe?

If that error is specific to tree sorting order, then that would be
a definite improvement.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 01/19] fsck: Introduce fsck options
  2015-06-19 13:32         ` [PATCH v6 01/19] fsck: Introduce fsck options Johannes Schindelin
@ 2015-06-19 19:03           ` Junio C Hamano
  2015-06-20 12:33             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 19:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 48fa472..87ae9ba 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -75,6 +75,7 @@ static int nr_threads;
>  static int from_stdin;
>  static int strict;
>  static int do_fsck_object;
> +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;

So there is a global fsck_options used throughout the entire
session here.

> @@ -838,10 +839,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
>  			if (!obj)
>  				die(_("invalid %s"), typename(type));
>  			if (do_fsck_object &&
> -			    fsck_object(obj, buf, size, 1,
> -				    fsck_error_function))
> +			    fsck_object(obj, buf, size, &fsck_options))
>  				die(_("Error in object"));

And that is used here to inspect each and every object we encounter.

> -			if (fsck_walk(obj, mark_link, NULL))
> +			fsck_options.walk = mark_link;

Then we do a call to fsck_walk() starting from this object, letting
mark_link() to inspect it and set the LINK bit.

> +			if (fsck_walk(obj, NULL, &fsck_options))
>  				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));

Since nobody else sets fsck_options.walk to any other value, and
nobody else calls fsck_walk(), shouldn't that assignment be done
only once somewhere a lot higher in the callchain?  The apparent
"overriding while inspecting this object" that does not have any
corresponding "now we are done, so revert it to the original value"
puzzled me, and I am sure it would puzzle future readers of this
code.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 02/19] fsck: Introduce identifiers for fsck messages
  2015-06-19 13:32         ` [PATCH v6 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-06-19 19:06           ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 19:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Instead of specifying whether a message by the fsck machinery constitutes
> an error or a warning, let's specify an identifier relating to the
> concrete problem that was encountered. This is necessary for upcoming
> support to be able to demote certain errors to warnings.
>
> In the process, simplify the requirements on the calling code: instead of
> having to handle full-blown varargs in every callback, we now send a
> string buffer ready to be used by the callback.
>
> We could use a simple enum for the message IDs here, but we want to
> guarantee that the enum values are associated with the appropriate
> message types (i.e. error or warning?). Besides, we want to introduce a
> parser in the next commit that maps the string representation to the
> enum value, hence we use the slightly ugly preprocessor construct that
> is extensible for use with said parser.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

Nicely implemented.  Looks good.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs
  2015-06-19 13:32         ` [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-06-19 19:13           ` Junio C Hamano
  2015-06-21 13:46             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 19:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> +#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
>  static struct {
> +	const char *id_string;
>  	int msg_type;
>  } msg_id_info[FSCK_MSG_MAX + 1] = {
>  	FOREACH_MSG_ID(MSG_ID)
> -	{ -1 }
> +	{ NULL, -1 }
>  };
>  #undef MSG_ID
>  
> +static int parse_msg_id(const char *text, int len)
> +{
> +	int i, j;
> +
> +	if (len < 0)
> +		len = strlen(text);
> +
> +	for (i = 0; i < FSCK_MSG_MAX; i++) {

I wonder an array without sentinel at the end with ARRAY_SIZE() may
be a leaner way to do these, especially as this is all limited to
this single file.

> +		const char *key = msg_id_info[i].id_string;
> +		/* match id_string case-insensitively, without underscores. */
> +		for (j = 0; j < len; j++) {
> +			char c = *(key++);
> +			if (c == '_')
> +				c = *(key++);

s/if/while/ perhaps?

> +			if (toupper(text[j]) != c)

I know the performance would not matter very much but calling
toupper() for each letter in the user input FSCK_MSG_MAX times
sounds rather inefficient.

Would it make sense to make the caller upcase instead (or upcase
upfront in the function)?

> +				break;
> +		}
> +		if (j == len && !*key)
> +			return i;
> +	}
> +
> +	return -1;
> +}
> +
>  static int fsck_msg_type(enum fsck_msg_id msg_id,
>  	struct fsck_options *options)
>  {

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-19 13:33         ` [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-06-19 19:26           ` Junio C Hamano
  2015-06-21 13:59             ` Johannes Schindelin
  2015-06-22 15:24             ` Johannes Schindelin
  0 siblings, 2 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 19:26 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/fsck.c b/fsck.c
> index da5717c..8c3caff 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -103,13 +103,85 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
>  {
>  	int msg_type;
>  
> -	msg_type = msg_id_info[msg_id].msg_type;
> -	if (options->strict && msg_type == FSCK_WARN)
> -		msg_type = FSCK_ERROR;
> +	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
> +
> +	if (options->msg_type)
> +		msg_type = options->msg_type[msg_id];
> +	else {
> +		msg_type = msg_id_info[msg_id].msg_type;
> +		if (options->strict && msg_type == FSCK_WARN)
> +			msg_type = FSCK_ERROR;
> +	}
>  
>  	return msg_type;
>  }

Nice.

> +static inline int substrcmp(const char *string, int len, const char *match)
> +{
> +	int match_len = strlen(match);
> +	if (match_len != len)
> +		return -1;
> +	return memcmp(string, match, len);
> +}

What this does looks suspiciously like starts_with(), but its name
"substrcmp()" does not give any hint that this is about the beginnig
part of "string"; if anything, it gives a wrong hint that it may be
any substring.  prefixcmp() might be a better name but that was the
old name for !starts_with() so we cannot use it here.  It is a
mouthful, but starts_with_counted() may be.

But the whole thing may be moot.

If we take the "why not upcase the end-user string upfront"
suggestion from the previous review, fsck_set_msg_types() would have
an upcased copy of the end-user string that it can muck with; it can
turn "badfoo=error,poorbar=..." into "BADFOO=error,POORBAR=..."
that is stored in its own writable memory (possibly a strbuf), and
at that point it can afford to NUL-terminate BADFOO=error after
finding where one specification ends with strcspn() before calling
fsck_set_msg_type(), which in turn calls parse_msg_type().

So all parse_msg_type() needs to do is just !strcmp().

> +
> +static int parse_msg_type(const char *str, int len)
> +{
> +	if (len < 0)
> +		len = strlen(str);
> +
> +	if (!substrcmp(str, len, "error"))
> +		return FSCK_ERROR;
> +	else if (!substrcmp(str, len, "warn"))
> +		return FSCK_WARN;

> +	else
> +		die("Unknown fsck message type: '%.*s'",
> +				len, str);
> +}
> +
> +void fsck_set_msg_type(struct fsck_options *options,
> +		const char *msg_id, int msg_id_len,
> +		const char *msg_type, int msg_type_len)
> +{
> +	int id = parse_msg_id(msg_id, msg_id_len), type;
> +
> +	if (id < 0)
> +		die("Unhandled message id: %.*s", msg_id_len, msg_id);
> +	type = parse_msg_type(msg_type, msg_type_len);
> +
> +	if (!options->msg_type) {
> +		int i;
> +		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
> +		for (i = 0; i < FSCK_MSG_MAX; i++)
> +			msg_type[i] = fsck_msg_type(i, options);
> +		options->msg_type = msg_type;
> +	}
> +
> +	options->msg_type[id] = type;
> +}
> +
> +void fsck_set_msg_types(struct fsck_options *options, const char *values)
> +{
> +	while (*values) {
> +		int len = strcspn(values, " ,|"), equal;
> +
> +		if (!len) {
> +			values++;
> +			continue;
> +		}
> +
> +		for (equal = 0; equal < len; equal++)
> +			if (values[equal] == '=' || values[equal] == ':')
> +				break;
> +
> +		if (equal == len)
> +			die("Missing '=': '%.*s'", len, values);
> +
> +		fsck_set_msg_type(options, values, equal,
> +				values + equal + 1, len - equal - 1);
> +		values += len;
> +	}
> +}
> +
>  __attribute__((format (printf, 4, 5)))
>  static int report(struct fsck_options *options, struct object *object,
>  	enum fsck_msg_id id, const char *fmt, ...)
> @@ -599,6 +671,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
>  
>  int fsck_error_function(struct object *obj, int msg_type, const char *message)
>  {
> +	if (msg_type == FSCK_WARN) {
> +		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
> +		return 0;
> +	}
>  	error("object %s: %s", sha1_to_hex(obj->sha1), message);
>  	return 1;
>  }
> diff --git a/fsck.h b/fsck.h
> index f6f268a..edb4540 100644
> --- a/fsck.h
> +++ b/fsck.h
> @@ -6,6 +6,11 @@
>  
>  struct fsck_options;
>  
> +void fsck_set_msg_type(struct fsck_options *options,
> +		const char *msg_id, int msg_id_len,
> +		const char *msg_type, int msg_type_len);
> +void fsck_set_msg_types(struct fsck_options *options, const char *values);
> +
>  /*
>   * callback function for fsck_walk
>   * type is the expected type of the object or OBJ_ANY
> @@ -25,10 +30,11 @@ struct fsck_options {
>  	fsck_walk_func walk;
>  	fsck_error error_func;
>  	unsigned strict:1;
> +	int *msg_type;
>  };
>  
> -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
> -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
> +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
> +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
>  
>  /* descend in all linked child objects
>   * the return value is:

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 06/19] fsck: Report the ID of the error/warning
  2015-06-19 13:33         ` [PATCH v6 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-06-19 19:28           ` Junio C Hamano
  2015-06-19 21:34             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 19:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Some legacy code has objects with non-fatal fsck issues; To enable the
> user to ignore those issues, let's print out the ID (e.g. when
> encountering "missingemail", the user might want to call `git config
> --add receive.fsck.missingemail=warn`).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c          | 16 ++++++++++++++++
>  t/t1450-fsck.sh |  4 ++--
>  2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index 8e6faa8..0b3e18f 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
>  	}
>  }
>  
> +static void append_msg_id(struct strbuf *sb, const char *msg_id)
> +{
> +	for (;;) {
> +		char c = *(msg_id)++;
> +
> +		if (!c)
> +			break;
> +		if (c != '_')
> +			strbuf_addch(sb, tolower(c));
> +	}
> +
> +	strbuf_addstr(sb, ": ");
> +}
> +
>  __attribute__((format (printf, 4, 5)))
>  static int report(struct fsck_options *options, struct object *object,
>  	enum fsck_msg_id id, const char *fmt, ...)
> @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object,
>  	struct strbuf sb = STRBUF_INIT;
>  	int msg_type = fsck_msg_type(id, options), result;
>  
> +	append_msg_id(&sb, msg_id_info[id].id_string);


Nice.  The append function can be made a bit more context sensitive
to upcase a char immediately after _ to make it easier to cut and
paste into "git config" and keep the result readable, I think.

	git config --add receive.fsck.missingEmail=warn

>  	va_start(ap, fmt);
>  	strbuf_vaddf(&sb, fmt, ap);
>  	result = options->error_func(object, msg_type, sb.buf);
> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index cfb32b6..d6d3b13 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
>  	git fsck --tags 2>out &&
>  
>  	cat >expect <<-EOF &&
> -	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
> -	warning in tag $tag: invalid format - expected '\''tagger'\'' line
> +	warning in tag $tag: badtagname: invalid '\''tag'\'' name: wrong name format
> +	warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line
>  	EOF
>  	test_cmp expect out
>  '

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-19 17:33           ` Junio C Hamano
@ 2015-06-19 19:43             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 19:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

first of all: the improvements discussed here are already part of v6.

On 2015-06-19 19:33, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> I basically made up names on the go, based on the messages.
>>
>>> Some of the questionable groups are:
>>>
>>>     BAD_DATE DATE_OVERFLOW
>>
>> I guess it should be BAD_DATE_OVERFLOW to be more consistent?
> 
> I am not sure about "consistency", but surely a common prefix would
> help readers to group things.  But for this particular group, I was
> wondering if singling out "integer overflow", "zero stuffed
> timestamp", etc. into such a finer sub-errors of "you have a bad
> timestamp" was beneficial.

Well, someone thought it a good idea to print different error messages, and I took that as an indicator that there is merit in being able to distinguish these issues from one another.
 
>>>     BAD_TREE_SHA1 INVALID_OBJECT_SHA1 INVALID_TREE
>>>
>>>     BAD_PARENT_SHA1 INVALID_OBJECT_SHA1
>>
>> So how about s/INVALID_/BAD_/g?
> 
> It is not just about distinction between INVAID and BAD.
> 
> I was basically wondering what rule decides which one among
> BAD_TREE_SHA1, INVALID_OBJECT_SHA1 and INVALID_TREE I would get when
> I have a random non-hexdigit string in various places, e.g. after
> 'tree ' in the object header of a commit object, after 'tag ' in a
> tag object that says 'type tree', etc.

To be honest, I think the IDs do not really matter as much as your comment makes it sound: the IDs purpose is solely to be able to configure the message type (read: whether to error out, warn or ignore the issue). The real information is in the actual message (and I did not change any message, therefore I could not make things worse than they are right now).

Example: you would never read BAD_TREE_SHA1, but instead: "badtreesha1: invalid 'tree' line format - bad sha1".

For BAD_OBJECT_SHA1 (as it is now called), there are actually two code paths generating the error: "invalid 'object' line format - bad sha1" and "no valid object to fsck".

And for BAD_TREE it is: "could not load commit's tree <SHA1>".

Thus, from the error message it should be really clear what is going on.

>>> Also it is unclear if NOT_SORTED is to be used ever for any error
>>> other than a tree object sorted incorrectly, or if we start noticing
>>> a new error that something is not sorted, we will reuse this one.
>>
>> s/NOT_SORTED/TREE_&/ maybe?
> 
> If that error is specific to tree sorting order, then that would be
> a definite improvement.

Yes, that is the case. Tree objects are assumed to list their contents in order, and this ID applies to the problem where a tree object's list is out of order.

As I said, I already made both discussed changes part of v6.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly
  2015-06-19 13:33         ` [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-06-19 19:48           ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 19:48 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> When fsck_ident() identifies a problem with the ident, it should still
> advance the pointer to the next line so that fsck can continue in the
> case of a mere warning.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

Makes sense.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-19 13:33         ` [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-06-19 20:12           ` Junio C Hamano
  2015-06-19 20:52             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:12 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> When fsck_commit() identifies a problem with the commit, it should try
> to make it possible to continue checking the commit object, in case the
> user wants to demote the detected errors to mere warnings.

That makes sense.

> Note that some problems are too problematic to simply ignore. For
> example, when the header lines are mixed up, we punt after encountering
> an incorrect line. Therefore, demoting certain warnings to errors can
> hide other problems. Example: demoting the missingauthor error to
> a warning would hide a problematic committer line.

Is this a warning to end-users (which should be better in the doc),
or "because some of them are too problematic to ignore" that forgot
to add the explanation "hence we do not keep going in this code"
(which should be in the log message if that is what is going on)?

I notice that there are many instances of

	if (object does not pass some test)
		return report(...);

that do not do "err = report(); if (err) return;" in this function
after applying this patch.

I think that answers the above question.  The answer is "because
some are too problematic, even after this patch, we give up parsing
the remainder of the commit object once we hit certain errors,
leaving some other errors that appear later in the object
undetected".

I think that is a sensible design decision, but the proposed log
message forgets to say so.

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c | 28 ++++++++++++++++++++--------
>  1 file changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index 9faaf53..9fe9f48 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
>  
>  	if (!skip_prefix(buffer, "tree ", &buffer))
>  		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
> -	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
> -		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
> +	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
> +		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
> +		if (err)
> +			return err;
> +	}

I do not think this "if (err) return err;" that uses the return
value of report(), makes sense.

As all the errors that use this pattern are isolated ones that does
not break parsing of the remainder (e.g. author ident had an extra >
in it may break "author " but that does not prevent us from checking
"committer ").

Your report() switches its return value based on the user setting;
specifically, it returns 0 if the user tells us to ignore/skip or
warn.  Which means that the user will see all warnings, but we stop
at the first error.

Shouldn't we continue regardless of the end-user setting in order to
show errors on other fields, too?

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 09/19] fsck: Handle multiple authors in commits specially
  2015-06-19 13:34         ` [PATCH v6 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-06-19 20:16           ` Junio C Hamano
  2015-06-19 21:04             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:16 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>  	err = fsck_ident(&buffer, &commit->object, options);
>  	if (err)
>  		return err;
> +	while (skip_prefix(buffer, "author ", &buffer)) {
> +		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
> +		if (err)
> +			return err;
> +		err = fsck_ident(&buffer, &commit->object, options);
> +		if (err)
> +			return err;
> +	}

Hmph, naively I would have expected that you wouldn't need an
extra call to fsck_ident() here, and instead would see something
like this:

	author_count = 0;
	while (skip_prefix("author ")) {
        	author_count++;
                ... do the existing check as-is ...
	}
        if (author_count < 1)
        	err |= report(missing author);
	else if (author_count > 1)
        	err |= report(multiple authors);

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly
  2015-06-19 13:34         ` [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-06-19 20:18           ` Junio C Hamano
  2015-06-19 21:06             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:18 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> When fsck_tag() identifies a problem with the commit, it should try
> to make it possible to continue checking the commit object, in case the
> user wants to demote the detected errors to mere warnings.

I agree with that.  But if FSCK_MSG_BAD_OBJECT_SHA1 is an ignorable
error, why should we still have a conditional "goto done" here?

Shouldn't we be parsing the object the same way regardless?

>
> Just like fsck_commit(), there are certain problems that could hide other
> issues with the same tag object. For example, if the 'type' line is not
> encountered in the correct position, the 'tag' line – if there is any –
> would not be handled at all.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fsck.c b/fsck.c
> index 0cfa4d0..21e3052 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -640,7 +640,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
>  	}
>  	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
>  		ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
> -		goto done;
> +		if (ret)
> +			goto done;
>  	}
>  	buffer += 41;

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-06-19 13:34         ` [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-06-19 20:21           ` Junio C Hamano
  2015-06-19 21:09             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Some kinds of errors are intrinsically unrecoverable (e.g. errors while
> uncompressing objects). It does not make sense to allow demoting them to
> mere warnings.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c                          | 14 ++++++++++++--
>  t/t5504-fetch-receive-strict.sh | 11 +++++++++++
>  2 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/fsck.c b/fsck.c
> index 21e3052..a4fbce3 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -9,7 +9,12 @@
>  #include "refs.h"
>  #include "utf8.h"
>  
> +#define FSCK_FATAL -1
> +
>  #define FOREACH_MSG_ID(FUNC) \
> +	/* fatal errors */ \
> +	FUNC(NUL_IN_HEADER, FATAL) \
> +	FUNC(UNTERMINATED_HEADER, FATAL) \
>  	/* errors */ \
>  	FUNC(BAD_DATE, ERROR) \
>  	FUNC(BAD_DATE_OVERFLOW, ERROR) \
> @@ -39,11 +44,9 @@
>  	FUNC(MISSING_TYPE, ERROR) \
>  	FUNC(MISSING_TYPE_ENTRY, ERROR) \
>  	FUNC(MULTIPLE_AUTHORS, ERROR) \
> -	FUNC(NUL_IN_HEADER, ERROR) \
>  	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
>  	FUNC(TREE_NOT_SORTED, ERROR) \
>  	FUNC(UNKNOWN_TYPE, ERROR) \
> -	FUNC(UNTERMINATED_HEADER, ERROR) \

I think the end result very much makes a good sense, but why didn't
this list enumerate the errors in the above "final" order from the
beginning in 02/19?

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-06-19 13:34         ` [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-06-19 20:22           ` Junio C Hamano
  2015-06-19 21:10             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:22 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>  #define FSCK_FATAL -1
> +#define FSCK_INFO -2
>  
>  #define FOREACH_MSG_ID(FUNC) \
>  	/* fatal errors */ \
> @@ -50,15 +51,16 @@
>  	FUNC(ZERO_PADDED_DATE, ERROR) \
>  	/* warnings */ \
>  	FUNC(BAD_FILEMODE, WARN) \
> -	FUNC(BAD_TAG_NAME, WARN) \
>  	FUNC(EMPTY_NAME, WARN) \
>  	FUNC(FULL_PATHNAME, WARN) \
>  	FUNC(HAS_DOT, WARN) \
>  	FUNC(HAS_DOTDOT, WARN) \
>  	FUNC(HAS_DOTGIT, WARN) \
> -	FUNC(MISSING_TAGGER_ENTRY, WARN) \
>  	FUNC(NULL_SHA1, WARN) \
> -	FUNC(ZERO_PADDED_FILEMODE, WARN)
> +	FUNC(ZERO_PADDED_FILEMODE, WARN) \
> +	/* infos (reported as warnings, but ignored by default) */ \
> +	FUNC(BAD_TAG_NAME, INFO) \
> +	FUNC(MISSING_TAGGER_ENTRY, INFO)

Exactly the same comment as 12/19 applies to this change; not only
complaints but also "result makes sense" part.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 13:35         ` [PATCH v6 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
@ 2015-06-19 20:32           ` Junio C Hamano
  2015-06-19 20:42             ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> This option avoids unpacking each and all objects, and just verifies the
> connectivity.

That sounds like marketing ;-)

"Wow this does not unpack unnecessarily, wait, it needs to unpack
and parse 3 out of 4 kinds of objects?"

Jokes aside, given that you should regularly repack your repository
anyway, I do not think it is such a big downside that this mode
misses a corrupt objects, and the 1 out of 4 kinds of objects,
i.e. blobs, occupy major part of the repository storage, so this new
mode probably makes sense.

> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index 922c346..2863a8a 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
>  	test_must_fail git -C missing fsck
>  '
>  
> +test_expect_success 'fsck --quick' '
> +	rm -rf quick &&
> +	git init quick &&
> +	(
> +		cd quick &&
> +		touch empty &&
> +		git add empty &&
> +		test_commit empty &&
> +		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
> +		rm -f $empty &&
> +		echo invalid >$empty &&
> +		test_must_fail git fsck --strict &&
> +		git fsck --strict --quick &&
> +		tree=$(git rev-parse HEAD:) &&
> +		suffix=${tree#??} &&
> +		tree=.git/objects/${tree%$suffix}/$suffix &&
> +		rm -f $tree &&
> +		echo invalid >$tree &&
> +		test_must_fail git fsck --strict --quick
> +	)
> +'
> +
>  test_done

I see a few impedance mismatch here.  For --quick, I would have
expected that the addition would be in t/perf/, not here.

Also the fact that quickness comes by cheating on blobs is an
implementation detail; in the future, perhaps somebody may come up
with a way to do a quick fsck while making sure blob corruption is
also detected.  The new test that expects "--quick" to ignore a
corrupt blob forbids such a progress.

If the option name was "--ignore-corrupt-blob", then the above
change is 100% justified, though.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-19 13:35         ` [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-06-19 20:39           ` Junio C Hamano
  2015-06-20 12:45             ` Johannes Schindelin
  2015-06-22  4:21           ` Junio C Hamano
  1 sibling, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:39 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> +	if (strcmp(var, "receive.fsck.skiplist") == 0) {
> +		const char *path = is_absolute_path(value) ?
> +			value : git_path("%s", value);

This "either absolute or inside $GIT_DIR" looks somewhat strange to
me.  Shouldn't we mimick what "git config --path" does instead?

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
  2015-06-19 13:35         ` [PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
@ 2015-06-19 20:40           ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> +	if (strcmp(var, "fsck.skiplist") == 0) {
> +		const char *path = is_absolute_path(value) ?
> +			value : git_path("%s", value);

Same comment as 18/19.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 20:32           ` Junio C Hamano
@ 2015-06-19 20:42             ` Johannes Schindelin
  2015-06-19 20:53               ` Junio C Hamano
  2015-06-20  3:26               ` Junio C Hamano
  0 siblings, 2 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 20:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:32, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> This option avoids unpacking each and all objects, and just verifies the
>> connectivity.
> 
> That sounds like marketing ;-)
> 
> "Wow this does not unpack unnecessarily, wait, it needs to unpack
> and parse 3 out of 4 kinds of objects?"

Hah, you caught me there. I wanted to say "blob objects".

> Jokes aside, given that you should regularly repack your repository
> anyway, I do not think it is such a big downside that this mode
> misses a corrupt objects, and the 1 out of 4 kinds of objects,
> i.e. blobs, occupy major part of the repository storage, so this new
> mode probably makes sense.

It actually makes a ton of sense as a kind of light-weight check ;-) Try it, it is really much, much faster than a full fsck.

>> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
>> index 922c346..2863a8a 100755
>> --- a/t/t1450-fsck.sh
>> +++ b/t/t1450-fsck.sh
>> @@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
>>  	test_must_fail git -C missing fsck
>>  '
>>
>> +test_expect_success 'fsck --quick' '
>> +	rm -rf quick &&
>> +	git init quick &&
>> +	(
>> +		cd quick &&
>> +		touch empty &&
>> +		git add empty &&
>> +		test_commit empty &&
>> +		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
>> +		rm -f $empty &&
>> +		echo invalid >$empty &&
>> +		test_must_fail git fsck --strict &&
>> +		git fsck --strict --quick &&
>> +		tree=$(git rev-parse HEAD:) &&
>> +		suffix=${tree#??} &&
>> +		tree=.git/objects/${tree%$suffix}/$suffix &&
>> +		rm -f $tree &&
>> +		echo invalid >$tree &&
>> +		test_must_fail git fsck --strict --quick
>> +	)
>> +'
>> +
>>  test_done
> 
> I see a few impedance mismatch here.  For --quick, I would have
> expected that the addition would be in t/perf/, not here.
> 
> Also the fact that quickness comes by cheating on blobs is an
> implementation detail; in the future, perhaps somebody may come up
> with a way to do a quick fsck while making sure blob corruption is
> also detected.  The new test that expects "--quick" to ignore a
> corrupt blob forbids such a progress.
> 
> If the option name was "--ignore-corrupt-blob", then the above
> change is 100% justified, though.

The meaning of "quick" that I was thinking of was not the same as "fast", but more like "just a quick check". As in "quick & dirty" ;-)

The point is not to ignore corrupt blobs, by the way, it is to check the connectivity only, and save substantial amounts of time doing so.

Can you think of a name for the option that is as short as `--quick` but means the same as `--connectivity-only`?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-19 20:12           ` Junio C Hamano
@ 2015-06-19 20:52             ` Johannes Schindelin
  2015-06-19 21:01               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 20:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:12, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Note that some problems are too problematic to simply ignore. For
>> example, when the header lines are mixed up, we punt after encountering
>> an incorrect line. Therefore, demoting certain warnings to errors can
>> hide other problems. Example: demoting the missingauthor error to
>> a warning would hide a problematic committer line.
> 
> Is this a warning to end-users (which should be better in the doc),
> or "because some of them are too problematic to ignore" that forgot
> to add the explanation "hence we do not keep going in this code"
> (which should be in the log message if that is what is going on)?

It was intended to offer the explanation for the design decision you commented on later:

> I notice that there are many instances of
> 
> 	if (object does not pass some test)
> 		return report(...);
> 
> that do not do "err = report(); if (err) return;" in this function
> after applying this patch.
> 
> I think that answers the above question.  The answer is "because
> some are too problematic, even after this patch, we give up parsing
> the remainder of the commit object once we hit certain errors,
> leaving some other errors that appear later in the object
> undetected".
> 
> I think that is a sensible design decision, but the proposed log
> message forgets to say so.
> 
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
>>  fsck.c | 28 ++++++++++++++++++++--------
>>  1 file changed, 20 insertions(+), 8 deletions(-)
>>
>> diff --git a/fsck.c b/fsck.c
>> index 9faaf53..9fe9f48 100644
>> --- a/fsck.c
>> +++ b/fsck.c
>> @@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
>>
>>  	if (!skip_prefix(buffer, "tree ", &buffer))
>>  		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
>> -	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
>> -		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
>> +	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
>> +		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
>> +		if (err)
>> +			return err;
>> +	}
> 
> I do not think this "if (err) return err;" that uses the return
> value of report(), makes sense.
> 
> As all the errors that use this pattern are isolated ones that does
> not break parsing of the remainder (e.g. author ident had an extra >
> in it may break "author " but that does not prevent us from checking
> "committer ").
> 
> Your report() switches its return value based on the user setting;
> specifically, it returns 0 if the user tells us to ignore/skip or
> warn.  Which means that the user will see all warnings, but we stop
> at the first error.
> 
> Shouldn't we continue regardless of the end-user setting in order to
> show errors on other fields, too?

I can make that happen, but please note that this is a change of behavior: we always stopped upon the first error.

It was my intention not to change behavior in that way without a proper reason, and I saw none.

I actually see a really good reason to *keep* the current behavior: one of the most prominent users of this code path is `git receive-pack --strict`. It is used heavily by GitHub to ensure at least a certain level of validity of pushed objects. Now, for this use case it is easy to see that you want to stop *as soon as an error was encountered*. And as GitHub sponsors my work on this patch series, my main aim is to support their use case.

Having said that, I agree that it could actually make sense for `git fsck` to show all errors, or at least to have an option to do so.

But that is a story for another night ;-)
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 20:42             ` Johannes Schindelin
@ 2015-06-19 20:53               ` Junio C Hamano
  2015-06-19 23:57                 ` Scott Schmit
  2015-06-21  4:55                 ` Michael Haggerty
  2015-06-20  3:26               ` Junio C Hamano
  1 sibling, 2 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 20:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Can you think of a name for the option that is as short as `--quick`
> but means the same as `--connectivity-only`?

No I can't.  I think `--connectivity-only` is a very good name that
is unfortunately a mouthful, I agree that we need a name that is as
short as `--xxxxx` that means the same as `--connectivity-only`.  I
do not think `--quick` is that word; it does not mean such a thing.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-19 20:52             ` Johannes Schindelin
@ 2015-06-19 21:01               ` Junio C Hamano
  2015-06-19 23:43                 ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 21:01 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>> I do not think this "if (err) return err;" that uses the return
>> value of report(), makes sense.
>> 
>> As all the errors that use this pattern are isolated ones that does
>> not break parsing of the remainder (e.g. author ident had an extra >
>> in it may break "author " but that does not prevent us from checking
>> "committer ").
>> 
>> Your report() switches its return value based on the user setting;
>> specifically, it returns 0 if the user tells us to ignore/skip or
>> warn.  Which means that the user will see all warnings, but we stop
>> at the first error.
>> 
>> Shouldn't we continue regardless of the end-user setting in order to
>> show errors on other fields, too?
>
> I can make that happen, but please note that this is a change of
> behavior: we always stopped upon the first error.

Yeah, and we always died when we saw error, without giving users an
option to turn it down.  So?

> It was my intention not to change behavior in that way without a
> proper reason, and I saw none.

What would be the end-user experience if you stopped at the first
error?  You see an error, add an "fsck.<msg-id> = ignore" and rerun,
only to find another error and rinse and repeat?  Wouldn't you
rather see all of them and add the "ignore" to cover them in one go?

> I actually see a really good reason to *keep* the current behavior:
> one of the most prominent users of this code path is `git receive-pack
> --strict`. It is used heavily by GitHub to ensure at least a certain
> level of validity of pushed objects. Now, for this use case it is easy
> to see that you want to stop *as soon as an error was
> encountered*. And as GitHub sponsors my work on this patch series, my
> main aim is to support their use case.

While I understand that use case, I do not think stopping after
showing three more errors in a single commit would make much
difference in the bigger picture.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 09/19] fsck: Handle multiple authors in commits specially
  2015-06-19 20:16           ` Junio C Hamano
@ 2015-06-19 21:04             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 21:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:16, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>>  	err = fsck_ident(&buffer, &commit->object, options);
>>  	if (err)
>>  		return err;
>> +	while (skip_prefix(buffer, "author ", &buffer)) {
>> +		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
>> +		if (err)
>> +			return err;
>> +		err = fsck_ident(&buffer, &commit->object, options);
>> +		if (err)
>> +			return err;
>> +	}
> 
> Hmph, naively I would have expected that you wouldn't need an
> extra call to fsck_ident() here, and instead would see something
> like this:
> 
> 	author_count = 0;
> 	while (skip_prefix("author ")) {
>         	author_count++;
>                 ... do the existing check as-is ...
> 	}
>         if (author_count < 1)
>         	err |= report(missing author);
> 	else if (author_count > 1)
>         	err |= report(multiple authors);

Good idea! I fixed this in my branch and it will be part of v7.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly
  2015-06-19 20:18           ` Junio C Hamano
@ 2015-06-19 21:06             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 21:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:18, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> When fsck_tag() identifies a problem with the commit, it should try
>> to make it possible to continue checking the commit object, in case the
>> user wants to demote the detected errors to mere warnings.
> 
> I agree with that.  But if FSCK_MSG_BAD_OBJECT_SHA1 is an ignorable
> error, why should we still have a conditional "goto done" here?
> 
> Shouldn't we be parsing the object the same way regardless?

Same reason as I mentioned before: in `git receive-pack --strict` we want to fail early, there is no sense to keep going when we know that nothing of the rest will make it to the server.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-06-19 20:21           ` Junio C Hamano
@ 2015-06-19 21:09             ` Johannes Schindelin
  2015-06-19 23:30               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 21:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:21, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Some kinds of errors are intrinsically unrecoverable (e.g. errors while
>> uncompressing objects). It does not make sense to allow demoting them to
>> mere warnings.
>>
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
>>  fsck.c                          | 14 ++++++++++++--
>>  t/t5504-fetch-receive-strict.sh | 11 +++++++++++
>>  2 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/fsck.c b/fsck.c
>> index 21e3052..a4fbce3 100644
>> --- a/fsck.c
>> +++ b/fsck.c
>> @@ -9,7 +9,12 @@
>>  #include "refs.h"
>>  #include "utf8.h"
>>
>> +#define FSCK_FATAL -1
>> +
>>  #define FOREACH_MSG_ID(FUNC) \
>> +	/* fatal errors */ \
>> +	FUNC(NUL_IN_HEADER, FATAL) \
>> +	FUNC(UNTERMINATED_HEADER, FATAL) \
>>  	/* errors */ \
>>  	FUNC(BAD_DATE, ERROR) \
>>  	FUNC(BAD_DATE_OVERFLOW, ERROR) \
>> @@ -39,11 +44,9 @@
>>  	FUNC(MISSING_TYPE, ERROR) \
>>  	FUNC(MISSING_TYPE_ENTRY, ERROR) \
>>  	FUNC(MULTIPLE_AUTHORS, ERROR) \
>> -	FUNC(NUL_IN_HEADER, ERROR) \
>>  	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
>>  	FUNC(TREE_NOT_SORTED, ERROR) \
>>  	FUNC(UNKNOWN_TYPE, ERROR) \
>> -	FUNC(UNTERMINATED_HEADER, ERROR) \
> 
> I think the end result very much makes a good sense, but why didn't
> this list enumerate the errors in the above "final" order from the
> beginning in 02/19?

Because they are alphabetically ordered, within message type categories, that is; this helped me develop with more ease (you do not want to know how many hundreds of times I ran an interactive rebase on all of these patches...).

And from the point of a development story (which a patch series is), it would puzzle me, as a reader, if those two out of all the others were in front in 02/19, when they are no different from the others at that stage.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-06-19 20:22           ` Junio C Hamano
@ 2015-06-19 21:10             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 21:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:22, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>>  #define FSCK_FATAL -1
>> +#define FSCK_INFO -2
>>
>>  #define FOREACH_MSG_ID(FUNC) \
>>  	/* fatal errors */ \
>> @@ -50,15 +51,16 @@
>>  	FUNC(ZERO_PADDED_DATE, ERROR) \
>>  	/* warnings */ \
>>  	FUNC(BAD_FILEMODE, WARN) \
>> -	FUNC(BAD_TAG_NAME, WARN) \
>>  	FUNC(EMPTY_NAME, WARN) \
>>  	FUNC(FULL_PATHNAME, WARN) \
>>  	FUNC(HAS_DOT, WARN) \
>>  	FUNC(HAS_DOTDOT, WARN) \
>>  	FUNC(HAS_DOTGIT, WARN) \
>> -	FUNC(MISSING_TAGGER_ENTRY, WARN) \
>>  	FUNC(NULL_SHA1, WARN) \
>> -	FUNC(ZERO_PADDED_FILEMODE, WARN)
>> +	FUNC(ZERO_PADDED_FILEMODE, WARN) \
>> +	/* infos (reported as warnings, but ignored by default) */ \
>> +	FUNC(BAD_TAG_NAME, INFO) \
>> +	FUNC(MISSING_TAGGER_ENTRY, INFO)
> 
> Exactly the same comment as 12/19 applies to this change; not only
> complaints but also "result makes sense" part.

And my explanation is the same ;-) At 02/19 time, it would just puzzle me, as a reader, to see special treatment without any good reason.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 06/19] fsck: Report the ID of the error/warning
  2015-06-19 19:28           ` Junio C Hamano
@ 2015-06-19 21:34             ` Johannes Schindelin
  2015-06-19 23:26               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-19 21:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 21:28, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Some legacy code has objects with non-fatal fsck issues; To enable the
>> user to ignore those issues, let's print out the ID (e.g. when
>> encountering "missingemail", the user might want to call `git config
>> --add receive.fsck.missingemail=warn`).
>>
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
>>  fsck.c          | 16 ++++++++++++++++
>>  t/t1450-fsck.sh |  4 ++--
>>  2 files changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/fsck.c b/fsck.c
>> index 8e6faa8..0b3e18f 100644
>> --- a/fsck.c
>> +++ b/fsck.c
>> @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
>>  	}
>>  }
>>
>> +static void append_msg_id(struct strbuf *sb, const char *msg_id)
>> +{
>> +	for (;;) {
>> +		char c = *(msg_id)++;
>> +
>> +		if (!c)
>> +			break;
>> +		if (c != '_')
>> +			strbuf_addch(sb, tolower(c));
>> +	}
>> +
>> +	strbuf_addstr(sb, ": ");
>> +}
>> +
>>  __attribute__((format (printf, 4, 5)))
>>  static int report(struct fsck_options *options, struct object *object,
>>  	enum fsck_msg_id id, const char *fmt, ...)
>> @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object,
>>  	struct strbuf sb = STRBUF_INIT;
>>  	int msg_type = fsck_msg_type(id, options), result;
>>
>> +	append_msg_id(&sb, msg_id_info[id].id_string);
> 
> 
> Nice.  The append function can be made a bit more context sensitive
> to upcase a char immediately after _ to make it easier to cut and
> paste into "git config" and keep the result readable, I think.
> 
> 	git config --add receive.fsck.missingEmail=warn

Okay. I camelCased the IDs; it is a bit sore on my eyes in the command-line output, and the config variables are case-insensitive, anyway, but your wish is my command... I changed it locally, it will be part of v7.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 06/19] fsck: Report the ID of the error/warning
  2015-06-19 21:34             ` Johannes Schindelin
@ 2015-06-19 23:26               ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 23:26 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List, Michael Haggerty, Jeff King

That "... can be made ..." was not my wish but more like "the way the
code is structured it is possible for somebody to do such a thing
easily, well done" compliment ;-)

The message names will have to be shown somewhere in the
documentation, and in Documentation/ we try to use camelCase to show
the word boundary; it would be better to match that, as this output is
meant to be used there.

On Fri, Jun 19, 2015 at 2:34 PM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> Hi Junio,
>
> On 2015-06-19 21:28, Junio C Hamano wrote:
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>>
>>> Some legacy code has objects with non-fatal fsck issues; To enable the
>>> user to ignore those issues, let's print out the ID (e.g. when
>>> encountering "missingemail", the user might want to call `git config
>>> --add receive.fsck.missingemail=warn`).
>>>
>>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>>> ---
>>>  fsck.c          | 16 ++++++++++++++++
>>>  t/t1450-fsck.sh |  4 ++--
>>>  2 files changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fsck.c b/fsck.c
>>> index 8e6faa8..0b3e18f 100644
>>> --- a/fsck.c
>>> +++ b/fsck.c
>>> @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
>>>      }
>>>  }
>>>
>>> +static void append_msg_id(struct strbuf *sb, const char *msg_id)
>>> +{
>>> +    for (;;) {
>>> +            char c = *(msg_id)++;
>>> +
>>> +            if (!c)
>>> +                    break;
>>> +            if (c != '_')
>>> +                    strbuf_addch(sb, tolower(c));
>>> +    }
>>> +
>>> +    strbuf_addstr(sb, ": ");
>>> +}
>>> +
>>>  __attribute__((format (printf, 4, 5)))
>>>  static int report(struct fsck_options *options, struct object *object,
>>>      enum fsck_msg_id id, const char *fmt, ...)
>>> @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object,
>>>      struct strbuf sb = STRBUF_INIT;
>>>      int msg_type = fsck_msg_type(id, options), result;
>>>
>>> +    append_msg_id(&sb, msg_id_info[id].id_string);
>>
>>
>> Nice.  The append function can be made a bit more context sensitive
>> to upcase a char immediately after _ to make it easier to cut and
>> paste into "git config" and keep the result readable, I think.
>>
>>       git config --add receive.fsck.missingEmail=warn
>
> Okay. I camelCased the IDs; it is a bit sore on my eyes in the command-line output, and the config variables are case-insensitive, anyway, but your wish is my command... I changed it locally, it will be part of v7.
>
> Ciao,
> Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-06-19 21:09             ` Johannes Schindelin
@ 2015-06-19 23:30               ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 23:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List, Michael Haggerty, Jeff King

Ahh, I didn't see that they were not grouped by object types, features
or any meaningful axis.
That explains it (i.e. I can now understand why the original list was
ordered differently from the final order).


On Fri, Jun 19, 2015 at 2:09 PM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> Hi Junio,
>
> On 2015-06-19 22:21, Junio C Hamano wrote:
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>>
>>> Some kinds of errors are intrinsically unrecoverable (e.g. errors while
>>> uncompressing objects). It does not make sense to allow demoting them to
>>> mere warnings.
>>>
>>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>>> ---
>>>  fsck.c                          | 14 ++++++++++++--
>>>  t/t5504-fetch-receive-strict.sh | 11 +++++++++++
>>>  2 files changed, 23 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fsck.c b/fsck.c
>>> index 21e3052..a4fbce3 100644
>>> --- a/fsck.c
>>> +++ b/fsck.c
>>> @@ -9,7 +9,12 @@
>>>  #include "refs.h"
>>>  #include "utf8.h"
>>>
>>> +#define FSCK_FATAL -1
>>> +
>>>  #define FOREACH_MSG_ID(FUNC) \
>>> +    /* fatal errors */ \
>>> +    FUNC(NUL_IN_HEADER, FATAL) \
>>> +    FUNC(UNTERMINATED_HEADER, FATAL) \
>>>      /* errors */ \
>>>      FUNC(BAD_DATE, ERROR) \
>>>      FUNC(BAD_DATE_OVERFLOW, ERROR) \
>>> @@ -39,11 +44,9 @@
>>>      FUNC(MISSING_TYPE, ERROR) \
>>>      FUNC(MISSING_TYPE_ENTRY, ERROR) \
>>>      FUNC(MULTIPLE_AUTHORS, ERROR) \
>>> -    FUNC(NUL_IN_HEADER, ERROR) \
>>>      FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
>>>      FUNC(TREE_NOT_SORTED, ERROR) \
>>>      FUNC(UNKNOWN_TYPE, ERROR) \
>>> -    FUNC(UNTERMINATED_HEADER, ERROR) \
>>
>> I think the end result very much makes a good sense, but why didn't
>> this list enumerate the errors in the above "final" order from the
>> beginning in 02/19?
>
> Because they are alphabetically ordered, within message type categories, that is; this helped me develop with more ease (you do not want to know how many hundreds of times I ran an interactive rebase on all of these patches...).
>
> And from the point of a development story (which a patch series is), it would puzzle me, as a reader, if those two out of all the others were in front in 02/19, when they are no different from the others at that stage.
>
> Ciao,
> Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-19 21:01               ` Junio C Hamano
@ 2015-06-19 23:43                 ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-19 23:43 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Junio C Hamano <gitster@pobox.com> writes:

> What would be the end-user experience if you stopped at the first
> error?  You see an error, add an "fsck.<msg-id> = ignore" and rerun,
> only to find another error and rinse and repeat?  Wouldn't you
> rather see all of them and add the "ignore" to cover them in one go?
>
>> I actually see a really good reason to *keep* the current behavior:
>> one of the most prominent users of this code path is `git receive-pack
>> --strict`. It is used heavily by GitHub to ensure at least a certain
>> level of validity of pushed objects. Now, for this use case it is easy
>> to see that you want to stop *as soon as an error was
>> encountered*. And as GitHub sponsors my work on this patch series, my
>> main aim is to support their use case.
>
> While I understand that use case, I do not think stopping after
> showing three more errors in a single commit would make much
> difference in the bigger picture.

I actually changed my mind.  The above talks about the value given
to the end user by noticing as many errors in a single object, but
I'd think fsck.<msg-id> is pretty much useless as a tool to keep
using a repository with malformed object in its history.  When you
are told object d6602ec is bad (that's the v0.99 tag that does not
have tagger field), you would never want to say "in this repository,
any tag without tagger is allowed", because you would still want to
catch and prevent future breakages of the same kind in new tags.

And the way to do so is to say "I know the object d6602ec is bad, so
do not report breakage you find to me" by using the skip-list.  For
that use case, showing _all_ errors (or warnings for that matter)
does not add any value.

So let's stop at the first error as your patch did.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 20:53               ` Junio C Hamano
@ 2015-06-19 23:57                 ` Scott Schmit
  2015-06-20  3:24                   ` Junio C Hamano
  2015-06-21  4:55                 ` Michael Haggerty
  1 sibling, 1 reply; 275+ messages in thread
From: Scott Schmit @ 2015-06-19 23:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git, mhagger, peff

On Fri, Jun 19, 2015 at 01:53:01PM -0700, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > Can you think of a name for the option that is as short as `--quick`
> > but means the same as `--connectivity-only`?
> 
> No I can't.  I think `--connectivity-only` is a very good name that
> is unfortunately a mouthful, I agree that we need a name that is as
> short as `--xxxxx` that means the same as `--connectivity-only`.  I
> do not think `--quick` is that word; it does not mean such a thing.

How about `--linkage` or `--links`?

-- 
Scott Schmit

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 23:57                 ` Scott Schmit
@ 2015-06-20  3:24                   ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-20  3:24 UTC (permalink / raw)
  To: Scott Schmit; +Cc: Johannes Schindelin, git, mhagger, peff

Scott Schmit <i.grok@comcast.net> writes:

> On Fri, Jun 19, 2015 at 01:53:01PM -0700, Junio C Hamano wrote:
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> 
>> > Can you think of a name for the option that is as short as `--quick`
>> > but means the same as `--connectivity-only`?
>> 
>> No I can't.  I think `--connectivity-only` is a very good name that
>> is unfortunately a mouthful, I agree that we need a name that is as
>> short as `--xxxxx` that means the same as `--connectivity-only`.  I
>> do not think `--quick` is that word; it does not mean such a thing.
>
> How about `--linkage` or `--links`?

Even though "link" may be shorter than "connectivity", the real
difficulty is to come up with a phrase that conveys the "-only" part
of the fully spelled name, which is more important, without spending
5 letters that take to say "-only".

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 20:42             ` Johannes Schindelin
  2015-06-19 20:53               ` Junio C Hamano
@ 2015-06-20  3:26               ` Junio C Hamano
  1 sibling, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-20  3:26 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

>> Jokes aside, given that you should regularly repack your repository
>> anyway, I do not think it is such a big downside that this mode
>> misses a corrupt objects, and the 1 out of 4 kinds of objects,
>> i.e. blobs, occupy major part of the repository storage, so this new
>> mode probably makes sense.
>
> It actually makes a ton of sense as a kind of light-weight check ;-)

Yes, didn't I agree that it makes sense already?

> The meaning of "quick" that I was thinking of was not the same as
> "fast", but more like "just a quick check". As in "quick & dirty" ;-)

Sure.

> The point is not to ignore corrupt blobs, by the way, it is to check
> the connectivity only, and save substantial amounts of time doing so.

Yeah, I understand that; after all, that is exactly why I said "make
sure it does not notice corrupt blobs" is not a good test.

On the other hand, you are also checking that it notices a broken
tree (which makes it impossible to complete connectivity check),
which is a good test.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 01/19] fsck: Introduce fsck options
  2015-06-19 19:03           ` Junio C Hamano
@ 2015-06-20 12:33             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-20 12:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 21:03, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
>> index 48fa472..87ae9ba 100644
>> --- a/builtin/index-pack.c
>> +++ b/builtin/index-pack.c
>> @@ -75,6 +75,7 @@ static int nr_threads;
>>  static int from_stdin;
>>  static int strict;
>>  static int do_fsck_object;
>> +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
> 
> So there is a global fsck_options used throughout the entire
> session here.
> 
>> @@ -838,10 +839,10 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
>>  			if (!obj)
>>  				die(_("invalid %s"), typename(type));
>>  			if (do_fsck_object &&
>> -			    fsck_object(obj, buf, size, 1,
>> -				    fsck_error_function))
>> +			    fsck_object(obj, buf, size, &fsck_options))
>>  				die(_("Error in object"));
> 
> And that is used here to inspect each and every object we encounter.
> 
>> -			if (fsck_walk(obj, mark_link, NULL))
>> +			fsck_options.walk = mark_link;
> 
> Then we do a call to fsck_walk() starting from this object, letting
> mark_link() to inspect it and set the LINK bit.
> 
>> +			if (fsck_walk(obj, NULL, &fsck_options))
>>  				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
> 
> Since nobody else sets fsck_options.walk to any other value, and
> nobody else calls fsck_walk(), shouldn't that assignment be done
> only once somewhere a lot higher in the callchain?  The apparent
> "overriding while inspecting this object" that does not have any
> corresponding "now we are done, so revert it to the original value"
> puzzled me, and I am sure it would puzzle future readers of this
> code.

Good point. I guess I was really wary that a configured walk function might change the behavior of `fsck_object()`. But after inspecting the code paths carefully, I conclude that the walk function is really only used in the `fsck_walk_*()` family of functions.

I changed that locally, will be part of v7.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-19 20:39           ` Junio C Hamano
@ 2015-06-20 12:45             ` Johannes Schindelin
  2015-06-20 17:28               ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-20 12:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 22:39, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> +	if (strcmp(var, "receive.fsck.skiplist") == 0) {
>> +		const char *path = is_absolute_path(value) ?
>> +			value : git_path("%s", value);
> 
> This "either absolute or inside $GIT_DIR" looks somewhat strange to
> me.  Shouldn't we mimick what "git config --path" does instead?

Okay. That would also support user paths.

There is a problem, though: `git_config_pathname()` accepts a `const char **` parameter to set the path, yet I need to `free()` that pointer afterwards because it has been obtained through `expand_user_path()` which detaches that buffer from a `strbuf`.

In regular use cases, that does not matter much because the path obtained via `git_config_pathname()` is typically stored in a global variable, but in my case, I append it to a global variable.

Of course I could cast the value to a `char *` and `free()` it, but that would assume too much about the implementation detail of `git_config_pathname()` for my taste.

The easiest cop out would be to recapitulate the code in `git_config_pathname()`, but I think the correct solution would be to change the signature of `git_config_pathname()` to accept a `char **` instead. Do you agree? However, the diff looks really ugly:

-- snipsnap --
diff --git a/builtin/commit.c b/builtin/commit.c
index 254477f..946b31b 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1494,7 +1494,7 @@ static int git_commit_config(const char *k, const char *v, void *cb)
 	int status;
 
 	if (!strcmp(k, "commit.template"))
-		return git_config_pathname(&template_file, k, v);
+		return git_config_pathname((char **)&template_file, k, v);
 	if (!strcmp(k, "commit.status")) {
 		include_status = git_config_bool(k, v);
 		return 0;
diff --git a/builtin/config.c b/builtin/config.c
index 7188405..a249faa 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -130,7 +130,7 @@ static int format_config(struct strbuf *buf, const char *key_, const char *value
 		else
 			sprintf(value, "%d", v);
 	} else if (types == TYPE_PATH) {
-		if (git_config_pathname(&vptr, key_, value_) < 0)
+		if (git_config_pathname((char **)&vptr, key_, value_) < 0)
 			return -1;
 		must_free_vptr = 1;
 	} else if (value_) {
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 4335738..0572cae 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -177,7 +177,8 @@ free_return:
 static int git_init_db_config(const char *k, const char *v, void *cb)
 {
 	if (!strcmp(k, "init.templatedir"))
-		return git_config_pathname(&init_db_template_dir, k, v);
+		return git_config_pathname((char **)&init_db_template_dir,
+				k, v);
 
 	return 0;
 }
diff --git a/builtin/log.c b/builtin/log.c
index 8781049..8751173 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -743,7 +743,8 @@ static int git_format_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "format.signature"))
 		return git_config_string(&signature, var, value);
 	if (!strcmp(var, "format.signaturefile"))
-		return git_config_pathname(&signature_file, var, value);
+		return git_config_pathname((char **)&signature_file,
+				var, value);
 	if (!strcmp(var, "format.coverletter")) {
 		if (value && !strcasecmp(value, "auto")) {
 			config_cover_letter = COVER_AUTO;
diff --git a/cache.h b/cache.h
index 4f55466..19af52e 100644
--- a/cache.h
+++ b/cache.h
@@ -1442,7 +1442,7 @@ extern int git_config_bool_or_int(const char *, const char *, int *);
 extern int git_config_bool(const char *, const char *);
 extern int git_config_maybe_bool(const char *, const char *);
 extern int git_config_string(const char **, const char *, const char *);
-extern int git_config_pathname(const char **, const char *, const char *);
+extern int git_config_pathname(char **, const char *, const char *);
 extern int git_config_set_in_file(const char *, const char *, const char *);
 extern int git_config_set(const char *, const char *);
 extern int git_config_parse_key(const char *, char **, int *);
diff --git a/config.c b/config.c
index 29fa012..51781b8 100644
--- a/config.c
+++ b/config.c
@@ -670,7 +670,7 @@ int git_config_string(const char **dest, const char *var, const char *value)
 	return 0;
 }
 
-int git_config_pathname(const char **dest, const char *var, const char *value)
+int git_config_pathname(char **dest, const char *var, const char *value)
 {
 	if (!value)
 		return config_error_nonbool(var);
@@ -714,7 +714,8 @@ static int git_default_core_config(const char *var, const char *value)
 	}
 
 	if (!strcmp(var, "core.attributesfile"))
-		return git_config_pathname(&git_attributes_file, var, value);
+		return git_config_pathname((char **)&git_attributes_file,
+				var, value);
 
 	if (!strcmp(var, "core.bare")) {
 		is_bare_repository_cfg = git_config_bool(var, value);
@@ -862,7 +863,7 @@ static int git_default_core_config(const char *var, const char *value)
 		return git_config_string(&askpass_program, var, value);
 
 	if (!strcmp(var, "core.excludesfile"))
-		return git_config_pathname(&excludes_file, var, value);
+		return git_config_pathname((char **)&excludes_file, var, value);
 
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
@@ -989,7 +990,8 @@ static int git_default_push_config(const char *var, const char *value)
 static int git_default_mailmap_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "mailmap.file"))
-		return git_config_pathname(&git_mailmap_file, var, value);
+		return git_config_pathname((char **)&git_mailmap_file,
+				var, value);
 	if (!strcmp(var, "mailmap.blob"))
 		return git_config_string(&git_mailmap_blob, var, value);
 
@@ -1506,7 +1508,7 @@ int git_configset_get_pathname(struct config_set *cs, const char *key, const cha
 {
 	const char *value;
 	if (!git_configset_get_value(cs, key, &value))
-		return git_config_pathname(dest, key, value);
+		return git_config_pathname((char **)dest, key, value);
 	else
 		return 1;
 }
diff --git a/diff.c b/diff.c
index 87b16d5..e029b75 100644
--- a/diff.c
+++ b/diff.c
@@ -204,7 +204,8 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "diff.wordregex"))
 		return git_config_string(&diff_word_regex_cfg, var, value);
 	if (!strcmp(var, "diff.orderfile"))
-		return git_config_pathname(&diff_order_file_cfg, var, value);
+		return git_config_pathname((char **)&diff_order_file_cfg,
+				var, value);
 
 	if (!strcmp(var, "diff.ignoresubmodules"))
 		handle_ignore_submodules_arg(&default_diff_options, value);

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-20 12:45             ` Johannes Schindelin
@ 2015-06-20 17:28               ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-20 17:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> There is a problem, though: `git_config_pathname()` accepts a
> `const char **` parameter to set the path, yet I need to `free()`
> that pointer afterwards because it has been obtained through
> `expand_user_path()` which detaches that buffer from a `strbuf`.

"I have 'const char *' because I do not ever change the string
myself after getting it from an API function, but free() does not
want to free a const pointer" occurs sometimes in our codebase and
it is OK to cast the constness away (many callsites to free()
already do so).

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-19 20:53               ` Junio C Hamano
  2015-06-19 23:57                 ` Scott Schmit
@ 2015-06-21  4:55                 ` Michael Haggerty
  2015-06-21  5:09                   ` Randall S. Becker
                                     ` (2 more replies)
  1 sibling, 3 replies; 275+ messages in thread
From: Michael Haggerty @ 2015-06-21  4:55 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin; +Cc: git, peff

On 06/19/2015 10:53 PM, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Can you think of a name for the option that is as short as `--quick`
>> but means the same as `--connectivity-only`?
> 
> No I can't.  I think `--connectivity-only` is a very good name that
> is unfortunately a mouthful, I agree that we need a name that is as
> short as `--xxxxx` that means the same as `--connectivity-only`.  I
> do not think `--quick` is that word; it does not mean such a thing.

`--connectivity-only` says that "of all the things that fsck can do,
skip everything except for the connectivity check". But the switch
really affects not the connectivity part of the checks (that part is
done in either case), but the blob part. So, if we ignore the length of
the option name for a moment, it seems like the options should be
something like `--check-blob-integrity`/`--no-check-blob-integrity`. The
default would remain `--check-blob-integrity` of course, but

* Someday there might be a config setting that people can use to change
the default behavior of fsck to `--no-check-blob-integrity`.
* Someday there might be other expensive types of checks [1] that we
want to turn on/off independent of blob integrity checks.

But now that I'm writing this, a silly question occurs to me: Do we need
an overall option like this at all? If I demote all blob-integrity
checks to "ignore" via the mechanism that you have added, then shouldn't
fsck automatically detect that it doesn't have to open the blobs at all
and enable this speedup automatically? So maybe
`--(no-)?check-blob-integrity` is actually a shorthand for turning a few
more specific checks on/off at once.

As for thinking of a shorter name for the option: assuming the blob
integrity checks can be turned on and off independently as described
above, then I think it is reasonable to *also* add a `--quick` option
defined as

--quick: Skip some expensive checks, dramatically reducing the
    runtime of `git fsck`. Currently this is equivalent to
    `--no-check-blob-integrity`.

In the future if we invent other expensive checks we might also add them
to the list of things that are skipped by `--quick`.

Michael

[1] For example, if LFS or something like it every became part of
standard Git, one could imagine a super-expensive
`--check-lfs-object-availability` check that would would default to OFF
but sometimes turn on by hand.

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 275+ messages in thread

* RE: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21  4:55                 ` Michael Haggerty
@ 2015-06-21  5:09                   ` Randall S. Becker
  2015-06-21 14:40                     ` Johannes Schindelin
  2015-06-21 12:01                   ` Johannes Schindelin
  2015-06-21 17:15                   ` Junio C Hamano
  2 siblings, 1 reply; 275+ messages in thread
From: Randall S. Becker @ 2015-06-21  5:09 UTC (permalink / raw)
  To: 'Michael Haggerty', 'Junio C Hamano',
	'Johannes Schindelin'
  Cc: git, peff

On June 21, 2015 12:56 AM, Michael Haggerty wrote:
> On 06/19/2015 10:53 PM, Junio C Hamano wrote:
> > Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> >
> >> Can you think of a name for the option that is as short as `--quick`
> >> but means the same as `--connectivity-only`?
> >
> > No I can't.  I think `--connectivity-only` is a very good name that is
> > unfortunately a mouthful, I agree that we need a name that is as short
> > as `--xxxxx` that means the same as `--connectivity-only`.  I do not
> > think `--quick` is that word; it does not mean such a thing.
> 
> `--connectivity-only` says that "of all the things that fsck can do, skip
everything
> except for the connectivity check". But the switch really affects not the
> connectivity part of the checks (that part is done in either case), but
the blob
> part. So, if we ignore the length of the option name for a moment, it
seems like
> the options should be something like
`--check-blob-integrity`/`--no-check-blob-
> integrity`. The default would remain `--check-blob-integrity` of course,
but
> 
> * Someday there might be a config setting that people can use to change
the
> default behavior of fsck to `--no-check-blob-integrity`.
> * Someday there might be other expensive types of checks [1] that we want
to
> turn on/off independent of blob integrity checks.
> 
> But now that I'm writing this, a silly question occurs to me: Do we need
an
> overall option like this at all? If I demote all blob-integrity checks to
"ignore"
> via the mechanism that you have added, then shouldn't fsck automatically
> detect that it doesn't have to open the blobs at all and enable this
speedup
> automatically? So maybe `--(no-)?check-blob-integrity` is actually a
shorthand
> for turning a few more specific checks on/off at once.
> 
> As for thinking of a shorter name for the option: assuming the blob
integrity
> checks can be turned on and off independently as described above, then I
think
> it is reasonable to *also* add a `--quick` option defined as
> 
> --quick: Skip some expensive checks, dramatically reducing the
>     runtime of `git fsck`. Currently this is equivalent to
>     `--no-check-blob-integrity`.
> 
> In the future if we invent other expensive checks we might also add them
to the
> list of things that are skipped by `--quick`.

Synonym suggestions: --links or --relations
I was going to include --refs but that may be ambiguous. Links also has
meaning so it's probably out and --hitch may just be silly and needlessly
introducing a new term.

Cheers,
Randall

-- Brief whoami: NonStop&UNIX developer since approximately
UNIX(421664400)/NonStop(211288444200000000)
-- In my real life, I talk too much.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21  4:55                 ` Michael Haggerty
  2015-06-21  5:09                   ` Randall S. Becker
@ 2015-06-21 12:01                   ` Johannes Schindelin
  2015-06-21 17:15                   ` Junio C Hamano
  2 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-21 12:01 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git, peff

Hi Michael,

On 2015-06-21 06:55, Michael Haggerty wrote:
> On 06/19/2015 10:53 PM, Junio C Hamano wrote:
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>>
>>> Can you think of a name for the option that is as short as `--quick`
>>> but means the same as `--connectivity-only`?
>>
>> No I can't.  I think `--connectivity-only` is a very good name that
>> is unfortunately a mouthful, I agree that we need a name that is as
>> short as `--xxxxx` that means the same as `--connectivity-only`.  I
>> do not think `--quick` is that word; it does not mean such a thing.
> 
> `--connectivity-only` says that "of all the things that fsck can do,
> skip everything except for the connectivity check". But the switch
> really affects not the connectivity part of the checks (that part is
> done in either case), but the blob part.

Right, so `--skip-blobs` would be a better name, I guess, if you follow Junio's reasoning.

But...

> [...]
> As for thinking of a shorter name for the option: assuming the blob
> integrity checks can be turned on and off independently as described
> above, then I think it is reasonable to *also* add a `--quick` option
> defined as
> 
> --quick: Skip some expensive checks, dramatically reducing the
>     runtime of `git fsck`. Currently this is equivalent to
>     `--no-check-blob-integrity`.

This was my idea, without bothering to introduce the `--no-check-blob-integrity`. I was really thinking along the lines: If you just want to check quickly whether your repository is in good shape, without wanting to check too deeply, then `--quick` is your friend. I just *happened* to think of skipping blobs as a way to trade off accuracy for speed, but really, the reason why I introduced `--quick` was to have a way to check much faster if somewhat less thoroughly.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs
  2015-06-19 19:13           ` Junio C Hamano
@ 2015-06-21 13:46             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-21 13:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 21:13, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> +#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
>>  static struct {
>> +	const char *id_string;
>>  	int msg_type;
>>  } msg_id_info[FSCK_MSG_MAX + 1] = {
>>  	FOREACH_MSG_ID(MSG_ID)
>> -	{ -1 }
>> +	{ NULL, -1 }
>>  };
>>  #undef MSG_ID
>>
>> +static int parse_msg_id(const char *text, int len)
>> +{
>> +	int i, j;
>> +
>> +	if (len < 0)
>> +		len = strlen(text);
>> +
>> +	for (i = 0; i < FSCK_MSG_MAX; i++) {
> 
> I wonder an array without sentinel at the end with ARRAY_SIZE() may
> be a leaner way to do these, especially as this is all limited to
> this single file.

The real reason is that I cannot prevent a trailing comma with the way I construct the array, so FSCK_MSG_MAX is a natural way to support compilers that do not allow something like

    enum { A, B, C, };

>> +		const char *key = msg_id_info[i].id_string;
>> +		/* match id_string case-insensitively, without underscores. */
>> +		for (j = 0; j < len; j++) {
>> +			char c = *(key++);
>> +			if (c == '_')
>> +				c = *(key++);
> 
> s/if/while/ perhaps?

Actually, I want to prevent double underscores so that no ambiguity occurs between the camelCased version of, say, JUNIO_HAMANO and JUNIO__HAMANO.

I inserted an `assert()`;

>> +			if (toupper(text[j]) != c)
> 
> I know the performance would not matter very much but calling
> toupper() for each letter in the user input FSCK_MSG_MAX times
> sounds rather inefficient.
> 
> Would it make sense to make the caller upcase instead (or upcase
> upfront in the function)?

As you said, performance plays a lesser role here than simplicity. The strings we receive here are passed in read-only, therefore I would have to `strdup()` them first and then make sure to `free()` them. That was too un-simple for my taste.

Having said that, if you disagree, I will introduce the complexity, though. Do you want me to do that?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-19 19:26           ` Junio C Hamano
@ 2015-06-21 13:59             ` Johannes Schindelin
  2015-06-21 17:36               ` Junio C Hamano
  2015-06-22 15:24             ` Johannes Schindelin
  1 sibling, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-21 13:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 21:26, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> +static inline int substrcmp(const char *string, int len, const char *match)
>> +{
>> +	int match_len = strlen(match);
>> +	if (match_len != len)
>> +		return -1;
>> +	return memcmp(string, match, len);
>> +}
> 
> What this does looks suspiciously like starts_with(), but its name
> "substrcmp()" does not give any hint that this is about the beginnig
> part of "string"; if anything, it gives a wrong hint that it may be
> any substring.  prefixcmp() might be a better name but that was the
> old name for !starts_with() so we cannot use it here.  It is a
> mouthful, but starts_with_counted() may be.

It is actually not `prefixcmp()` because it requires that `match` has precisely `len` bytes, while `prefixcmp()` or `starts_with()` would not care as long as the first `strlen(match)` bytes of `string` match `match`.

Maybe `fixed_length_strcmp()` would be an appropriate name, but it is pretty long and *still* does not convey exactly what this function is about.

Also please note that this is a `static inline` function, i.e. its definition does not bleed outside of this file.

So I hope that you are satisfied with simply adding

    /** compares a counted string to a NUL-terminated one. */

as comment above the `substrcmp()` function?

> But the whole thing may be moot.
> 
> If we take the "why not upcase the end-user string upfront"
> suggestion from the previous review, fsck_set_msg_types() would have
> an upcased copy of the end-user string that it can muck with; it can
> turn "badfoo=error,poorbar=..." into "BADFOO=error,POORBAR=..."
> that is stored in its own writable memory (possibly a strbuf), and
> at that point it can afford to NUL-terminate BADFOO=error after
> finding where one specification ends with strcspn() before calling
> fsck_set_msg_type(), which in turn calls parse_msg_type().

Hmm. I really do not like that kind of thinking, i.e. having to duplicate, then modify data to be able to call the API, only to have to modify the data back afterwards, and eventually having to unallocate the data in all code paths. That feels just very inelegant to me.

I agree that it cannot be helped, sometimes, when you *have* to do exactly such mucking with data on a copy, be it to avoid additional complexity or poor performance. But I simply do not see the need for that here.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* RE: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21  5:09                   ` Randall S. Becker
@ 2015-06-21 14:40                     ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-21 14:40 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: 'Michael Haggerty', 'Junio C Hamano', git, peff

Hi Randall,

On 2015-06-21 07:09, Randall S. Becker wrote:
> On June 21, 2015 12:56 AM, Michael Haggerty wrote:
>
>> As for thinking of a shorter name for the option: assuming the blob integrity
>> checks can be turned on and off independently as described above, then I think
>> it is reasonable to *also* add a `--quick` option defined as
>>
>> --quick: Skip some expensive checks, dramatically reducing the
>>     runtime of `git fsck`. Currently this is equivalent to
>>     `--no-check-blob-integrity`.
>>
>> In the future if we invent other expensive checks we might also add them to the
>> list of things that are skipped by `--quick`.
> 
> Synonym suggestions: --links or --relations
> I was going to include --refs but that may be ambiguous. Links also has
> meaning so it's probably out and --hitch may just be silly and needlessly
> introducing a new term.

I appreciate your input, but I think in this case, "links" and "relations" would just add new terminology, and I would like to keep using existing terminology as much as possible (because we already have a confusingly large glossary). I also see "link" as a problematic term because it resembles "gitlink" -- which means something entirely different.

My favorite is still `--quick` because in contrast to Junio, to me it *has* a connotation of "less thoroughly". "Get rich quick" just does not mean "Earn a lot of money by doing a thorough job, just faster than you usually would".

If Junio insists, I will of course rename it to `--skip-blobs` because that is really what we do. However, it has been my experience over and over again that letting implementation details bleed through to the user interface always comes back to haunt you[*1*]. In this case, the purpose of the option really just was to cut a few corners to run `git fsck` more quickly. The fact that the corner I cut here is to skip unpacking of the blobs just happens to be the implementation detail.

So, Junio, what is it: keep `--quick` and clarify in the documentation that we cut corners (and which corners), or rename it to `--skip-blobs`? Your call.

Ciao,
Dscho

Footnote *1*: You would not believe how often I had to apologize for renaming the `edit-patch-series` script to `interactive rebase`. You see, technically it is correct to call it an interactive rebase: it really is that, from a purely technical, soul-less point of view. From the users' point of view, however, the name `edit-patch-series` would relate exactly what the command is *about*, as opposed to *how* it works. To most users, calling it "rebase -i" is as understandable as "sympodial fasciculation and regrafting of disconnate vernates" (and don't ask me what that even means).

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21  4:55                 ` Michael Haggerty
  2015-06-21  5:09                   ` Randall S. Becker
  2015-06-21 12:01                   ` Johannes Schindelin
@ 2015-06-21 17:15                   ` Junio C Hamano
  2015-06-21 18:27                     ` Johannes Schindelin
  2 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-21 17:15 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Johannes Schindelin, git, peff

Michael Haggerty <mhagger@alum.mit.edu> writes:

> But now that I'm writing this, a silly question occurs to me: Do we need
> an overall option like this at all? If I demote all blob-integrity
> checks to "ignore" via the mechanism that you have added, then shouldn't
> fsck automatically detect that it doesn't have to open the blobs at all
> and enable this speedup automatically?

That's brilliant.

Just to make sure I am reading you correctly, you mean the current
overall structure:

	if (! "is the blob loadable and well-formed?") {
		report("BAD BLOB");
                ... which is equivalent to ...
                if ("is bad_blob ignored?")
			; /* no-op */
		else {
			output "BAD BLOB";
                        if ("is bad_blob an error?")
				return 1; /* error */
		}
	}
        ... other checks ...

can be turned into this structure:

        if ("is bad_blob ignored?")
		;
	else if (! "is the blob loadable and well-formed?") {
		report("BAD BLOB");
                ... which would be equivalent to ...
		output "BAD BLOB";
                if ("is bad_blob an error?")
			return 1; /* error */
	}
        ... other checks ...

I think that makes tons of sense.  With one minor caveat.  In the
above "rewrite" I deliberately described report() in the updated
flow to always output, but that would force all checkers to adopt
"do not unconditionally check; if we do not report, do not even
bother checking", which (1) would be a large change to what Dscho
currently have, and (2) might not apply to certain kinds of error
conditions.

But that minor caveat is easily addressed by keeping the "if we are
set to ignore this error, just return 0" in report().  Some codepaths
(like the "BAD BLOB" above) may not exercise it by bypassing the
call to report() upfront and that is perfectly fine.

I like that idea.

> So maybe
> `--(no-)?check-blob-integrity` is actually a shorthand for turning a few
> more specific checks on/off at once.
>
> As for thinking of a shorter name for the option: assuming the blob
> integrity checks can be turned on and off independently as described
> above, then I think it is reasonable to *also* add a `--quick` option
> defined as
>
> --quick: Skip some expensive checks, dramatically reducing the
>     runtime of `git fsck`. Currently this is equivalent to
>     `--no-check-blob-integrity`.
>
> In the future if we invent other expensive checks we might also add them
> to the list of things that are skipped by `--quick`.

Yes, that is doubly brilliant. Taken together with the auto-skipping
of the checks based on the report settings, it makes it unnecessary
to even introduce --[no-]check-blob-integrity or any other new
knobs.

Very well analysed.  I am happy with the "--quick" with that
definition.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-21 13:59             ` Johannes Schindelin
@ 2015-06-21 17:36               ` Junio C Hamano
  2015-06-21 18:23                 ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-21 17:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Hmm. I really do not like that kind of thinking, i.e. having to
> duplicate, then modify data to be able to call the API, only to have
> to modify the data back afterwards, and eventually having to
> unallocate the data in all code paths. That feels just very inelegant
> to me.

You can see in our codebase that I have avoided touching end-user
strings by using a substring in place by introducing a new API
function or using an existing API function that takes <ptr, len> for
exactly that reason.  There also are cases where we are better off
if we make a copy upfront at a very high level in the callchain if
that makes the processing of that string deeper in the callchain
much simpler without customized helpers that take counted strings.

And 03/19 and this one taken together, I think it is an example of
the latter [*1*].

You not only need to invent counted string comparison in 04/19 but
also need upcasing byte-by-byte comparison in a loop in 03/19; both
of which can be made much simpler if you massaged the end-user input
"foo=error,bar=ignore" into "FOO=error,BAR=ignore" and allowed the
code to loop over it to turn ',' into NUL while parsing each
individual piece (i.e. "FOO=error").

So contrary to what you said in response to my review on 03/19, I
view this as not "adding complexity" but its total opposite.  It is
to make the code and logic much less complex by paying the price for
one copied (and massaged) string.

Having said all that, as long as the result functions correctly, I
suspect that it may not matter much either way.  As I already said,
"we can demote this error to a warning (or ignored)" is much less
useful in practice than the "we know this and that object in this
project does not pass fsck, so please do not bother checking" in my
mind, and that makes me think that this part of the series is much
less important than the skip-list thing.  Unnecessary complexity in
the code and otherwise useless helper functions may be something
that can be simplified following my advice, but that can be done as
a code clean-up, simplification and optimization by somebody else
later if they really cared.

Thanks.


[Footnote]

*1* Everything I say in my messages is what "I think", so saying "I
think" before saying what I think is redundant, but this one needs
that, because I think ;-) it is in the "taste" territory to view
each individual case and decide which one of these two opposite
approaches is more appropriate.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-21 17:36               ` Junio C Hamano
@ 2015-06-21 18:23                 ` Johannes Schindelin
  2015-06-21 18:47                   ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-21 18:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-21 19:36, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Hmm. I really do not like that kind of thinking, i.e. having to
>> duplicate, then modify data to be able to call the API, only to have
>> to modify the data back afterwards, and eventually having to
>> unallocate the data in all code paths. That feels just very inelegant
>> to me.
> 
> You can see in our codebase that I have avoided touching end-user
> strings by using a substring in place by introducing a new API
> function or using an existing API function that takes <ptr, len> for
> exactly that reason.  There also are cases where we are better off
> if we make a copy upfront at a very high level in the callchain if
> that makes the processing of that string deeper in the callchain
> much simpler without customized helpers that take counted strings.
> 
> And 03/19 and this one taken together, I think it is an example of
> the latter [*1*].
> 
> You not only need to invent counted string comparison in 04/19 but
> also need upcasing byte-by-byte comparison in a loop in 03/19; both
> of which can be made much simpler if you massaged the end-user input
> "foo=error,bar=ignore" into "FOO=error,BAR=ignore" and allowed the
> code to loop over it to turn ',' into NUL while parsing each
> individual piece (i.e. "FOO=error").
> 
> So contrary to what you said in response to my review on 03/19, I
> view this as not "adding complexity" but its total opposite.  It is
> to make the code and logic much less complex by paying the price for
> one copied (and massaged) string.

How about I implement your suggestion tomorrow, then show the diff between the two versions and we can assess what looks to be simpler (i.e. more maintainable)?

BTW I agree about the skip-list feature being the most important outcome of this patch series; It only occurred to me as a useful feature at the very end of my work on the fsck IDs, but then I did not want to throw away all of that previous work ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21 17:15                   ` Junio C Hamano
@ 2015-06-21 18:27                     ` Johannes Schindelin
  2015-06-21 20:35                       ` Junio C Hamano
  0 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-21 18:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, peff

Hi Junio,

On 2015-06-21 19:15, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> But now that I'm writing this, a silly question occurs to me: Do we need
>> an overall option like this at all? If I demote all blob-integrity
>> checks to "ignore" via the mechanism that you have added, then shouldn't
>> fsck automatically detect that it doesn't have to open the blobs at all
>> and enable this speedup automatically?
> 
> That's brilliant.
> 
> Just to make sure I am reading you correctly, you mean the current
> overall structure:
> 
> [...]

The way I read Michael's mail, he actually meant something different: if all of the blob-related errors/warnings are switched to "ignore", simply skip unpacking the blobs. If I read this correctly, I would like to point out that introducing a future blob-related check would require scripts to be changed to benefit from the speed-up, unless the `--quick` option is *also* introduced.

The way I read your suggestion, however, was to introduce *yet another* message ID: BAD_BLOB. If that one is turned to "ignore", we simply won't look at blobs' contents at all.

I like that idea, together with:

>> --quick: Skip some expensive checks, dramatically reducing the
>>     runtime of `git fsck`. Currently this is equivalent to
>>     `--no-check-blob-integrity`.

... where this would become `-c fsck.badBlob=ignore`.

>> In the future if we invent other expensive checks we might also add them
>> to the list of things that are skipped by `--quick`.
> 
> Yes, that is doubly brilliant. Taken together with the auto-skipping
> of the checks based on the report settings, it makes it unnecessary
> to even introduce --[no-]check-blob-integrity or any other new
> knobs.
> 
> Very well analysed.  I am happy with the "--quick" with that
> definition.

Did I understand you correctly? If so, I will gladly implement this tomorrow and send out v7.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-21 18:23                 ` Johannes Schindelin
@ 2015-06-21 18:47                   ` Junio C Hamano
  0 siblings, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-21 18:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> How about I implement your suggestion tomorrow, then show the diff
> between the two versions and we can assess what looks to be simpler
> (i.e. more maintainable)?

I'm indifferent at this point, partly because as we agree that what
you have as long as it works is ok, and more importantly, Michael's
suggestion to turn "check unconditionally and only control if we
ignore or warn or error out the result" into "do not even check if
we are told to ignore" is a much more prouctive thing to spend your
time on.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21 18:27                     ` Johannes Schindelin
@ 2015-06-21 20:35                       ` Junio C Hamano
  2015-06-21 20:46                         ` Junio C Hamano
  2015-06-22 13:01                         ` Johannes Schindelin
  0 siblings, 2 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-21 20:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Michael Haggerty, git, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> On 2015-06-21 19:15, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> That's brilliant.
>> 
>> Just to make sure I am reading you correctly, you mean the current
>> overall structure:
>> 
>> [...]
>
> The way I read Michael's mail, he actually meant something different:
> if all of the blob-related errors/warnings are switched to "ignore",
> simply skip unpacking the blobs.

That is how I read his mail, too.

But because IIRC we do not check anything special with blob other
than we can read it correctly, my description of "overall structure"
stayed at a very high conceptual level.  The unpacking may happen at
a much higher level in the code, i.e. it comes way before this part
of the logic flow:

        if ("is bad_blob ignored?")
		;
	else if (! "is the blob loadable and well-formed?") {

in which case "is bad blobs ignored?" check may have to happen
before we unpack the object.

And I do not suggest introducing yet another BAD_BLOB error class; I
would have guessed that you already have an error class for objects
that are not stored correctly (be it truncated loose object, checksum
mismatch in the packed base object, or corrupt delta in pack).

It so happens that blob is the only type of object that does not
have outgoing links that is needed for connectivity check, so even
if you allow to ignore "error class for objects that are not stored
correctly", you would still have to read trees, commits and tags;
it would be a natural consequence of ignoring that class of errors
that you would get a quick-and-dirty fsck by not unpacking blobs.

Of course, that assumes that you can tell an object is a blob
without unpacking.  If a tree entry mentions an object to be a blob
by having 100644 as its mode, unless you unpack the object pointed
at by that tree entry to make sure it is a blob, you wouldn't be
able to detect a case where a non-blob object is stored with 100644
mode, which would be an error in the containing tree object that we
may want to detect.  I am not sure if "skipping inflation of blobs,
but still ensure connectivity and tree integrity" is really a viable
mode of quick-and-dirty operation.  I would imagine you would need
to lose a bit more than "we don't bother reading blobs" (which is OK
by me, but I am just pointing out that (1) I do not mean to say we
should add BAD_BLOB as a new class, and (2) the automatic bypass
Michael's --quick skips may not be limited to suppressing "we cannot
read this blob object" class, but also need to suppress checks for
some form of tree integrity violation).

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21 20:35                       ` Junio C Hamano
@ 2015-06-21 20:46                         ` Junio C Hamano
  2015-06-22 13:01                         ` Johannes Schindelin
  1 sibling, 0 replies; 275+ messages in thread
From: Junio C Hamano @ 2015-06-21 20:46 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Michael Haggerty, git, peff

Junio C Hamano <gitster@pobox.com> writes:

> Of course, that assumes that you can tell an object is a blob
> without unpacking.  If a tree entry mentions an object to be a blob
> by having 100644 as its mode, unless you unpack the object pointed
> at by that tree entry to make sure it is a blob, you wouldn't be
> able to detect a case where a non-blob object is stored with 100644
> mode, which would be an error in the containing tree object that we
> may want to detect.

Heh, I was being stupid here.  Of course, you can tell an object is
a blob without fully inflating it (we should already be doing that
in sha1_object_info(), inflating only the object header in loose
objects or inflating only the early parts of delta in packs to
follow the delta chain to learn the type of the object), and that
would certainly save us substantial zlib cost.

Sorry about the noise.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-19 13:35         ` [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
  2015-06-19 20:39           ` Junio C Hamano
@ 2015-06-22  4:21           ` Junio C Hamano
  2015-06-22  8:49             ` Johannes Schindelin
  1 sibling, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-22  4:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List, Michael Haggerty, Jeff King

On Fri, Jun 19, 2015 at 6:35 AM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
>
> @@ -227,6 +277,10 @@ static int report(struct fsck_options *options, struct object *object,
>         if (msg_type == FSCK_IGNORE)
>                 return 0;
>
> +       if (options->skiplist && object &&
> +                       sha1_array_lookup(options->skiplist, object->sha1) >= 0)
> +               return 0;
> +
>         if (msg_type == FSCK_FATAL)
>                 msg_type = FSCK_ERROR;
>         else if (msg_type == FSCK_INFO)

I just double checked this patch because I wanted to make sure this
was applied in the
report() function (i.e. behave as if FSCK_IGNORE was specified for
specific objects on
the skip list), and I am happy to see that it indeed is the case.

That was because I briefly feared that skip could be done before going
through the usual
verification chain, which would have been very wrong (e.g. we may want
not to hear about
missing tagger in v2.6.11-tree tag, but nevertheless we would want to
check all the tree
contents pointed at by that tag, as that tree may not be reachable by
any other way).

So this one looks good.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-22  4:21           ` Junio C Hamano
@ 2015-06-22  8:49             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22  8:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Michael Haggerty, Jeff King

Hi Junio,

On 2015-06-22 06:21, Junio C Hamano wrote:
> On Fri, Jun 19, 2015 at 6:35 AM, Johannes Schindelin
> <johannes.schindelin@gmx.de> wrote:
>>
>> @@ -227,6 +277,10 @@ static int report(struct fsck_options *options, struct object *object,
>>         if (msg_type == FSCK_IGNORE)
>>                 return 0;
>>
>> +       if (options->skiplist && object &&
>> +                       sha1_array_lookup(options->skiplist, object->sha1) >= 0)
>> +               return 0;
>> +
>>         if (msg_type == FSCK_FATAL)
>>                 msg_type = FSCK_ERROR;
>>         else if (msg_type == FSCK_INFO)
> 
> I just double checked this patch because I wanted to make sure this
> was applied in the
> report() function (i.e. behave as if FSCK_IGNORE was specified for
> specific objects on
> the skip list), and I am happy to see that it indeed is the case.
> 
> That was because I briefly feared that skip could be done before going
> through the usual
> verification chain, which would have been very wrong (e.g. we may want
> not to hear about
> missing tagger in v2.6.11-tree tag, but nevertheless we would want to
> check all the tree
> contents pointed at by that tag, as that tree may not be reachable by
> any other way).

To be honest, an earlier iteration actually did have that test much earlier in the call chain, but I had changed it to the current location in v5.

My rationale was slightly different from yours: I wanted to affect the performance as little as possible. So looking up each and every object in the skip list (which I expect to be relatively small) seemed to be wasteful. And then it occurred to me that it would make much more sense to just make the skip-list functionality equivalent to the "ignore" message type.

It just occurred to me, however, that one thing is possibly surprising with either version of the skip list functionality: if a certain object is corrupt on disk, it cannot be skipped via the skip-list, as the object is *still* unpacked (which would fail in the case of a corrupt object).

I will document that in the man page.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
  2015-06-21 20:35                       ` Junio C Hamano
  2015-06-21 20:46                         ` Junio C Hamano
@ 2015-06-22 13:01                         ` Johannes Schindelin
  1 sibling, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 13:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, peff

Hi Junio,

On 2015-06-21 22:35, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> On 2015-06-21 19:15, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>> That's brilliant.
>>>
>>> Just to make sure I am reading you correctly, you mean the current
>>> overall structure:
>>>
>>> [...]
>>
>> The way I read Michael's mail, he actually meant something different:
>> if all of the blob-related errors/warnings are switched to "ignore",
>> simply skip unpacking the blobs.
> 
> That is how I read his mail, too.
> 
> But because IIRC we do not check anything special with blob other
> than we can read it correctly, my description of "overall structure"
> stayed at a very high conceptual level.  The unpacking may happen at
> a much higher level in the code, i.e. it comes way before this part
> of the logic flow:
> 
>         if ("is bad_blob ignored?")
> 		;
> 	else if (! "is the blob loadable and well-formed?") {
> 
> in which case "is bad blobs ignored?" check may have to happen
> before we unpack the object.
> 
> And I do not suggest introducing yet another BAD_BLOB error class; I
> would have guessed that you already have an error class for objects
> that are not stored correctly (be it truncated loose object, checksum
> mismatch in the packed base object, or corrupt delta in pack).

Sadly, there is no BAD_BLOB class. The reason is that we actually perform no test on blobs, as you pointed out, except for the implicit one: read it as a blob object.

And reading them even only partially would still imply a lot of I/O, taking away much of the performance improvement I wanted to achieve here.

Further, please note that the `--quick` option *solely* impacts `git fsck`, not `git receive-pack`, because we actually really skipped everything except the connectivity test.

To allow this discussion to be resolved without further ado, I therefore renamed the `--quick` option to `--connectivity-only`, as even I realize that there is not much of a check left if not even author or committer lines are tested.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-19 19:26           ` Junio C Hamano
  2015-06-21 13:59             ` Johannes Schindelin
@ 2015-06-22 15:24             ` Johannes Schindelin
  1 sibling, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-19 21:26, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> +static inline int substrcmp(const char *string, int len, const char *match)
>> +{
>> +	int match_len = strlen(match);
>> +	if (match_len != len)
>> +		return -1;
>> +	return memcmp(string, match, len);
>> +}
> 
> What this does looks suspiciously like starts_with(), but its name
> "substrcmp()" does not give any hint that this is about the beginnig
> part of "string"; if anything, it gives a wrong hint that it may be
> any substring.  prefixcmp() might be a better name but that was the
> old name for !starts_with() so we cannot use it here.  It is a
> mouthful, but starts_with_counted() may be.
> 
> But the whole thing may be moot.
> 
> If we take the "why not upcase the end-user string upfront"
> suggestion from the previous review, fsck_set_msg_types() would have
> an upcased copy of the end-user string that it can muck with; it can
> turn "badfoo=error,poorbar=..." into "BADFOO=error,POORBAR=..."
> that is stored in its own writable memory (possibly a strbuf), and
> at that point it can afford to NUL-terminate BADFOO=error after
> finding where one specification ends with strcspn() before calling
> fsck_set_msg_type(), which in turn calls parse_msg_type().
> 
> So all parse_msg_type() needs to do is just !strcmp().

Turns out that the diffstat says it saves 10 lines. So I changed it according to your suggestion. Part of v7.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
                           ` (18 preceding siblings ...)
  2015-06-19 13:35         ` [PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
@ 2015-06-22 15:24         ` Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 01/19] fsck: Introduce fsck options Johannes Schindelin
                             ` (19 more replies)
  19 siblings, 20 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:24 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.

Changes since v6:

- camelCased message IDs

- multiple author checking now as suggested by Junio

- renamed `--quick` to `--connectivity-only`, better commit message

- `fsck.skipList` is now handled correctly (and not mistaken for a message
  type setting)

- `fsck.skipList` can handle user paths now

- index-pack configures the walk function in a more logical place now

- simplified code by avoiding working on partial strings (i.e. removed
  `substrcmp()`). This saves 10 lines. To accomodate parsing config
  variables directly, we now work on lowercased message IDs; unfortunately
  this means that we cannot use them in append_msg_id() because that
  function wants to append camelCased message IDs.

Interdiff below diffstat.

Johannes Schindelin (19):
  fsck: Introduce fsck options
  fsck: Introduce identifiers for fsck messages
  fsck: Provide a function to parse fsck message IDs
  fsck: Offer a function to demote fsck errors to warnings
  fsck (receive-pack): Allow demoting errors to warnings
  fsck: Report the ID of the error/warning
  fsck: Make fsck_ident() warn-friendly
  fsck: Make fsck_commit() warn-friendly
  fsck: Handle multiple authors in commits specially
  fsck: Make fsck_tag() warn-friendly
  fsck: Add a simple test for receive.fsck.<msg-id>
  fsck: Disallow demoting grave fsck errors to warnings
  fsck: Optionally ignore specific fsck issues completely
  fsck: Allow upgrading fsck warnings to errors
  fsck: Document the new receive.fsck.<msg-id> options
  fsck: Support demoting errors to warnings
  fsck: Introduce `git fsck --connectivity-only`
  fsck: git receive-pack: support excluding objects from fsck'ing
  fsck: support ignoring objects in `git fsck` via fsck.skiplist

 Documentation/config.txt        |  41 +++
 Documentation/git-fsck.txt      |   7 +-
 builtin/fsck.c                  |  78 ++++--
 builtin/index-pack.c            |  13 +-
 builtin/receive-pack.c          |  28 +-
 builtin/unpack-objects.c        |  16 +-
 fsck.c                          | 554 +++++++++++++++++++++++++++++++---------
 fsck.h                          |  30 ++-
 t/t1450-fsck.sh                 |  37 ++-
 t/t5302-pack-index.sh           |   2 +-
 t/t5504-fetch-receive-strict.sh |  51 ++++
 11 files changed, 692 insertions(+), 165 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5aba63a..69dda93 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1252,11 +1252,11 @@ filter.<driver>.smudge::
 
 fsck.<msg-id>::
 	Allows overriding the message type (error, warn or ignore) of a
-	specific message ID such as `missingemail`.
+	specific message ID such as `missingEmail`.
 +
 For convenience, fsck prefixes the error/warning with the message ID,
-e.g.  "missingemail: invalid author/committer line - missing email" means
-that setting `fsck.missingemail = ignore` will hide that issue.
+e.g.  "missingEmail: invalid author/committer line - missing email" means
+that setting `fsck.missingEmail = ignore` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
@@ -1267,6 +1267,7 @@ fsck.skipList::
 	be ignored. This feature is useful when an established project
 	should be accepted despite early commits containing errors that
 	can be safely ignored such as invalid committer email addresses.
+	Note: corrupt objects cannot be skipped with this setting.
 
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
@@ -2228,9 +2229,9 @@ receive.fsck.<msg-id>::
 	to warnings and vice versa by configuring the `receive.fsck.<msg-id>`
 	setting where the `<msg-id>` is the fsck message ID and the value
 	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
-	the error/warning with the message ID, e.g. "missingemail: invalid
+	the error/warning with the message ID, e.g. "missingEmail: invalid
 	author/committer line - missing email" means that setting
-	`receive.fsck.missingemail = ignore` will hide that issue.
+	`receive.fsck.missingEmail = ignore` will hide that issue.
 +
 This feature is intended to support working with legacy repositories
 which would not pass pushing when `receive.fsckObjects = true`, allowing
@@ -2243,6 +2244,7 @@ receive.fsck.skipList::
 	be ignored. This feature is useful when an established project
 	should be accepted despite early commits containing errors that
 	can be safely ignored such as invalid committer email addresses.
+	Note: corrupt objects cannot be skipped with this setting.
 
 receive.unpackLimit::
 	If the number of objects received in a push is below this
diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index b98fb43..84ee92e 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,8 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
-	 [--[no-]dangling] [--[no-]progress] [<object>*]
+	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]dangling] [--[no-]progress] [--connectivity-only] [<object>*]
 
 DESCRIPTION
 -----------
@@ -60,10 +60,10 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
---quick::
+--connectivity-only::
 	Check only the connectivity of tags, commits and tree objects. By
 	avoiding to unpack blobs, this speeds up the operation, at the
-	expense of missing corrupt objects.
+	expense of missing corrupt objects or other problematic issues.
 
 --strict::
 	Enable more strict checking, namely to catch a file mode
diff --git a/builtin/fsck.c b/builtin/fsck.c
index ce538ac..7e3df20 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,7 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
-static int quick;
+static int connectivity_only;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -49,21 +49,24 @@ static int show_dangling = 1;
 
 static int fsck_config(const char *var, const char *value, void *cb)
 {
-	if (skip_prefix(var, "fsck.", &var)) {
-		fsck_set_msg_type(&fsck_obj_options, var, -1, value, -1);
-		return 0;
-	}
-
 	if (strcmp(var, "fsck.skiplist") == 0) {
-		const char *path = is_absolute_path(value) ?
-			value : git_path("%s", value);
+		const char *path;
 		struct strbuf sb = STRBUF_INIT;
+
+		if (git_config_pathname(&path, var, value))
+			return 1;
 		strbuf_addf(&sb, "skiplist=%s", path);
+		free((char *) path);
 		fsck_set_msg_types(&fsck_obj_options, sb.buf);
 		strbuf_release(&sb);
 		return 0;
 	}
 
+	if (skip_prefix(var, "fsck.", &var)) {
+		fsck_set_msg_type(&fsck_obj_options, var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -192,7 +195,7 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
-		if (quick && has_sha1_file(obj->sha1))
+		if (connectivity_only && has_sha1_file(obj->sha1))
 			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
@@ -636,7 +639,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
-	OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
+	OPT_BOOL(0, "connectivity-only", &connectivity_only, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -673,7 +676,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	if (!quick)
+	if (!connectivity_only)
 		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 98e14fe..f0d283b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -841,7 +841,6 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (do_fsck_object &&
 			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			fsck_options.walk = mark_link;
 			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
@@ -1616,6 +1615,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 		usage(index_pack_usage);
 
 	check_replace_refs = 0;
+	fsck_options.walk = mark_link;
 
 	reset_pack_idx_option(&opts);
 	git_config(git_index_pack_config, &opts);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 80574f9..3fbed23 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -118,10 +118,13 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 	}
 
 	if (strcmp(var, "receive.fsck.skiplist") == 0) {
-		const char *path = is_absolute_path(value) ?
-			value : git_path("%s", value);
+		const char *path;
+
+		if (git_config_pathname(&path, var, value))
+			return 1;
 		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
 			fsck_msg_types.len ? ',' : '=', path);
+		free((char *) path);
 		return 0;
 	}
 
diff --git a/fsck.c b/fsck.c
index f80b508..a677b50 100644
--- a/fsck.c
+++ b/fsck.c
@@ -71,37 +71,42 @@ enum fsck_msg_id {
 #undef MSG_ID
 
 #define STR(x) #x
-#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
+#define MSG_ID(id, msg_type) { STR(id), NULL, FSCK_##msg_type },
 static struct {
 	const char *id_string;
+	const char *lowercased;
 	int msg_type;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ NULL, -1 }
+	{ NULL, NULL, -1 }
 };
 #undef MSG_ID
 
-static int parse_msg_id(const char *text, int len)
+static int parse_msg_id(const char *text)
 {
-	int i, j;
-
-	if (len < 0)
-		len = strlen(text);
-
-	for (i = 0; i < FSCK_MSG_MAX; i++) {
-		const char *key = msg_id_info[i].id_string;
-		/* match id_string case-insensitively, without underscores. */
-		for (j = 0; j < len; j++) {
-			char c = *(key++);
-			if (c == '_')
-				c = *(key++);
-			if (toupper(text[j]) != c)
-				break;
+	int i;
+
+	if (!msg_id_info[0].lowercased) {
+		/* convert id_string to lower case, without underscores. */
+		for (i = 0; i < FSCK_MSG_MAX; i++) {
+			const char *p = msg_id_info[i].id_string;
+			int len = strlen(p);
+			char *q = xmalloc(len);
+
+			msg_id_info[i].lowercased = q;
+			while (*p)
+				if (*p == '_')
+					p++;
+				else
+					*(q)++ = tolower(*(p)++);
+			*q = '\0';
 		}
-		if (j == len && !*key)
-			return i;
 	}
 
+	for (i = 0; i < FSCK_MSG_MAX; i++)
+		if (!strcmp(text, msg_id_info[i].lowercased))
+			return i;
+
 	return -1;
 }
 
@@ -160,51 +165,37 @@ static void init_skiplist(struct fsck_options *options, const char *path)
 		skiplist.sorted = 1;
 }
 
-static inline int substrcmp(const char *string, int len, const char *match)
-{
-	int match_len = strlen(match);
-	if (match_len != len)
-		return -1;
-	return memcmp(string, match, len);
-}
-
-static int parse_msg_type(const char *str, int len)
+static int parse_msg_type(const char *str)
 {
-	if (len < 0)
-		len = strlen(str);
-
-	if (!substrcmp(str, len, "error"))
+	if (!strcmp(str, "error"))
 		return FSCK_ERROR;
-	else if (!substrcmp(str, len, "warn"))
+	else if (!strcmp(str, "warn"))
 		return FSCK_WARN;
-	else if (!substrcmp(str, len, "ignore"))
+	else if (!strcmp(str, "ignore"))
 		return FSCK_IGNORE;
 	else
-		die("Unknown fsck message type: '%.*s'",
-				len, str);
+		die("Unknown fsck message type: '%s'", str);
 }
 
 int is_valid_msg_type(const char *msg_id, const char *msg_type)
 {
-	if (parse_msg_id(msg_id, -1) < 0)
+	if (parse_msg_id(msg_id) < 0)
 		return 0;
-	parse_msg_type(msg_type, -1);
+	parse_msg_type(msg_type);
 	return 1;
 }
 
 void fsck_set_msg_type(struct fsck_options *options,
-		const char *msg_id, int msg_id_len,
-		const char *msg_type, int msg_type_len)
+		const char *msg_id, const char *msg_type)
 {
-	int id = parse_msg_id(msg_id, msg_id_len), type;
+	int id = parse_msg_id(msg_id), type;
 
 	if (id < 0)
-		die("Unhandled message id: %.*s", msg_id_len, msg_id);
-	type = parse_msg_type(msg_type, msg_type_len);
+		die("Unhandled message id: %s", msg_id);
+	type = parse_msg_type(msg_type);
 
 	if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL)
-		die("Cannot demote %.*s to %.*s", msg_id_len, msg_id,
-				msg_type_len, msg_type);
+		die("Cannot demote %s to %s", msg_id, msg_type);
 
 	if (!options->msg_type) {
 		int i;
@@ -219,37 +210,39 @@ void fsck_set_msg_type(struct fsck_options *options,
 
 void fsck_set_msg_types(struct fsck_options *options, const char *values)
 {
-	while (*values) {
-		int len = strcspn(values, " ,|"), equal;
+	char *buf = xstrdup(values), *to_free = buf;
+	int done = 0;
 
+	while (!done) {
+		int len = strcspn(buf, " ,|"), equal;
+
+		done = !buf[len];
 		if (!len) {
-			values++;
+			buf++;
 			continue;
 		}
+		buf[len] = '\0';
 
-		for (equal = 0; equal < len; equal++)
-			if (values[equal] == '=' || values[equal] == ':')
-				break;
-
-		if (!substrcmp(values, equal, "skiplist")) {
-			char *path = xstrndup(values + equal + 1,
-				len - equal - 1);
+		for (equal = 0; equal < len &&
+				buf[equal] != '=' && buf[equal] != ':'; equal++)
+			buf[equal] = tolower(buf[equal]);
+		buf[equal] = '\0';
 
+		if (!strcmp(buf, "skiplist")) {
 			if (equal == len)
 				die("skiplist requires a path");
-			init_skiplist(options, path);
-			free(path);
-			values += len;
+			init_skiplist(options, buf + equal + 1);
+			buf += len + 1;
 			continue;
 		}
 
 		if (equal == len)
-			die("Missing '=': '%.*s'", len, values);
+			die("Missing '=': '%s'", buf);
 
-		fsck_set_msg_type(options, values, equal,
-				values + equal + 1, len - equal - 1);
-		values += len;
+		fsck_set_msg_type(options, buf, buf + equal + 1);
+		buf += len + 1;
 	}
+	free(to_free);
 }
 
 static void append_msg_id(struct strbuf *sb, const char *msg_id)
@@ -261,6 +254,10 @@ static void append_msg_id(struct strbuf *sb, const char *msg_id)
 			break;
 		if (c != '_')
 			strbuf_addch(sb, tolower(c));
+		else {
+			assert(*msg_id);
+			strbuf_addch(sb, *(msg_id)++);
+		}
 	}
 
 	strbuf_addstr(sb, ": ");
@@ -600,7 +597,7 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
-	unsigned parent_count, parent_line_count = 0;
+	unsigned parent_count, parent_line_count = 0, author_count;
 	int err;
 
 	if (require_end_of_header(buffer, size, &commit->object, options))
@@ -640,19 +637,19 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 				return err;
 		}
 	}
-	if (!skip_prefix(buffer, "author ", &buffer))
-		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, options);
-	if (err)
-		return err;
+	author_count = 0;
 	while (skip_prefix(buffer, "author ", &buffer)) {
-		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
-		if (err)
-			return err;
+		author_count++;
 		err = fsck_ident(&buffer, &commit->object, options);
 		if (err)
 			return err;
 	}
+	if (author_count < 1)
+		err = report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
+	else if (author_count > 1)
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+	if (err)
+		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
diff --git a/fsck.h b/fsck.h
index cab9c65..dded84b 100644
--- a/fsck.h
+++ b/fsck.h
@@ -8,8 +8,7 @@
 struct fsck_options;
 
 void fsck_set_msg_type(struct fsck_options *options,
-		const char *msg_id, int msg_id_len,
-		const char *msg_type, int msg_type_len);
+		const char *msg_id, const char *msg_type);
 void fsck_set_msg_types(struct fsck_options *options, const char *values);
 int is_valid_msg_type(const char *msg_id, const char *msg_type);
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 2863a8a..956673b 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: badtagname: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: badTagName: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missingTaggerEntry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
@@ -295,7 +295,7 @@ test_expect_success 'force fsck to ignore double author' '
 	git update-ref refs/heads/bogus "$new" &&
 	test_when_finished "git update-ref -d refs/heads/bogus" &&
 	test_must_fail git fsck &&
-	git -c fsck.multipleauthors=ignore fsck
+	git -c fsck.multipleAuthors=ignore fsck
 '
 
 _bz='\0'
@@ -431,11 +431,11 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
-test_expect_success 'fsck --quick' '
-	rm -rf quick &&
-	git init quick &&
+test_expect_success 'fsck --connectivity-only' '
+	rm -rf connectivity-only &&
+	git init connectivity-only &&
 	(
-		cd quick &&
+		cd connectivity-only &&
 		touch empty &&
 		git add empty &&
 		test_commit empty &&
@@ -443,13 +443,13 @@ test_expect_success 'fsck --quick' '
 		rm -f $empty &&
 		echo invalid >$empty &&
 		test_must_fail git fsck --strict &&
-		git fsck --strict --quick &&
+		git fsck --strict --connectivity-only &&
 		tree=$(git rev-parse HEAD:) &&
 		suffix=${tree#??} &&
 		tree=.git/objects/${tree%$suffix}/$suffix &&
 		rm -f $tree &&
 		echo invalid >$tree &&
-		test_must_fail git fsck --strict --quick
+		test_must_fail git fsck --strict --connectivity-only
 	)
 '
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 1ada54c..6a1f89e 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,19 +123,19 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
-test_expect_success 'push with receive.fsck.skiplist' '
+test_expect_success 'push with receive.fsck.skipList' '
 	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
 	git init dst &&
-	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config receive.fsckObjects true &&
 	test_must_fail git push --porcelain dst bogus &&
-	git --git-dir=dst/.git config receive.fsck.skiplist SKIP &&
+	git --git-dir=dst/.git config receive.fsck.skipList SKIP &&
 	echo $commit >dst/.git/SKIP &&
 	git push --porcelain dst bogus
 '
 
-test_expect_success 'push with receive.fsck.missingemail=warn' '
+test_expect_success 'push with receive.fsck.missingEmail=warn' '
 	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
 	rm -rf dst &&
@@ -143,20 +143,20 @@ test_expect_success 'push with receive.fsck.missingemail=warn' '
 	git --git-dir=dst/.git config receive.fsckobjects true &&
 	test_must_fail git push --porcelain dst bogus &&
 	git --git-dir=dst/.git config \
-		receive.fsck.missingemail warn &&
+		receive.fsck.missingEmail warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missingemail" act &&
+	grep "missingEmail" act &&
 	git --git-dir=dst/.git branch -D bogus &&
 	git  --git-dir=dst/.git config --add \
-		receive.fsck.missingemail ignore &&
+		receive.fsck.missingEmail ignore &&
 	git  --git-dir=dst/.git config --add \
-		receive.fsck.baddate warn &&
+		receive.fsck.badDate warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	test_must_fail grep "missingemail" act
+	test_must_fail grep "missingEmail" act
 '
 
 test_expect_success \
-	'receive.fsck.unterminatedheader=warn triggers error' '
+	'receive.fsck.unterminatedHeader=warn triggers error' '
 	rm -rf dst &&
 	git init dst &&
 	git --git-dir=dst/.git config receive.fsckobjects true &&
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 01/19] fsck: Introduce fsck options
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
@ 2015-06-22 15:25           ` Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
                             ` (18 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:25 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c           |  20 +++++--
 builtin/index-pack.c     |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c                   | 150 +++++++++++++++++++++++------------------------
 fsck.h                   |  17 +++++-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2679793..981dca5 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static struct object_id head_oid;
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void *data)
 
 static void mark_object_reachable(struct object *obj)
 {
-	mark_object(obj, OBJ_ANY, NULL);
+	mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
 		if (parse_tree(tree) < 0)
 			return 1; /* error already displayed */
 	}
-	result = fsck_walk(obj, mark_object, obj);
+	result = fsck_walk(obj, obj, &fsck_walk_options);
 	if (tree)
 		free_tree_buffer(tree);
 	return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
 	return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
 		fprintf(stderr, "Checking %s %s\n",
 			typename(obj->type), sha1_to_hex(obj->sha1));
 
-	if (fsck_walk(obj, mark_used, NULL))
+	if (fsck_walk(obj, NULL, &fsck_obj_options))
 		objerror(obj, "broken links");
-	if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+	if (fsck_object(obj, NULL, 0, &fsck_obj_options))
 		return -1;
 
 	if (obj->type == OBJ_TREE) {
@@ -638,6 +640,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+	fsck_walk_options.walk = mark_object;
+	fsck_obj_options.walk = mark_used;
+	fsck_obj_options.error_func = fsck_error_func;
+	if (check_strict)
+		fsck_obj_options.strict = 1;
+
 	if (show_progress == -1)
 		show_progress = isatty(2);
 	if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 48fa472..f8b0c64 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -75,6 +75,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -192,7 +193,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -838,10 +839,9 @@ static void sha1_object(const void *data, struct object_entry *obj_entry,
 			if (!obj)
 				die(_("invalid %s"), typename(type));
 			if (do_fsck_object &&
-			    fsck_object(obj, buf, size, 1,
-				    fsck_error_function))
+			    fsck_object(obj, buf, size, &fsck_options))
 				die(_("Error in object"));
-			if (fsck_walk(obj, mark_link, NULL))
+			if (fsck_walk(obj, NULL, &fsck_options))
 				die(_("Not all child objects of %s are reachable"), sha1_to_hex(obj->sha1));
 
 			if (obj->type == OBJ_TREE) {
@@ -1615,6 +1615,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 		usage(index_pack_usage);
 
 	check_replace_refs = 0;
+	fsck_options.walk = mark_link;
 
 	reset_pack_idx_option(&opts);
 	git_config(git_index_pack_config, &opts);
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ac66672..6d17040 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -178,7 +179,7 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf)
  * that have reachability requirements and calls this function.
  * Verify its reachability and validity recursively and write it out.
  */
-static int check_object(struct object *obj, int type, void *data)
+static int check_object(struct object *obj, int type, void *data, struct fsck_options *options)
 {
 	struct obj_buffer *obj_buf;
 
@@ -203,10 +204,10 @@ static int check_object(struct object *obj, int type, void *data)
 	obj_buf = lookup_object_buffer(obj);
 	if (!obj_buf)
 		die("Whoops! Cannot find object '%s'", sha1_to_hex(obj->sha1));
-	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, 1,
-			fsck_error_function))
+	if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options))
 		die("Error in object");
-	if (fsck_walk(obj, check_object, NULL))
+	fsck_options.walk = check_object;
+	if (fsck_walk(obj, NULL, &fsck_options))
 		die("Error on reachable objects of %s", sha1_to_hex(obj->sha1));
 	write_cached_object(obj, obj_buf);
 	return 0;
@@ -217,7 +218,7 @@ static void write_rest(void)
 	unsigned i;
 	for (i = 0; i < nr_objects; i++) {
 		if (obj_list[i].obj)
-			check_object(obj_list[i].obj, OBJ_ANY, NULL);
+			check_object(obj_list[i].obj, OBJ_ANY, NULL, NULL);
 	}
 }
 
diff --git a/fsck.c b/fsck.c
index 10bcb65..d83b811 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 
-static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
+static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
@@ -25,9 +25,9 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 		if (S_ISGITLINK(entry.mode))
 			continue;
 		if (S_ISDIR(entry.mode))
-			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);
+			result = options->walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data, options);
 		else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode))
-			result = walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data);
+			result = options->walk(&lookup_blob(entry.sha1)->object, OBJ_BLOB, data, options);
 		else {
 			result = error("in tree %s: entry %s has bad mode %.6o",
 					sha1_to_hex(tree->object.sha1), entry.path, entry.mode);
@@ -40,7 +40,7 @@ static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
 	return res;
 }
 
-static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *data)
+static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options)
 {
 	struct commit_list *parents;
 	int res;
@@ -49,14 +49,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	if (parse_commit(commit))
 		return -1;
 
-	result = walk((struct object *)commit->tree, OBJ_TREE, data);
+	result = options->walk((struct object *)commit->tree, OBJ_TREE, data, options);
 	if (result < 0)
 		return result;
 	res = result;
 
 	parents = commit->parents;
 	while (parents) {
-		result = walk((struct object *)parents->item, OBJ_COMMIT, data);
+		result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options);
 		if (result < 0)
 			return result;
 		if (!res)
@@ -66,14 +66,14 @@ static int fsck_walk_commit(struct commit *commit, fsck_walk_func walk, void *da
 	return res;
 }
 
-static int fsck_walk_tag(struct tag *tag, fsck_walk_func walk, void *data)
+static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options)
 {
 	if (parse_tag(tag))
 		return -1;
-	return walk(tag->tagged, OBJ_ANY, data);
+	return options->walk(tag->tagged, OBJ_ANY, data, options);
 }
 
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options)
 {
 	if (!obj)
 		return -1;
@@ -81,11 +81,11 @@ int fsck_walk(struct object *obj, fsck_walk_func walk, void *data)
 	case OBJ_BLOB:
 		return 0;
 	case OBJ_TREE:
-		return fsck_walk_tree((struct tree *)obj, walk, data);
+		return fsck_walk_tree((struct tree *)obj, data, options);
 	case OBJ_COMMIT:
-		return fsck_walk_commit((struct commit *)obj, walk, data);
+		return fsck_walk_commit((struct commit *)obj, data, options);
 	case OBJ_TAG:
-		return fsck_walk_tag((struct tag *)obj, walk, data);
+		return fsck_walk_tag((struct tag *)obj, data, options);
 	default:
 		error("Unknown object type for %s", sha1_to_hex(obj->sha1));
 		return -1;
@@ -138,7 +138,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con
 	return c1 < c2 ? 0 : TREE_UNORDERED;
 }
 
-static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
+static int fsck_tree(struct tree *item, struct fsck_options *options)
 {
 	int retval;
 	int has_null_sha1 = 0;
@@ -194,7 +194,7 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 		 * bits..
 		 */
 		case S_IFREG | 0664:
-			if (!strict)
+			if (!options->strict)
 				break;
 		default:
 			has_bad_modes = 1;
@@ -219,30 +219,30 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
 	if (has_empty_name)
-		retval += error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
 	if (has_dot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
 	if (has_dotdot)
-		retval += error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
 	if (has_dotgit)
-		retval += error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
 	if (has_zero_pad)
-		retval += error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
 	if (has_dup_entries)
-		retval += error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
 	return retval;
 }
 
 static int require_end_of_header(const void *data, unsigned long size,
-	struct object *obj, fsck_error error_func)
+	struct object *obj, struct fsck_options *options)
 {
 	const char *buffer = (const char *)data;
 	unsigned long i;
@@ -250,7 +250,7 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return error_func(obj, FSCK_ERROR,
+			return options->error_func(obj, FSCK_ERROR,
 				"unterminated header: NUL at offset %d", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
@@ -258,36 +258,36 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	return options->error_func(obj, FSCK_ERROR, "unterminated header");
 }
 
-static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
+static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
 	char *end;
 
 	if (**ident == '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,30 +295,30 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
 
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(buffer, size, &commit->object, options))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,39 +328,39 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
-	err = fsck_ident(&buffer, &commit->object, error_func);
+		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
 
 static int fsck_commit(struct commit *commit, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	const char *buffer = data ?  data : get_commit_buffer(commit, &size);
-	int ret = fsck_commit_buffer(commit, buffer, size, error_func);
+	int ret = fsck_commit_buffer(commit, buffer, size, options);
 	if (!data)
 		unuse_commit_buffer(commit, buffer);
 	return ret;
 }
 
 static int fsck_tag_buffer(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	unsigned char sha1[20];
 	int ret = 0;
@@ -376,65 +376,65 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return error_func(&tag->object, FSCK_ERROR,
+			return options->error_func(&tag->object, FSCK_ERROR,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = error_func(&tag->object, FSCK_ERROR,
+			ret = options->error_func(&tag->object, FSCK_ERROR,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(buffer, size, &tag->object, options))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
 	else
-		ret = fsck_ident(&buffer, &tag->object, error_func);
+		ret = fsck_ident(&buffer, &tag->object, options);
 
 done:
 	strbuf_release(&sb);
@@ -443,34 +443,34 @@ done:
 }
 
 static int fsck_tag(struct tag *tag, const char *data,
-	unsigned long size, fsck_error error_func)
+	unsigned long size, struct fsck_options *options)
 {
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
 
-	return fsck_tag_buffer(tag, data, size, error_func);
+	return fsck_tag_buffer(tag, data, size, options);
 }
 
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func)
+	struct fsck_options *options)
 {
 	if (!obj)
-		return error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
 	if (obj->type == OBJ_TREE)
-		return fsck_tree((struct tree *) obj, strict, error_func);
+		return fsck_tree((struct tree *) obj, options);
 	if (obj->type == OBJ_COMMIT)
 		return fsck_commit((struct commit *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 	if (obj->type == OBJ_TAG)
 		return fsck_tag((struct tag *) obj, (const char *) data,
-			size, error_func);
+			size, options);
 
-	return error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
diff --git a/fsck.h b/fsck.h
index d1e6387..07d0ab2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -4,6 +4,8 @@
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
 
+struct fsck_options;
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -12,7 +14,7 @@
  *     <0	error signaled and abort
  *     >0	error signaled and do not abort
  */
-typedef int (*fsck_walk_func)(struct object *obj, int type, void *data);
+typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
 typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
@@ -20,6 +22,15 @@ typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
 __attribute__((format (printf, 3, 4)))
 int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
 
+struct fsck_options {
+	fsck_walk_func walk;
+	fsck_error error_func;
+	unsigned strict:1;
+};
+
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+
 /* descend in all linked child objects
  * the return value is:
  *    -1	error in processing the object
@@ -27,9 +38,9 @@ int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
  *    >0	return value of the first signaled error >0 (in the case of no other errors)
  *    0		everything OK
  */
-int fsck_walk(struct object *obj, fsck_walk_func walk, void *data);
+int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
 /* If NULL is passed for data, we assume the object is local and read it. */
 int fsck_object(struct object *obj, void *data, unsigned long size,
-	int strict, fsck_error error_func);
+	struct fsck_options *options);
 
 #endif
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 02/19] fsck: Introduce identifiers for fsck messages
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 01/19] fsck: Introduce fsck options Johannes Schindelin
@ 2015-06-22 15:25           ` Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
                             ` (17 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:25 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Instead of specifying whether a message by the fsck machinery constitutes
an error or a warning, let's specify an identifier relating to the
concrete problem that was encountered. This is necessary for upcoming
support to be able to demote certain errors to warnings.

In the process, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

We could use a simple enum for the message IDs here, but we want to
guarantee that the enum values are associated with the appropriate
message types (i.e. error or warning?). Besides, we want to introduce a
parser in the next commit that maps the string representation to the
enum value, hence we use the slightly ugly preprocessor construct that
is extensible for use with said parser.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fsck.c |  26 +++-----
 fsck.c         | 201 +++++++++++++++++++++++++++++++++++++++++----------------
 fsck.h         |   5 +-
 3 files changed, 154 insertions(+), 78 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 981dca5..fff38fe 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,33 +46,23 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
-static void objreport(struct object *obj, const char *severity,
-                      const char *err, va_list params)
+static void objreport(struct object *obj, const char *msg_type,
+			const char *err)
 {
-	fprintf(stderr, "%s in %s %s: ",
-	        severity, typename(obj->type), sha1_to_hex(obj->sha1));
-	vfprintf(stderr, err, params);
-	fputs("\n", stderr);
+	fprintf(stderr, "%s in %s %s: %s\n",
+		msg_type, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-	va_list params;
-	va_start(params, err);
 	errors_found |= ERROR_OBJECT;
-	objreport(obj, "error", err, params);
-	va_end(params);
+	objreport(obj, "error", err);
 	return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-	va_list params;
-	va_start(params, err);
-	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-	va_end(params);
+	objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d83b811..ab24618 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,98 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+	/* errors */ \
+	FUNC(BAD_DATE, ERROR) \
+	FUNC(BAD_DATE_OVERFLOW, ERROR) \
+	FUNC(BAD_EMAIL, ERROR) \
+	FUNC(BAD_NAME, ERROR) \
+	FUNC(BAD_OBJECT_SHA1, ERROR) \
+	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_TAG_OBJECT, ERROR) \
+	FUNC(BAD_TIMEZONE, ERROR) \
+	FUNC(BAD_TREE, ERROR) \
+	FUNC(BAD_TREE_SHA1, ERROR) \
+	FUNC(BAD_TYPE, ERROR) \
+	FUNC(DUPLICATE_ENTRIES, ERROR) \
+	FUNC(MISSING_AUTHOR, ERROR) \
+	FUNC(MISSING_COMMITTER, ERROR) \
+	FUNC(MISSING_EMAIL, ERROR) \
+	FUNC(MISSING_GRAFT, ERROR) \
+	FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_OBJECT, ERROR) \
+	FUNC(MISSING_PARENT, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \
+	FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \
+	FUNC(MISSING_TAG, ERROR) \
+	FUNC(MISSING_TAG_ENTRY, ERROR) \
+	FUNC(MISSING_TAG_OBJECT, ERROR) \
+	FUNC(MISSING_TREE, ERROR) \
+	FUNC(MISSING_TYPE, ERROR) \
+	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(NUL_IN_HEADER, ERROR) \
+	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
+	FUNC(TREE_NOT_SORTED, ERROR) \
+	FUNC(UNKNOWN_TYPE, ERROR) \
+	FUNC(UNTERMINATED_HEADER, ERROR) \
+	FUNC(ZERO_PADDED_DATE, ERROR) \
+	/* warnings */ \
+	FUNC(BAD_FILEMODE, WARN) \
+	FUNC(BAD_TAG_NAME, WARN) \
+	FUNC(EMPTY_NAME, WARN) \
+	FUNC(FULL_PATHNAME, WARN) \
+	FUNC(HAS_DOT, WARN) \
+	FUNC(HAS_DOTDOT, WARN) \
+	FUNC(HAS_DOTGIT, WARN) \
+	FUNC(MISSING_TAGGER_ENTRY, WARN) \
+	FUNC(NULL_SHA1, WARN) \
+	FUNC(ZERO_PADDED_FILEMODE, WARN)
+
+#define MSG_ID(id, msg_type) FSCK_MSG_##id,
+enum fsck_msg_id {
+	FOREACH_MSG_ID(MSG_ID)
+	FSCK_MSG_MAX
+};
+#undef MSG_ID
+
+#define MSG_ID(id, msg_type) { FSCK_##msg_type },
+static struct {
+	int msg_type;
+} msg_id_info[FSCK_MSG_MAX + 1] = {
+	FOREACH_MSG_ID(MSG_ID)
+	{ -1 }
+};
+#undef MSG_ID
+
+static int fsck_msg_type(enum fsck_msg_id msg_id,
+	struct fsck_options *options)
+{
+	int msg_type;
+
+	msg_type = msg_id_info[msg_id].msg_type;
+	if (options->strict && msg_type == FSCK_WARN)
+		msg_type = FSCK_ERROR;
+
+	return msg_type;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+	enum fsck_msg_id id, const char *fmt, ...)
+{
+	va_list ap;
+	struct strbuf sb = STRBUF_INIT;
+	int msg_type = fsck_msg_type(id, options), result;
+
+	va_start(ap, fmt);
+	strbuf_vaddf(&sb, fmt, ap);
+	result = options->error_func(object, msg_type, sb.buf);
+	strbuf_release(&sb);
+	va_end(ap);
+
+	return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options)
 {
 	struct tree_desc desc;
@@ -219,25 +311,25 @@ static int fsck_tree(struct tree *item, struct fsck_options *options)
 
 	retval = 0;
 	if (has_null_sha1)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains entries pointing to null sha1");
+		retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
 	if (has_full_path)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains full pathnames");
+		retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames");
 	if (has_empty_name)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains empty pathname");
+		retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname");
 	if (has_dot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'");
 	if (has_dotdot)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '..'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'");
 	if (has_dotgit)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains '.git'");
+		retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'");
 	if (has_zero_pad)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains zero-padded file modes");
+		retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes");
 	if (has_bad_modes)
-		retval += options->error_func(&item->object, FSCK_WARN, "contains bad file modes");
+		retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes");
 	if (has_dup_entries)
-		retval += options->error_func(&item->object, FSCK_ERROR, "contains duplicate file entries");
+		retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries");
 	if (not_properly_sorted)
-		retval += options->error_func(&item->object, FSCK_ERROR, "not properly sorted");
+		retval += report(options, &item->object, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted");
 	return retval;
 }
 
@@ -250,15 +342,17 @@ static int require_end_of_header(const void *data, unsigned long size,
 	for (i = 0; i < size; i++) {
 		switch (buffer[i]) {
 		case '\0':
-			return options->error_func(obj, FSCK_ERROR,
-				"unterminated header: NUL at offset %d", i);
+			return report(options, obj,
+				FSCK_MSG_NUL_IN_HEADER,
+				"unterminated header: NUL at offset %ld", i);
 		case '\n':
 			if (i + 1 < size && buffer[i + 1] == '\n')
 				return 0;
 		}
 	}
 
-	return options->error_func(obj, FSCK_ERROR, "unterminated header");
+	return report(options, obj,
+		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
@@ -266,28 +360,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	char *end;
 
 	if (**ident == '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident == '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad name");
+		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
 	if (**ident != '<')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing email");
+		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
 	if ((*ident)[-1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before email");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
 	(*ident)++;
 	*ident += strcspn(*ident, "<>\n");
 	if (**ident != '>')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad email");
+		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
 	(*ident)++;
 	if (**ident != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - missing space before date");
+		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	(*ident)++;
 	if (**ident == '0' && (*ident)[1] != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - zero-padded date");
+		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(strtoul(*ident, &end, 10)))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - date causes integer overflow");
+		return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
 	if (end == *ident || *end != ' ')
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad date");
+		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
 	*ident = end + 1;
 	if ((**ident != '+' && **ident != '-') ||
 	    !isdigit((*ident)[1]) ||
@@ -295,7 +389,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option
 	    !isdigit((*ident)[3]) ||
 	    !isdigit((*ident)[4]) ||
 	    ((*ident)[5] != '\n'))
-		return options->error_func(obj, FSCK_ERROR, "invalid author/committer line - bad time zone");
+		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
 	(*ident) += 6;
 	return 0;
 }
@@ -312,13 +406,13 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'tree' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid 'tree' line format - bad sha1");
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
 		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return options->error_func(&commit->object, FSCK_ERROR, "invalid 'parent' line format - bad sha1");
+			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -328,23 +422,23 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
 		else if (graft->nr_parent != parent_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "graft objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
 	} else {
 		if (parent_count != parent_line_count)
-			return options->error_func(&commit->object, FSCK_ERROR, "parent objects missing");
+			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'author' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-		return options->error_func(&commit->object, FSCK_ERROR, "invalid format - expected 'committer' line");
+		return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, &commit->object, options);
 	if (err)
 		return err;
 	if (!commit->tree)
-		return options->error_func(&commit->object, FSCK_ERROR, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
+		return report(options, &commit->object, FSCK_MSG_BAD_TREE, "could not load commit's tree %s", sha1_to_hex(tree_sha1));
 
 	return 0;
 }
@@ -376,11 +470,13 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		buffer = to_free =
 			read_sha1_file(tag->object.sha1, &type, &size);
 		if (!buffer)
-			return options->error_func(&tag->object, FSCK_ERROR,
+			return report(options, &tag->object,
+				FSCK_MSG_MISSING_TAG_OBJECT,
 				"cannot read tag object");
 
 		if (type != OBJ_TAG) {
-			ret = options->error_func(&tag->object, FSCK_ERROR,
+			ret = report(options, &tag->object,
+				FSCK_MSG_TAG_OBJECT_NOT_TAG,
 				"expected tag got %s",
 			    typename(type));
 			goto done;
@@ -391,48 +487,49 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+		ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
 		goto done;
 	}
 	buffer += 41;
 
 	if (!skip_prefix(buffer, "type ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
+		ret = report(options, &tag->object, FSCK_MSG_BAD_TYPE, "invalid 'type' value");
 	if (ret)
 		goto done;
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tag ", &buffer)) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
 	eol = strchr(buffer, '\n');
 	if (!eol) {
-		ret = options->error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
 	if (check_refname_format(sb.buf, 0))
-		options->error_func(&tag->object, FSCK_WARN, "invalid 'tag' name: %.*s",
+		report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
+			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
 	buffer = eol + 1;
 
 	if (!skip_prefix(buffer, "tagger ", &buffer))
 		/* early tags do not contain 'tagger' lines; warn only */
-		options->error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");
+		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
@@ -448,7 +545,7 @@ static int fsck_tag(struct tag *tag, const char *data,
 	struct object *tagged = tag->tagged;
 
 	if (!tagged)
-		return options->error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+		return report(options, &tag->object, FSCK_MSG_BAD_TAG_OBJECT, "could not load tagged object");
 
 	return fsck_tag_buffer(tag, data, size, options);
 }
@@ -457,7 +554,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 	struct fsck_options *options)
 {
 	if (!obj)
-		return options->error_func(obj, FSCK_ERROR, "no valid object to fsck");
+		return report(options, obj, FSCK_MSG_BAD_OBJECT_SHA1, "no valid object to fsck");
 
 	if (obj->type == OBJ_BLOB)
 		return 0;
@@ -470,22 +567,12 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 		return fsck_tag((struct tag *) obj, (const char *) data,
 			size, options);
 
-	return options->error_func(obj, FSCK_ERROR, "unknown type '%d' (internal fsck error)",
+	return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)",
 			  obj->type);
 }
 
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...)
+int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
-	va_list ap;
-	struct strbuf sb = STRBUF_INIT;
-
-	strbuf_addf(&sb, "object %s:", sha1_to_hex(obj->sha1));
-
-	va_start(ap, fmt);
-	strbuf_vaddf(&sb, fmt, ap);
-	va_end(ap);
-
-	error("%s", sb.buf);
-	strbuf_release(&sb);
+	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index 07d0ab2..f6f268a 100644
--- a/fsck.h
+++ b/fsck.h
@@ -17,10 +17,9 @@ struct fsck_options;
 typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options);
 
 /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */
-typedef int (*fsck_error)(struct object *obj, int type, const char *err, ...);
+typedef int (*fsck_error)(struct object *obj, int type, const char *message);
 
-__attribute__((format (printf, 3, 4)))
-int fsck_error_function(struct object *obj, int type, const char *fmt, ...);
+int fsck_error_function(struct object *obj, int type, const char *message);
 
 struct fsck_options {
 	fsck_walk_func walk;
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 03/19] fsck: Provide a function to parse fsck message IDs
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 01/19] fsck: Introduce fsck options Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
@ 2015-06-22 15:25           ` Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
                             ` (16 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:25 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

These functions will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. The upcoming `fsck_set_msg_types()` function has to
handle partial strings because we would like to be able to parse, say,
'missingemail=warn,missingtaggerentry=warn' command line parameters
(which will be passed by receive-pack to index-pack and unpack-objects).

To make the parsing robust, we generate strings from the enum keys, and
using these keys, we match up strings without dashes case-insensitively
to the corresponding enum values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index ab24618..1a3f7ce 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,15 +63,47 @@ enum fsck_msg_id {
 };
 #undef MSG_ID
 
-#define MSG_ID(id, msg_type) { FSCK_##msg_type },
+#define STR(x) #x
+#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
 static struct {
+	const char *id_string;
 	int msg_type;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ -1 }
+	{ NULL, -1 }
 };
 #undef MSG_ID
 
+static int parse_msg_id(const char *text)
+{
+	static char **lowercased;
+	int i;
+
+	if (!lowercased) {
+		/* convert id_string to lower case, without underscores. */
+		lowercased = xmalloc(FSCK_MSG_MAX * sizeof(*lowercased));
+		for (i = 0; i < FSCK_MSG_MAX; i++) {
+			const char *p = msg_id_info[i].id_string;
+			int len = strlen(p);
+			char *q = xmalloc(len);
+
+			lowercased[i] = q;
+			while (*p)
+				if (*p == '_')
+					p++;
+				else
+					*(q)++ = tolower(*(p)++);
+			*q = '\0';
+		}
+	}
+
+	for (i = 0; i < FSCK_MSG_MAX; i++)
+		if (!strcmp(text, lowercased[i]))
+			return i;
+
+	return -1;
+}
+
 static int fsck_msg_type(enum fsck_msg_id msg_id,
 	struct fsck_options *options)
 {
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (2 preceding siblings ...)
  2015-06-22 15:25           ` [PATCH v7 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
@ 2015-06-22 15:25           ` Johannes Schindelin
  2015-06-22 17:37             ` Junio C Hamano
  2015-06-22 15:25           ` [PATCH v7 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
                             ` (15 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:25 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The `fsck_set_msg_types()` function added by this commit parses a list
of settings in the form:

	missingemail=warn,badname=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we need
to take extra care to set all message types to FSCK_ERROR by default in
those cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 fsck.h |  9 +++++--
 2 files changed, 85 insertions(+), 12 deletions(-)

diff --git a/fsck.c b/fsck.c
index 1a3f7ce..e81a342 100644
--- a/fsck.c
+++ b/fsck.c
@@ -64,30 +64,29 @@ enum fsck_msg_id {
 #undef MSG_ID
 
 #define STR(x) #x
-#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
+#define MSG_ID(id, msg_type) { STR(id), NULL, FSCK_##msg_type },
 static struct {
 	const char *id_string;
+	const char *lowercased;
 	int msg_type;
 } msg_id_info[FSCK_MSG_MAX + 1] = {
 	FOREACH_MSG_ID(MSG_ID)
-	{ NULL, -1 }
+	{ NULL, NULL, -1 }
 };
 #undef MSG_ID
 
 static int parse_msg_id(const char *text)
 {
-	static char **lowercased;
 	int i;
 
-	if (!lowercased) {
+	if (!msg_id_info[0].lowercased) {
 		/* convert id_string to lower case, without underscores. */
-		lowercased = xmalloc(FSCK_MSG_MAX * sizeof(*lowercased));
 		for (i = 0; i < FSCK_MSG_MAX; i++) {
 			const char *p = msg_id_info[i].id_string;
 			int len = strlen(p);
 			char *q = xmalloc(len);
 
-			lowercased[i] = q;
+			msg_id_info[i].lowercased = q;
 			while (*p)
 				if (*p == '_')
 					p++;
@@ -98,7 +97,7 @@ static int parse_msg_id(const char *text)
 	}
 
 	for (i = 0; i < FSCK_MSG_MAX; i++)
-		if (!strcmp(text, lowercased[i]))
+		if (!strcmp(text, msg_id_info[i].lowercased))
 			return i;
 
 	return -1;
@@ -109,13 +108,78 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
 {
 	int msg_type;
 
-	msg_type = msg_id_info[msg_id].msg_type;
-	if (options->strict && msg_type == FSCK_WARN)
-		msg_type = FSCK_ERROR;
+	assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX);
+
+	if (options->msg_type)
+		msg_type = options->msg_type[msg_id];
+	else {
+		msg_type = msg_id_info[msg_id].msg_type;
+		if (options->strict && msg_type == FSCK_WARN)
+			msg_type = FSCK_ERROR;
+	}
 
 	return msg_type;
 }
 
+static int parse_msg_type(const char *str)
+{
+	if (!strcmp(str, "error"))
+		return FSCK_ERROR;
+	else if (!strcmp(str, "warn"))
+		return FSCK_WARN;
+	else
+		die("Unknown fsck message type: '%s'", str);
+}
+
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, const char *msg_type)
+{
+	int id = parse_msg_id(msg_id), type;
+
+	if (id < 0)
+		die("Unhandled message id: %s", msg_id);
+	type = parse_msg_type(msg_type);
+
+	if (!options->msg_type) {
+		int i;
+		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		for (i = 0; i < FSCK_MSG_MAX; i++)
+			msg_type[i] = fsck_msg_type(i, options);
+		options->msg_type = msg_type;
+	}
+
+	options->msg_type[id] = type;
+}
+
+void fsck_set_msg_types(struct fsck_options *options, const char *values)
+{
+	char *buf = xstrdup(values), *to_free = buf;
+	int done = 0;
+
+	while (!done) {
+		int len = strcspn(buf, " ,|"), equal;
+
+		done = !buf[len];
+		if (!len) {
+			buf++;
+			continue;
+		}
+		buf[len] = '\0';
+
+		for (equal = 0; equal < len &&
+				buf[equal] != '=' && buf[equal] != ':'; equal++)
+			buf[equal] = tolower(buf[equal]);
+		buf[equal] = '\0';
+
+		if (equal == len)
+			die("Missing '=': '%s'", buf);
+
+		fsck_set_msg_type(options, buf, buf + equal + 1);
+		buf += len + 1;
+	}
+	free(to_free);
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -605,6 +669,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size,
 
 int fsck_error_function(struct object *obj, int msg_type, const char *message)
 {
+	if (msg_type == FSCK_WARN) {
+		warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+		return 0;
+	}
 	error("object %s: %s", sha1_to_hex(obj->sha1), message);
 	return 1;
 }
diff --git a/fsck.h b/fsck.h
index f6f268a..af3c84e 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,10 @@
 
 struct fsck_options;
 
+void fsck_set_msg_type(struct fsck_options *options,
+		const char *msg_id, const char *msg_type);
+void fsck_set_msg_types(struct fsck_options *options, const char *values);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +29,11 @@ struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	int *msg_type;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 05/19] fsck (receive-pack): Allow demoting errors to warnings
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (3 preceding siblings ...)
  2015-06-22 15:25           ` [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-06-22 15:25           ` Johannes Schindelin
  2015-06-22 15:25           ` [PATCH v7 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
                             ` (14 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:25 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.missingemail=warn

The value is actually a comma-separated list.

In case that the same key is listed in multiple receive.fsck.<msg-id>
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain message type
settings).

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/index-pack.c     |  4 ++++
 builtin/receive-pack.c   | 17 +++++++++++++++--
 builtin/unpack-objects.c |  5 +++++
 fsck.c                   |  8 ++++++++
 fsck.h                   |  1 +
 5 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f8b0c64..f0d283b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1633,6 +1633,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!strcmp(arg, "--strict")) {
 				strict = 1;
 				do_fsck_object = 1;
+			} else if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				do_fsck_object = 1;
+				fsck_set_msg_types(&fsck_options, arg);
 			} else if (!strcmp(arg, "--check-self-contained-and-connected")) {
 				strict = 1;
 				check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 94d0571..3afe8f8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -19,6 +19,7 @@
 #include "tag.h"
 #include "gpg-interface.h"
 #include "sigchain.h"
+#include "fsck.h"
 
 static const char receive_pack_usage[] = "git receive-pack <git-dir>";
 
@@ -36,6 +37,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
@@ -115,6 +117,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (skip_prefix(var, "receive.fsck.", &var)) {
+		if (is_valid_msg_type(var, value))
+			strbuf_addf(&fsck_msg_types, "%c%s=%s",
+				fsck_msg_types.len ? ',' : '=', var, value);
+		else
+			warning("Skipping unknown msg id '%s'", var);
+		return 0;
+	}
+
 	if (strcmp(var, "receive.fsckobjects") == 0) {
 		receive_fsck_objects = git_config_bool(var, value);
 		return 0;
@@ -1490,7 +1501,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (quiet)
 			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_msg_types.buf);
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1508,7 +1520,8 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_push(&child.args, "--strict");
+			argv_array_pushf(&child.args, "--strict%s",
+				fsck_msg_types.buf);
 		if (fix_thin)
 			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d17040..7cc086f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 				strict = 1;
 				continue;
 			}
+			if (skip_prefix(arg, "--strict=", &arg)) {
+				strict = 1;
+				fsck_set_msg_types(&fsck_options, arg);
+				continue;
+			}
 			if (starts_with(arg, "--pack_header=")) {
 				struct pack_header *hdr;
 				char *c;
diff --git a/fsck.c b/fsck.c
index e81a342..02af3ed 100644
--- a/fsck.c
+++ b/fsck.c
@@ -131,6 +131,14 @@ static int parse_msg_type(const char *str)
 		die("Unknown fsck message type: '%s'", str);
 }
 
+int is_valid_msg_type(const char *msg_id, const char *msg_type)
+{
+	if (parse_msg_id(msg_id) < 0)
+		return 0;
+	parse_msg_type(msg_type);
+	return 1;
+}
+
 void fsck_set_msg_type(struct fsck_options *options,
 		const char *msg_id, const char *msg_type)
 {
diff --git a/fsck.h b/fsck.h
index af3c84e..3ef92a3 100644
--- a/fsck.h
+++ b/fsck.h
@@ -9,6 +9,7 @@ struct fsck_options;
 void fsck_set_msg_type(struct fsck_options *options,
 		const char *msg_id, const char *msg_type);
 void fsck_set_msg_types(struct fsck_options *options, const char *values);
+int is_valid_msg_type(const char *msg_id, const char *msg_type);
 
 /*
  * callback function for fsck_walk
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 06/19] fsck: Report the ID of the error/warning
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (4 preceding siblings ...)
  2015-06-22 15:25           ` [PATCH v7 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
@ 2015-06-22 15:25           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
                             ` (13 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:25 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missingEmail", the user might want to call `git config
--add receive.fsck.missingEmail=warn`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c          | 20 ++++++++++++++++++++
 t/t1450-fsck.sh |  4 ++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 02af3ed..0ec02b2 100644
--- a/fsck.c
+++ b/fsck.c
@@ -188,6 +188,24 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
 	free(to_free);
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+	for (;;) {
+		char c = *(msg_id)++;
+
+		if (!c)
+			break;
+		if (c != '_')
+			strbuf_addch(sb, tolower(c));
+		else {
+			assert(*msg_id);
+			strbuf_addch(sb, *(msg_id)++);
+		}
+	}
+
+	strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
 	enum fsck_msg_id id, const char *fmt, ...)
@@ -196,6 +214,8 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	append_msg_id(&sb, msg_id_info[id].id_string);
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&sb, fmt, ap);
 	result = options->error_func(object, msg_type, sb.buf);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cfb32b6..7c5b3d5 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name & missing tagger' '
 	git fsck --tags 2>out &&
 
 	cat >expect <<-EOF &&
-	warning in tag $tag: invalid '\''tag'\'' name: wrong name format
-	warning in tag $tag: invalid format - expected '\''tagger'\'' line
+	warning in tag $tag: badTagName: invalid '\''tag'\'' name: wrong name format
+	warning in tag $tag: missingTaggerEntry: invalid format - expected '\''tagger'\'' line
 	EOF
 	test_cmp expect out
 '
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 07/19] fsck: Make fsck_ident() warn-friendly
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (5 preceding siblings ...)
  2015-06-22 15:25           ` [PATCH v7 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
                             ` (12 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 49 +++++++++++++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index 0ec02b2..d0a7282 100644
--- a/fsck.c
+++ b/fsck.c
@@ -481,40 +481,45 @@ static int require_end_of_header(const void *data, unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options)
 {
+	const char *p = *ident;
 	char *end;
 
-	if (**ident == '<')
+	*ident = strchrnul(*ident, '\n');
+	if (**ident == '\n')
+		(*ident)++;
+
+	if (*p == '<')
 		return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident == '>')
+	p += strcspn(p, "<>\n");
+	if (*p == '>')
 		return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name");
-	if (**ident != '<')
+	if (*p != '<')
 		return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email");
-	if ((*ident)[-1] != ' ')
+	if (p[-1] != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email");
-	(*ident)++;
-	*ident += strcspn(*ident, "<>\n");
-	if (**ident != '>')
+	p++;
+	p += strcspn(p, "<>\n");
+	if (*p != '>')
 		return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email");
-	(*ident)++;
-	if (**ident != ' ')
+	p++;
+	if (*p != ' ')
 		return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
-	(*ident)++;
-	if (**ident == '0' && (*ident)[1] != ' ')
+	p++;
+	if (*p == '0' && p[1] != ' ')
 		return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(strtoul(*ident, &end, 10)))
+	if (date_overflows(strtoul(p, &end, 10)))
 		return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (end == *ident || *end != ' ')
+	if ((end == p || *end != ' '))
 		return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	*ident = end + 1;
-	if ((**ident != '+' && **ident != '-') ||
-	    !isdigit((*ident)[1]) ||
-	    !isdigit((*ident)[2]) ||
-	    !isdigit((*ident)[3]) ||
-	    !isdigit((*ident)[4]) ||
-	    ((*ident)[5] != '\n'))
+	p = end + 1;
+	if ((*p != '+' && *p != '-') ||
+	    !isdigit(p[1]) ||
+	    !isdigit(p[2]) ||
+	    !isdigit(p[3]) ||
+	    !isdigit(p[4]) ||
+	    (p[5] != '\n'))
 		return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone");
-	(*ident) += 6;
+	p += 6;
 	return 0;
 }
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 08/19] fsck: Make fsck_commit() warn-friendly
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (6 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
                             ` (11 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missingauthor error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index d0a7282..ef3bf68 100644
--- a/fsck.c
+++ b/fsck.c
@@ -536,12 +536,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
-	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-		return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+	if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+		err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
+		if (err)
+			return err;
+	}
 	buffer += 41;
 	while (skip_prefix(buffer, "parent ", &buffer)) {
-		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-			return report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+			err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+			if (err)
+				return err;
+		}
 		buffer += 41;
 		parent_line_count++;
 	}
@@ -550,11 +556,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 	if (graft) {
 		if (graft->nr_parent == -1 && !parent_count)
 			; /* shallow commit */
-		else if (graft->nr_parent != parent_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+		else if (graft->nr_parent != parent_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+			if (err)
+				return err;
+		}
 	} else {
-		if (parent_count != parent_line_count)
-			return report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+		if (parent_count != parent_line_count) {
+			err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing");
+			if (err)
+				return err;
+		}
 	}
 	if (!skip_prefix(buffer, "author ", &buffer))
 		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 09/19] fsck: Handle multiple authors in commits specially
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (7 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
                             ` (10 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
receive.fsck.<msg-id> to missingCommitter=warn, but that could hide
missing tree objects in the same commit because we cannot continue
verifying any commit object after encountering a missing committer line,
while we can continue in the case of multiple author lines.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fsck.c b/fsck.c
index ef3bf68..daa07ad 100644
--- a/fsck.c
+++ b/fsck.c
@@ -38,6 +38,7 @@
 	FUNC(MISSING_TREE, ERROR) \
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
+	FUNC(MULTIPLE_AUTHORS, ERROR) \
 	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(TREE_NOT_SORTED, ERROR) \
@@ -528,7 +529,7 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 {
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
-	unsigned parent_count, parent_line_count = 0;
+	unsigned parent_count, parent_line_count = 0, author_count;
 	int err;
 
 	if (require_end_of_header(buffer, size, &commit->object, options))
@@ -568,9 +569,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer,
 				return err;
 		}
 	}
-	if (!skip_prefix(buffer, "author ", &buffer))
-		return report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-	err = fsck_ident(&buffer, &commit->object, options);
+	author_count = 0;
+	while (skip_prefix(buffer, "author ", &buffer)) {
+		author_count++;
+		err = fsck_ident(&buffer, &commit->object, options);
+		if (err)
+			return err;
+	}
+	if (author_count < 1)
+		err = report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
+	else if (author_count > 1)
+		err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
 	if (err)
 		return err;
 	if (!skip_prefix(buffer, "committer ", &buffer))
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 10/19] fsck: Make fsck_tag() warn-friendly
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (8 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
                             ` (9 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index daa07ad..3f76f99 100644
--- a/fsck.c
+++ b/fsck.c
@@ -642,7 +642,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 	}
 	if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
 		ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-		goto done;
+		if (ret)
+			goto done;
 	}
 	buffer += 41;
 
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 11/19] fsck: Add a simple test for receive.fsck.<msg-id>
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (9 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
                             ` (8 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5504-fetch-receive-strict.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..36024fc 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,25 @@ test_expect_success 'push with transfer.fsckobjects' '
 	test_cmp exp act
 '
 
+cat >bogus-commit <<\EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +0000
+committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.missingEmail=warn' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config \
+		receive.fsck.missingEmail warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	grep "missingEmail" act
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 12/19] fsck: Disallow demoting grave fsck errors to warnings
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (10 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:26           ` [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
                             ` (7 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 13 +++++++++++--
 t/t5504-fetch-receive-strict.sh | 11 +++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 3f76f99..85535b1 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,7 +9,12 @@
 #include "refs.h"
 #include "utf8.h"
 
+#define FSCK_FATAL -1
+
 #define FOREACH_MSG_ID(FUNC) \
+	/* fatal errors */ \
+	FUNC(NUL_IN_HEADER, FATAL) \
+	FUNC(UNTERMINATED_HEADER, FATAL) \
 	/* errors */ \
 	FUNC(BAD_DATE, ERROR) \
 	FUNC(BAD_DATE_OVERFLOW, ERROR) \
@@ -39,11 +44,9 @@
 	FUNC(MISSING_TYPE, ERROR) \
 	FUNC(MISSING_TYPE_ENTRY, ERROR) \
 	FUNC(MULTIPLE_AUTHORS, ERROR) \
-	FUNC(NUL_IN_HEADER, ERROR) \
 	FUNC(TAG_OBJECT_NOT_TAG, ERROR) \
 	FUNC(TREE_NOT_SORTED, ERROR) \
 	FUNC(UNKNOWN_TYPE, ERROR) \
-	FUNC(UNTERMINATED_HEADER, ERROR) \
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
@@ -149,6 +152,9 @@ void fsck_set_msg_type(struct fsck_options *options,
 		die("Unhandled message id: %s", msg_id);
 	type = parse_msg_type(msg_type);
 
+	if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL)
+		die("Cannot demote %s to %s", msg_id, msg_type);
+
 	if (!options->msg_type) {
 		int i;
 		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
@@ -215,6 +221,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_FATAL)
+		msg_type = FSCK_ERROR;
+
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
 	va_start(ap, fmt);
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 36024fc..f5d6d0d 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -136,4 +136,15 @@ test_expect_success 'push with receive.fsck.missingEmail=warn' '
 	grep "missingEmail" act
 '
 
+test_expect_success \
+	'receive.fsck.unterminatedHeader=warn triggers error' '
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckobjects true &&
+	git --git-dir=dst/.git config \
+		receive.fsck.unterminatedheader warn &&
+	test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+	grep "Cannot demote unterminatedheader" act
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (11 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 18:04             ` Junio C Hamano
  2015-06-22 15:26           ` [PATCH v7 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
                             ` (6 subsequent siblings)
  19 siblings, 1 reply; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective message type to "ignore".

This change "abuses" the missingEmail=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                          | 5 +++++
 fsck.h                          | 1 +
 t/t5504-fetch-receive-strict.sh | 9 ++++++++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 85535b1..cbfff1f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -131,6 +131,8 @@ static int parse_msg_type(const char *str)
 		return FSCK_ERROR;
 	else if (!strcmp(str, "warn"))
 		return FSCK_WARN;
+	else if (!strcmp(str, "ignore"))
+		return FSCK_IGNORE;
 	else
 		die("Unknown fsck message type: '%s'", str);
 }
@@ -221,6 +223,9 @@ static int report(struct fsck_options *options, struct object *object,
 	struct strbuf sb = STRBUF_INIT;
 	int msg_type = fsck_msg_type(id, options), result;
 
+	if (msg_type == FSCK_IGNORE)
+		return 0;
+
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
 
diff --git a/fsck.h b/fsck.h
index 3ef92a3..1dab276 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index f5d6d0d..af373ba 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -133,7 +133,14 @@ test_expect_success 'push with receive.fsck.missingEmail=warn' '
 	git --git-dir=dst/.git config \
 		receive.fsck.missingEmail warn &&
 	git push --porcelain dst bogus >act 2>&1 &&
-	grep "missingEmail" act
+	grep "missingEmail" act &&
+	git --git-dir=dst/.git branch -D bogus &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.missingEmail ignore &&
+	git  --git-dir=dst/.git config --add \
+		receive.fsck.badDate warn &&
+	git push --porcelain dst bogus >act 2>&1 &&
+	test_must_fail grep "missingEmail" act
 '
 
 test_expect_success \
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 14/19] fsck: Allow upgrading fsck warnings to errors
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (12 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-06-22 15:26           ` Johannes Schindelin
  2015-06-22 15:27           ` [PATCH v7 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
                             ` (5 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:26 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalidTagName` and
`missingTaggerEntry` in the receive.fsck.<msg-id> config setting.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 fsck.c                | 24 +++++++++++++++++-------
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index cbfff1f..f6fc384 100644
--- a/fsck.c
+++ b/fsck.c
@@ -10,6 +10,7 @@
 #include "utf8.h"
 
 #define FSCK_FATAL -1
+#define FSCK_INFO -2
 
 #define FOREACH_MSG_ID(FUNC) \
 	/* fatal errors */ \
@@ -50,15 +51,16 @@
 	FUNC(ZERO_PADDED_DATE, ERROR) \
 	/* warnings */ \
 	FUNC(BAD_FILEMODE, WARN) \
-	FUNC(BAD_TAG_NAME, WARN) \
 	FUNC(EMPTY_NAME, WARN) \
 	FUNC(FULL_PATHNAME, WARN) \
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
-	FUNC(MISSING_TAGGER_ENTRY, WARN) \
 	FUNC(NULL_SHA1, WARN) \
-	FUNC(ZERO_PADDED_FILEMODE, WARN)
+	FUNC(ZERO_PADDED_FILEMODE, WARN) \
+	/* infos (reported as warnings, but ignored by default) */ \
+	FUNC(BAD_TAG_NAME, INFO) \
+	FUNC(MISSING_TAGGER_ENTRY, INFO)
 
 #define MSG_ID(id, msg_type) FSCK_MSG_##id,
 enum fsck_msg_id {
@@ -228,6 +230,8 @@ static int report(struct fsck_options *options, struct object *object,
 
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
+	else if (msg_type == FSCK_INFO)
+		msg_type = FSCK_WARN;
 
 	append_msg_id(&sb, msg_id_info[id].id_string);
 
@@ -686,15 +690,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		goto done;
 	}
 	strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-	if (check_refname_format(sb.buf, 0))
-		report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
+	if (check_refname_format(sb.buf, 0)) {
+		ret = report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME,
 			   "invalid 'tag' name: %.*s",
 			   (int)(eol - buffer), buffer);
+		if (ret)
+			goto done;
+	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer))
+	if (!skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
-		report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+		if (ret)
+			goto done;
+	}
 	else
 		ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
     thirtyeight=${tag#??} &&
     rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
     git index-pack --strict tag-test-${pack1}.pack 2>err &&
-    grep "^error:.* expected .tagger. line" err
+    grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 15/19] fsck: Document the new receive.fsck.<msg-id> options
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (13 preceding siblings ...)
  2015-06-22 15:26           ` [PATCH v7 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
@ 2015-06-22 15:27           ` Johannes Schindelin
  2015-06-22 15:27           ` [PATCH v7 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
                             ` (4 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:27 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 3e37b93..4e5fbea 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2205,6 +2205,20 @@ receive.fsckObjects::
 	Defaults to false. If not set, the value of `transfer.fsckObjects`
 	is used instead.
 
+receive.fsck.<msg-id>::
+	When `receive.fsckObjects` is set to true, errors can be switched
+	to warnings and vice versa by configuring the `receive.fsck.<msg-id>`
+	setting where the `<msg-id>` is the fsck message ID and the value
+	is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
+	the error/warning with the message ID, e.g. "missingEmail: invalid
+	author/committer line - missing email" means that setting
+	`receive.fsck.missingEmail = ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`, allowing
+the host to accept repositories with certain known issues but still catch
+other issues.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 16/19] fsck: Support demoting errors to warnings
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (14 preceding siblings ...)
  2015-06-22 15:27           ` [PATCH v7 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
@ 2015-06-22 15:27           ` Johannes Schindelin
  2015-06-22 15:27           ` [PATCH v7 17/19] fsck: Introduce `git fsck --connectivity-only` Johannes Schindelin
                             ` (3 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:27 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.missingEmail=ignore fsck` would hide
problems with missing emails in author, committer and tagger lines.

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.<msg-id> from the
receive.fsck.<msg-id> settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt | 11 +++++++++++
 builtin/fsck.c           | 12 ++++++++++++
 t/t1450-fsck.sh          | 11 +++++++++++
 3 files changed, 34 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4e5fbea..bfccd2b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1250,6 +1250,17 @@ filter.<driver>.smudge::
 	object to a worktree file upon checkout.  See
 	linkgit:gitattributes[5] for details.
 
+fsck.<msg-id>::
+	Allows overriding the message type (error, warn or ignore) of a
+	specific message ID such as `missingEmail`.
++
+For convenience, fsck prefixes the error/warning with the message ID,
+e.g.  "missingEmail: invalid author/committer line - missing email" means
+that setting `fsck.missingEmail = ignore` will hide that issue.
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index fff38fe..adaa802 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,16 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+	if (skip_prefix(var, "fsck.", &var)) {
+		fsck_set_msg_type(&fsck_obj_options, var, value);
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *msg_type,
 			const char *err)
 {
@@ -646,6 +656,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		include_reflogs = 0;
 	}
 
+	git_config(fsck_config, NULL);
+
 	fsck_head_link();
 	fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 7c5b3d5..1727129 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' '
 	grep -q "error: sha1 mismatch 63ffffffffffffffffffffffffffffffffffffff" out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+	git cat-file commit HEAD >basis &&
+	sed "s/^author .*/&,&/" <basis | tr , \\n >multiple-authors &&
+	new=$(git hash-object -t commit -w --stdin <multiple-authors) &&
+	test_when_finished "remove_object $new" &&
+	git update-ref refs/heads/bogus "$new" &&
+	test_when_finished "git update-ref -d refs/heads/bogus" &&
+	test_must_fail git fsck &&
+	git -c fsck.multipleAuthors=ignore fsck
+'
+
 _bz='\0'
 _bz5="$_bz$_bz$_bz$_bz$_bz"
 _bz20="$_bz5$_bz5$_bz5$_bz5"
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 17/19] fsck: Introduce `git fsck --connectivity-only`
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (15 preceding siblings ...)
  2015-06-22 15:27           ` [PATCH v7 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
@ 2015-06-22 15:27           ` Johannes Schindelin
  2015-06-22 15:27           ` [PATCH v7 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
                             ` (2 subsequent siblings)
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:27 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

This option avoids unpacking each and all blob objects, and just
verifies the connectivity. In particular with large repositories, this
speeds up the operation, at the expense of missing corrupt blobs,
ignoring unreachable objects and other fsck issues, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-fsck.txt |  7 ++++++-
 builtin/fsck.c             |  7 ++++++-
 t/t1450-fsck.sh            | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..84ee92e 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
 	 [--[no-]full] [--strict] [--verbose] [--lost-found]
-	 [--[no-]dangling] [--[no-]progress] [<object>*]
+	 [--[no-]dangling] [--[no-]progress] [--connectivity-only] [<object>*]
 
 DESCRIPTION
 -----------
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs
 	object pools.  This is now default; you can turn it off
 	with --no-full.
 
+--connectivity-only::
+	Check only the connectivity of tags, commits and tree objects. By
+	avoiding to unpack blobs, this speeds up the operation, at the
+	expense of missing corrupt objects or other problematic issues.
+
 --strict::
 	Enable more strict checking, namely to catch a file mode
 	recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index adaa802..2d14298 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int connectivity_only;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -181,6 +182,8 @@ static void check_reachable_object(struct object *obj)
 	if (!(obj->flags & HAS_OBJ)) {
 		if (has_sha1_pack(obj->sha1))
 			return; /* it is in pack - forget about it */
+		if (connectivity_only && has_sha1_file(obj->sha1))
+			return;
 		printf("missing %s %s\n", typename(obj->type), sha1_to_hex(obj->sha1));
 		errors_found |= ERROR_REACHABLE;
 		return;
@@ -623,6 +626,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head nodes")),
 	OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes (default)")),
 	OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate objects")),
+	OPT_BOOL(0, "connectivity-only", &connectivity_only, N_("check only connectivity")),
 	OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
 	OPT_BOOL(0, "lost-found", &write_lost_and_found,
 				N_("write dangling objects in .git/lost-found")),
@@ -659,7 +663,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	git_config(fsck_config, NULL);
 
 	fsck_head_link();
-	fsck_object_dir(get_object_directory());
+	if (!connectivity_only)
+		fsck_object_dir(get_object_directory());
 
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 1727129..956673b 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --connectivity-only' '
+	rm -rf connectivity-only &&
+	git init connectivity-only &&
+	(
+		cd connectivity-only &&
+		touch empty &&
+		git add empty &&
+		test_commit empty &&
+		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+		rm -f $empty &&
+		echo invalid >$empty &&
+		test_must_fail git fsck --strict &&
+		git fsck --strict --connectivity-only &&
+		tree=$(git rev-parse HEAD:) &&
+		suffix=${tree#??} &&
+		tree=.git/objects/${tree%$suffix}/$suffix &&
+		rm -f $tree &&
+		echo invalid >$tree &&
+		test_must_fail git fsck --strict --connectivity-only
+	)
+'
+
 test_done
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (16 preceding siblings ...)
  2015-06-22 15:27           ` [PATCH v7 17/19] fsck: Introduce `git fsck --connectivity-only` Johannes Schindelin
@ 2015-06-22 15:27           ` Johannes Schindelin
  2015-06-22 15:27           ` [PATCH v7 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
  2015-06-22 18:02           ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:27 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

The optional new config option `receive.fsck.skipList` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt        |  8 +++++++
 builtin/receive-pack.c          | 11 +++++++++
 fsck.c                          | 50 +++++++++++++++++++++++++++++++++++++++++
 fsck.h                          |  1 +
 t/t5504-fetch-receive-strict.sh | 12 ++++++++++
 5 files changed, 82 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bfccd2b..ed7f37f 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2230,6 +2230,14 @@ which would not pass pushing when `receive.fsckObjects = true`, allowing
 the host to accept repositories with certain known issues but still catch
 other issues.
 
+receive.fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+	Note: corrupt objects cannot be skipped with this setting.
+
 receive.unpackLimit::
 	If the number of objects received in a push is below this
 	limit then the objects will be unpacked into loose object
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3afe8f8..3fbed23 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -117,6 +117,17 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.fsck.skiplist") == 0) {
+		const char *path;
+
+		if (git_config_pathname(&path, var, value))
+			return 1;
+		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
+			fsck_msg_types.len ? ',' : '=', path);
+		free((char *) path);
+		return 0;
+	}
+
 	if (skip_prefix(var, "receive.fsck.", &var)) {
 		if (is_valid_msg_type(var, value))
 			strbuf_addf(&fsck_msg_types, "%c%s=%s",
diff --git a/fsck.c b/fsck.c
index f6fc384..a677b50 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,7 @@
 #include "fsck.h"
 #include "refs.h"
 #include "utf8.h"
+#include "sha1-array.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -127,6 +128,43 @@ static int fsck_msg_type(enum fsck_msg_id msg_id,
 	return msg_type;
 }
 
+static void init_skiplist(struct fsck_options *options, const char *path)
+{
+	static struct sha1_array skiplist = SHA1_ARRAY_INIT;
+	int sorted, fd;
+	char buffer[41];
+	unsigned char sha1[20];
+
+	if (options->skiplist)
+		sorted = options->skiplist->sorted;
+	else {
+		sorted = 1;
+		options->skiplist = &skiplist;
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		die("Could not open skip list: %s", path);
+	for (;;) {
+		int result = read_in_full(fd, buffer, sizeof(buffer));
+		if (result < 0)
+			die_errno("Could not read '%s'", path);
+		if (!result)
+			break;
+		if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+			die("Invalid SHA-1: %s", buffer);
+		sha1_array_append(&skiplist, sha1);
+		if (sorted && skiplist.nr > 1 &&
+				hashcmp(skiplist.sha1[skiplist.nr - 2],
+					sha1) > 0)
+			sorted = 0;
+	}
+	close(fd);
+
+	if (sorted)
+		skiplist.sorted = 1;
+}
+
 static int parse_msg_type(const char *str)
 {
 	if (!strcmp(str, "error"))
@@ -190,6 +228,14 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values)
 			buf[equal] = tolower(buf[equal]);
 		buf[equal] = '\0';
 
+		if (!strcmp(buf, "skiplist")) {
+			if (equal == len)
+				die("skiplist requires a path");
+			init_skiplist(options, buf + equal + 1);
+			buf += len + 1;
+			continue;
+		}
+
 		if (equal == len)
 			die("Missing '=': '%s'", buf);
 
@@ -228,6 +274,10 @@ static int report(struct fsck_options *options, struct object *object,
 	if (msg_type == FSCK_IGNORE)
 		return 0;
 
+	if (options->skiplist && object &&
+			sha1_array_lookup(options->skiplist, object->sha1) >= 0)
+		return 0;
+
 	if (msg_type == FSCK_FATAL)
 		msg_type = FSCK_ERROR;
 	else if (msg_type == FSCK_INFO)
diff --git a/fsck.h b/fsck.h
index 1dab276..dded84b 100644
--- a/fsck.h
+++ b/fsck.h
@@ -32,6 +32,7 @@ struct fsck_options {
 	fsck_error error_func;
 	unsigned strict:1;
 	int *msg_type;
+	struct sha1_array *skiplist;
 };
 
 #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index af373ba..6a1f89e 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -123,6 +123,18 @@ committer Bugs Bunny <bugs@bun.ni> 1234567890 +0000
 This commit object intentionally broken
 EOF
 
+test_expect_success 'push with receive.fsck.skipList' '
+	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
+	git push . $commit:refs/heads/bogus &&
+	rm -rf dst &&
+	git init dst &&
+	git --git-dir=dst/.git config receive.fsckObjects true &&
+	test_must_fail git push --porcelain dst bogus &&
+	git --git-dir=dst/.git config receive.fsck.skipList SKIP &&
+	echo $commit >dst/.git/SKIP &&
+	git push --porcelain dst bogus
+'
+
 test_expect_success 'push with receive.fsck.missingEmail=warn' '
 	commit="$(git hash-object -t commit -w --stdin <bogus-commit)" &&
 	git push . $commit:refs/heads/bogus &&
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* [PATCH v7 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (17 preceding siblings ...)
  2015-06-22 15:27           ` [PATCH v7 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
@ 2015-06-22 15:27           ` Johannes Schindelin
  2015-06-22 18:02           ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
  19 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 15:27 UTC (permalink / raw)
  To: gitster; +Cc: git, mhagger, peff

Identical to support in `git receive-pack for the config option
`receive.fsck.skiplist`, we now support ignoring given objects in
`git fsck` via `fsck.skiplist` altogether.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/config.txt |  8 ++++++++
 builtin/fsck.c           | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ed7f37f..69dda93 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1261,6 +1261,14 @@ that setting `fsck.missingEmail = ignore` will hide that issue.
 This feature is intended to support working with legacy repositories
 which cannot be repaired without disruptive changes.
 
+fsck.skipList::
+	The path to a sorted list of object names (i.e. one SHA-1 per
+	line) that are known to be broken in a non-fatal way and should
+	be ignored. This feature is useful when an established project
+	should be accepted despite early commits containing errors that
+	can be safely ignored such as invalid committer email addresses.
+	Note: corrupt objects cannot be skipped with this setting.
+
 gc.aggressiveDepth::
 	The depth parameter used in the delta compression
 	algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2d14298..7e3df20 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -49,6 +49,19 @@ static int show_dangling = 1;
 
 static int fsck_config(const char *var, const char *value, void *cb)
 {
+	if (strcmp(var, "fsck.skiplist") == 0) {
+		const char *path;
+		struct strbuf sb = STRBUF_INIT;
+
+		if (git_config_pathname(&path, var, value))
+			return 1;
+		strbuf_addf(&sb, "skiplist=%s", path);
+		free((char *) path);
+		fsck_set_msg_types(&fsck_obj_options, sb.buf);
+		strbuf_release(&sb);
+		return 0;
+	}
+
 	if (skip_prefix(var, "fsck.", &var)) {
 		fsck_set_msg_type(&fsck_obj_options, var, value);
 		return 0;
-- 
2.3.1.windows.1.9.g8c01ab4

^ permalink raw reply related	[flat|nested] 275+ messages in thread

* Re: [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-22 15:25           ` [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
@ 2015-06-22 17:37             ` Junio C Hamano
  2015-06-22 21:00               ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-22 17:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/fsck.c b/fsck.c
> index 1a3f7ce..e81a342 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -64,30 +64,29 @@ enum fsck_msg_id {
>  #undef MSG_ID
>  
>  #define STR(x) #x
> -#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
> +#define MSG_ID(id, msg_type) { STR(id), NULL, FSCK_##msg_type },
>  static struct {
>  	const char *id_string;
> +	const char *lowercased;
>  	int msg_type;
>  } msg_id_info[FSCK_MSG_MAX + 1] = {
>  	FOREACH_MSG_ID(MSG_ID)
> -	{ NULL, -1 }
> +	{ NULL, NULL, -1 }
>  };
>  #undef MSG_ID
>  
>  static int parse_msg_id(const char *text)
>  {
> -	static char **lowercased;
>  	int i;
>  
> -	if (!lowercased) {
> +	if (!msg_id_info[0].lowercased) {
>  		/* convert id_string to lower case, without underscores. */
> -		lowercased = xmalloc(FSCK_MSG_MAX * sizeof(*lowercased));
>  		for (i = 0; i < FSCK_MSG_MAX; i++) {
>  			const char *p = msg_id_info[i].id_string;
>  			int len = strlen(p);
>  			char *q = xmalloc(len);
>  
> -			lowercased[i] = q;
> +			msg_id_info[i].lowercased = q;
>  			while (*p)
>  				if (*p == '_')
>  					p++;
> @@ -98,7 +97,7 @@ static int parse_msg_id(const char *text)
>  	}
>  
>  	for (i = 0; i < FSCK_MSG_MAX; i++)
> -		if (!strcmp(text, lowercased[i]))
> +		if (!strcmp(text, msg_id_info[i].lowercased))
>  			return i;
>  
>  	return -1;

Heh, this was the first thing that came to my mind when I saw 03/19
that lazily prepares downcased version (which is good) but do so in
a separately allocated buffer (which is improved by this change) ;-)

IOW, I think all of the above should have been part of 03/19, not
"oops I belatedly realized that this way is better" fixup here.

The end result looks good, so let's keep reading.

> +void fsck_set_msg_types(struct fsck_options *options, const char *values)
> +{
> +	char *buf = xstrdup(values), *to_free = buf;
> +	int done = 0;
> +
> +	while (!done) {
> +		int len = strcspn(buf, " ,|"), equal;
> +
> +		done = !buf[len];
> +		if (!len) {
> +			buf++;
> +			continue;
> +		}
> +		buf[len] = '\0';
> +
> +		for (equal = 0; equal < len &&
> +				buf[equal] != '=' && buf[equal] != ':'; equal++)

Style.  I'd format this more like so:

		for (equal = 0;
                     equal < len && buf[equal] != '=' && buf[equal] != ':';
		     equal++)

> +			buf[equal] = tolower(buf[equal]);
> +		buf[equal] = '\0';
> +
> +		if (equal == len)
> +			die("Missing '=': '%s'", buf);
> +
> +		fsck_set_msg_type(options, buf, buf + equal + 1);
> +		buf += len + 1;
> +	}
> +	free(to_free);
> +}

Overall, the change is good (and it was good in v6, too), and I
think it has become simpler to follow the logic with the upfront
downcasing.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
                             ` (18 preceding siblings ...)
  2015-06-22 15:27           ` [PATCH v7 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
@ 2015-06-22 18:02           ` Junio C Hamano
  2015-06-22 21:07             ` Johannes Schindelin
  19 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-22 18:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> Changes since v6:
> 
> - camelCased message IDs
>
> - multiple author checking now as suggested by Junio
>
> - renamed `--quick` to `--connectivity-only`, better commit message
>
> - `fsck.skipList` is now handled correctly (and not mistaken for a message
>   type setting)
>
> - `fsck.skipList` can handle user paths now
>
> - index-pack configures the walk function in a more logical place now
>
> - simplified code by avoiding working on partial strings (i.e. removed
>   `substrcmp()`). This saves 10 lines. To accomodate parsing config
>   variables directly, we now work on lowercased message IDs; unfortunately
>   this means that we cannot use them in append_msg_id() because that
>   function wants to append camelCased message IDs.
>
> Interdiff below diffstat.

Except for minor nits I sent separate messages, this round looks
very nicely done (I however admit that I haven't read the skiplist
parsing code carefully at all, expecting that you wouldn't screw up
with something simple like that ;-))

Thanks, will replace what is queued.  Let's start thinking about
moving it down to 'next' (meaning: we _could_ still accept a reroll,
but I think we are in a good shape and minor incremental refinements
would suffice), cooking it for the remainder of the cycle and having
it graduate to 'master' at the beginning of the next cycle.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-06-22 15:26           ` [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
@ 2015-06-22 18:04             ` Junio C Hamano
  2015-06-22 21:11               ` Johannes Schindelin
  0 siblings, 1 reply; 275+ messages in thread
From: Junio C Hamano @ 2015-06-22 18:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, mhagger, peff

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> +	git --git-dir=dst/.git branch -D bogus &&
> +	git  --git-dir=dst/.git config --add \
> +		receive.fsck.missingEmail ignore &&
> +	git  --git-dir=dst/.git config --add \
> +		receive.fsck.badDate warn &&

Funny double-SP (will locally fix).

There are a few other minor style nits (not in this patch but in
other patches in the series) like

	s/free((char *) var)/free((char *)var)/;

that I locally added SQUASH, and I may also have tweaked some log
messages.  Please check what you will see in 'pu' when I push the
day's integration out later.

Thanks.

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings
  2015-06-22 17:37             ` Junio C Hamano
@ 2015-06-22 21:00               ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 21:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-22 19:37, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> diff --git a/fsck.c b/fsck.c
>> index 1a3f7ce..e81a342 100644
>> --- a/fsck.c
>> +++ b/fsck.c
>> @@ -64,30 +64,29 @@ enum fsck_msg_id {
>>  #undef MSG_ID
>>
>>  #define STR(x) #x
>> -#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type },
>> +#define MSG_ID(id, msg_type) { STR(id), NULL, FSCK_##msg_type },
>>  static struct {
>>  	const char *id_string;
>> +	const char *lowercased;
>>  	int msg_type;
>>  } msg_id_info[FSCK_MSG_MAX + 1] = {
>>  	FOREACH_MSG_ID(MSG_ID)
>> -	{ NULL, -1 }
>> +	{ NULL, NULL, -1 }
>>  };
>>  #undef MSG_ID
>>
>>  static int parse_msg_id(const char *text)
>>  {
>> -	static char **lowercased;
>>  	int i;
>>
>> -	if (!lowercased) {
>> +	if (!msg_id_info[0].lowercased) {
>>  		/* convert id_string to lower case, without underscores. */
>> -		lowercased = xmalloc(FSCK_MSG_MAX * sizeof(*lowercased));
>>  		for (i = 0; i < FSCK_MSG_MAX; i++) {
>>  			const char *p = msg_id_info[i].id_string;
>>  			int len = strlen(p);
>>  			char *q = xmalloc(len);
>>
>> -			lowercased[i] = q;
>> +			msg_id_info[i].lowercased = q;
>>  			while (*p)
>>  				if (*p == '_')
>>  					p++;
>> @@ -98,7 +97,7 @@ static int parse_msg_id(const char *text)
>>  	}
>>
>>  	for (i = 0; i < FSCK_MSG_MAX; i++)
>> -		if (!strcmp(text, lowercased[i]))
>> +		if (!strcmp(text, msg_id_info[i].lowercased))
>>  			return i;
>>
>>  	return -1;
> 
> Heh, this was the first thing that came to my mind when I saw 03/19
> that lazily prepares downcased version (which is good) but do so in
> a separately allocated buffer (which is improved by this change) ;-)
> 
> IOW, I think all of the above should have been part of 03/19, not
> "oops I belatedly realized that this way is better" fixup here.

Gaaaah. Wrong commit fixed up. Sorry. Will be fixed in v8.

>> +void fsck_set_msg_types(struct fsck_options *options, const char *values)
>> +{
>> +	char *buf = xstrdup(values), *to_free = buf;
>> +	int done = 0;
>> +
>> +	while (!done) {
>> +		int len = strcspn(buf, " ,|"), equal;
>> +
>> +		done = !buf[len];
>> +		if (!len) {
>> +			buf++;
>> +			continue;
>> +		}
>> +		buf[len] = '\0';
>> +
>> +		for (equal = 0; equal < len &&
>> +				buf[equal] != '=' && buf[equal] != ':'; equal++)
> 
> Style.  I'd format this more like so:
> 
> 		for (equal = 0;
>                      equal < len && buf[equal] != '=' && buf[equal] != ':';
> 		     equal++)

Will be fixed.

>> +			buf[equal] = tolower(buf[equal]);
>> +		buf[equal] = '\0';
>> +
>> +		if (equal == len)
>> +			die("Missing '=': '%s'", buf);
>> +
>> +		fsck_set_msg_type(options, buf, buf + equal + 1);
>> +		buf += len + 1;
>> +	}
>> +	free(to_free);
>> +}
> 
> Overall, the change is good (and it was good in v6, too), and I
> think it has become simpler to follow the logic with the upfront
> downcasing.

Yep, I agree. I did not expect that, but it was worth the effort to compare the two versions.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery
  2015-06-22 18:02           ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
@ 2015-06-22 21:07             ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 21:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-22 20:02, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> Changes since v6:
>>
>> - camelCased message IDs
>>
>> - multiple author checking now as suggested by Junio
>>
>> - renamed `--quick` to `--connectivity-only`, better commit message
>>
>> - `fsck.skipList` is now handled correctly (and not mistaken for a message
>>   type setting)
>>
>> - `fsck.skipList` can handle user paths now
>>
>> - index-pack configures the walk function in a more logical place now
>>
>> - simplified code by avoiding working on partial strings (i.e. removed
>>   `substrcmp()`). This saves 10 lines. To accomodate parsing config
>>   variables directly, we now work on lowercased message IDs; unfortunately
>>   this means that we cannot use them in append_msg_id() because that
>>   function wants to append camelCased message IDs.
>>
>> Interdiff below diffstat.
> 
> Except for minor nits I sent separate messages, this round looks
> very nicely done (I however admit that I haven't read the skiplist
> parsing code carefully at all, expecting that you wouldn't screw up
> with something simple like that ;-))
> 
> Thanks, will replace what is queued.  Let's start thinking about
> moving it down to 'next' (meaning: we _could_ still accept a reroll,
> but I think we are in a good shape and minor incremental refinements
> would suffice), cooking it for the remainder of the cycle and having
> it graduate to 'master' at the beginning of the next cycle.

Let me submit a v8 with the borked fixup fixed (i.e. part of 04/19 moved to 03/19, where it really belongs), the `for` style fix, the fixed double space and the cast style, too.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

* Re: [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely
  2015-06-22 18:04             ` Junio C Hamano
@ 2015-06-22 21:11               ` Johannes Schindelin
  0 siblings, 0 replies; 275+ messages in thread
From: Johannes Schindelin @ 2015-06-22 21:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, mhagger, peff

Hi Junio,

On 2015-06-22 20:04, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> +	git --git-dir=dst/.git branch -D bogus &&
>> +	git  --git-dir=dst/.git config --add \
>> +		receive.fsck.missingEmail ignore &&
>> +	git  --git-dir=dst/.git config --add \
>> +		receive.fsck.badDate warn &&
> 
> Funny double-SP (will locally fix).
> 
> There are a few other minor style nits (not in this patch but in
> other patches in the series) like
> 
> 	s/free((char *) var)/free((char *)var)/;
> 
> that I locally added SQUASH,

I fixed all of this locally, ready to push out v8, but...

> and I may also have tweaked some log messages.  Please check what you will see in 'pu' when I push the day's integration out later.

... I did not see that yet.

For the record, my current state is available at

    https://github.com/dscho/git/compare/fsck-api~19...fsck-api

I could wait for tomorrow to adjust my commit messages... or what do you want me to do?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 275+ messages in thread

end of thread, other threads:[~2015-06-22 21:11 UTC | newest]

Thread overview: 275+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-08 16:13 [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
2014-12-08 16:14 ` [PATCH 01/18] Introduce fsck options Johannes Schindelin
2014-12-10 15:33   ` Junio C Hamano
2014-12-22 17:26     ` Johannes Schindelin
2014-12-22 17:32       ` Junio C Hamano
2014-12-08 16:14 ` [PATCH 02/18] Introduce identifiers for fsck messages Johannes Schindelin
2014-12-08 16:14 ` [PATCH 03/18] Provide a function to parse fsck message IDs Johannes Schindelin
2014-12-10 17:56   ` Junio C Hamano
2014-12-22 21:27     ` Johannes Schindelin
2014-12-08 16:14 ` [PATCH 04/18] Offer a function to demote fsck errors to warnings Johannes Schindelin
2014-12-10 18:00   ` Junio C Hamano
2014-12-22 21:43     ` Johannes Schindelin
2014-12-22 21:59       ` Junio C Hamano
2014-12-22 22:32         ` Johannes Schindelin
2014-12-22 22:40           ` Junio C Hamano
2014-12-22 22:55             ` Johannes Schindelin
2014-12-22 23:15               ` Junio C Hamano
2014-12-23 10:53                 ` Johannes Schindelin
2014-12-23 16:18                   ` Junio C Hamano
2014-12-23 16:30                     ` Johannes Schindelin
2014-12-23 17:20                       ` Junio C Hamano
2014-12-23 17:28                         ` Johannes Schindelin
2014-12-23 18:14                           ` Junio C Hamano
2014-12-23 18:23                             ` Johannes Schindelin
2014-12-08 16:14 ` [PATCH 05/18] Allow demoting errors to warnings via receive.fsck.<key> = warn Johannes Schindelin
2014-12-10 17:52   ` Junio C Hamano
2014-12-22 21:44     ` Johannes Schindelin
2014-12-08 16:14 ` [PATCH 06/18] fsck: report the ID of the error/warning Johannes Schindelin
2014-12-08 16:14 ` [PATCH 07/18] Make fsck_ident() warn-friendly Johannes Schindelin
2014-12-08 16:14 ` [PATCH 08/18] Make fsck_commit() warn-friendly Johannes Schindelin
2014-12-08 16:15 ` [PATCH 09/18] fsck: handle multiple authors in commits specially Johannes Schindelin
2014-12-10 18:04   ` Junio C Hamano
2014-12-22 21:53     ` Johannes Schindelin
2014-12-08 16:15 ` [PATCH 10/18] Make fsck_tag() warn-friendly Johannes Schindelin
2014-12-08 16:15 ` [PATCH 11/18] Add a simple test for receive.fsck.* Johannes Schindelin
2014-12-08 16:15 ` [PATCH 12/18] Disallow demoting grave fsck errors to warnings Johannes Schindelin
2014-12-10 18:06   ` Junio C Hamano
2014-12-22 21:56     ` Johannes Schindelin
2014-12-08 16:15 ` [PATCH 13/18] Optionally ignore specific fsck issues completely Johannes Schindelin
2014-12-10 18:07   ` Junio C Hamano
2014-12-08 16:15 ` [PATCH 14/18] fsck: allow upgrading fsck warnings to errors Johannes Schindelin
2014-12-10 18:08   ` Junio C Hamano
2014-12-22 22:01     ` Johannes Schindelin
2014-12-22 22:15       ` Junio C Hamano
2014-12-22 22:39         ` Johannes Schindelin
2014-12-08 16:15 ` [PATCH 15/18] Document the new receive.fsck.* options Johannes Schindelin
2014-12-08 16:15 ` [PATCH 16/18] fsck: support demoting errors to warnings Johannes Schindelin
2014-12-10 18:15   ` Junio C Hamano
2014-12-22 22:25     ` Johannes Schindelin
2014-12-22 22:34       ` Junio C Hamano
2014-12-22 22:46         ` Johannes Schindelin
2014-12-22 22:50           ` Junio C Hamano
2014-12-22 22:57             ` Johannes Schindelin
2014-12-22 23:13               ` Junio C Hamano
2014-12-23  9:50                 ` Johannes Schindelin
2014-12-23 16:32                   ` Junio C Hamano
2014-12-23 16:47                     ` Johannes Schindelin
2014-12-23 17:14                       ` Junio C Hamano
2014-12-23 17:41                         ` Johannes Schindelin
2014-12-23 17:56                           ` Junio C Hamano
2014-12-23 18:06                             ` Johannes Schindelin
2014-12-23 18:09                             ` Junio C Hamano
2014-12-23 18:14                               ` Johannes Schindelin
2014-12-23 18:56                                 ` Junio C Hamano
2014-12-23 20:12                                   ` Johannes Schindelin
2014-12-23 21:17                                     ` Junio C Hamano
2015-01-22 15:49                         ` Michael Haggerty
2015-01-22 17:17                           ` Johannes Schindelin
2015-01-31 20:41                             ` Johannes Schindelin
2014-12-23 17:07                     ` Junio C Hamano
2014-12-08 16:15 ` [PATCH 17/18] Introduce `git fsck --quick` Johannes Schindelin
2014-12-08 16:15 ` [PATCH 18/18] git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2014-12-10 18:23   ` Junio C Hamano
2014-12-22 22:19     ` Johannes Schindelin
2014-12-10 18:34 ` [PATCH 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
2015-01-19 15:49   ` [PATCH v2 " Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 01/18] fsck: Introduce fsck options Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 02/18] fsck: Introduce identifiers for fsck messages Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 03/18] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 04/18] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
2015-01-21  8:49       ` Junio C Hamano
2015-01-21 17:42         ` Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 05/18] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
2015-01-21  8:54       ` Junio C Hamano
2015-01-21 18:01         ` Johannes Schindelin
2015-01-21 21:47           ` Junio C Hamano
2015-01-22  9:35             ` Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 06/18] fsck: Report the ID of the error/warning Johannes Schindelin
2015-01-19 15:50     ` [PATCH v2 07/18] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
2015-01-21  8:56       ` Junio C Hamano
2015-01-19 15:50     ` [PATCH v2 08/18] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 09/18] fsck: Handle multiple authors in commits specially Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 10/18] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 11/18] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
2015-01-21  8:59       ` Junio C Hamano
2015-01-21 18:14         ` Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 12/18] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 13/18] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 14/18] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 15/18] fsck: Document the new receive.fsck.* options Johannes Schindelin
2015-01-19 22:44       ` Eric Sunshine
2015-01-20  7:24         ` Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 16/18] fsck: Support demoting errors to warnings Johannes Schindelin
2015-01-19 15:51     ` [PATCH v2 17/18] fsck: Introduce `git fsck --quick` Johannes Schindelin
2015-01-19 15:52     ` [PATCH v2 18/18] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2015-01-21  9:02       ` Junio C Hamano
2015-01-21 18:17         ` Johannes Schindelin
2015-01-21  9:17     ` [PATCH v2 00/18] Introduce an internal API to interact with the fsck machinery Junio C Hamano
2015-01-21 18:24       ` Johannes Schindelin
2015-01-21 19:23   ` [PATCH v3 00/19] " Johannes Schindelin
2015-01-21 19:24     ` [PATCH v3 01/19] fsck: Introduce fsck options Johannes Schindelin
2015-01-21 19:24     ` [PATCH v3 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
2015-01-21 19:24     ` [PATCH v3 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
2015-01-21 19:24     ` [PATCH v3 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
2015-01-21 19:24     ` [PATCH v3 05/19] fsck: Allow demoting errors to warnings via receive.fsck.warn = <key> Johannes Schindelin
2015-01-21 19:25     ` [PATCH v3 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
2015-01-21 19:25     ` [PATCH v3 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
2015-01-21 19:25     ` [PATCH v3 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
2015-01-21 19:25     ` [PATCH v3 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
2015-01-21 19:25     ` [PATCH v3 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
2015-01-21 19:25     ` [PATCH v3 11/19] fsck: Add a simple test for receive.fsck.* Johannes Schindelin
2015-01-21 19:26     ` [PATCH v3 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
2015-01-21 19:26     ` [PATCH v3 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
2015-01-21 19:26     ` [PATCH v3 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
2015-01-21 19:27     ` [PATCH v3 15/19] fsck: Document the new receive.fsck.* options Johannes Schindelin
2015-01-21 19:27     ` [PATCH v3 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
2015-01-21 19:27     ` [PATCH v3 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
2015-01-21 19:27     ` [PATCH v3 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2015-01-21 19:27     ` [PATCH v3 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
2015-01-31 21:04   ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
2015-01-31 21:04     ` [PATCH v4 01/19] fsck: Introduce fsck options Johannes Schindelin
2015-01-31 21:04     ` [PATCH v4 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
2015-01-31 21:04     ` [PATCH v4 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 05/19] fsck: Allow demoting errors to warnings Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 04/19] fsck: Offer a function to demote fsck " Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 11/19] fsck: Add a simple test for receive.fsck.severity Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
2015-01-31 21:05     ` [PATCH v4 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 15/19] fsck: Document the new receive.fsck.severity options Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
2015-01-31 21:06     ` [PATCH v4 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2015-01-31 21:07     ` [PATCH v4 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
2015-02-02 11:41     ` [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
2015-02-02 12:43       ` Michael Haggerty
2015-02-02 16:48         ` Johannes Schindelin
2015-02-03 15:11           ` Michael Haggerty
2015-02-03 16:33             ` Johannes Schindelin
2015-02-04  3:50               ` Junio C Hamano
2015-02-04 11:02                 ` Johannes Schindelin
2015-06-18 20:07     ` [PATCH v5 " Johannes Schindelin
2015-06-18 20:07       ` [PATCH v5 01/19] fsck: Introduce fsck options Johannes Schindelin
2015-06-18 20:07       ` [PATCH v5 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
2015-06-18 20:07       ` [PATCH v5 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
2015-06-18 20:08       ` [PATCH v5 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
2015-06-18 20:09       ` [PATCH v5 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
2015-06-18 20:10       ` [PATCH v5 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2015-06-18 20:10       ` [PATCH v5 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
2015-06-18 22:11       ` [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
2015-06-19  0:04         ` Johannes Schindelin
2015-06-19 17:33           ` Junio C Hamano
2015-06-19 19:43             ` Johannes Schindelin
2015-06-19 13:32       ` [PATCH v6 " Johannes Schindelin
2015-06-19 13:32         ` [PATCH v6 01/19] fsck: Introduce fsck options Johannes Schindelin
2015-06-19 19:03           ` Junio C Hamano
2015-06-20 12:33             ` Johannes Schindelin
2015-06-19 13:32         ` [PATCH v6 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
2015-06-19 19:06           ` Junio C Hamano
2015-06-19 13:32         ` [PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
2015-06-19 19:13           ` Junio C Hamano
2015-06-21 13:46             ` Johannes Schindelin
2015-06-19 13:33         ` [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
2015-06-19 19:26           ` Junio C Hamano
2015-06-21 13:59             ` Johannes Schindelin
2015-06-21 17:36               ` Junio C Hamano
2015-06-21 18:23                 ` Johannes Schindelin
2015-06-21 18:47                   ` Junio C Hamano
2015-06-22 15:24             ` Johannes Schindelin
2015-06-19 13:33         ` [PATCH v6 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
2015-06-19 13:33         ` [PATCH v6 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
2015-06-19 19:28           ` Junio C Hamano
2015-06-19 21:34             ` Johannes Schindelin
2015-06-19 23:26               ` Junio C Hamano
2015-06-19 13:33         ` [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
2015-06-19 19:48           ` Junio C Hamano
2015-06-19 13:33         ` [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
2015-06-19 20:12           ` Junio C Hamano
2015-06-19 20:52             ` Johannes Schindelin
2015-06-19 21:01               ` Junio C Hamano
2015-06-19 23:43                 ` Junio C Hamano
2015-06-19 13:34         ` [PATCH v6 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
2015-06-19 20:16           ` Junio C Hamano
2015-06-19 21:04             ` Johannes Schindelin
2015-06-19 13:34         ` [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
2015-06-19 20:18           ` Junio C Hamano
2015-06-19 21:06             ` Johannes Schindelin
2015-06-19 13:34         ` [PATCH v6 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
2015-06-19 13:34         ` [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
2015-06-19 20:21           ` Junio C Hamano
2015-06-19 21:09             ` Johannes Schindelin
2015-06-19 23:30               ` Junio C Hamano
2015-06-19 13:34         ` [PATCH v6 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
2015-06-19 13:34         ` [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
2015-06-19 20:22           ` Junio C Hamano
2015-06-19 21:10             ` Johannes Schindelin
2015-06-19 13:35         ` [PATCH v6 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
2015-06-19 13:35         ` [PATCH v6 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
2015-06-19 13:35         ` [PATCH v6 17/19] fsck: Introduce `git fsck --quick` Johannes Schindelin
2015-06-19 20:32           ` Junio C Hamano
2015-06-19 20:42             ` Johannes Schindelin
2015-06-19 20:53               ` Junio C Hamano
2015-06-19 23:57                 ` Scott Schmit
2015-06-20  3:24                   ` Junio C Hamano
2015-06-21  4:55                 ` Michael Haggerty
2015-06-21  5:09                   ` Randall S. Becker
2015-06-21 14:40                     ` Johannes Schindelin
2015-06-21 12:01                   ` Johannes Schindelin
2015-06-21 17:15                   ` Junio C Hamano
2015-06-21 18:27                     ` Johannes Schindelin
2015-06-21 20:35                       ` Junio C Hamano
2015-06-21 20:46                         ` Junio C Hamano
2015-06-22 13:01                         ` Johannes Schindelin
2015-06-20  3:26               ` Junio C Hamano
2015-06-19 13:35         ` [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2015-06-19 20:39           ` Junio C Hamano
2015-06-20 12:45             ` Johannes Schindelin
2015-06-20 17:28               ` Junio C Hamano
2015-06-22  4:21           ` Junio C Hamano
2015-06-22  8:49             ` Johannes Schindelin
2015-06-19 13:35         ` [PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
2015-06-19 20:40           ` Junio C Hamano
2015-06-22 15:24         ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Johannes Schindelin
2015-06-22 15:25           ` [PATCH v7 01/19] fsck: Introduce fsck options Johannes Schindelin
2015-06-22 15:25           ` [PATCH v7 02/19] fsck: Introduce identifiers for fsck messages Johannes Schindelin
2015-06-22 15:25           ` [PATCH v7 03/19] fsck: Provide a function to parse fsck message IDs Johannes Schindelin
2015-06-22 15:25           ` [PATCH v7 04/19] fsck: Offer a function to demote fsck errors to warnings Johannes Schindelin
2015-06-22 17:37             ` Junio C Hamano
2015-06-22 21:00               ` Johannes Schindelin
2015-06-22 15:25           ` [PATCH v7 05/19] fsck (receive-pack): Allow demoting " Johannes Schindelin
2015-06-22 15:25           ` [PATCH v7 06/19] fsck: Report the ID of the error/warning Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 07/19] fsck: Make fsck_ident() warn-friendly Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 08/19] fsck: Make fsck_commit() warn-friendly Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 09/19] fsck: Handle multiple authors in commits specially Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 10/19] fsck: Make fsck_tag() warn-friendly Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 11/19] fsck: Add a simple test for receive.fsck.<msg-id> Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 12/19] fsck: Disallow demoting grave fsck errors to warnings Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 13/19] fsck: Optionally ignore specific fsck issues completely Johannes Schindelin
2015-06-22 18:04             ` Junio C Hamano
2015-06-22 21:11               ` Johannes Schindelin
2015-06-22 15:26           ` [PATCH v7 14/19] fsck: Allow upgrading fsck warnings to errors Johannes Schindelin
2015-06-22 15:27           ` [PATCH v7 15/19] fsck: Document the new receive.fsck.<msg-id> options Johannes Schindelin
2015-06-22 15:27           ` [PATCH v7 16/19] fsck: Support demoting errors to warnings Johannes Schindelin
2015-06-22 15:27           ` [PATCH v7 17/19] fsck: Introduce `git fsck --connectivity-only` Johannes Schindelin
2015-06-22 15:27           ` [PATCH v7 18/19] fsck: git receive-pack: support excluding objects from fsck'ing Johannes Schindelin
2015-06-22 15:27           ` [PATCH v7 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist Johannes Schindelin
2015-06-22 18:02           ` [PATCH v7 00/19] Introduce an internal API to interact with the fsck machinery Junio C Hamano
2015-06-22 21:07             ` Johannes Schindelin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.