All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] grep with textconv
@ 2013-04-19 16:44 Michael J Gruber
  2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber
                   ` (6 more replies)
  0 siblings, 7 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git

This series teaches show and grep to obey textconv:
show by default (like diff), grep only on request (--textconv).
We might switch the default for the latter also, of course.
I'd actually like that.

Compared to an earlier (historic) series this one comes with tests.
Besides, it has been in use since.

Jeff King (1):
  grep: allow to use textconv filters

Michael J Gruber (5):
  t4030: demonstrate behavior of show with textconv
  show: obey --textconv for blobs
  cat-file: do not die on --textconv without textconv filters
  t7008: demonstrate behavior of grep with textconv
  grep: obey --textconv for the case rev:path

 builtin/cat-file.c           |   9 ++--
 builtin/grep.c               |  13 +++---
 builtin/log.c                |  24 +++++++++--
 grep.c                       | 100 +++++++++++++++++++++++++++++++++++++------
 grep.h                       |   1 +
 object.c                     |  26 ++++++++---
 object.h                     |   2 +
 t/t4030-diff-textconv.sh     |  18 ++++++++
 t/t7008-grep-binary.sh       |  39 +++++++++++++++++
 t/t8007-cat-file-textconv.sh |  20 +++------
 10 files changed, 205 insertions(+), 47 deletions(-)

-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 1/6] t4030: demonstrate behavior of show with textconv
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
@ 2013-04-19 16:44 ` Michael J Gruber
  2013-04-20  4:04   ` Jeff King
  2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git

"git show <commit>" obeys the textconc setting while "git show <blob>"
does not. Demonstrate this in the test.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 t/t4030-diff-textconv.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index 53ec330..f314ced 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -58,6 +58,12 @@ test_expect_success 'diff produces text' '
 	test_cmp expect.text actual
 '
 
+test_expect_success 'show commit produces text' '
+	git show HEAD >diff &&
+	find_diff <diff >actual &&
+	test_cmp expect.text actual
+'
+
 test_expect_success 'diff-tree produces binary' '
 	git diff-tree -p HEAD^ HEAD >diff &&
 	find_diff <diff >actual &&
@@ -84,6 +90,12 @@ test_expect_success 'status -v produces text' '
 	git reset --soft HEAD@{1}
 '
 
+test_expect_success 'show blob produces binary' '
+	git show HEAD:file >actual &&
+	printf "\\0\\n\\1\\n" >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
 	echo one >expect &&
 	git log --root --format=%s -G0 >actual &&
-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 2/6] show: obey --textconv for blobs
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
  2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber
@ 2013-04-19 16:44 ` Michael J Gruber
  2013-04-20  4:06   ` Jeff King
  2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git

Currently, "diff" and "cat-file" for blobs obey "--textconv" options
(with the former defaulting to "--textconv" and the latter to
"--no-textconv") whereas "show" does not obey this option, even though
it takes diff options.

Make "show" on blobs behave like "diff", i.e. obey "--textconv" by
default and "--no-textconv" when given.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/log.c            | 24 +++++++++++++++++++++---
 t/t4030-diff-textconv.sh |  8 +++++++-
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 5f3ed77..fe0275e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev)
 	strbuf_release(&out);
 }
 
-static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
+static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
 {
+	unsigned char sha1c[20];
+	struct object_context obj_context;
+	char *buf;
+	unsigned long size;
+
 	fflush(stdout);
-	return stream_blob_to_fd(1, sha1, NULL, 0);
+	if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
+		return stream_blob_to_fd(1, sha1, NULL, 0);
+
+	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
+		die("Not a valid object name %s", obj_name);
+	if (!obj_context.path[0] ||
+	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
+		return stream_blob_to_fd(1, sha1, NULL, 0);
+
+	if (!buf)
+		die("git show %s: bad file", obj_name);
+
+	write_or_die(1, buf, size);
+	return 0;
 }
 
 static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
@@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 		const char *name = objects[i].name;
 		switch (o->type) {
 		case OBJ_BLOB:
-			ret = show_blob_object(o->sha1, NULL);
+			ret = show_blob_object(o->sha1, &rev, name);
 			break;
 		case OBJ_TAG: {
 			struct tag *t = (struct tag *)o;
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index f314ced..f9d55e1 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -90,8 +90,14 @@ test_expect_success 'status -v produces text' '
 	git reset --soft HEAD@{1}
 '
 
-test_expect_success 'show blob produces binary' '
+test_expect_success 'show blob produces text' '
 	git show HEAD:file >actual &&
+	printf "0\\n1\\n" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'show --no-textconv blob produces binary' '
+	git show --no-textconv HEAD:file >actual &&
 	printf "\\0\\n\\1\\n" >expect &&
 	test_cmp expect actual
 '
-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 3/6] cat-file: do not die on --textconv without textconv filters
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
  2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber
  2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber
@ 2013-04-19 16:44 ` Michael J Gruber
  2013-04-19 18:15   ` Junio C Hamano
  2013-04-20  4:17   ` Jeff King
  2013-04-19 16:44 ` [PATCH 4/6] t7008: demonstrate behavior of grep with textconv Michael J Gruber
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git

When a command is supposed to use textconv filters (by default or with
"--textconv") and none are configured then the blob is output without
conversion; the only exception to this rule is "cat-file --textconv".

Make it behave like the rest of textconv aware commands.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/cat-file.c           |  9 +++++----
 t/t8007-cat-file-textconv.sh | 20 +++++---------------
 2 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 40f87b4..dd4e063 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -146,10 +146,11 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 			die("git cat-file --textconv %s: <object> must be <sha1:path>",
 			    obj_name);
 
-		if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
-			die("git cat-file --textconv: unable to run textconv on %s",
-			    obj_name);
-		break;
+		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
+			break;
+
+		/* otherwise expect a blob */
+		exp_type = "blob";
 
 	case 0:
 		if (type_from_string(exp_type) == OBJ_BLOB) {
diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh
index 78a0085..83c6636 100755
--- a/t/t8007-cat-file-textconv.sh
+++ b/t/t8007-cat-file-textconv.sh
@@ -22,11 +22,11 @@ test_expect_success 'setup ' '
 '
 
 cat >expected <<EOF
-fatal: git cat-file --textconv: unable to run textconv on :one.bin
+bin: test version 2
 EOF
 
 test_expect_success 'no filter specified' '
-	git cat-file --textconv :one.bin 2>result
+	git cat-file --textconv :one.bin >result &&
 	test_cmp expected result
 '
 
@@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' '
 	git config diff.test.cachetextconv false
 '
 
-cat >expected <<EOF
-bin: test version 2
-EOF
-
 test_expect_success 'cat-file without --textconv' '
 	git cat-file blob :one.bin >result &&
 	test_cmp expected result
@@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' '
 '
 
 test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' '
+	printf "%s" "one.bin" >expected &&
 	git cat-file blob :symlink.bin >result &&
-	printf "%s" "one.bin" >expected
 	test_cmp expected result
 '
 
 
 test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' '
-	! git cat-file --textconv :symlink.bin 2>result &&
-	cat >expected <<\EOF &&
-fatal: git cat-file --textconv: unable to run textconv on :symlink.bin
-EOF
+	git cat-file --textconv :symlink.bin >result &&
 	test_cmp expected result
 '
 
 test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' '
-	! git cat-file --textconv HEAD:symlink.bin 2>result &&
-	cat >expected <<EOF &&
-fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin
-EOF
+	git cat-file --textconv HEAD:symlink.bin >result &&
 	test_cmp expected result
 '
 
-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 4/6] t7008: demonstrate behavior of grep with textconv
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
                   ` (2 preceding siblings ...)
  2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber
@ 2013-04-19 16:44 ` Michael J Gruber
  2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git

Currently, "git grep" does not invoke any textconv filters. Demonstrate
this in the tests.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 t/t7008-grep-binary.sh | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 26f8319..a1fd0b2 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -145,4 +145,23 @@ test_expect_success 'grep respects not-binary diff attribute' '
 	test_cmp expect actual
 '
 
+cat >nul_to_q_textconv <<'EOF'
+#!/bin/sh
+"$PERL_PATH" -pe 'y/\000/Q/' < "$1"
+EOF
+chmod +x nul_to_q_textconv
+
+test_expect_success 'setup textconv filters' '
+	echo a diff=foo >.gitattributes &&
+	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
+'
+
+test_expect_success 'grep does not obey textconv' '
+	test_must_fail git grep Qfile
+'
+
+test_expect_success 'grep blob does not obey textconv' '
+	test_must_fail git grep Qfile HEAD:a
+'
+
 test_done
-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 5/6] grep: allow to use textconv filters
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
                   ` (3 preceding siblings ...)
  2013-04-19 16:44 ` [PATCH 4/6] t7008: demonstrate behavior of grep with textconv Michael J Gruber
@ 2013-04-19 16:44 ` Michael J Gruber
  2013-04-20  4:31   ` Jeff King
  2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber
  2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git; +Cc: Jeff King

From: Jeff King <peff@peff.net>

Recently and not so recently, we made sure that log/grep type operations
use textconv filters when a userfacing diff would do the same:

ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)

"git grep" currently does not use textconv filters at all, that is
neither for displaying the match and context nor for the actual grepping.

Introduce an option "--textconv" which makes git grep use any configured
textconv filters for grepping and output purposes. It is off by default.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/grep.c         |   2 +
 grep.c                 | 100 ++++++++++++++++++++++++++++++++++++++++++-------
 grep.h                 |   1 +
 t/t7008-grep-binary.sh |  18 +++++++++
 4 files changed, 107 insertions(+), 14 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 159e65d..00ee57d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -659,6 +659,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		OPT_SET_INT('I', NULL, &opt.binary,
 			N_("don't match patterns in binary files"),
 			GREP_BINARY_NOMATCH),
+		OPT_BOOL(0, "textconv", &opt.allow_textconv,
+			 N_("process binary files with textconv filters")),
 		{ OPTION_INTEGER, 0, "max-depth", &opt.max_depth, N_("depth"),
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
diff --git a/grep.c b/grep.c
index bb548ca..c668034 100644
--- a/grep.c
+++ b/grep.c
@@ -2,6 +2,8 @@
 #include "grep.h"
 #include "userdiff.h"
 #include "xdiff-interface.h"
+#include "diff.h"
+#include "diffcore.h"
 
 static int grep_source_load(struct grep_source *gs);
 static int grep_source_is_binary(struct grep_source *gs);
@@ -1322,6 +1324,58 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
+static int fill_textconv_grep(struct userdiff_driver *driver,
+			      struct grep_source *gs)
+{
+	struct diff_filespec *df;
+	char *buf;
+	size_t size;
+
+	if (!driver || !driver->textconv)
+		return grep_source_load(gs);
+
+	/*
+	 * The textconv interface is intimately tied to diff_filespecs, so we
+	 * have to pretend to be one. If we could unify the grep_source
+	 * and diff_filespec structs, this mess could just go away.
+	 */
+	df = alloc_filespec(gs->path);
+	switch (gs->type) {
+	case GREP_SOURCE_SHA1:
+		fill_filespec(df, gs->identifier, 1, 0100644);
+		break;
+	case GREP_SOURCE_FILE:
+		fill_filespec(df, null_sha1, 0, 0100644);
+		break;
+	default:
+		die("BUG: attempt to textconv something without a path?");
+	}
+
+	/*
+	 * fill_textconv is not remotely thread-safe; it may load objects
+	 * behind the scenes, and it modifies the global diff tempfile
+	 * structure.
+	 */
+	grep_read_lock();
+	size = fill_textconv(driver, df, &buf);
+	grep_read_unlock();
+	free_filespec(df);
+
+	/*
+	 * The normal fill_textconv usage by the diff machinery would just keep
+	 * the textconv'd buf separate from the diff_filespec. But much of the
+	 * grep code passes around a grep_source and assumes that its "buf"
+	 * pointer is the beginning of the thing we are searching. So let's
+	 * install our textconv'd version into the grep_source, taking care not
+	 * to leak any existing buffer.
+	 */
+	grep_source_clear_data(gs);
+	gs->buf = buf;
+	gs->size = size;
+
+	return 0;
+}
+
 static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int collect_hits)
 {
 	char *bol;
@@ -1332,6 +1386,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	unsigned count = 0;
 	int try_lookahead = 0;
 	int show_function = 0;
+	struct userdiff_driver *textconv = NULL;
 	enum grep_context ctx = GREP_CONTEXT_HEAD;
 	xdemitconf_t xecfg;
 
@@ -1353,19 +1408,36 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	}
 	opt->last_shown = 0;
 
-	switch (opt->binary) {
-	case GREP_BINARY_DEFAULT:
-		if (grep_source_is_binary(gs))
-			binary_match_only = 1;
-		break;
-	case GREP_BINARY_NOMATCH:
-		if (grep_source_is_binary(gs))
-			return 0; /* Assume unmatch */
-		break;
-	case GREP_BINARY_TEXT:
-		break;
-	default:
-		die("bug: unknown binary handling mode");
+	if (opt->allow_textconv) {
+		grep_source_load_driver(gs);
+		/*
+		 * We might set up the shared textconv cache data here, which
+		 * is not thread-safe.
+		 */
+		grep_attr_lock();
+		textconv = userdiff_get_textconv(gs->driver);
+		grep_attr_unlock();
+	}
+
+	/*
+	 * We know the result of a textconv is text, so we only have to care
+	 * about binary handling if we are not using it.
+	 */
+	if (!textconv) {
+		switch (opt->binary) {
+		case GREP_BINARY_DEFAULT:
+			if (grep_source_is_binary(gs))
+				binary_match_only = 1;
+			break;
+		case GREP_BINARY_NOMATCH:
+			if (grep_source_is_binary(gs))
+				return 0; /* Assume unmatch */
+			break;
+		case GREP_BINARY_TEXT:
+			break;
+		default:
+			die("bug: unknown binary handling mode");
+		}
 	}
 
 	memset(&xecfg, 0, sizeof(xecfg));
@@ -1373,7 +1445,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 
 	try_lookahead = should_lookahead(opt);
 
-	if (grep_source_load(gs) < 0)
+	if (fill_textconv_grep(textconv, gs) < 0)
 		return 0;
 
 	bol = gs->buf;
diff --git a/grep.h b/grep.h
index e4a1df5..eaaced1 100644
--- a/grep.h
+++ b/grep.h
@@ -107,6 +107,7 @@ struct grep_opt {
 #define GREP_BINARY_NOMATCH	1
 #define GREP_BINARY_TEXT	2
 	int binary;
+	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
 	int pcre;
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index a1fd0b2..a7fe94a 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -160,8 +160,26 @@ test_expect_success 'grep does not obey textconv' '
 	test_must_fail git grep Qfile
 '
 
+test_expect_success 'grep --textconv does obey textconv' '
+	echo "a:binaryQfile" >expect &&
+	git grep --textconv Qfile >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-textconv does not obey textconv' '
+	test_must_fail git grep Qfile
+'
+
 test_expect_success 'grep blob does not obey textconv' '
 	test_must_fail git grep Qfile HEAD:a
 '
 
+test_expect_success 'grep --textconv blob does not obey textconv' '
+	test_must_fail git grep --textconv Qfile HEAD:a
+'
+
+test_expect_success 'grep --no-textconv blob does not obey textconv' '
+	test_must_fail git grep --no-textconv Qfile HEAD:a
+'
+
 test_done
-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 6/6] grep: obey --textconv for the case rev:path
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
                   ` (4 preceding siblings ...)
  2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber
@ 2013-04-19 16:44 ` Michael J Gruber
  2013-04-20  4:24   ` Jeff King
  2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw)
  To: git

Make "grep" obey the "--textconv" option also for the object case, i.e.
when used with an argument "rev:path".

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/grep.c         | 11 ++++++-----
 object.c               | 26 ++++++++++++++++++++------
 object.h               |  2 ++
 t/t7008-grep-binary.sh |  6 ++++--
 4 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 00ee57d..bb7f970 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -458,10 +458,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name)
+		       struct object *obj, const char *name, struct object_context *oc)
 {
 	if (obj->type == OBJ_BLOB)
-		return grep_sha1(opt, obj->sha1, name, 0, NULL);
+		return grep_sha1(opt, obj->sha1, name, 0, oc ? oc->path : NULL);
 	if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) {
 		struct tree_desc tree;
 		void *data;
@@ -503,7 +503,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
 		real_obj = deref_tag(list->objects[i].item, NULL, 0);
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].context)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *arg = argv[i];
 		unsigned char sha1[20];
+		struct object_context oc;
 		/* Is it a rev? */
-		if (!get_sha1(arg, sha1)) {
+		if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
 			struct object *object = parse_object_or_die(sha1, arg);
 			if (!seen_dashdash)
 				verify_non_filename(prefix, arg);
-			add_object_array(object, arg, &list);
+			add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context)));
 			continue;
 		}
 		if (!strcmp(arg, "--")) {
diff --git a/object.c b/object.c
index 20703f5..c8ffc9e 100644
--- a/object.c
+++ b/object.c
@@ -255,12 +255,7 @@ int object_list_contains(struct object_list *list, struct object *obj)
 	return 0;
 }
 
-void add_object_array(struct object *obj, const char *name, struct object_array *array)
-{
-	add_object_array_with_mode(obj, name, array, S_IFINVALID);
-}
-
-void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
+static void add_object_array_with_mode_context(struct object *obj, const char *name, struct object_array *array, unsigned mode, struct object_context *context)
 {
 	unsigned nr = array->nr;
 	unsigned alloc = array->alloc;
@@ -275,9 +270,28 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj
 	objects[nr].item = obj;
 	objects[nr].name = name;
 	objects[nr].mode = mode;
+	objects[nr].context = context;
 	array->nr = ++nr;
 }
 
+void add_object_array(struct object *obj, const char *name, struct object_array *array)
+{
+	add_object_array_with_mode(obj, name, array, S_IFINVALID);
+}
+
+void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
+{
+	add_object_array_with_mode_context(obj, name, array, mode, NULL);
+}
+
+void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context)
+{
+	if (context)
+		add_object_array_with_mode_context(obj, name, array, context->mode, context);
+	else
+		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
+}
+
 void object_array_remove_duplicates(struct object_array *array)
 {
 	unsigned int ref, src, dst;
diff --git a/object.h b/object.h
index 97d384b..695847d 100644
--- a/object.h
+++ b/object.h
@@ -13,6 +13,7 @@ struct object_array {
 		struct object *item;
 		const char *name;
 		unsigned mode;
+		struct object_context *context;
 	} *objects;
 };
 
@@ -85,6 +86,7 @@ int object_list_contains(struct object_list *list, struct object *obj);
 /* Object array handling .. */
 void add_object_array(struct object *obj, const char *name, struct object_array *array);
 void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode);
+void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context);
 void object_array_remove_duplicates(struct object_array *);
 
 void clear_object_flags(unsigned flags);
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index a7fe94a..ef78abe 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -174,8 +174,10 @@ test_expect_success 'grep blob does not obey textconv' '
 	test_must_fail git grep Qfile HEAD:a
 '
 
-test_expect_success 'grep --textconv blob does not obey textconv' '
-	test_must_fail git grep --textconv Qfile HEAD:a
+test_expect_success 'grep --textconv blob does obey textconv' '
+	echo "HEAD:a:binaryQfile" >expect &&
+	git grep --textconv Qfile HEAD:a >actual &&
+	test_cmp expect actual
 '
 
 test_expect_success 'grep --no-textconv blob does not obey textconv' '
-- 
1.8.2.1.728.ge98e8b0

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH 3/6] cat-file: do not die on --textconv without textconv filters
  2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber
@ 2013-04-19 18:15   ` Junio C Hamano
  2013-04-20  4:17   ` Jeff King
  1 sibling, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-04-19 18:15 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

Michael J Gruber <git@drmicha.warpmail.net> writes:

> When a command is supposed to use textconv filters (by default or with
> "--textconv") and none are configured then the blob is output without
> conversion; the only exception to this rule is "cat-file --textconv".

I am of two minds.  Because cat-file is mostly a low-level plumbing,
I do not necessarily think it is a bad behaviour for it to error out
when it was asked to apply textconv where there is no filter or when
the filter fails to produce an output.  On the other hand, it
certainly makes it more convenient for callers that do not care too
deeply, taking textconv as a mere hint just like Porcelains do.

But assuming that this is the direction we would want to go...

> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 40f87b4..dd4e063 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -146,10 +146,11 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  			die("git cat-file --textconv %s: <object> must be <sha1:path>",
>  			    obj_name);
>  
> -		if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
> -			die("git cat-file --textconv: unable to run textconv on %s",
> -			    obj_name);
> -		break;
> +		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
> +			break;
> +
> +		/* otherwise expect a blob */
> +		exp_type = "blob";

Please use the constant string blob_type that is available for all
callers including this one.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 0/6] grep with textconv
  2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
                   ` (5 preceding siblings ...)
  2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber
@ 2013-04-19 18:24 ` Junio C Hamano
  2013-04-20  4:26   ` Jeff King
  2013-04-20 13:32   ` Michael J Gruber
  6 siblings, 2 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-04-19 18:24 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

Michael J Gruber <git@drmicha.warpmail.net> writes:

> This series teaches show and grep to obey textconv: show by
> default (like diff), grep only on request (--textconv).  We might
> switch the default for the latter also, of course.  I'd actually
> like that.
>
> Compared to an earlier (historic) series this one comes with tests.

It would have been nicer if you referred to the previous thread

cf.

  http://thread.gmane.org/gmane.comp.version-control.git/215385

>   grep: allow to use textconv filters

This looked mostly sensible except for one minor "eh, do we really
need to assume textconv output is text, or wouldn't using the same
codepath for raw blob and textconv result to make them consistently
honor opt->binary easier to explain?".

>   t4030: demonstrate behavior of show with textconv
>   t7008: demonstrate behavior of grep with textconv

It somehow felt they are better together in the patches that
implement the features they exercise.

>   show: obey --textconv for blobs
>   cat-file: do not die on --textconv without textconv filters
>   grep: obey --textconv for the case rev:path

I just let my eyes coast over these but didn't see anything
obviously wrong.

By the way, "git log --no-merges | grep obey | wc -l" shows that we
say "honor an option" a lot more than "obey an option".  We may want
to be consistent here.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/6] t4030: demonstrate behavior of show with textconv
  2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber
@ 2013-04-20  4:04   ` Jeff King
  2013-04-20 13:35     ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-04-20  4:04 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

On Fri, Apr 19, 2013 at 06:44:44PM +0200, Michael J Gruber wrote:

> "git show <commit>" obeys the textconc setting while "git show <blob>"
> does not. Demonstrate this in the test.

s/textconc/textconv

> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
> index 53ec330..f314ced 100755
> --- a/t/t4030-diff-textconv.sh
> +++ b/t/t4030-diff-textconv.sh
> @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' '
>  	test_cmp expect.text actual
>  '
>  
> +test_expect_success 'show commit produces text' '
> +	git show HEAD >diff &&
> +	find_diff <diff >actual &&
> +	test_cmp expect.text actual
> +'

Makes sense.

> +test_expect_success 'show blob produces binary' '
> +	git show HEAD:file >actual &&
> +	printf "\\0\\n\\1\\n" >expect &&
> +	test_cmp expect actual
> +'

I think this is probably the right thing. I can see instances where one
would want the converted contents, but we have "cat-file --textconv" for
that.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber
@ 2013-04-20  4:06   ` Jeff King
  2013-04-20 13:38     ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-04-20  4:06 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

On Fri, Apr 19, 2013 at 06:44:45PM +0200, Michael J Gruber wrote:

> Currently, "diff" and "cat-file" for blobs obey "--textconv" options
> (with the former defaulting to "--textconv" and the latter to
> "--no-textconv") whereas "show" does not obey this option, even though
> it takes diff options.
> 
> Make "show" on blobs behave like "diff", i.e. obey "--textconv" by
> default and "--no-textconv" when given.

Wait, this does the opposite of the last patch. If we do want to do
this, shouldn't the last one have been an "expect_failure"?

I'm not convinced this is the right thing to do, though. It would break:

  git show HEAD:file.c >file.c

Admittedly, such people should be using "checkout" or "cat-file", so I
do not mind too much breaking them if there is a good reason. But I am
not sure what that reason is.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 3/6] cat-file: do not die on --textconv without textconv filters
  2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber
  2013-04-19 18:15   ` Junio C Hamano
@ 2013-04-20  4:17   ` Jeff King
  2013-04-20 14:27     ` Michael J Gruber
  1 sibling, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-04-20  4:17 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

On Fri, Apr 19, 2013 at 06:44:46PM +0200, Michael J Gruber wrote:

> -			die("git cat-file --textconv: unable to run textconv on %s",
> -			    obj_name);
> -		break;
> +		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
> +			break;
> +
> +		/* otherwise expect a blob */
> +		exp_type = "blob";
>  
>  	case 0:
>  		if (type_from_string(exp_type) == OBJ_BLOB) {

I'm not sure this is right. What happens with:

  git cat-file --textconv HEAD:Documentation

We have failed to textconv, but should we be expecting a blob?

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 6/6] grep: obey --textconv for the case rev:path
  2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber
@ 2013-04-20  4:24   ` Jeff King
  2013-04-20 14:42     ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-04-20  4:24 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

On Fri, Apr 19, 2013 at 06:44:49PM +0200, Michael J Gruber wrote:

> @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  	for (i = 0; i < argc; i++) {
>  		const char *arg = argv[i];
>  		unsigned char sha1[20];
> +		struct object_context oc;
>  		/* Is it a rev? */
> -		if (!get_sha1(arg, sha1)) {
> +		if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
>  			struct object *object = parse_object_or_die(sha1, arg);
>  			if (!seen_dashdash)
>  				verify_non_filename(prefix, arg);
> -			add_object_array(object, arg, &list);
> +			add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context)));

Hrm. I'm not excited about the extra allocation here. Who frees it?

> +void add_object_array(struct object *obj, const char *name, struct object_array *array)
> +{
> +	add_object_array_with_mode(obj, name, array, S_IFINVALID);
> +}
> +
> +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
> +{
> +	add_object_array_with_mode_context(obj, name, array, mode, NULL);
> +}
> +
> +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context)
> +{
> +	if (context)
> +		add_object_array_with_mode_context(obj, name, array, context->mode, context);
> +	else
> +		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
> +}

And this mass of almost-the-same functions is gross, too, especially
given that the object_context contains a mode itself.

Unfortunately, I'm not sure if I have a more pleasant suggestion. I seem
to recall wrestling with this issue during the last round, too.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 0/6] grep with textconv
  2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano
@ 2013-04-20  4:26   ` Jeff King
  2013-04-20 13:32   ` Michael J Gruber
  1 sibling, 0 replies; 78+ messages in thread
From: Jeff King @ 2013-04-20  4:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git

On Fri, Apr 19, 2013 at 11:24:16AM -0700, Junio C Hamano wrote:

> >   grep: allow to use textconv filters
> 
> This looked mostly sensible except for one minor "eh, do we really
> need to assume textconv output is text, or wouldn't using the same
> codepath for raw blob and textconv result to make them consistently
> honor opt->binary easier to explain?".

I don't mind re-checking the textconv output for binary-ness. But I did
it that way for consistency with the diff code-path, which also assumes
that textconv output is not binary.

> By the way, "git log --no-merges | grep obey | wc -l" shows that we
> say "honor an option" a lot more than "obey an option".  We may want
> to be consistent here.

Yeah, it is a pretty minor thing, but I agree that "honor" sounds
better.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 5/6] grep: allow to use textconv filters
  2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber
@ 2013-04-20  4:31   ` Jeff King
  0 siblings, 0 replies; 78+ messages in thread
From: Jeff King @ 2013-04-20  4:31 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

On Fri, Apr 19, 2013 at 06:44:48PM +0200, Michael J Gruber wrote:

> From: Jeff King <peff@peff.net>
> 
> Recently and not so recently, we made sure that log/grep type operations
> use textconv filters when a userfacing diff would do the same:
> 
> ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
> b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
> 0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)
> 
> "git grep" currently does not use textconv filters at all, that is
> neither for displaying the match and context nor for the actual grepping.
> 
> Introduce an option "--textconv" which makes git grep use any configured
> textconv filters for grepping and output purposes. It is off by default.
> 
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
>  builtin/grep.c         |   2 +
>  grep.c                 | 100 ++++++++++++++++++++++++++++++++++++++++++-------
>  grep.h                 |   1 +
>  t/t7008-grep-binary.sh |  18 +++++++++
>  4 files changed, 107 insertions(+), 14 deletions(-)

This patch, of course, is flawless. :)

Feel free to add:

  Signed-off-by: Jeff King <peff@peff.net>

> +	/*
> +	 * We know the result of a textconv is text, so we only have to care
> +	 * about binary handling if we are not using it.
> +	 */
> +	if (!textconv) {
> +		switch (opt->binary) {
> +		case GREP_BINARY_DEFAULT:
> +			if (grep_source_is_binary(gs))
> +				binary_match_only = 1;
> +			break;
> +		case GREP_BINARY_NOMATCH:
> +			if (grep_source_is_binary(gs))
> +				return 0; /* Assume unmatch */
> +			break;
> +		case GREP_BINARY_TEXT:
> +			break;
> +		default:
> +			die("bug: unknown binary handling mode");
> +		}
>  	}

Junio mentioned checking the textconv output for binary-ness. Doing that
would involve removing the outer conditional here. But it's not quite so
simple, as we don't load the textconv results until later, and
grep_source_is_binary will lazily load the contents. I think it would be
sufficient to fill_textconv_grep() right before, and then
grep_source_is_binary would rely on the cached buffer.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 0/6] grep with textconv
  2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano
  2013-04-20  4:26   ` Jeff King
@ 2013-04-20 13:32   ` Michael J Gruber
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
  1 sibling, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-20 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

Junio C Hamano venit, vidit, dixit 19.04.2013 20:24:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> This series teaches show and grep to obey textconv: show by
>> default (like diff), grep only on request (--textconv).  We might
>> switch the default for the latter also, of course.  I'd actually
>> like that.
>>
>> Compared to an earlier (historic) series this one comes with tests.
> 
> It would have been nicer if you referred to the previous thread
> 
> cf.
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/215385

Yes, sorry, I was on a slow mobile connection due to DSL breakage...

>>   grep: allow to use textconv filters
> 
> This looked mostly sensible except for one minor "eh, do we really
> need to assume textconv output is text, or wouldn't using the same
> codepath for raw blob and textconv result to make them consistently
> honor opt->binary easier to explain?".
>

I think we assume in general that textconv produces text, which is maybe
not completely surprising given its name ;)

>>   t4030: demonstrate behavior of show with textconv
>>   t7008: demonstrate behavior of grep with textconv
> 
> It somehow felt they are better together in the patches that
> implement the features they exercise.

I added them after the fact. They can be squashed in, of course. On the
other hand you don't see the change in behavior that the latter patches
introduce any more if you that; which is why I left them separate at
least for review purposes and for camparing to the previous series which
I had failed to reference.

>>   show: obey --textconv for blobs
>>   cat-file: do not die on --textconv without textconv filters
>>   grep: obey --textconv for the case rev:path
> 
> I just let my eyes coast over these but didn't see anything
> obviously wrong.
> 
> By the way, "git log --no-merges | grep obey | wc -l" shows that we
> say "honor an option" a lot more than "obey an option".  We may want
> to be consistent here.

Okay, let's be honorable rather than obedient.

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/6] t4030: demonstrate behavior of show with textconv
  2013-04-20  4:04   ` Jeff King
@ 2013-04-20 13:35     ` Michael J Gruber
  0 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-20 13:35 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

Jeff King venit, vidit, dixit 20.04.2013 06:04:
> On Fri, Apr 19, 2013 at 06:44:44PM +0200, Michael J Gruber wrote:
> 
>> "git show <commit>" obeys the textconc setting while "git show <blob>"
>> does not. Demonstrate this in the test.
> 
> s/textconc/textconv

Thanks, plus s/obey/honor/

>> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
>> index 53ec330..f314ced 100755
>> --- a/t/t4030-diff-textconv.sh
>> +++ b/t/t4030-diff-textconv.sh
>> @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' '
>>  	test_cmp expect.text actual
>>  '
>>  
>> +test_expect_success 'show commit produces text' '
>> +	git show HEAD >diff &&
>> +	find_diff <diff >actual &&
>> +	test_cmp expect.text actual
>> +'
> 
> Makes sense.
> 
>> +test_expect_success 'show blob produces binary' '
>> +	git show HEAD:file >actual &&
>> +	printf "\\0\\n\\1\\n" >expect &&
>> +	test_cmp expect actual
>> +'
> 
> I think this is probably the right thing. I can see instances where one
> would want the converted contents, but we have "cat-file --textconv" for
> that.
> 

By that you mean that this behavior is to stay as is?

My reasoning is twofold:

- consistency between "git show commit" and "git show blob"

- "git show" is a user facing command, and as such should produce output
consumable by humans; whereas "git cat-file" is plumbing and should
produce raw data unless told otherwise (-p, --textconv).

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-20  4:06   ` Jeff King
@ 2013-04-20 13:38     ` Michael J Gruber
  2013-04-21  3:37       ` Jeff King
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-20 13:38 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

Jeff King venit, vidit, dixit 20.04.2013 06:06:
> On Fri, Apr 19, 2013 at 06:44:45PM +0200, Michael J Gruber wrote:
> 
>> Currently, "diff" and "cat-file" for blobs obey "--textconv" options
>> (with the former defaulting to "--textconv" and the latter to
>> "--no-textconv") whereas "show" does not obey this option, even though
>> it takes diff options.
>>
>> Make "show" on blobs behave like "diff", i.e. obey "--textconv" by
>> default and "--no-textconv" when given.
> 
> Wait, this does the opposite of the last patch. If we do want to do
> this, shouldn't the last one have been an "expect_failure"?

The last patch just documents the status quo, which is not a bug per se.
Therefore, no failure, but change in the definition of "success".

> I'm not convinced this is the right thing to do, though. It would break:
> 
>   git show HEAD:file.c >file.c
> 
> Admittedly, such people should be using "checkout" or "cat-file", so I
> do not mind too much breaking them if there is a good reason. But I am
> not sure what that reason is.

My reasoning is twofold:

- consistency between "git show commit" and "git show blob"

- "git show" is a user facing command, and as such should produce output
consumable by humans; whereas "git cat-file" is plumbing and should
produce raw data unless told otherwise (-p, --textconv).

(Sorry for the repeat.)

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 3/6] cat-file: do not die on --textconv without textconv filters
  2013-04-20  4:17   ` Jeff King
@ 2013-04-20 14:27     ` Michael J Gruber
  0 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-20 14:27 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

Jeff King venit, vidit, dixit 20.04.2013 06:17:
> On Fri, Apr 19, 2013 at 06:44:46PM +0200, Michael J Gruber wrote:
> 
>> -			die("git cat-file --textconv: unable to run textconv on %s",
>> -			    obj_name);
>> -		break;
>> +		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
>> +			break;
>> +
>> +		/* otherwise expect a blob */
>> +		exp_type = "blob";
>>  
>>  	case 0:
>>  		if (type_from_string(exp_type) == OBJ_BLOB) {
> 
> I'm not sure this is right. What happens with:
> 
>   git cat-file --textconv HEAD:Documentation
> 
> We have failed to textconv, but should we be expecting a blob?

Very true, thanks. I'll reorder so that the --textconv case continues
(without break) into the -p case. I think it makes sense to consider
"--textconv" to be "at least as pretty as -p".

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 6/6] grep: obey --textconv for the case rev:path
  2013-04-20  4:24   ` Jeff King
@ 2013-04-20 14:42     ` Michael J Gruber
  2013-04-21  3:41       ` Jeff King
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-20 14:42 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

Jeff King venit, vidit, dixit 20.04.2013 06:24:
> On Fri, Apr 19, 2013 at 06:44:49PM +0200, Michael J Gruber wrote:
> 
>> @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>  	for (i = 0; i < argc; i++) {
>>  		const char *arg = argv[i];
>>  		unsigned char sha1[20];
>> +		struct object_context oc;
>>  		/* Is it a rev? */
>> -		if (!get_sha1(arg, sha1)) {
>> +		if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
>>  			struct object *object = parse_object_or_die(sha1, arg);
>>  			if (!seen_dashdash)
>>  				verify_non_filename(prefix, arg);
>> -			add_object_array(object, arg, &list);
>> +			add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context)));
> 
> Hrm. I'm not excited about the extra allocation here. Who frees it?
> 
>> +void add_object_array(struct object *obj, const char *name, struct object_array *array)
>> +{
>> +	add_object_array_with_mode(obj, name, array, S_IFINVALID);
>> +}
>> +
>> +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
>> +{
>> +	add_object_array_with_mode_context(obj, name, array, mode, NULL);
>> +}
>> +
>> +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context)
>> +{
>> +	if (context)
>> +		add_object_array_with_mode_context(obj, name, array, context->mode, context);
>> +	else
>> +		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
>> +}
> 
> And this mass of almost-the-same functions is gross, too, especially
> given that the object_context contains a mode itself.

Well, it's just providing different ways to call into the one and only
function, in order to satisfy different callers' needs. It's not unheard
of (or rather: unseen) in our code, is it?

> Unfortunately, I'm not sure if I have a more pleasant suggestion. I seem
> to recall wrestling with this issue during the last round, too.

Yes, I think that's what we ended up with. At least it's just one
context struct per argument to grep, so it's not that bad after all.

I vaguely seem to recall we had some more general framework cooking but
I may be wrong (I was offline due to sickness for a while). It was about
attaching some additional info to something. Yes, I said "vaguely" ...

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-20 13:38     ` Michael J Gruber
@ 2013-04-21  3:37       ` Jeff King
  2013-04-22 10:29         ` Michael J Gruber
  2013-04-22 15:25         ` Junio C Hamano
  0 siblings, 2 replies; 78+ messages in thread
From: Jeff King @ 2013-04-21  3:37 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Junio C Hamano

On Sat, Apr 20, 2013 at 03:38:53PM +0200, Michael J Gruber wrote:

> > Wait, this does the opposite of the last patch. If we do want to do
> > this, shouldn't the last one have been an "expect_failure"?
> 
> The last patch just documents the status quo, which is not a bug per se.
> Therefore, no failure, but change in the definition of "success".

IMHO, the series is easier to review if you it does not go back and
forth. If you have one patch that says "X is the right behavior", and
then another patch that flips it to say "Y is the right behavior", the
reviewer would read each in sequence and want to be convinced by your
arguments for X and Y. But you probably cannot make a good argument for
X if you are trying to end up at Y. :)

So I'd much rather see the test introduced with the desired end
behavior, marked as expect_failure, and the commit message contain an
argument about why Y is a good thing (and squashing the tests in with
the actual fix is often even better, because the fix itself would want
to contain the same argument).

Just my two cents as a reviewer.

> My reasoning is twofold:
> 
> - consistency between "git show commit" and "git show blob"

I'm not sure I agree with this line of reasoning. "git show commit" is
showing a diff, not the file contents; textconv has always been about
munging the contents to produce a textual diff. It may be reasonable to
extend its definition to "this is the preferred human view of this
content, and that happens to be what you would want to produce a diff".
But I do not think it is necessarily inconsistent not to apply it for
the blob case.

> - "git show" is a user facing command, and as such should produce output
> consumable by humans; whereas "git cat-file" is plumbing and should
> produce raw data unless told otherwise (-p, --textconv).

That holds if the textconv is the only (or best) human-readable version
of the file. And maybe that is reasonable. But is it possible that
somebody uses "textconv" to produce a better diff of some already
human-readable format? For example, imagine I define a textconv for XML
files that normalizes the formatting to make diffs less noisy. When I am
not looking at a diff, what is the best human-readable version? The
original, or the normalized one? I'm not sure.

Note that I'm somewhat playing devil's advocate here. For the cases
where I have used textconv in the real world, I think I would probably
prefer seeing the converted contents, and I am happy to call it user
error if I use "git show HEAD:foo.jpg >bar.jpg" accidentally. But I also
want to make sure we are not regressing somebody else unnecessarily.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 6/6] grep: obey --textconv for the case rev:path
  2013-04-20 14:42     ` Michael J Gruber
@ 2013-04-21  3:41       ` Jeff King
  0 siblings, 0 replies; 78+ messages in thread
From: Jeff King @ 2013-04-21  3:41 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Junio C Hamano

On Sat, Apr 20, 2013 at 04:42:49PM +0200, Michael J Gruber wrote:

> > And this mass of almost-the-same functions is gross, too, especially
> > given that the object_context contains a mode itself.
> 
> Well, it's just providing different ways to call into the one and only
> function, in order to satisfy different callers' needs. It's not unheard
> of (or rather: unseen) in our code, is it?

No, we have instances of it already. And they're ugly, too. :) I think
when we hit more than 2 or 3 wrappers it is time to start thinking
whether they can be consolidated.  I think it is mostly the overlap in
context and mode that makes me find this one particularly ugly. But it's
probably not solvable without some pretty heavy refactoring.

> I vaguely seem to recall we had some more general framework cooking but
> I may be wrong (I was offline due to sickness for a while). It was about
> attaching some additional info to something. Yes, I said "vaguely" ...

Yeah, I really wanted to keep the context inside the object_array, but
it means either wasting a lot of space (due to over-large buffers) or
having the array elements be variable-sized (with a flex-array for the
pathname). And object_array entries already have a memory-leak problem
from the "name" field, which I think we just punt on elsewhere. So I
think this is probably the lesser of the possible evils.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-21  3:37       ` Jeff King
@ 2013-04-22 10:29         ` Michael J Gruber
  2013-04-22 15:25         ` Junio C Hamano
  1 sibling, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-22 10:29 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

Jeff King venit, vidit, dixit 21.04.2013 05:37:
> On Sat, Apr 20, 2013 at 03:38:53PM +0200, Michael J Gruber wrote:
> 
>>> Wait, this does the opposite of the last patch. If we do want to do
>>> this, shouldn't the last one have been an "expect_failure"?
>>
>> The last patch just documents the status quo, which is not a bug per se.
>> Therefore, no failure, but change in the definition of "success".
> 
> IMHO, the series is easier to review if you it does not go back and
> forth. If you have one patch that says "X is the right behavior", and
> then another patch that flips it to say "Y is the right behavior", the
> reviewer would read each in sequence and want to be convinced by your
> arguments for X and Y. But you probably cannot make a good argument for
> X if you are trying to end up at Y. :)
> 
> So I'd much rather see the test introduced with the desired end
> behavior, marked as expect_failure, and the commit message contain an
> argument about why Y is a good thing (and squashing the tests in with
> the actual fix is often even better, because the fix itself would want
> to contain the same argument).
> 
> Just my two cents as a reviewer.
> 
>> My reasoning is twofold:
>>
>> - consistency between "git show commit" and "git show blob"
> 
> I'm not sure I agree with this line of reasoning. "git show commit" is
> showing a diff, not the file contents; textconv has always been about
> munging the contents to produce a textual diff. It may be reasonable to
> extend its definition to "this is the preferred human view of this
> content, and that happens to be what you would want to produce a diff".
> But I do not think it is necessarily inconsistent not to apply it for
> the blob case.
> 
>> - "git show" is a user facing command, and as such should produce output
>> consumable by humans; whereas "git cat-file" is plumbing and should
>> produce raw data unless told otherwise (-p, --textconv).
> 
> That holds if the textconv is the only (or best) human-readable version
> of the file. And maybe that is reasonable. But is it possible that
> somebody uses "textconv" to produce a better diff of some already
> human-readable format? For example, imagine I define a textconv for XML
> files that normalizes the formatting to make diffs less noisy. When I am
> not looking at a diff, what is the best human-readable version? The
> original, or the normalized one? I'm not sure.
> 
> Note that I'm somewhat playing devil's advocate here. For the cases
> where I have used textconv in the real world, I think I would probably
> prefer seeing the converted contents, and I am happy to call it user
> error if I use "git show HEAD:foo.jpg >bar.jpg" accidentally. But I also
> want to make sure we are not regressing somebody else unnecessarily.

Yes, the thing is that textconv helps diff by converting content (to
text) before the (textual) diff. So it's somehow a double-faced beast.

It's clearly activated by a "diff" attribute; so that would be a strong
argument against my patch, at least against defaulting to --textconv for
blobs.

OTOH, textconv does have this aspect of converting text to a form
digestable by humans (pre-diff, granted), which is the argument for
defaulting to --textconv in porcellain.

We could use a separate attribute "show" in addition to "diff", but I
don't think it's worth going there, unless there is a strong use case
for "diff-specific textconv" which one would not want to apply when
showing just the content.

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-21  3:37       ` Jeff King
  2013-04-22 10:29         ` Michael J Gruber
@ 2013-04-22 15:25         ` Junio C Hamano
  2013-04-22 15:29           ` Jeff King
  1 sibling, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-22 15:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael J Gruber, git

Jeff King <peff@peff.net> writes:

> Just my two cents as a reviewer.
>
>> My reasoning is twofold:
>> 
>> - consistency between "git show commit" and "git show blob"
>
> I'm not sure I agree with this line of reasoning. "git show commit" is
> showing a diff, not the file contents; textconv has always been about
> munging the contents to produce a textual diff. It may be reasonable to
> extend its definition to "this is the preferred human view of this
> content, and that happens to be what you would want to produce a diff".
> But I do not think it is necessarily inconsistent not to apply it for
> the blob case.

True.  Applying textconv to otherwise unreadable blobs is often
useful, but I agree that it is unexpected if it is done by default,
especially given that many people have learned to do:

	git show HEAD~4:binary-gob >old-binary-gob

to recover old version of binary contents to a temporary file when
checking the sanity of or restoring the breakage in the new one.

It of course does _not_ forbid

	git show --textconv HEAD~4:binary-gob | less

but I doubt it is a good idea to turn it on by default this late in
the game.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-22 15:25         ` Junio C Hamano
@ 2013-04-22 15:29           ` Jeff King
  2013-04-22 15:37             ` Jeremy Rosen
  0 siblings, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-04-22 15:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git

On Mon, Apr 22, 2013 at 08:25:41AM -0700, Junio C Hamano wrote:

> True.  Applying textconv to otherwise unreadable blobs is often
> useful, but I agree that it is unexpected if it is done by default,
> especially given that many people have learned to do:
> 
> 	git show HEAD~4:binary-gob >old-binary-gob
> 
> to recover old version of binary contents to a temporary file when
> checking the sanity of or restoring the breakage in the new one.
> 
> It of course does _not_ forbid
> 
> 	git show --textconv HEAD~4:binary-gob | less
> 
> but I doubt it is a good idea to turn it on by default this late in
> the game.

Exactly. I certainly do not mind it as an option, and I am on the fence
regarding it as a default (I think it might have been a sane thing to do
from the start, but at this point the change-of-behavior makes me
hesitate). So I am perfectly willing to go either way, depending on what
others think.

I'm going to be out of email contact the rest of the week, so I'll let
you two talk it out. :)

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-22 15:29           ` Jeff King
@ 2013-04-22 15:37             ` Jeremy Rosen
  2013-04-22 15:54               ` Matthieu Moy
  0 siblings, 1 reply; 78+ messages in thread
From: Jeremy Rosen @ 2013-04-22 15:37 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael J Gruber, git, Junio C Hamano

> > 	git show --textconv HEAD~4:binary-gob | less
> > 
> > but I doubt it is a good idea to turn it on by default this late in
> > the game.
> 
> Exactly. I certainly do not mind it as an option, and I am on the
> fence
> regarding it as a default (I think it might have been a sane thing to
> do
> from the start, but at this point the change-of-behavior makes me
> hesitate). So I am perfectly willing to go either way, depending on
> what
> others think.
> 


some features detect if they are piping to a terminal... couldn't we do
something like that ?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-22 15:37             ` Jeremy Rosen
@ 2013-04-22 15:54               ` Matthieu Moy
  2013-04-23  8:58                 ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Matthieu Moy @ 2013-04-22 15:54 UTC (permalink / raw)
  To: Jeremy Rosen; +Cc: Jeff King, Michael J Gruber, git, Junio C Hamano

Jeremy Rosen <jeremy.rosen@openwide.fr> writes:

> some features detect if they are piping to a terminal... couldn't we do
> something like that ?

That's OK for convenience features like colors or so, but that would be
really, really unexpected to have

$ git show HEAD:file
foo
$ git show HEAD:file > tmp
$ cat tmp
bar

IMHO.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 2/6] show: obey --textconv for blobs
  2013-04-22 15:54               ` Matthieu Moy
@ 2013-04-23  8:58                 ` Michael J Gruber
  0 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23  8:58 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Jeremy Rosen, Jeff King, git, Junio C Hamano

Matthieu Moy venit, vidit, dixit 22.04.2013 17:54:
> Jeremy Rosen <jeremy.rosen@openwide.fr> writes:
> 
>> some features detect if they are piping to a terminal... couldn't we do
>> something like that ?
> 
> That's OK for convenience features like colors or so, but that would be
> really, really unexpected to have
> 
> $ git show HEAD:file
> foo
> $ git show HEAD:file > tmp
> $ cat tmp
> bar
> 
> IMHO.

Yes, I'd either do it by default in general (my preference) or on
request, but not based on tty.

Another point of input: You can do

git show commit <blob> <commit> <blob>

and also with other object types, of course. On the other hand, there is
a single rev.diffopt. Besides the nuisance of having to track whether
textconv has been specified explicitely and flipping the bit in
rev.diffopt per argument (or adding a parameter), which is an
implementation detail, it would mean that the default for different
arguments in the argument list is different, depending on type. And that
is a usablility issue, I would argue:

Is textconv on by default for git show? Yes and no, for some arguments
yes, for others no.

That's what I want to cure ;)

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCHv2 0/7] grep with textconv
  2013-04-20 13:32   ` Michael J Gruber
@ 2013-04-23 12:11     ` Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber
                         ` (6 more replies)
  0 siblings, 7 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

Here's a reroll, with the following changes:

* Use "honor" for obey".

* Fixed the issue with --textconv and non-blobs.

* Restructured tests as per Jeff's preference.

* Added 7/ which flips the default for git grep to textconv.

Jeff King (1):
  grep: allow to use textconv filters

Michael J Gruber (6):
  t4030: demonstrate behavior of show with textconv
  show: obey --textconv for blobs
  cat-file: do not die on --textconv without textconv filters
  t7008: demonstrate behavior of grep with textconv
  grep: honor --textconv for the case rev:path
  git grep: honor textconv by default

 Documentation/git-grep.txt   |   9 +++-
 builtin/cat-file.c           |  18 ++++----
 builtin/grep.c               |  13 +++---
 builtin/log.c                |  24 ++++++++--
 grep.c                       | 102 +++++++++++++++++++++++++++++++++++++------
 grep.h                       |   1 +
 object.c                     |  26 ++++++++---
 object.h                     |   2 +
 t/t4030-diff-textconv.sh     |  18 ++++++++
 t/t7008-grep-binary.sh       |  43 ++++++++++++++++++
 t/t8007-cat-file-textconv.sh |  20 +++------
 11 files changed, 222 insertions(+), 54 deletions(-)

-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCHv2 1/7] t4030: demonstrate behavior of show with textconv
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 15:11         ` Junio C Hamano
  2013-04-23 12:11       ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber
                         ` (5 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

"git show <commit>" honors the textconv setting while "git show <blob>"
does not. Demonstrate this in the test.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 t/t4030-diff-textconv.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index 53ec330..260ea92 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -58,6 +58,12 @@ test_expect_success 'diff produces text' '
 	test_cmp expect.text actual
 '
 
+test_expect_success 'show commit produces text' '
+	git show HEAD >diff &&
+	find_diff <diff >actual &&
+	test_cmp expect.text actual
+'
+
 test_expect_success 'diff-tree produces binary' '
 	git diff-tree -p HEAD^ HEAD >diff &&
 	find_diff <diff >actual &&
@@ -84,6 +90,12 @@ test_expect_success 'status -v produces text' '
 	git reset --soft HEAD@{1}
 '
 
+test_expect_failure 'show blob produces text' '
+	git show HEAD:file >actual &&
+	printf "0\\n1\\n" >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
 	echo one >expect &&
 	git log --root --format=%s -G0 >actual &&
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv2 2/7] show: obey --textconv for blobs
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 15:14         ` Junio C Hamano
  2013-04-23 12:11       ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
                         ` (4 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

Currently, "diff" and "cat-file" for blobs honor "--textconv" options
(with the former defaulting to "--textconv" and the latter to
"--no-textconv") whereas "show" does not honor this option, even though
it takes diff options.

Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
default and "--no-textconv" when given.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/log.c            | 24 +++++++++++++++++++++---
 t/t4030-diff-textconv.sh |  8 +++++++-
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 5f3ed77..fe0275e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev)
 	strbuf_release(&out);
 }
 
-static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
+static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
 {
+	unsigned char sha1c[20];
+	struct object_context obj_context;
+	char *buf;
+	unsigned long size;
+
 	fflush(stdout);
-	return stream_blob_to_fd(1, sha1, NULL, 0);
+	if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
+		return stream_blob_to_fd(1, sha1, NULL, 0);
+
+	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
+		die("Not a valid object name %s", obj_name);
+	if (!obj_context.path[0] ||
+	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
+		return stream_blob_to_fd(1, sha1, NULL, 0);
+
+	if (!buf)
+		die("git show %s: bad file", obj_name);
+
+	write_or_die(1, buf, size);
+	return 0;
 }
 
 static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
@@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 		const char *name = objects[i].name;
 		switch (o->type) {
 		case OBJ_BLOB:
-			ret = show_blob_object(o->sha1, NULL);
+			ret = show_blob_object(o->sha1, &rev, name);
 			break;
 		case OBJ_TAG: {
 			struct tag *t = (struct tag *)o;
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index 260ea92..f9d55e1 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -90,12 +90,18 @@ test_expect_success 'status -v produces text' '
 	git reset --soft HEAD@{1}
 '
 
-test_expect_failure 'show blob produces text' '
+test_expect_success 'show blob produces text' '
 	git show HEAD:file >actual &&
 	printf "0\\n1\\n" >expect &&
 	test_cmp expect actual
 '
 
+test_expect_success 'show --no-textconv blob produces binary' '
+	git show --no-textconv HEAD:file >actual &&
+	printf "\\0\\n\\1\\n" >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
 	echo one >expect &&
 	git log --root --format=%s -G0 >actual &&
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 15:15         ` Junio C Hamano
  2013-04-23 12:11       ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
                         ` (3 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

When a command is supposed to use textconv filters (by default or with
"--textconv") and none are configured then the blob is output without
conversion; the only exception to this rule is "cat-file --textconv".

Make it behave like the rest of textconv aware commands.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/cat-file.c           | 18 ++++++++----------
 t/t8007-cat-file-textconv.sh | 20 +++++---------------
 2 files changed, 13 insertions(+), 25 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 045cee7..bd62373 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -48,6 +48,14 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	case 'e':
 		return !has_sha1_file(sha1);
 
+	case 'c':
+		if (!obj_context.path[0])
+			die("git cat-file --textconv %s: <object> must be <sha1:path>",
+			    obj_name);
+
+		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
+			break;
+
 	case 'p':
 		type = sha1_object_info(sha1, NULL);
 		if (type < 0)
@@ -70,16 +78,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 		/* otherwise just spit out the data */
 		break;
 
-	case 'c':
-		if (!obj_context.path[0])
-			die("git cat-file --textconv %s: <object> must be <sha1:path>",
-			    obj_name);
-
-		if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
-			die("git cat-file --textconv: unable to run textconv on %s",
-			    obj_name);
-		break;
-
 	case 0:
 		if (type_from_string(exp_type) == OBJ_BLOB) {
 			unsigned char blob_sha1[20];
diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh
index 78a0085..83c6636 100755
--- a/t/t8007-cat-file-textconv.sh
+++ b/t/t8007-cat-file-textconv.sh
@@ -22,11 +22,11 @@ test_expect_success 'setup ' '
 '
 
 cat >expected <<EOF
-fatal: git cat-file --textconv: unable to run textconv on :one.bin
+bin: test version 2
 EOF
 
 test_expect_success 'no filter specified' '
-	git cat-file --textconv :one.bin 2>result
+	git cat-file --textconv :one.bin >result &&
 	test_cmp expected result
 '
 
@@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' '
 	git config diff.test.cachetextconv false
 '
 
-cat >expected <<EOF
-bin: test version 2
-EOF
-
 test_expect_success 'cat-file without --textconv' '
 	git cat-file blob :one.bin >result &&
 	test_cmp expected result
@@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' '
 '
 
 test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' '
+	printf "%s" "one.bin" >expected &&
 	git cat-file blob :symlink.bin >result &&
-	printf "%s" "one.bin" >expected
 	test_cmp expected result
 '
 
 
 test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' '
-	! git cat-file --textconv :symlink.bin 2>result &&
-	cat >expected <<\EOF &&
-fatal: git cat-file --textconv: unable to run textconv on :symlink.bin
-EOF
+	git cat-file --textconv :symlink.bin >result &&
 	test_cmp expected result
 '
 
 test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' '
-	! git cat-file --textconv HEAD:symlink.bin 2>result &&
-	cat >expected <<EOF &&
-fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin
-EOF
+	git cat-file --textconv HEAD:symlink.bin >result &&
 	test_cmp expected result
 '
 
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
                         ` (2 preceding siblings ...)
  2013-04-23 12:11       ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 15:16         ` Junio C Hamano
  2013-04-23 12:11       ` [PATCHv2 5/7] grep: allow to use textconv filters Michael J Gruber
                         ` (2 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

Currently, "git grep" does not honor any textconv filters. Demonstrate
this in the tests.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 t/t7008-grep-binary.sh | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 26f8319..126fe4c 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -145,4 +145,27 @@ test_expect_success 'grep respects not-binary diff attribute' '
 	test_cmp expect actual
 '
 
+cat >nul_to_q_textconv <<'EOF'
+#!/bin/sh
+"$PERL_PATH" -pe 'y/\000/Q/' < "$1"
+EOF
+chmod +x nul_to_q_textconv
+
+test_expect_success 'setup textconv filters' '
+	echo a diff=foo >.gitattributes &&
+	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
+'
+
+test_expect_failure 'grep does not honor textconv' '
+	echo "a:binaryQfile" >expect &&
+	git grep Qfile >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep blob does not honor textconv' '
+	echo "HEAD:a:binaryQfile" >expect &&
+	git grep Qfile HEAD:a >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv2 5/7] grep: allow to use textconv filters
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
                         ` (3 preceding siblings ...)
  2013-04-23 12:11       ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 6/7] grep: honor --textconv for the case rev:path Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

From: Jeff King <peff@peff.net>

Recently and not so recently, we made sure that log/grep type operations
use textconv filters when a userfacing diff would do the same:

ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)

"git grep" currently does not use textconv filters at all, that is
neither for displaying the match and context nor for the actual grepping.

Introduce an option "--textconv" which makes git grep use any configured
textconv filters for grepping and output purposes. It is off by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 Documentation/git-grep.txt |   9 +++-
 builtin/grep.c             |   2 +
 grep.c                     | 100 ++++++++++++++++++++++++++++++++++++++-------
 grep.h                     |   1 +
 t/t7008-grep-binary.sh     |  20 +++++++++
 5 files changed, 117 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 50d46e1..a5c5a27 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -9,7 +9,7 @@ git-grep - Print lines matching a pattern
 SYNOPSIS
 --------
 [verse]
-'git grep' [-a | --text] [-I] [-i | --ignore-case] [-w | --word-regexp]
+'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp]
 	   [-v | --invert-match] [-h|-H] [--full-name]
 	   [-E | --extended-regexp] [-G | --basic-regexp]
 	   [-P | --perl-regexp]
@@ -80,6 +80,13 @@ OPTIONS
 --text::
 	Process binary files as if they were text.
 
+--textconv::
+	Honor textconv filter settings.
+
+--no-textconv::
+	Do not honor textconv filter settings.
+	This is the default.
+
 -i::
 --ignore-case::
 	Ignore case differences between the patterns and the
diff --git a/builtin/grep.c b/builtin/grep.c
index 159e65d..00ee57d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -659,6 +659,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		OPT_SET_INT('I', NULL, &opt.binary,
 			N_("don't match patterns in binary files"),
 			GREP_BINARY_NOMATCH),
+		OPT_BOOL(0, "textconv", &opt.allow_textconv,
+			 N_("process binary files with textconv filters")),
 		{ OPTION_INTEGER, 0, "max-depth", &opt.max_depth, N_("depth"),
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
diff --git a/grep.c b/grep.c
index bb548ca..c668034 100644
--- a/grep.c
+++ b/grep.c
@@ -2,6 +2,8 @@
 #include "grep.h"
 #include "userdiff.h"
 #include "xdiff-interface.h"
+#include "diff.h"
+#include "diffcore.h"
 
 static int grep_source_load(struct grep_source *gs);
 static int grep_source_is_binary(struct grep_source *gs);
@@ -1322,6 +1324,58 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
+static int fill_textconv_grep(struct userdiff_driver *driver,
+			      struct grep_source *gs)
+{
+	struct diff_filespec *df;
+	char *buf;
+	size_t size;
+
+	if (!driver || !driver->textconv)
+		return grep_source_load(gs);
+
+	/*
+	 * The textconv interface is intimately tied to diff_filespecs, so we
+	 * have to pretend to be one. If we could unify the grep_source
+	 * and diff_filespec structs, this mess could just go away.
+	 */
+	df = alloc_filespec(gs->path);
+	switch (gs->type) {
+	case GREP_SOURCE_SHA1:
+		fill_filespec(df, gs->identifier, 1, 0100644);
+		break;
+	case GREP_SOURCE_FILE:
+		fill_filespec(df, null_sha1, 0, 0100644);
+		break;
+	default:
+		die("BUG: attempt to textconv something without a path?");
+	}
+
+	/*
+	 * fill_textconv is not remotely thread-safe; it may load objects
+	 * behind the scenes, and it modifies the global diff tempfile
+	 * structure.
+	 */
+	grep_read_lock();
+	size = fill_textconv(driver, df, &buf);
+	grep_read_unlock();
+	free_filespec(df);
+
+	/*
+	 * The normal fill_textconv usage by the diff machinery would just keep
+	 * the textconv'd buf separate from the diff_filespec. But much of the
+	 * grep code passes around a grep_source and assumes that its "buf"
+	 * pointer is the beginning of the thing we are searching. So let's
+	 * install our textconv'd version into the grep_source, taking care not
+	 * to leak any existing buffer.
+	 */
+	grep_source_clear_data(gs);
+	gs->buf = buf;
+	gs->size = size;
+
+	return 0;
+}
+
 static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int collect_hits)
 {
 	char *bol;
@@ -1332,6 +1386,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	unsigned count = 0;
 	int try_lookahead = 0;
 	int show_function = 0;
+	struct userdiff_driver *textconv = NULL;
 	enum grep_context ctx = GREP_CONTEXT_HEAD;
 	xdemitconf_t xecfg;
 
@@ -1353,19 +1408,36 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	}
 	opt->last_shown = 0;
 
-	switch (opt->binary) {
-	case GREP_BINARY_DEFAULT:
-		if (grep_source_is_binary(gs))
-			binary_match_only = 1;
-		break;
-	case GREP_BINARY_NOMATCH:
-		if (grep_source_is_binary(gs))
-			return 0; /* Assume unmatch */
-		break;
-	case GREP_BINARY_TEXT:
-		break;
-	default:
-		die("bug: unknown binary handling mode");
+	if (opt->allow_textconv) {
+		grep_source_load_driver(gs);
+		/*
+		 * We might set up the shared textconv cache data here, which
+		 * is not thread-safe.
+		 */
+		grep_attr_lock();
+		textconv = userdiff_get_textconv(gs->driver);
+		grep_attr_unlock();
+	}
+
+	/*
+	 * We know the result of a textconv is text, so we only have to care
+	 * about binary handling if we are not using it.
+	 */
+	if (!textconv) {
+		switch (opt->binary) {
+		case GREP_BINARY_DEFAULT:
+			if (grep_source_is_binary(gs))
+				binary_match_only = 1;
+			break;
+		case GREP_BINARY_NOMATCH:
+			if (grep_source_is_binary(gs))
+				return 0; /* Assume unmatch */
+			break;
+		case GREP_BINARY_TEXT:
+			break;
+		default:
+			die("bug: unknown binary handling mode");
+		}
 	}
 
 	memset(&xecfg, 0, sizeof(xecfg));
@@ -1373,7 +1445,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 
 	try_lookahead = should_lookahead(opt);
 
-	if (grep_source_load(gs) < 0)
+	if (fill_textconv_grep(textconv, gs) < 0)
 		return 0;
 
 	bol = gs->buf;
diff --git a/grep.h b/grep.h
index e4a1df5..eaaced1 100644
--- a/grep.h
+++ b/grep.h
@@ -107,6 +107,7 @@ struct grep_opt {
 #define GREP_BINARY_NOMATCH	1
 #define GREP_BINARY_TEXT	2
 	int binary;
+	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
 	int pcre;
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 126fe4c..1eae6a4 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -162,10 +162,30 @@ test_expect_failure 'grep does not honor textconv' '
 	test_cmp expect actual
 '
 
+test_expect_success 'grep --textconv does honor textconv' '
+	echo "a:binaryQfile" >expect &&
+	git grep --textconv Qfile >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-textconv does not honor textconv' '
+	test_must_fail git grep --no-textconv Qfile
+'
+
 test_expect_failure 'grep blob does not honor textconv' '
 	echo "HEAD:a:binaryQfile" >expect &&
 	git grep Qfile HEAD:a >actual &&
 	test_cmp expect actual
 '
 
+test_expect_failure 'grep --textconv blob does not honor textconv' '
+	echo "HEAD:a:binaryQfile" >expect &&
+	git grep --textconv Qfile HEAD:a >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-textconv blob does not honor textconv' '
+	test_must_fail git grep --no-textconv Qfile HEAD:a
+'
+
 test_done
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv2 6/7] grep: honor --textconv for the case rev:path
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
                         ` (4 preceding siblings ...)
  2013-04-23 12:11       ` [PATCHv2 5/7] grep: allow to use textconv filters Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 12:11       ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

Make "grep" honor the "--textconv" option also for the object case, i.e.
when used with an argument "rev:path".

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/grep.c         | 11 ++++++-----
 object.c               | 26 ++++++++++++++++++++------
 object.h               |  2 ++
 t/t7008-grep-binary.sh |  2 +-
 4 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 00ee57d..bb7f970 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -458,10 +458,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name)
+		       struct object *obj, const char *name, struct object_context *oc)
 {
 	if (obj->type == OBJ_BLOB)
-		return grep_sha1(opt, obj->sha1, name, 0, NULL);
+		return grep_sha1(opt, obj->sha1, name, 0, oc ? oc->path : NULL);
 	if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) {
 		struct tree_desc tree;
 		void *data;
@@ -503,7 +503,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
 		real_obj = deref_tag(list->objects[i].item, NULL, 0);
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].context)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *arg = argv[i];
 		unsigned char sha1[20];
+		struct object_context oc;
 		/* Is it a rev? */
-		if (!get_sha1(arg, sha1)) {
+		if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
 			struct object *object = parse_object_or_die(sha1, arg);
 			if (!seen_dashdash)
 				verify_non_filename(prefix, arg);
-			add_object_array(object, arg, &list);
+			add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context)));
 			continue;
 		}
 		if (!strcmp(arg, "--")) {
diff --git a/object.c b/object.c
index 20703f5..c8ffc9e 100644
--- a/object.c
+++ b/object.c
@@ -255,12 +255,7 @@ int object_list_contains(struct object_list *list, struct object *obj)
 	return 0;
 }
 
-void add_object_array(struct object *obj, const char *name, struct object_array *array)
-{
-	add_object_array_with_mode(obj, name, array, S_IFINVALID);
-}
-
-void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
+static void add_object_array_with_mode_context(struct object *obj, const char *name, struct object_array *array, unsigned mode, struct object_context *context)
 {
 	unsigned nr = array->nr;
 	unsigned alloc = array->alloc;
@@ -275,9 +270,28 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj
 	objects[nr].item = obj;
 	objects[nr].name = name;
 	objects[nr].mode = mode;
+	objects[nr].context = context;
 	array->nr = ++nr;
 }
 
+void add_object_array(struct object *obj, const char *name, struct object_array *array)
+{
+	add_object_array_with_mode(obj, name, array, S_IFINVALID);
+}
+
+void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
+{
+	add_object_array_with_mode_context(obj, name, array, mode, NULL);
+}
+
+void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context)
+{
+	if (context)
+		add_object_array_with_mode_context(obj, name, array, context->mode, context);
+	else
+		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
+}
+
 void object_array_remove_duplicates(struct object_array *array)
 {
 	unsigned int ref, src, dst;
diff --git a/object.h b/object.h
index 97d384b..695847d 100644
--- a/object.h
+++ b/object.h
@@ -13,6 +13,7 @@ struct object_array {
 		struct object *item;
 		const char *name;
 		unsigned mode;
+		struct object_context *context;
 	} *objects;
 };
 
@@ -85,6 +86,7 @@ int object_list_contains(struct object_list *list, struct object *obj);
 /* Object array handling .. */
 void add_object_array(struct object *obj, const char *name, struct object_array *array);
 void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode);
+void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context);
 void object_array_remove_duplicates(struct object_array *);
 
 void clear_object_flags(unsigned flags);
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 1eae6a4..10b2c8b 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -178,7 +178,7 @@ test_expect_failure 'grep blob does not honor textconv' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'grep --textconv blob does not honor textconv' '
+test_expect_success 'grep --textconv blob does honor textconv' '
 	echo "HEAD:a:binaryQfile" >expect &&
 	git grep --textconv Qfile HEAD:a >actual &&
 	test_cmp expect actual
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
                         ` (5 preceding siblings ...)
  2013-04-23 12:11       ` [PATCHv2 6/7] grep: honor --textconv for the case rev:path Michael J Gruber
@ 2013-04-23 12:11       ` Michael J Gruber
  2013-04-23 15:20         ` Junio C Hamano
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw)
  To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano

Currently, "git grep" does not honor textconv settings by default.
Make it honor them by default just like "git log --grep" does.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 Documentation/git-grep.txt | 2 +-
 grep.c                     | 2 ++
 t/t7008-grep-binary.sh     | 4 ++--
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index a5c5a27..f54ac0c 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -82,10 +82,10 @@ OPTIONS
 
 --textconv::
 	Honor textconv filter settings.
+	This is the default.
 
 --no-textconv::
 	Do not honor textconv filter settings.
-	This is the default.
 
 -i::
 --ignore-case::
diff --git a/grep.c b/grep.c
index c668034..161d3f0 100644
--- a/grep.c
+++ b/grep.c
@@ -31,6 +31,7 @@ void init_grep_defaults(void)
 	opt->max_depth = -1;
 	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
 	opt->extended_regexp_option = 0;
+	opt->allow_textconv = 1;
 	strcpy(opt->color_context, "");
 	strcpy(opt->color_filename, "");
 	strcpy(opt->color_function, "");
@@ -134,6 +135,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	opt->pathname = def->pathname;
 	opt->regflags = def->regflags;
 	opt->relative = def->relative;
+	opt->allow_textconv = def->allow_textconv;
 
 	strcpy(opt->color_context, def->color_context);
 	strcpy(opt->color_filename, def->color_filename);
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 10b2c8b..2fc9d9c 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -156,7 +156,7 @@ test_expect_success 'setup textconv filters' '
 	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
 '
 
-test_expect_failure 'grep does not honor textconv' '
+test_expect_success 'grep does honor textconv' '
 	echo "a:binaryQfile" >expect &&
 	git grep Qfile >actual &&
 	test_cmp expect actual
@@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' '
 	test_must_fail git grep --no-textconv Qfile
 '
 
-test_expect_failure 'grep blob does not honor textconv' '
+test_expect_success 'grep blob does honor textconv' '
 	echo "HEAD:a:binaryQfile" >expect &&
 	git grep Qfile HEAD:a >actual &&
 	test_cmp expect actual
-- 
1.8.2.1.799.g1ac2534

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 1/7] t4030: demonstrate behavior of show with textconv
  2013-04-23 12:11       ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber
@ 2013-04-23 15:11         ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-04-23 15:11 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> "git show <commit>" honors the textconv setting while "git show <blob>"
> does not. Demonstrate this in the test.

Should "git show <blob>" run textconv by default?  I somehow had an
impression that people were against it during the discussion on the
previous round.

>
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
>  t/t4030-diff-textconv.sh | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
> index 53ec330..260ea92 100755
> --- a/t/t4030-diff-textconv.sh
> +++ b/t/t4030-diff-textconv.sh
> @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' '
>  	test_cmp expect.text actual
>  '
>  
> +test_expect_success 'show commit produces text' '
> +	git show HEAD >diff &&
> +	find_diff <diff >actual &&
> +	test_cmp expect.text actual
> +'
> +
>  test_expect_success 'diff-tree produces binary' '
>  	git diff-tree -p HEAD^ HEAD >diff &&
>  	find_diff <diff >actual &&
> @@ -84,6 +90,12 @@ test_expect_success 'status -v produces text' '
>  	git reset --soft HEAD@{1}
>  '
>  
> +test_expect_failure 'show blob produces text' '
> +	git show HEAD:file >actual &&
> +	printf "0\\n1\\n" >expect &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
>  	echo one >expect &&
>  	git log --root --format=%s -G0 >actual &&

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 2/7] show: obey --textconv for blobs
  2013-04-23 12:11       ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber
@ 2013-04-23 15:14         ` Junio C Hamano
  2013-04-24 10:09           ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-23 15:14 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

>> Subject: Re: [PATCHv2 2/7] show: obey --textconv for blobs

s/obey/honor/;

> Currently, "diff" and "cat-file" for blobs honor "--textconv" options
> (with the former defaulting to "--textconv" and the latter to
> "--no-textconv") whereas "show" does not honor this option, even though
> it takes diff options.
>
> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
> default and "--no-textconv" when given.

It is the right thing to do to teach it to react to --[no-]textconv;
I am not sure if the default is right, though.

> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
>  builtin/log.c            | 24 +++++++++++++++++++++---
>  t/t4030-diff-textconv.sh |  8 +++++++-
>  2 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 5f3ed77..fe0275e 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev)
>  	strbuf_release(&out);
>  }
>  
> -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
> +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
>  {
> +	unsigned char sha1c[20];
> +	struct object_context obj_context;
> +	char *buf;
> +	unsigned long size;
> +
>  	fflush(stdout);
> -	return stream_blob_to_fd(1, sha1, NULL, 0);
> +	if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
> +		return stream_blob_to_fd(1, sha1, NULL, 0);
> +
> +	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
> +		die("Not a valid object name %s", obj_name);
> +	if (!obj_context.path[0] ||
> +	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
> +		return stream_blob_to_fd(1, sha1, NULL, 0);
> +
> +	if (!buf)
> +		die("git show %s: bad file", obj_name);
> +
> +	write_or_die(1, buf, size);
> +	return 0;
>  }
>  
>  static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
> @@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
>  		const char *name = objects[i].name;
>  		switch (o->type) {
>  		case OBJ_BLOB:
> -			ret = show_blob_object(o->sha1, NULL);
> +			ret = show_blob_object(o->sha1, &rev, name);
>  			break;
>  		case OBJ_TAG: {
>  			struct tag *t = (struct tag *)o;
> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
> index 260ea92..f9d55e1 100755
> --- a/t/t4030-diff-textconv.sh
> +++ b/t/t4030-diff-textconv.sh
> @@ -90,12 +90,18 @@ test_expect_success 'status -v produces text' '
>  	git reset --soft HEAD@{1}
>  '
>  
> -test_expect_failure 'show blob produces text' '
> +test_expect_success 'show blob produces text' '
>  	git show HEAD:file >actual &&
>  	printf "0\\n1\\n" >expect &&
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'show --no-textconv blob produces binary' '
> +	git show --no-textconv HEAD:file >actual &&
> +	printf "\\0\\n\\1\\n" >expect &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
>  	echo one >expect &&
>  	git log --root --format=%s -G0 >actual &&

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters
  2013-04-23 12:11       ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
@ 2013-04-23 15:15         ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-04-23 15:15 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> When a command is supposed to use textconv filters (by default or with
> "--textconv") and none are configured then the blob is output without
> conversion; the only exception to this rule is "cat-file --textconv".
>
> Make it behave like the rest of textconv aware commands.
>
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
>  builtin/cat-file.c           | 18 ++++++++----------
>  t/t8007-cat-file-textconv.sh | 20 +++++---------------
>  2 files changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 045cee7..bd62373 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -48,6 +48,14 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  	case 'e':
>  		return !has_sha1_file(sha1);
>  
> +	case 'c':
> +		if (!obj_context.path[0])
> +			die("git cat-file --textconv %s: <object> must be <sha1:path>",
> +			    obj_name);
> +
> +		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
> +			break;
> +
>  	case 'p':

Yeah, falling back to the 'p' case is a lot more sensible.

>  		type = sha1_object_info(sha1, NULL);
>  		if (type < 0)
> @@ -70,16 +78,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		/* otherwise just spit out the data */
>  		break;
>  
> -	case 'c':
> -		if (!obj_context.path[0])
> -			die("git cat-file --textconv %s: <object> must be <sha1:path>",
> -			    obj_name);
> -
> -		if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
> -			die("git cat-file --textconv: unable to run textconv on %s",
> -			    obj_name);
> -		break;
> -
>  	case 0:
>  		if (type_from_string(exp_type) == OBJ_BLOB) {
>  			unsigned char blob_sha1[20];
> diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh
> index 78a0085..83c6636 100755
> --- a/t/t8007-cat-file-textconv.sh
> +++ b/t/t8007-cat-file-textconv.sh
> @@ -22,11 +22,11 @@ test_expect_success 'setup ' '
>  '
>  
>  cat >expected <<EOF
> -fatal: git cat-file --textconv: unable to run textconv on :one.bin
> +bin: test version 2
>  EOF
>  
>  test_expect_success 'no filter specified' '
> -	git cat-file --textconv :one.bin 2>result
> +	git cat-file --textconv :one.bin >result &&
>  	test_cmp expected result
>  '
>  
> @@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' '
>  	git config diff.test.cachetextconv false
>  '
>  
> -cat >expected <<EOF
> -bin: test version 2
> -EOF
> -
>  test_expect_success 'cat-file without --textconv' '
>  	git cat-file blob :one.bin >result &&
>  	test_cmp expected result
> @@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' '
>  '
>  
>  test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' '
> +	printf "%s" "one.bin" >expected &&
>  	git cat-file blob :symlink.bin >result &&
> -	printf "%s" "one.bin" >expected
>  	test_cmp expected result
>  '
>  
>  
>  test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' '
> -	! git cat-file --textconv :symlink.bin 2>result &&
> -	cat >expected <<\EOF &&
> -fatal: git cat-file --textconv: unable to run textconv on :symlink.bin
> -EOF
> +	git cat-file --textconv :symlink.bin >result &&
>  	test_cmp expected result
>  '
>  
>  test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' '
> -	! git cat-file --textconv HEAD:symlink.bin 2>result &&
> -	cat >expected <<EOF &&
> -fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin
> -EOF
> +	git cat-file --textconv HEAD:symlink.bin >result &&
>  	test_cmp expected result
>  '

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv
  2013-04-23 12:11       ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
@ 2013-04-23 15:16         ` Junio C Hamano
  2013-04-24 10:09           ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-23 15:16 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> Currently, "git grep" does not honor any textconv filters. Demonstrate
> this in the tests.
>
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
>  t/t7008-grep-binary.sh | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
>
> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
> index 26f8319..126fe4c 100755
> --- a/t/t7008-grep-binary.sh
> +++ b/t/t7008-grep-binary.sh
> @@ -145,4 +145,27 @@ test_expect_success 'grep respects not-binary diff attribute' '
>  	test_cmp expect actual
>  '
>  
> +cat >nul_to_q_textconv <<'EOF'
> +#!/bin/sh
> +"$PERL_PATH" -pe 'y/\000/Q/' < "$1"
> +EOF
> +chmod +x nul_to_q_textconv
> +
> +test_expect_success 'setup textconv filters' '
> +	echo a diff=foo >.gitattributes &&
> +	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
> +'
> +
> +test_expect_failure 'grep does not honor textconv' '
> +	echo "a:binaryQfile" >expect &&
> +	git grep Qfile >actual &&

This should pass --textconv to "git grep".

> +	test_cmp expect actual
> +'
> +
> +test_expect_failure 'grep blob does not honor textconv' '
> +	echo "HEAD:a:binaryQfile" >expect &&
> +	git grep Qfile HEAD:a >actual &&

Likewise.

> +	test_cmp expect actual
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-23 12:11       ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber
@ 2013-04-23 15:20         ` Junio C Hamano
  2013-04-24 10:05           ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-23 15:20 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> Currently, "git grep" does not honor textconv settings by default.
> Make it honor them by default just like "git log --grep" does.

"git log --grep" looks for strings in the log message which never
goes through textconv filters.

Puzzled....

If you meant -S/-G, it justifies use of textconv because we are
generating diff and the user defines textconv to get a reasonable
output for otherwise undiffable contents.

I do not know if it is sensible to apply textconv by default for
"grep" (or for that matter "git show" that gives blob contents).

>
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
> ---
>  Documentation/git-grep.txt | 2 +-
>  grep.c                     | 2 ++
>  t/t7008-grep-binary.sh     | 4 ++--
>  3 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index a5c5a27..f54ac0c 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -82,10 +82,10 @@ OPTIONS
>  
>  --textconv::
>  	Honor textconv filter settings.
> +	This is the default.
>  
>  --no-textconv::
>  	Do not honor textconv filter settings.
> -	This is the default.
>  
>  -i::
>  --ignore-case::
> diff --git a/grep.c b/grep.c
> index c668034..161d3f0 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -31,6 +31,7 @@ void init_grep_defaults(void)
>  	opt->max_depth = -1;
>  	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
>  	opt->extended_regexp_option = 0;
> +	opt->allow_textconv = 1;
>  	strcpy(opt->color_context, "");
>  	strcpy(opt->color_filename, "");
>  	strcpy(opt->color_function, "");
> @@ -134,6 +135,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
>  	opt->pathname = def->pathname;
>  	opt->regflags = def->regflags;
>  	opt->relative = def->relative;
> +	opt->allow_textconv = def->allow_textconv;
>  
>  	strcpy(opt->color_context, def->color_context);
>  	strcpy(opt->color_filename, def->color_filename);
> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
> index 10b2c8b..2fc9d9c 100755
> --- a/t/t7008-grep-binary.sh
> +++ b/t/t7008-grep-binary.sh
> @@ -156,7 +156,7 @@ test_expect_success 'setup textconv filters' '
>  	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
>  '
>  
> -test_expect_failure 'grep does not honor textconv' '
> +test_expect_success 'grep does honor textconv' '
>  	echo "a:binaryQfile" >expect &&
>  	git grep Qfile >actual &&
>  	test_cmp expect actual
> @@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' '
>  	test_must_fail git grep --no-textconv Qfile
>  '
>  
> -test_expect_failure 'grep blob does not honor textconv' '
> +test_expect_success 'grep blob does honor textconv' '
>  	echo "HEAD:a:binaryQfile" >expect &&
>  	git grep Qfile HEAD:a >actual &&
>  	test_cmp expect actual

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-23 15:20         ` Junio C Hamano
@ 2013-04-24 10:05           ` Michael J Gruber
  2013-04-24 17:35             ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-24 10:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Junio C Hamano venit, vidit, dixit 23.04.2013 17:20:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> Currently, "git grep" does not honor textconv settings by default.
>> Make it honor them by default just like "git log --grep" does.
> 
> "git log --grep" looks for strings in the log message which never
> goes through textconv filters.
> 
> Puzzled....
> 
> If you meant -S/-G, it justifies use of textconv because we are
> generating diff and the user defines textconv to get a reasonable
> output for otherwise undiffable contents.

Sorry, yes, I meant "log grep diff", aka "log -S/-G".

> I do not know if it is sensible to apply textconv by default for
> "grep" (or for that matter "git show" that gives blob contents).

Well, that is the discussion that we were having, with no real end
result, which is why I haven't implemented this differently yet.

My point is that we apply textconv on "log diff greps" already, so why
should't we on content greps?

The question is really whether we should treat "content" similar to
"diff", that's question both when comparing "git log -S" to "git grep"
and "git show <commit>" to "git show <blob>".

My choice is clear, but others seem torn.

For "git grep", implementing a "no-textconv" default is simple, but for
"git show <blob>" this appears to be cumbersome to me.

>> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
>> ---
>>  Documentation/git-grep.txt | 2 +-
>>  grep.c                     | 2 ++
>>  t/t7008-grep-binary.sh     | 4 ++--
>>  3 files changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
>> index a5c5a27..f54ac0c 100644
>> --- a/Documentation/git-grep.txt
>> +++ b/Documentation/git-grep.txt
>> @@ -82,10 +82,10 @@ OPTIONS
>>  
>>  --textconv::
>>  	Honor textconv filter settings.
>> +	This is the default.
>>  
>>  --no-textconv::
>>  	Do not honor textconv filter settings.
>> -	This is the default.
>>  
>>  -i::
>>  --ignore-case::
>> diff --git a/grep.c b/grep.c
>> index c668034..161d3f0 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -31,6 +31,7 @@ void init_grep_defaults(void)
>>  	opt->max_depth = -1;
>>  	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
>>  	opt->extended_regexp_option = 0;
>> +	opt->allow_textconv = 1;
>>  	strcpy(opt->color_context, "");
>>  	strcpy(opt->color_filename, "");
>>  	strcpy(opt->color_function, "");
>> @@ -134,6 +135,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
>>  	opt->pathname = def->pathname;
>>  	opt->regflags = def->regflags;
>>  	opt->relative = def->relative;
>> +	opt->allow_textconv = def->allow_textconv;
>>  
>>  	strcpy(opt->color_context, def->color_context);
>>  	strcpy(opt->color_filename, def->color_filename);
>> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
>> index 10b2c8b..2fc9d9c 100755
>> --- a/t/t7008-grep-binary.sh
>> +++ b/t/t7008-grep-binary.sh
>> @@ -156,7 +156,7 @@ test_expect_success 'setup textconv filters' '
>>  	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
>>  '
>>  
>> -test_expect_failure 'grep does not honor textconv' '
>> +test_expect_success 'grep does honor textconv' '
>>  	echo "a:binaryQfile" >expect &&
>>  	git grep Qfile >actual &&
>>  	test_cmp expect actual
>> @@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' '
>>  	test_must_fail git grep --no-textconv Qfile
>>  '
>>  
>> -test_expect_failure 'grep blob does not honor textconv' '
>> +test_expect_success 'grep blob does honor textconv' '
>>  	echo "HEAD:a:binaryQfile" >expect &&
>>  	git grep Qfile HEAD:a >actual &&
>>  	test_cmp expect actual

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv
  2013-04-23 15:16         ` Junio C Hamano
@ 2013-04-24 10:09           ` Michael J Gruber
  2013-04-24 17:29             ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-24 10:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Junio C Hamano venit, vidit, dixit 23.04.2013 17:16:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> Currently, "git grep" does not honor any textconv filters. Demonstrate
>> this in the tests.
>>
>> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
>> ---
>>  t/t7008-grep-binary.sh | 23 +++++++++++++++++++++++
>>  1 file changed, 23 insertions(+)
>>
>> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
>> index 26f8319..126fe4c 100755
>> --- a/t/t7008-grep-binary.sh
>> +++ b/t/t7008-grep-binary.sh
>> @@ -145,4 +145,27 @@ test_expect_success 'grep respects not-binary diff attribute' '
>>  	test_cmp expect actual
>>  '
>>  
>> +cat >nul_to_q_textconv <<'EOF'
>> +#!/bin/sh
>> +"$PERL_PATH" -pe 'y/\000/Q/' < "$1"
>> +EOF
>> +chmod +x nul_to_q_textconv
>> +
>> +test_expect_success 'setup textconv filters' '
>> +	echo a diff=foo >.gitattributes &&
>> +	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
>> +'
>> +
>> +test_expect_failure 'grep does not honor textconv' '
>> +	echo "a:binaryQfile" >expect &&
>> +	git grep Qfile >actual &&
> 
> This should pass --textconv to "git grep".

But "git grep" does not know that option yet, so the test would fail for
the wrong reason.

The point ist that I expect "git grep" to apply textconv filters by
default, which it does not. (I know I might be the only one with this
expectation.)

Or do we want to document the absence of that option?

>> +	test_cmp expect actual
>> +'
>> +
>> +test_expect_failure 'grep blob does not honor textconv' '
>> +	echo "HEAD:a:binaryQfile" >expect &&
>> +	git grep Qfile HEAD:a >actual &&
> 
> Likewise.
> 
>> +	test_cmp expect actual
>> +'
>> +
>>  test_done

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 2/7] show: obey --textconv for blobs
  2013-04-23 15:14         ` Junio C Hamano
@ 2013-04-24 10:09           ` Michael J Gruber
  2013-04-24 17:30             ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-24 10:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Junio C Hamano venit, vidit, dixit 23.04.2013 17:14:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>>> Subject: Re: [PATCHv2 2/7] show: obey --textconv for blobs
> 
> s/obey/honor/;

I missed that one, thanks.

>> Currently, "diff" and "cat-file" for blobs honor "--textconv" options
>> (with the former defaulting to "--textconv" and the latter to
>> "--no-textconv") whereas "show" does not honor this option, even though
>> it takes diff options.
>>
>> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
>> default and "--no-textconv" when given.
> 
> It is the right thing to do to teach it to react to --[no-]textconv;
> I am not sure if the default is right, though.

That is the question ;)

>> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
>> ---
>>  builtin/log.c            | 24 +++++++++++++++++++++---
>>  t/t4030-diff-textconv.sh |  8 +++++++-
>>  2 files changed, 28 insertions(+), 4 deletions(-)
>>
>> diff --git a/builtin/log.c b/builtin/log.c
>> index 5f3ed77..fe0275e 100644
>> --- a/builtin/log.c
>> +++ b/builtin/log.c
>> @@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev)
>>  	strbuf_release(&out);
>>  }
>>  
>> -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
>> +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
>>  {
>> +	unsigned char sha1c[20];
>> +	struct object_context obj_context;
>> +	char *buf;
>> +	unsigned long size;
>> +
>>  	fflush(stdout);
>> -	return stream_blob_to_fd(1, sha1, NULL, 0);
>> +	if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
>> +		return stream_blob_to_fd(1, sha1, NULL, 0);
>> +
>> +	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
>> +		die("Not a valid object name %s", obj_name);
>> +	if (!obj_context.path[0] ||
>> +	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
>> +		return stream_blob_to_fd(1, sha1, NULL, 0);
>> +
>> +	if (!buf)
>> +		die("git show %s: bad file", obj_name);
>> +
>> +	write_or_die(1, buf, size);
>> +	return 0;
>>  }
>>  
>>  static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
>> @@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
>>  		const char *name = objects[i].name;
>>  		switch (o->type) {
>>  		case OBJ_BLOB:
>> -			ret = show_blob_object(o->sha1, NULL);
>> +			ret = show_blob_object(o->sha1, &rev, name);
>>  			break;
>>  		case OBJ_TAG: {
>>  			struct tag *t = (struct tag *)o;
>> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
>> index 260ea92..f9d55e1 100755
>> --- a/t/t4030-diff-textconv.sh
>> +++ b/t/t4030-diff-textconv.sh
>> @@ -90,12 +90,18 @@ test_expect_success 'status -v produces text' '
>>  	git reset --soft HEAD@{1}
>>  '
>>  
>> -test_expect_failure 'show blob produces text' '
>> +test_expect_success 'show blob produces text' '
>>  	git show HEAD:file >actual &&
>>  	printf "0\\n1\\n" >expect &&
>>  	test_cmp expect actual
>>  '
>>  
>> +test_expect_success 'show --no-textconv blob produces binary' '
>> +	git show --no-textconv HEAD:file >actual &&
>> +	printf "\\0\\n\\1\\n" >expect &&
>> +	test_cmp expect actual
>> +'
>> +
>>  test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
>>  	echo one >expect &&
>>  	git log --root --format=%s -G0 >actual &&

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv
  2013-04-24 10:09           ` Michael J Gruber
@ 2013-04-24 17:29             ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-04-24 17:29 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

>>> +test_expect_failure 'grep does not honor textconv' '
>>> +	echo "a:binaryQfile" >expect &&
>>> +	git grep Qfile >actual &&
>> 
>> This should pass --textconv to "git grep".
>
> But "git grep" does not know that option yet, so the test would fail for
> the wrong reason.
>
> The point ist that I expect "git grep" to apply textconv filters by
> default, which it does not. (I know I might be the only one with this
> expectation.)
>
> Or do we want to document the absence of that option?

First, whether you write expect_failure or expec_success, please
label the test to say what is expected to happen in the ideal world.
The test in question says "grep does not honor textconv", but if you
want it to honor textconv in the ideal world, it should be "grep
honors textconv (when it should)".

Now, from the point of view of testing "git grep honors textconv"
missing support at the command line parser level and a buggy
implementation of the command line parser that accepts but does not
trigger the feature are the same thing.  The command would not honor
textconv either way.

Marking the above as "failure" without explicitly asking for the
feature with "--textconv" means we want it to use textconv by
default, but that is *not* what the test title says is testing.

In your patch, what the body of the text is really expecting is
"grep uses textconv by default".  If that is what it tests, then
passing --textconv from the command line as I suggested would be
wrong, but I was going by the title of the patch.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 2/7] show: obey --textconv for blobs
  2013-04-24 10:09           ` Michael J Gruber
@ 2013-04-24 17:30             ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-04-24 17:30 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> Junio C Hamano venit, vidit, dixit 23.04.2013 17:14:
>> Michael J Gruber <git@drmicha.warpmail.net> writes:
>> 
>>>> Subject: Re: [PATCHv2 2/7] show: obey --textconv for blobs
>> 
>> s/obey/honor/;
>
> I missed that one, thanks.
>
>>> Currently, "diff" and "cat-file" for blobs honor "--textconv" options
>>> (with the former defaulting to "--textconv" and the latter to
>>> "--no-textconv") whereas "show" does not honor this option, even though
>>> it takes diff options.
>>>
>>> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
>>> default and "--no-textconv" when given.
>> 
>> It is the right thing to do to teach it to react to --[no-]textconv;
>> I am not sure if the default is right, though.
>
> That is the question ;)

Then let me make it easier.  It is not just "I am not sure if", but "I
do not think that".

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-24 10:05           ` Michael J Gruber
@ 2013-04-24 17:35             ` Junio C Hamano
  2013-04-24 17:57               ` Matthieu Moy
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-24 17:35 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> My point is that we apply textconv on "log diff greps" already, so why
> should't we on content greps?

I think you are going in circles.  If you start from "textconv is
about mangling blob contents", then it would look natural to you
that "show <blob>", "diff A B", and "grep <pattern> <blob>" would
all first mangle the blob contents using textconv and then work on
them.

But diff.<driver>.textconv is to mangle blob contents in preparation
for comparing with another.

That is why you explicitly ask "cat-file --textconv" to use the same
mangling even when you are not comparing it with anything else.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-24 17:35             ` Junio C Hamano
@ 2013-04-24 17:57               ` Matthieu Moy
  2013-04-24 18:55                 ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Matthieu Moy @ 2013-04-24 17:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git, jeremy.rosen, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> But diff.<driver>.textconv is to mangle blob contents in preparation
> for comparing with another.

and also in preparation for "blame".

In both cases (diff and blame), we're preparing to show the file content
to the user, and showing the binary makes no sense.

Grepping through the binary, on the other hand, can very well make
sense, like:

$ git grep foo
file.txt: some instance of foo
binary file bar.bin matches

One reason not to run the filter is performance: "git grep" is fast, and
it's cool. My textconv filters are usually slow, and it's not a big
problem because the diff machinery will only invoke the textconv filter
when the files are modified (i.e. hopefully not often for tracked binary
files). OTOH, "git grep" would need to run the textconv filters for each
binary files being searched for.

I tend to agree with Junio that it makes sense to keep it disabled by
default. Perhaps a grep.textconv config option?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-24 17:57               ` Matthieu Moy
@ 2013-04-24 18:55                 ` Junio C Hamano
  2013-04-26 11:59                   ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-24 18:55 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Michael J Gruber, git, jeremy.rosen, Jeff King

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

> Grepping through the binary, on the other hand, can very well make
> sense, like:
>
> $ git grep foo
> file.txt: some instance of foo
> binary file bar.bin matches

Yes, 

I am moderately negative on making it the default, mostly because it
goes against established expectations, but I did not mean to say
that an ability to pass blob contents through textconv before
running grep should not exist.  It would be a good option to have.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-24 18:55                 ` Junio C Hamano
@ 2013-04-26 11:59                   ` Michael J Gruber
  2013-04-26 13:23                     ` Matthieu Moy
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-26 11:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthieu Moy, git, jeremy.rosen, Jeff King

Junio C Hamano venit, vidit, dixit 24.04.2013 20:55:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
> 
>> Grepping through the binary, on the other hand, can very well make
>> sense, like:
>>
>> $ git grep foo
>> file.txt: some instance of foo
>> binary file bar.bin matches

BTW, textconv does not have to be slow - just use textconv-cache.

> Yes, 
> 
> I am moderately negative on making it the default, mostly because it
> goes against established expectations, but I did not mean to say
> that an ability to pass blob contents through textconv before
> running grep should not exist.  It would be a good option to have.

I'm still looking for a way to at least treat "git grep" and "git show
blob" the same way. I understand that I cannot convince you to change
the default here. The two options that I see are:

- Implement the --textconv option but leave the default as is. I did
that for "git grep" already (just drop 7/7) but it seems to be
cumbersome for "git show blob". I have to recheck.

- Implement a new attribute "show" analogous to "diff" which applies to
the blob case ("git grep" is a blob case, and so is "git show blob")
which can specify a "show" driver, which is like a "diff" driver but
understands textconv and cachetextconv options only.
Here, the default would be "--textconv" in any case, but unless you
specify a "show" attribute and driver there is no change in current
behavior.

The second case is a bit like clean/smudge, so, alternatively, one could
add a textconv and cachetextconv option to "filter" rather than
introducing "show". I'm not sure how much the textconv machinery needs
to change, though.

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-26 11:59                   ` Michael J Gruber
@ 2013-04-26 13:23                     ` Matthieu Moy
  2013-04-29  9:04                       ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Matthieu Moy @ 2013-04-26 13:23 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Junio C Hamano, git, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> BTW, textconv does not have to be slow - just use textconv-cache.

Right, thanks for reminding me about this, I had forgotten its existance ;-).

> I'm still looking for a way to at least treat "git grep" and "git show
> blob" the same way.

I agree they should be treated similarly.

> - Implement the --textconv option but leave the default as is. I did
> that for "git grep" already (just drop 7/7)

That seems sensible.

> but it seems to be cumbersome for "git show blob". I have to recheck.

It should be possible to have a tri-state for the --[no-]textconv
option: unset, set to true or set to false. But the code sharing between
log, show and diff might make that non-trivial.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-26 13:23                     ` Matthieu Moy
@ 2013-04-29  9:04                       ` Michael J Gruber
  2013-04-29 15:04                         ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-04-29  9:04 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Junio C Hamano, git, jeremy.rosen, Jeff King

Matthieu Moy venit, vidit, dixit 26.04.2013 15:23:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> BTW, textconv does not have to be slow - just use textconv-cache.
> 
> Right, thanks for reminding me about this, I had forgotten its existance ;-).
> 
>> I'm still looking for a way to at least treat "git grep" and "git show
>> blob" the same way.
> 
> I agree they should be treated similarly.
> 
>> - Implement the --textconv option but leave the default as is. I did
>> that for "git grep" already (just drop 7/7)
> 
> That seems sensible.
> 
>> but it seems to be cumbersome for "git show blob". I have to recheck.
> 
> It should be possible to have a tri-state for the --[no-]textconv
> option: unset, set to true or set to false. But the code sharing between
> log, show and diff might make that non-trivial.

Right now it's a diffopt bit...

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv2 7/7] git grep: honor textconv by default
  2013-04-29  9:04                       ` Michael J Gruber
@ 2013-04-29 15:04                         ` Junio C Hamano
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-04-29 15:04 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Matthieu Moy, git, jeremy.rosen, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

>> It should be possible to have a tri-state for the --[no-]textconv
>> option: unset, set to true or set to false. But the code sharing between
>> log, show and diff might make that non-trivial.
>
> Right now it's a diffopt bit...

I wonder if you can do something along the lines of the attached
patch.  The following discussion assumes that your default wants
textconv for generating patches, and no textconv for showing blobs,
which is the case your "it is a bit" becomes an issue.

The basic structure is that:

 * There is an extra "opt->touched_flags" that keeps track of all
   the fields that have been touched by DIFF_OPT_SET and
   DIFF_OPT_CLR;

 * You may continue setting the default values to the flags, like
   commands in the "log" family do in cmd_log_init_defaults(), but
   after you finished setting the defaults, you clear the
   touched_flags field;

 * And then you let the usual callchain to call diff_opt_parse(),
   allowing the opt->flags be set or unset, while keeping track of
   which bits the user touched;

 * There is an optional callback "opt->set_default" that is called
   at the very beginning to lets you inspect touched_flags and
   update opt->flags appropriately, before the remainder of the
   diffcore machinery is set up, taking the opt->flags value into
   account.

Your "git show" could start out with ALLOW_TEXTCONV set, but notice
explicit requests to --[no-]textconv from the command line in your
set_default() callback.  And then when it deals with a blob, check
if the user touched ALLOW_TEXTCONV and appropriately act on that
knowledge.

There would be three cases in your set_default callback:

 * flags has ALLOW_TEXTCONV set, and the bit was touched: the user
   explicitly said --textconv because she wants blobs to be mangled;

 * flags has ALLOW_TEXTCONV set, and the bit was not touched: the
   user did not say --textconv; do not mangle blobs;

 * flags has ALLOW_TEXTCONV unset; the user did not say --textconv,
   or explicitly said --no-textconv; do not mangle blobs.

The set_default callback can also be used to adjust defaults for
fields that are not handled by the DIFF_OPT_SET/CLR/TST, by the way.
You can remember the address of the default value you fed to a
string field before entering the callchain to diff_opt_parse(), and
in your set_default callback see if the value is still pointing at
the same piece of memory (in which case the user did not touch it).

 builtin/log.c | 1 +
 diff.c        | 3 +++
 diff.h        | 7 +++++--
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 6e56a50..c62ecd1 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -91,6 +91,7 @@ static void cmd_log_init_defaults(struct rev_info *rev)
 
 	if (default_date_mode)
 		rev->date_mode = parse_date_format(default_date_mode);
+	rev->diffopt.touched_flags = 0;
 }
 
 static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
diff --git a/diff.c b/diff.c
index f0b3e7c..7c24872 100644
--- a/diff.c
+++ b/diff.c
@@ -3213,6 +3213,9 @@ void diff_setup_done(struct diff_options *options)
 {
 	int count = 0;
 
+	if (options->set_default)
+		options->set_default(options);
+
 	if (options->output_format & DIFF_FORMAT_NAME)
 		count++;
 	if (options->output_format & DIFF_FORMAT_NAME_STATUS)
diff --git a/diff.h b/diff.h
index 78b4091..5c2f878 100644
--- a/diff.h
+++ b/diff.h
@@ -87,8 +87,8 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data)
 #define DIFF_OPT_PICKAXE_IGNORE_CASE (1 << 30)
 
 #define DIFF_OPT_TST(opts, flag)    ((opts)->flags & DIFF_OPT_##flag)
-#define DIFF_OPT_SET(opts, flag)    ((opts)->flags |= DIFF_OPT_##flag)
-#define DIFF_OPT_CLR(opts, flag)    ((opts)->flags &= ~DIFF_OPT_##flag)
+#define DIFF_OPT_SET(opts, flag)    (((opts)->flags |= DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag))
+#define DIFF_OPT_CLR(opts, flag)    (((opts)->flags &= ~DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag))
 #define DIFF_XDL_TST(opts, flag)    ((opts)->xdl_opts & XDF_##flag)
 #define DIFF_XDL_SET(opts, flag)    ((opts)->xdl_opts |= XDF_##flag)
 #define DIFF_XDL_CLR(opts, flag)    ((opts)->xdl_opts &= ~XDF_##flag)
@@ -109,6 +109,7 @@ struct diff_options {
 	const char *single_follow;
 	const char *a_prefix, *b_prefix;
 	unsigned flags;
+	unsigned touched_flags;
 	int use_color;
 	int context;
 	int interhunkcontext;
@@ -145,6 +146,8 @@ struct diff_options {
 	/* to support internal diff recursion by --follow hack*/
 	int found_follow;
 
+	void (*set_default)(struct diff_options *);
+
 	FILE *file;
 	int close_file;
 

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 0/7] textconv with grep and show
@ 2013-05-10 15:08                           ` Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber
                                               ` (6 more replies)
  0 siblings, 7 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:08 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

This is the "Git Merge edition" of the textconv series. Great
conference, the series struggles to match that.

v3 keeps all defaults as they are (no textconv for blobs by default) and
incorporates Junio's touched_flags patch. I do not need the callback but
left it in.

As for beeing able to choose textconv as a default for blobs, I'm
wondering how fine grained that needs to be: one for all, differentiate
between "show blob" and "grep", or even driver specific
(diff.driver.blobtextconv boolean).

In the latter case, it's not clear to me how "--textconv" with a "false"
config should behave.

Jeff King (1):
  grep: allow to use textconv filters

Junio C Hamano (1):
  diff_opt: track whether flags have been set explicitly

Michael J Gruber (5):
  t4030: demonstrate behavior of show with textconv
  show: honor --textconv for blobs
  cat-file: do not die on --textconv without textconv filters
  t7008: demonstrate behavior of grep with textconv
  grep: honor --textconv for the case rev:path

 Documentation/git-grep.txt           |   9 +++-
 Documentation/technical/api-diff.txt |  10 +++-
 builtin/cat-file.c                   |  18 +++----
 builtin/grep.c                       |  13 +++--
 builtin/log.c                        |  26 +++++++--
 diff.c                               |   3 ++
 diff.h                               |   8 ++-
 grep.c                               | 100 ++++++++++++++++++++++++++++++-----
 grep.h                               |   1 +
 object.c                             |  26 ++++++---
 object.h                             |   2 +
 t/t4030-diff-textconv.sh             |  24 +++++++++
 t/t7008-grep-binary.sh               |  31 +++++++++++
 t/t8007-cat-file-textconv.sh         |  20 ++-----
 14 files changed, 234 insertions(+), 57 deletions(-)

-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber
                                               ` (5 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

"git show <commit>" honors the --textconv option while "git show <blob>"
does not. Demonstrate this in the test.

Since the current behavior is supposed to stay as is, we expect the
default for "git show <blob>" to remain --no-textconv.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 t/t4030-diff-textconv.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index 53ec330..3950fc9 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -58,6 +58,12 @@ test_expect_success 'diff produces text' '
 	test_cmp expect.text actual
 '
 
+test_expect_success 'show commit produces text' '
+	git show HEAD >diff &&
+	find_diff <diff >actual &&
+	test_cmp expect.text actual
+'
+
 test_expect_success 'diff-tree produces binary' '
 	git diff-tree -p HEAD^ HEAD >diff &&
 	find_diff <diff >actual &&
@@ -84,6 +90,24 @@ test_expect_success 'status -v produces text' '
 	git reset --soft HEAD@{1}
 '
 
+test_expect_success 'show blob produces binary' '
+	git show HEAD:file >actual &&
+	printf "\\0\\n\\01\\n" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'show --textconv blob produces text' '
+	git show --textconv HEAD:file >actual &&
+	printf "0\\n1\\n" >expect &&
+	test_cmp expect actual
+'
+
+test_success 'show --no-textconv blob produces binary' '
+	git show --textconv HEAD:file >actual &&
+	printf "\\0\\n\\01\\n" >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
 	echo one >expect &&
 	git log --root --format=%s -G0 >actual &&
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 15:31                               ` Eric Sunshine
  2013-05-10 15:10                             ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber
                                               ` (4 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

From: Junio C Hamano <gitster@pobox.com>

The diff_opt infrastructure sets flags based on defaults and command
line options. Currently, it is impossible to detect whether a flag has
been set as a default or on explicit request.

Amend the structure so that this detection is possible:

 * There is an extra "opt->touched_flags" that keeps track of all
   the fields that have been touched by DIFF_OPT_SET and
   DIFF_OPT_CLR;

 * You may continue setting the default values to the flags, like
   commands in the "log" family do in cmd_log_init_defaults(), but
   after you finished setting the defaults, you clear the
   touched_flags field;

 * And then you let the usual callchain call diff_opt_parse(),
   allowing the opt->flags be set or unset, while keeping track of
   which bits the user touched;

 * There is an optional callback "opt->set_default" that is called
   at the very beginning to lets you inspect touched_flags and
   update opt->flags appropriately, before the remainder of the
   diffcore machinery is set up, taking the opt->flags value into
   account.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 Documentation/technical/api-diff.txt | 10 +++++++++-
 builtin/log.c                        |  1 +
 diff.c                               |  3 +++
 diff.h                               |  8 ++++++--
 4 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/api-diff.txt b/Documentation/technical/api-diff.txt
index 2d2ebc0..8b001de 100644
--- a/Documentation/technical/api-diff.txt
+++ b/Documentation/technical/api-diff.txt
@@ -28,7 +28,8 @@ Calling sequence
 
 * Call `diff_setup_done()`; this inspects the options set up so far for
   internal consistency and make necessary tweaking to it (e.g. if
-  textual patch output was asked, recursive behaviour is turned on).
+  textual patch output was asked, recursive behaviour is turned on);
+  the callback set_default in diff_options can be used to tweak this more.
 
 * As you find different pairs of files, call `diff_change()` to feed
   modified files, `diff_addremove()` to feed created or deleted files,
@@ -115,6 +116,13 @@ Notable members are:
 	operation, but some do not have anything to do with the diffcore
 	library.
 
+`touched_flags`::
+	Records whether a flag has been changed due to user request
+	(rather than just set/unset by default).
+
+`set_default`::
+	Callback which allows tweaking the options in diff_setup_done().
+
 BINARY, TEXT;;
 	Affects the way how a file that is seemingly binary is treated.
 
diff --git a/builtin/log.c b/builtin/log.c
index 9e21232..f19d779 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -111,6 +111,7 @@ static void cmd_log_init_defaults(struct rev_info *rev)
 
 	if (default_date_mode)
 		rev->date_mode = parse_date_format(default_date_mode);
+	rev->diffopt.touched_flags = 0;
 }
 
 static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
diff --git a/diff.c b/diff.c
index f0b3e7c..7c24872 100644
--- a/diff.c
+++ b/diff.c
@@ -3213,6 +3213,9 @@ void diff_setup_done(struct diff_options *options)
 {
 	int count = 0;
 
+	if (options->set_default)
+		options->set_default(options);
+
 	if (options->output_format & DIFF_FORMAT_NAME)
 		count++;
 	if (options->output_format & DIFF_FORMAT_NAME_STATUS)
diff --git a/diff.h b/diff.h
index 78b4091..e995ae1 100644
--- a/diff.h
+++ b/diff.h
@@ -87,8 +87,9 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data)
 #define DIFF_OPT_PICKAXE_IGNORE_CASE (1 << 30)
 
 #define DIFF_OPT_TST(opts, flag)    ((opts)->flags & DIFF_OPT_##flag)
-#define DIFF_OPT_SET(opts, flag)    ((opts)->flags |= DIFF_OPT_##flag)
-#define DIFF_OPT_CLR(opts, flag)    ((opts)->flags &= ~DIFF_OPT_##flag)
+#define DIFF_OPT_TOUCHED(opts, flag)    ((opts)->touched_flags & DIFF_OPT_##flag)
+#define DIFF_OPT_SET(opts, flag)    (((opts)->flags |= DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag))
+#define DIFF_OPT_CLR(opts, flag)    (((opts)->flags &= ~DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag))
 #define DIFF_XDL_TST(opts, flag)    ((opts)->xdl_opts & XDF_##flag)
 #define DIFF_XDL_SET(opts, flag)    ((opts)->xdl_opts |= XDF_##flag)
 #define DIFF_XDL_CLR(opts, flag)    ((opts)->xdl_opts &= ~XDF_##flag)
@@ -109,6 +110,7 @@ struct diff_options {
 	const char *single_follow;
 	const char *a_prefix, *b_prefix;
 	unsigned flags;
+	unsigned touched_flags;
 	int use_color;
 	int context;
 	int interhunkcontext;
@@ -145,6 +147,8 @@ struct diff_options {
 	/* to support internal diff recursion by --follow hack*/
 	int found_follow;
 
+	void (*set_default)(struct diff_options *);
+
 	FILE *file;
 	int close_file;
 
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 17:02                               ` Junio C Hamano
  2013-05-10 15:10                             ` [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
                                               ` (3 subsequent siblings)
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

Currently, "diff" and "cat-file" for blobs honor "--textconv" options
(with the former defaulting to "--textconv" and the latter to
"--no-textconv") whereas "show" does not honor this option, even though
it takes diff options.

Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
default and "--no-textconv" when given.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/log.c            | 25 ++++++++++++++++++++++---
 t/t4030-diff-textconv.sh |  6 +++---
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index f19d779..dd3f108 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -437,10 +437,29 @@ static void show_tagger(char *buf, int len, struct rev_info *rev)
 	strbuf_release(&out);
 }
 
-static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
+static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
 {
+	unsigned char sha1c[20];
+	struct object_context obj_context;
+	char *buf;
+	unsigned long size;
+
 	fflush(stdout);
-	return stream_blob_to_fd(1, sha1, NULL, 0);
+	if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
+	    !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
+		return stream_blob_to_fd(1, sha1, NULL, 0);
+
+	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
+		die("Not a valid object name %s", obj_name);
+	if (!obj_context.path[0] ||
+	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
+		return stream_blob_to_fd(1, sha1, NULL, 0);
+
+	if (!buf)
+		die("git show %s: bad file", obj_name);
+
+	write_or_die(1, buf, size);
+	return 0;
 }
 
 static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
@@ -526,7 +545,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 		const char *name = objects[i].name;
 		switch (o->type) {
 		case OBJ_BLOB:
-			ret = show_blob_object(o->sha1, NULL);
+			ret = show_blob_object(o->sha1, &rev, name);
 			break;
 		case OBJ_TAG: {
 			struct tag *t = (struct tag *)o;
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index 3950fc9..0ebb028 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -96,14 +96,14 @@ test_expect_success 'show blob produces binary' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'show --textconv blob produces text' '
+test_expect_success 'show --textconv blob produces text' '
 	git show --textconv HEAD:file >actual &&
 	printf "0\\n1\\n" >expect &&
 	test_cmp expect actual
 '
 
-test_success 'show --no-textconv blob produces binary' '
-	git show --textconv HEAD:file >actual &&
+test_expect_success 'show --no-textconv blob produces binary' '
+	git show --no-textconv HEAD:file >actual &&
 	printf "\\0\\n\\01\\n" >expect &&
 	test_cmp expect actual
 '
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
                                               ` (2 preceding siblings ...)
  2013-05-10 15:10                             ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
                                               ` (2 subsequent siblings)
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

When a command is supposed to use textconv filters (by default or with
"--textconv") and none are configured then the blob is output without
conversion; the only exception to this rule is "cat-file --textconv".

Make it behave like the rest of textconv aware commands.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/cat-file.c           | 18 ++++++++----------
 t/t8007-cat-file-textconv.sh | 20 +++++---------------
 2 files changed, 13 insertions(+), 25 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 045cee7..bd62373 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -48,6 +48,14 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	case 'e':
 		return !has_sha1_file(sha1);
 
+	case 'c':
+		if (!obj_context.path[0])
+			die("git cat-file --textconv %s: <object> must be <sha1:path>",
+			    obj_name);
+
+		if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
+			break;
+
 	case 'p':
 		type = sha1_object_info(sha1, NULL);
 		if (type < 0)
@@ -70,16 +78,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 		/* otherwise just spit out the data */
 		break;
 
-	case 'c':
-		if (!obj_context.path[0])
-			die("git cat-file --textconv %s: <object> must be <sha1:path>",
-			    obj_name);
-
-		if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size))
-			die("git cat-file --textconv: unable to run textconv on %s",
-			    obj_name);
-		break;
-
 	case 0:
 		if (type_from_string(exp_type) == OBJ_BLOB) {
 			unsigned char blob_sha1[20];
diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh
index 78a0085..83c6636 100755
--- a/t/t8007-cat-file-textconv.sh
+++ b/t/t8007-cat-file-textconv.sh
@@ -22,11 +22,11 @@ test_expect_success 'setup ' '
 '
 
 cat >expected <<EOF
-fatal: git cat-file --textconv: unable to run textconv on :one.bin
+bin: test version 2
 EOF
 
 test_expect_success 'no filter specified' '
-	git cat-file --textconv :one.bin 2>result
+	git cat-file --textconv :one.bin >result &&
 	test_cmp expected result
 '
 
@@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' '
 	git config diff.test.cachetextconv false
 '
 
-cat >expected <<EOF
-bin: test version 2
-EOF
-
 test_expect_success 'cat-file without --textconv' '
 	git cat-file blob :one.bin >result &&
 	test_cmp expected result
@@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' '
 '
 
 test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' '
+	printf "%s" "one.bin" >expected &&
 	git cat-file blob :symlink.bin >result &&
-	printf "%s" "one.bin" >expected
 	test_cmp expected result
 '
 
 
 test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' '
-	! git cat-file --textconv :symlink.bin 2>result &&
-	cat >expected <<\EOF &&
-fatal: git cat-file --textconv: unable to run textconv on :symlink.bin
-EOF
+	git cat-file --textconv :symlink.bin >result &&
 	test_cmp expected result
 '
 
 test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' '
-	! git cat-file --textconv HEAD:symlink.bin 2>result &&
-	cat >expected <<EOF &&
-fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin
-EOF
+	git cat-file --textconv HEAD:symlink.bin >result &&
 	test_cmp expected result
 '
 
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
                                               ` (3 preceding siblings ...)
  2013-05-10 15:10                             ` [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 6/7] grep: allow to use textconv filters Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

Currently, "git grep" does not honor any textconv filters, with nor
without --textconv. Demonstrate this in the tests.

The default is expected to remain unchanged.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 t/t7008-grep-binary.sh | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 26f8319..1c0946f 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -145,4 +145,35 @@ test_expect_success 'grep respects not-binary diff attribute' '
 	test_cmp expect actual
 '
 
+cat >nul_to_q_textconv <<'EOF'
+#!/bin/sh
+"$PERL_PATH" -pe 'y/\000/Q/' < "$1"
+EOF
+chmod +x nul_to_q_textconv
+
+test_expect_success 'setup textconv filters' '
+	echo a diff=foo >.gitattributes &&
+	git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv
+'
+
+test_expect_success 'grep does not honor textconv' '
+	test_must_fail git grep Qfile
+'
+
+test_expect_failure 'grep --textconv honors textconv' '
+	echo "a:binaryQfile" >expect &&
+	git grep --textconv Qfile >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-textconv does not honor textconv' '
+	test_must_fail git grep --no-textconv Qfile
+'
+
+test_expect_failure 'grep --textconv blob honors textconv' '
+	echo "HEAD:a:binaryQfile" >expect &&
+	git grep --textconv Qfile HEAD:a >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 6/7] grep: allow to use textconv filters
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
                                               ` (4 preceding siblings ...)
  2013-05-10 15:10                             ` [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 15:10                             ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber
  6 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

From: Jeff King <peff@peff.net>

Recently and not so recently, we made sure that log/grep type operations
use textconv filters when a userfacing diff would do the same:

ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)

"git grep" currently does not use textconv filters at all, that is
neither for displaying the match and context nor for the actual grepping,
even when requested by --textconv.

Introduce an option "--textconv" which makes git grep use any configured
textconv filters for grepping and output purposes. It is off by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 Documentation/git-grep.txt |   9 +++-
 builtin/grep.c             |   2 +
 grep.c                     | 100 ++++++++++++++++++++++++++++++++++++++-------
 grep.h                     |   1 +
 t/t7008-grep-binary.sh     |   6 ++-
 5 files changed, 102 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 50d46e1..a5c5a27 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -9,7 +9,7 @@ git-grep - Print lines matching a pattern
 SYNOPSIS
 --------
 [verse]
-'git grep' [-a | --text] [-I] [-i | --ignore-case] [-w | --word-regexp]
+'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp]
 	   [-v | --invert-match] [-h|-H] [--full-name]
 	   [-E | --extended-regexp] [-G | --basic-regexp]
 	   [-P | --perl-regexp]
@@ -80,6 +80,13 @@ OPTIONS
 --text::
 	Process binary files as if they were text.
 
+--textconv::
+	Honor textconv filter settings.
+
+--no-textconv::
+	Do not honor textconv filter settings.
+	This is the default.
+
 -i::
 --ignore-case::
 	Ignore case differences between the patterns and the
diff --git a/builtin/grep.c b/builtin/grep.c
index 159e65d..00ee57d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -659,6 +659,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		OPT_SET_INT('I', NULL, &opt.binary,
 			N_("don't match patterns in binary files"),
 			GREP_BINARY_NOMATCH),
+		OPT_BOOL(0, "textconv", &opt.allow_textconv,
+			 N_("process binary files with textconv filters")),
 		{ OPTION_INTEGER, 0, "max-depth", &opt.max_depth, N_("depth"),
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
diff --git a/grep.c b/grep.c
index bb548ca..c668034 100644
--- a/grep.c
+++ b/grep.c
@@ -2,6 +2,8 @@
 #include "grep.h"
 #include "userdiff.h"
 #include "xdiff-interface.h"
+#include "diff.h"
+#include "diffcore.h"
 
 static int grep_source_load(struct grep_source *gs);
 static int grep_source_is_binary(struct grep_source *gs);
@@ -1322,6 +1324,58 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
+static int fill_textconv_grep(struct userdiff_driver *driver,
+			      struct grep_source *gs)
+{
+	struct diff_filespec *df;
+	char *buf;
+	size_t size;
+
+	if (!driver || !driver->textconv)
+		return grep_source_load(gs);
+
+	/*
+	 * The textconv interface is intimately tied to diff_filespecs, so we
+	 * have to pretend to be one. If we could unify the grep_source
+	 * and diff_filespec structs, this mess could just go away.
+	 */
+	df = alloc_filespec(gs->path);
+	switch (gs->type) {
+	case GREP_SOURCE_SHA1:
+		fill_filespec(df, gs->identifier, 1, 0100644);
+		break;
+	case GREP_SOURCE_FILE:
+		fill_filespec(df, null_sha1, 0, 0100644);
+		break;
+	default:
+		die("BUG: attempt to textconv something without a path?");
+	}
+
+	/*
+	 * fill_textconv is not remotely thread-safe; it may load objects
+	 * behind the scenes, and it modifies the global diff tempfile
+	 * structure.
+	 */
+	grep_read_lock();
+	size = fill_textconv(driver, df, &buf);
+	grep_read_unlock();
+	free_filespec(df);
+
+	/*
+	 * The normal fill_textconv usage by the diff machinery would just keep
+	 * the textconv'd buf separate from the diff_filespec. But much of the
+	 * grep code passes around a grep_source and assumes that its "buf"
+	 * pointer is the beginning of the thing we are searching. So let's
+	 * install our textconv'd version into the grep_source, taking care not
+	 * to leak any existing buffer.
+	 */
+	grep_source_clear_data(gs);
+	gs->buf = buf;
+	gs->size = size;
+
+	return 0;
+}
+
 static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int collect_hits)
 {
 	char *bol;
@@ -1332,6 +1386,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	unsigned count = 0;
 	int try_lookahead = 0;
 	int show_function = 0;
+	struct userdiff_driver *textconv = NULL;
 	enum grep_context ctx = GREP_CONTEXT_HEAD;
 	xdemitconf_t xecfg;
 
@@ -1353,19 +1408,36 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	}
 	opt->last_shown = 0;
 
-	switch (opt->binary) {
-	case GREP_BINARY_DEFAULT:
-		if (grep_source_is_binary(gs))
-			binary_match_only = 1;
-		break;
-	case GREP_BINARY_NOMATCH:
-		if (grep_source_is_binary(gs))
-			return 0; /* Assume unmatch */
-		break;
-	case GREP_BINARY_TEXT:
-		break;
-	default:
-		die("bug: unknown binary handling mode");
+	if (opt->allow_textconv) {
+		grep_source_load_driver(gs);
+		/*
+		 * We might set up the shared textconv cache data here, which
+		 * is not thread-safe.
+		 */
+		grep_attr_lock();
+		textconv = userdiff_get_textconv(gs->driver);
+		grep_attr_unlock();
+	}
+
+	/*
+	 * We know the result of a textconv is text, so we only have to care
+	 * about binary handling if we are not using it.
+	 */
+	if (!textconv) {
+		switch (opt->binary) {
+		case GREP_BINARY_DEFAULT:
+			if (grep_source_is_binary(gs))
+				binary_match_only = 1;
+			break;
+		case GREP_BINARY_NOMATCH:
+			if (grep_source_is_binary(gs))
+				return 0; /* Assume unmatch */
+			break;
+		case GREP_BINARY_TEXT:
+			break;
+		default:
+			die("bug: unknown binary handling mode");
+		}
 	}
 
 	memset(&xecfg, 0, sizeof(xecfg));
@@ -1373,7 +1445,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 
 	try_lookahead = should_lookahead(opt);
 
-	if (grep_source_load(gs) < 0)
+	if (fill_textconv_grep(textconv, gs) < 0)
 		return 0;
 
 	bol = gs->buf;
diff --git a/grep.h b/grep.h
index e4a1df5..eaaced1 100644
--- a/grep.h
+++ b/grep.h
@@ -107,6 +107,7 @@ struct grep_opt {
 #define GREP_BINARY_NOMATCH	1
 #define GREP_BINARY_TEXT	2
 	int binary;
+	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
 	int pcre;
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 1c0946f..a91260a 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -160,7 +160,7 @@ test_expect_success 'grep does not honor textconv' '
 	test_must_fail git grep Qfile
 '
 
-test_expect_failure 'grep --textconv honors textconv' '
+test_expect_success 'grep --textconv honors textconv' '
 	echo "a:binaryQfile" >expect &&
 	git grep --textconv Qfile >actual &&
 	test_cmp expect actual
@@ -176,4 +176,8 @@ test_expect_failure 'grep --textconv blob honors textconv' '
 	test_cmp expect actual
 '
 
+test_expect_success 'grep --no-textconv blob does not honor textconv' '
+	test_must_fail git grep --no-textconv Qfile HEAD:a
+'
+
 test_done
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCHv3 7/7] grep: honor --textconv for the case rev:path
  2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
                                               ` (5 preceding siblings ...)
  2013-05-10 15:10                             ` [PATCHv3 6/7] grep: allow to use textconv filters Michael J Gruber
@ 2013-05-10 15:10                             ` Michael J Gruber
  2013-05-10 18:11                               ` Junio C Hamano
  6 siblings, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy

Make "grep" honor the "--textconv" option also for the object case, i.e.
when used with an argument "rev:path".

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 builtin/grep.c         | 11 ++++++-----
 object.c               | 26 ++++++++++++++++++++------
 object.h               |  2 ++
 t/t7008-grep-binary.sh |  6 +-----
 4 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 00ee57d..bb7f970 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -458,10 +458,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name)
+		       struct object *obj, const char *name, struct object_context *oc)
 {
 	if (obj->type == OBJ_BLOB)
-		return grep_sha1(opt, obj->sha1, name, 0, NULL);
+		return grep_sha1(opt, obj->sha1, name, 0, oc ? oc->path : NULL);
 	if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) {
 		struct tree_desc tree;
 		void *data;
@@ -503,7 +503,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
 		real_obj = deref_tag(list->objects[i].item, NULL, 0);
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].context)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *arg = argv[i];
 		unsigned char sha1[20];
+		struct object_context oc;
 		/* Is it a rev? */
-		if (!get_sha1(arg, sha1)) {
+		if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
 			struct object *object = parse_object_or_die(sha1, arg);
 			if (!seen_dashdash)
 				verify_non_filename(prefix, arg);
-			add_object_array(object, arg, &list);
+			add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context)));
 			continue;
 		}
 		if (!strcmp(arg, "--")) {
diff --git a/object.c b/object.c
index 88d0bec..264b6df 100644
--- a/object.c
+++ b/object.c
@@ -265,12 +265,7 @@ int object_list_contains(struct object_list *list, struct object *obj)
 	return 0;
 }
 
-void add_object_array(struct object *obj, const char *name, struct object_array *array)
-{
-	add_object_array_with_mode(obj, name, array, S_IFINVALID);
-}
-
-void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
+static void add_object_array_with_mode_context(struct object *obj, const char *name, struct object_array *array, unsigned mode, struct object_context *context)
 {
 	unsigned nr = array->nr;
 	unsigned alloc = array->alloc;
@@ -285,9 +280,28 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj
 	objects[nr].item = obj;
 	objects[nr].name = name;
 	objects[nr].mode = mode;
+	objects[nr].context = context;
 	array->nr = ++nr;
 }
 
+void add_object_array(struct object *obj, const char *name, struct object_array *array)
+{
+	add_object_array_with_mode(obj, name, array, S_IFINVALID);
+}
+
+void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
+{
+	add_object_array_with_mode_context(obj, name, array, mode, NULL);
+}
+
+void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context)
+{
+	if (context)
+		add_object_array_with_mode_context(obj, name, array, context->mode, context);
+	else
+		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
+}
+
 void object_array_remove_duplicates(struct object_array *array)
 {
 	unsigned int ref, src, dst;
diff --git a/object.h b/object.h
index 97d384b..695847d 100644
--- a/object.h
+++ b/object.h
@@ -13,6 +13,7 @@ struct object_array {
 		struct object *item;
 		const char *name;
 		unsigned mode;
+		struct object_context *context;
 	} *objects;
 };
 
@@ -85,6 +86,7 @@ int object_list_contains(struct object_list *list, struct object *obj);
 /* Object array handling .. */
 void add_object_array(struct object *obj, const char *name, struct object_array *array);
 void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode);
+void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context);
 void object_array_remove_duplicates(struct object_array *);
 
 void clear_object_flags(unsigned flags);
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index a91260a..b146406 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -170,14 +170,10 @@ test_expect_success 'grep --no-textconv does not honor textconv' '
 	test_must_fail git grep --no-textconv Qfile
 '
 
-test_expect_failure 'grep --textconv blob honors textconv' '
+test_expect_success 'grep --textconv blob honors textconv' '
 	echo "HEAD:a:binaryQfile" >expect &&
 	git grep --textconv Qfile HEAD:a >actual &&
 	test_cmp expect actual
 '
 
-test_expect_success 'grep --no-textconv blob does not honor textconv' '
-	test_must_fail git grep --no-textconv Qfile HEAD:a
-'
-
 test_done
-- 
1.8.3.rc1.406.gf4dce7e

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly
  2013-05-10 15:10                             ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber
@ 2013-05-10 15:31                               ` Eric Sunshine
  0 siblings, 0 replies; 78+ messages in thread
From: Eric Sunshine @ 2013-05-10 15:31 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Git List, Jeff King, Junio C Hamano, Matthieu Moy

On Fri, May 10, 2013 at 11:10 AM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> From: Junio C Hamano <gitster@pobox.com>
>
> The diff_opt infrastructure sets flags based on defaults and command
> line options. Currently, it is impossible to detect whether a flag has
> been set as a default or on explicit request.
>
> Amend the structure so that this detection is possible:
>
>  * There is an extra "opt->touched_flags" that keeps track of all
>    the fields that have been touched by DIFF_OPT_SET and
>    DIFF_OPT_CLR;
>
>  * You may continue setting the default values to the flags, like
>    commands in the "log" family do in cmd_log_init_defaults(), but
>    after you finished setting the defaults, you clear the
>    touched_flags field;
>
>  * And then you let the usual callchain call diff_opt_parse(),
>    allowing the opt->flags be set or unset, while keeping track of
>    which bits the user touched;
>
>  * There is an optional callback "opt->set_default" that is called
>    at the very beginning to lets you inspect touched_flags and

s/lets/let/

>    update opt->flags appropriately, before the remainder of the
>    diffcore machinery is set up, taking the opt->flags value into
>    account.
>
> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-10 15:10                             ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber
@ 2013-05-10 17:02                               ` Junio C Hamano
  2013-05-10 17:34                                 ` Jeff King
  2013-05-11  8:54                                 ` Michael J Gruber
  0 siblings, 2 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-05-10 17:02 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy

Michael J Gruber <git@drmicha.warpmail.net> writes:

> Currently, "diff" and "cat-file" for blobs honor "--textconv" options
> (with the former defaulting to "--textconv" and the latter to
> "--no-textconv") whereas "show" does not honor this option, even though
> it takes diff options.
>
> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
> default and "--no-textconv" when given.

Hmm...

> +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
>  {
> +	unsigned char sha1c[20];
> +	struct object_context obj_context;
> +	char *buf;
> +	unsigned long size;
> +
>  	fflush(stdout);
> -	return stream_blob_to_fd(1, sha1, NULL, 0);
> +	if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
> +	    !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
> +		return stream_blob_to_fd(1, sha1, NULL, 0);

It is surprising that the necessary change is only this, but I think
it is correct ;-).  We ignore textconv when the command line did not
mention --[no-]textconv, or the command line said --no-textconv
explicitly.

This (especially the first condition) may deserve an in-code comment
for anybody who wonders where this default behaviour is implemented.

So "show" on blobs does show the raw contents by default, but the
user can explicitly ask to enable textconv with --[no-]textconv.  Is
the second paragraph in the log message still valid?

> +	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
> +		die("Not a valid object name %s", obj_name);

This looks somewhat unfortunate.

We already have sha1[]; actually we not just know sha1[] but have
the struct object for it.  How did we obtain it before we got here?

Will we always have a valid name in rev.pending.objects->name?  Will
that name convert back to the same sha1 we got in sha1[]?

I think the answers are "Yes (it is a command line argument), Yes
(that is what setup_revisions() got by feeding the name to give us
sha1[])".

I wonder if enriching rev_info->pending with the context information
might be a clean solution to avoid this redundant but unavoidable
conversion, but that is a separate and future topic, I think.

> +	if (!obj_context.path[0] ||
> +	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
> +		return stream_blob_to_fd(1, sha1, NULL, 0);
> +
> +	if (!buf)
> +		die("git show %s: bad file", obj_name);
> +
> +	write_or_die(1, buf, size);
> +	return 0;
>  }
>  
>  static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
> @@ -526,7 +545,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
>  		const char *name = objects[i].name;
>  		switch (o->type) {
>  		case OBJ_BLOB:
> -			ret = show_blob_object(o->sha1, NULL);
> +			ret = show_blob_object(o->sha1, &rev, name);
>  			break;
>  		case OBJ_TAG: {
>  			struct tag *t = (struct tag *)o;
> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
> index 3950fc9..0ebb028 100755
> --- a/t/t4030-diff-textconv.sh
> +++ b/t/t4030-diff-textconv.sh
> @@ -96,14 +96,14 @@ test_expect_success 'show blob produces binary' '
>  	test_cmp expect actual
>  '
>  
> -test_expect_failure 'show --textconv blob produces text' '
> +test_expect_success 'show --textconv blob produces text' '
>  	git show --textconv HEAD:file >actual &&
>  	printf "0\\n1\\n" >expect &&
>  	test_cmp expect actual
>  '
>  
> -test_success 'show --no-textconv blob produces binary' '
> -	git show --textconv HEAD:file >actual &&
> +test_expect_success 'show --no-textconv blob produces binary' '
> +	git show --no-textconv HEAD:file >actual &&
>  	printf "\\0\\n\\01\\n" >expect &&
>  	test_cmp expect actual
>  '

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-10 17:02                               ` Junio C Hamano
@ 2013-05-10 17:34                                 ` Jeff King
  2013-05-10 18:04                                   ` Junio C Hamano
  2013-05-11  8:54                                 ` Michael J Gruber
  1 sibling, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-05-10 17:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git, Matthieu Moy

On Fri, May 10, 2013 at 10:02:51AM -0700, Junio C Hamano wrote:

> > Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
> > default and "--no-textconv" when given.
> [...]
> So "show" on blobs does show the raw contents by default, but the
> user can explicitly ask to enable textconv with --[no-]textconv.  Is
> the second paragraph in the log message still valid?

Yes, I had the same thought...

> > +	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
> > +		die("Not a valid object name %s", obj_name);
> 
> This looks somewhat unfortunate.
> [...]
> I wonder if enriching rev_info->pending with the context information
> might be a clean solution to avoid this redundant but unavoidable
> conversion, but that is a separate and future topic, I think.

It would be, and indeed, that is similar to what the final patch does.
The problem is that it requires an extra allocation (we do not want to
unconditionally put the object_context into the object_array because it
is too big, so we add only a pointer). So having rev_info->pending store
that information would mean that callers would have to know to free it
when freeing the pending array. We would have to either teach each
existing caller to do so, or perhaps enable the behavior only when a
certain flag is set (e.g., rev->keep_object_context or something).

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-10 17:34                                 ` Jeff King
@ 2013-05-10 18:04                                   ` Junio C Hamano
  2013-05-11  0:25                                     ` Jeff King
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-05-10 18:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael J Gruber, git, Matthieu Moy

Jeff King <peff@peff.net> writes:

> On Fri, May 10, 2013 at 10:02:51AM -0700, Junio C Hamano wrote:
>
>> > Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
>> > default and "--no-textconv" when given.
>> [...]
>> So "show" on blobs does show the raw contents by default, but the
>> user can explicitly ask to enable textconv with --[no-]textconv.  Is
>> the second paragraph in the log message still valid?
>
> Yes, I had the same thought...

I'd rewrite the paragraph to something like:

    Make "show" on blobs honor "--textconv" when it is asked.  The default
    is not to apply textconv, which is in line with what "cat-file" does.

>> > +	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
>> > +		die("Not a valid object name %s", obj_name);
>> 
>> This looks somewhat unfortunate.
>> [...]
>> I wonder if enriching rev_info->pending with the context information
>> might be a clean solution to avoid this redundant but unavoidable
>> conversion, but that is a separate and future topic, I think.
>
> It would be, and indeed, that is similar to what the final patch does.

OK, I wasn't paying attention ;-)

> The problem is that it requires an extra allocation (we do not want to
> unconditionally put the object_context into the object_array because it
> is too big, so we add only a pointer). So having rev_info->pending store
> that information would mean that callers would have to know to free it
> when freeing the pending array. We would have to either teach each
> existing caller to do so, or perhaps enable the behavior only when a
> certain flag is set (e.g., rev->keep_object_context or something).

One thing to notice is that those accessing rev->pending before
calling prepare_revision_walk(), as opposed to those receiving
objects in rev->commits via get_revision(), are the only ones that
care about the context and wants to act differently depending on
where these came from and how they were specified.

That suggests at least two possibilities to me:

 - Perhaps we can place the context in rev->pending and clear them
   when prepare_revision_walk() moves them to rev->commits, without
   introducing rev->keep_object_context?

 - Perhaps instead of extending object-array, we can move this kind
   of information to rev_cmdline and enrich that structure?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 7/7] grep: honor --textconv for the case rev:path
  2013-05-10 15:10                             ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber
@ 2013-05-10 18:11                               ` Junio C Hamano
  2013-05-10 18:31                                 ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-05-10 18:11 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy

Michael J Gruber <git@drmicha.warpmail.net> writes:

> diff --git a/object.h b/object.h
> index 97d384b..695847d 100644
> --- a/object.h
> +++ b/object.h
> @@ -13,6 +13,7 @@ struct object_array {
>  		struct object *item;
>  		const char *name;
>  		unsigned mode;
> +		struct object_context *context;
>  	} *objects;
>  };

fsck has to hold this for each and every objects in the repository
it has found but hasn't inspected (i.e. pending), doesn't it? Do we
really want to add 8 bytes for each of them?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 7/7] grep: honor --textconv for the case rev:path
  2013-05-10 18:11                               ` Junio C Hamano
@ 2013-05-10 18:31                                 ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-05-10 18:31 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy

Junio C Hamano <gitster@pobox.com> writes:

> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
>> diff --git a/object.h b/object.h
>> index 97d384b..695847d 100644
>> --- a/object.h
>> +++ b/object.h
>> @@ -13,6 +13,7 @@ struct object_array {
>>  		struct object *item;
>>  		const char *name;
>>  		unsigned mode;
>> +		struct object_context *context;
>>  	} *objects;
>>  };
>
> fsck has to hold this for each and every objects in the repository
> it has found but hasn't inspected (i.e. pending), doesn't it? Do we
> really want to add 8 bytes for each of them?

Perhaps fsck does not even want "name" and "mode" for that matter.

I wonder what improvement, if any, we would see with a change like
this patch in a large-ish repository.

 builtin/fsck.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index bb9a2cd..c1de2a9 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -73,7 +73,32 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
 	return (type == FSCK_WARN) ? 0 : 1;
 }
 
-static struct object_array pending;
+static struct pending_object {
+	unsigned int nr;
+	unsigned int alloc;
+	struct object **objects;
+} pending;
+
+static int max_pending;
+
+static void add_pending(struct object *object)
+{
+	unsigned nr = pending.nr;
+	unsigned alloc = pending.alloc;
+	struct object **objects = pending.objects;
+
+	if (nr >= alloc) {
+		alloc = (alloc + 32) * 2;
+		objects = xrealloc(objects, alloc * sizeof(*objects));
+		pending.alloc = alloc;
+		pending.objects = objects;
+	}
+	objects[nr] = object;
+	pending.nr = ++nr;
+
+	if (max_pending < nr)
+		max_pending = nr;
+}
 
 static int mark_object(struct object *obj, int type, void *data)
 {
@@ -112,7 +137,7 @@ static int mark_object(struct object *obj, int type, void *data)
 		return 1;
 	}
 
-	add_object_array(obj, (void *) parent, &pending);
+	add_pending(obj);
 	return 0;
 }
 
@@ -148,15 +173,15 @@ static int traverse_reachable(void)
 	if (show_progress)
 		progress = start_progress_delay("Checking connectivity", 0, 0, 2);
 	while (pending.nr) {
-		struct object_array_entry *entry;
-		struct object *obj;
+		struct object **entry, *obj;
 
 		entry = pending.objects + --pending.nr;
-		obj = entry->item;
+		obj = *entry;
 		result |= traverse_one_object(obj);
 		display_progress(progress, ++nr);
 	}
 	stop_progress(&progress);
+	fprintf(stderr, "max# pending objects = %d\n", max_pending);
 	return !!result;
 }
 

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-10 18:04                                   ` Junio C Hamano
@ 2013-05-11  0:25                                     ` Jeff King
  2013-05-11 19:54                                       ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-05-11  0:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git, Matthieu Moy

On Fri, May 10, 2013 at 11:04:01AM -0700, Junio C Hamano wrote:

> One thing to notice is that those accessing rev->pending before
> calling prepare_revision_walk(), as opposed to those receiving
> objects in rev->commits via get_revision(), are the only ones that
> care about the context and wants to act differently depending on
> where these came from and how they were specified.
> 
> That suggests at least two possibilities to me:
> 
>  - Perhaps we can place the context in rev->pending and clear them
>    when prepare_revision_walk() moves them to rev->commits, without
>    introducing rev->keep_object_context?
> 
>  - Perhaps instead of extending object-array, we can move this kind
>    of information to rev_cmdline and enrich that structure?

Without looking too closely to see whether it is feasible, I would think
the latter would end up being much more elegant, since I think it
already deals with some allocation issues already.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-10 17:02                               ` Junio C Hamano
  2013-05-10 17:34                                 ` Jeff King
@ 2013-05-11  8:54                                 ` Michael J Gruber
  2013-05-11 10:00                                   ` Michael J Gruber
  2013-05-11 17:36                                   ` Junio C Hamano
  1 sibling, 2 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-11  8:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Matthieu Moy

Junio C Hamano venit, vidit, dixit 10.05.2013 19:02:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> Currently, "diff" and "cat-file" for blobs honor "--textconv" options
>> (with the former defaulting to "--textconv" and the latter to
>> "--no-textconv") whereas "show" does not honor this option, even though
>> it takes diff options.
>>
>> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by
>> default and "--no-textconv" when given.
> 
> Hmm...

Sorry, I overlooked that ;)

>> +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name)
>>  {
>> +	unsigned char sha1c[20];
>> +	struct object_context obj_context;
>> +	char *buf;
>> +	unsigned long size;
>> +
>>  	fflush(stdout);
>> -	return stream_blob_to_fd(1, sha1, NULL, 0);
>> +	if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
>> +	    !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
>> +		return stream_blob_to_fd(1, sha1, NULL, 0);
> 
> It is surprising that the necessary change is only this, but I think
> it is correct ;-).  We ignore textconv when the command line did not
> mention --[no-]textconv, or the command line said --no-textconv
> explicitly.
> 
> This (especially the first condition) may deserve an in-code comment
> for anybody who wonders where this default behaviour is implemented.

It's not as if we would document behavior by in-code comments in
general, do we? The usual answer is "git log -S" or "git blame".

> So "show" on blobs does show the raw contents by default, but the
> user can explicitly ask to enable textconv with --[no-]textconv.  Is
> the second paragraph in the log message still valid?
> 
>> +	if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context))
>> +		die("Not a valid object name %s", obj_name);
> 
> This looks somewhat unfortunate.
> 
> We already have sha1[]; actually we not just know sha1[] but have
> the struct object for it.  How did we obtain it before we got here?
> 
> Will we always have a valid name in rev.pending.objects->name?  Will
> that name convert back to the same sha1 we got in sha1[]?
> 
> I think the answers are "Yes (it is a command line argument), Yes
> (that is what setup_revisions() got by feeding the name to give us
> sha1[])".
> 
> I wonder if enriching rev_info->pending with the context information
> might be a clean solution to avoid this redundant but unavoidable
> conversion, but that is a separate and future topic, I think.

Yes, I think both Jeff and I have thought it and came to the same
conclusion - "later" ;)

>> +	if (!obj_context.path[0] ||
>> +	    !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size))
>> +		return stream_blob_to_fd(1, sha1, NULL, 0);
>> +
>> +	if (!buf)
>> +		die("git show %s: bad file", obj_name);
>> +
>> +	write_or_die(1, buf, size);
>> +	return 0;
>>  }
>>  
>>  static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
>> @@ -526,7 +545,7 @@ int cmd_show(int argc, const char **argv, const char *prefix)
>>  		const char *name = objects[i].name;
>>  		switch (o->type) {
>>  		case OBJ_BLOB:
>> -			ret = show_blob_object(o->sha1, NULL);
>> +			ret = show_blob_object(o->sha1, &rev, name);
>>  			break;
>>  		case OBJ_TAG: {
>>  			struct tag *t = (struct tag *)o;
>> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
>> index 3950fc9..0ebb028 100755
>> --- a/t/t4030-diff-textconv.sh
>> +++ b/t/t4030-diff-textconv.sh
>> @@ -96,14 +96,14 @@ test_expect_success 'show blob produces binary' '
>>  	test_cmp expect actual
>>  '
>>  
>> -test_expect_failure 'show --textconv blob produces text' '
>> +test_expect_success 'show --textconv blob produces text' '
>>  	git show --textconv HEAD:file >actual &&
>>  	printf "0\\n1\\n" >expect &&
>>  	test_cmp expect actual
>>  '
>>  
>> -test_success 'show --no-textconv blob produces binary' '
>> -	git show --textconv HEAD:file >actual &&
>> +test_expect_success 'show --no-textconv blob produces binary' '
>> +	git show --no-textconv HEAD:file >actual &&
>>  	printf "\\0\\n\\01\\n" >expect &&
>>  	test_cmp expect actual
>>  '

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-11  8:54                                 ` Michael J Gruber
@ 2013-05-11 10:00                                   ` Michael J Gruber
  2013-05-13  5:01                                     ` Junio C Hamano
  2013-05-11 17:36                                   ` Junio C Hamano
  1 sibling, 1 reply; 78+ messages in thread
From: Michael J Gruber @ 2013-05-11 10:00 UTC (permalink / raw)
  Cc: Junio C Hamano, git, Jeff King, Matthieu Moy

Adding to that:

Somehow I still feel I should introduce a new attribute "show" (or a
better name) similar to "diff" so that you can specifiy a diff driver to
use for showing a blob (or grepping it), which may or may not be the
same you use for "diff". This would be a much more fine-grained and
systematic way of setting a default for "--textconv" for blobs.

Of course, some driver attributes would just not matter for coverting
blobs, but that doesn't hurt.

I'm just wondering whether it's worth the effort and whether I should
distinguish between "show" and grep".

So, the sructure would be:

"--textconv" is on by default for diff, show, grep.

diff looks for a textconv driver using the "diff" attribute.
show/grep look for a textconv driver using the "show" attribute.

That way, turning on "--textconv" by default does not affect anyone
unless a user specifies the new attribute!

Also, all commands would behave "the same way" if you have both a diff
and a show attribute set on the same files..

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-11  8:54                                 ` Michael J Gruber
  2013-05-11 10:00                                   ` Michael J Gruber
@ 2013-05-11 17:36                                   ` Junio C Hamano
  2013-05-12 12:13                                     ` Michael J Gruber
  1 sibling, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-05-11 17:36 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy

Michael J Gruber <git@drmicha.warpmail.net> writes:

>>> +	if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
>>> +	    !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
>>> +		return stream_blob_to_fd(1, sha1, NULL, 0);
>> 
>> It is surprising that the necessary change is only this, but I think
>> it is correct ;-).  We ignore textconv when the command line did not
>> mention --[no-]textconv, or the command line said --no-textconv
>> explicitly.
>> 
>> This (especially the first condition) may deserve an in-code comment
>> for anybody who wonders where this default behaviour is implemented.
>
> It's not as if we would document behavior by in-code comments in
> general, do we? The usual answer is "git log -S" or "git blame".

The comment and the future reader I had in mind was more like

	Default to --no-textconv, even though cmd_log_init_defaults()
        sets the bit, when the user did not explicitly ask for it.

sought by somebody who wonders _where_ in the code we ignore
ALLOW_TEXTCONV that is set in cmd_log_init_defaults().

That is not something you can find with "log -S" or "blame", is it?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-11  0:25                                     ` Jeff King
@ 2013-05-11 19:54                                       ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-05-11 19:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael J Gruber, git, Matthieu Moy

Jeff King <peff@peff.net> writes:

> On Fri, May 10, 2013 at 11:04:01AM -0700, Junio C Hamano wrote:
>
>> One thing to notice is that those accessing rev->pending before
>> calling prepare_revision_walk(), as opposed to those receiving
>> objects in rev->commits via get_revision(), are the only ones that
>> care about the context and wants to act differently depending on
>> where these came from and how they were specified.
>> 
>> That suggests at least two possibilities to me:
>> 
>>  - Perhaps we can place the context in rev->pending and clear them
>>    when prepare_revision_walk() moves them to rev->commits, without
>>    introducing rev->keep_object_context?
>> 
>>  - Perhaps instead of extending object-array, we can move this kind
>>    of information to rev_cmdline and enrich that structure?
>
> Without looking too closely to see whether it is feasible, I would think
> the latter would end up being much more elegant, since I think it
> already deals with some allocation issues already.

Yeah. I am fairly reluctant to apply a change that makes entries in
object-array larger.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-11 17:36                                   ` Junio C Hamano
@ 2013-05-12 12:13                                     ` Michael J Gruber
  0 siblings, 0 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-12 12:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Matthieu Moy

Junio C Hamano venit, vidit, dixit 11.05.2013 19:36:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>>>> +	if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
>>>> +	    !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
>>>> +		return stream_blob_to_fd(1, sha1, NULL, 0);
>>>
>>> It is surprising that the necessary change is only this, but I think
>>> it is correct ;-).  We ignore textconv when the command line did not
>>> mention --[no-]textconv, or the command line said --no-textconv
>>> explicitly.
>>>
>>> This (especially the first condition) may deserve an in-code comment
>>> for anybody who wonders where this default behaviour is implemented.
>>
>> It's not as if we would document behavior by in-code comments in
>> general, do we? The usual answer is "git log -S" or "git blame".
> 
> The comment and the future reader I had in mind was more like
> 
> 	Default to --no-textconv, even though cmd_log_init_defaults()
>         sets the bit, when the user did not explicitly ask for it.
> 
> sought by somebody who wonders _where_ in the code we ignore
> ALLOW_TEXTCONV that is set in cmd_log_init_defaults().
> 
> That is not something you can find with "log -S" or "blame", is it?
> 

I'll refactor and restructure anyways. That will also get this whole
default discussion out of the way:

I'll try out the "show attribute" route as indicated. I'm not sure what
to do about the object_array/context discussion, though.

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-11 10:00                                   ` Michael J Gruber
@ 2013-05-13  5:01                                     ` Junio C Hamano
  2013-05-13 11:55                                       ` Jeff King
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2013-05-13  5:01 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy

Michael J Gruber <git@drmicha.warpmail.net> writes:

> Adding to that:
>
> Somehow I still feel I should introduce a new attribute "show" (or a
> better name) similar to "diff" so that you can specifiy a diff driver to
> use for showing a blob (or grepping it), which may or may not be the
> same you use for "diff". This would be a much more fine-grained and
> systematic way of setting a default for "--textconv" for blobs.
>
> Of course, some driver attributes would just not matter for coverting
> blobs, but that doesn't hurt.
>
> I'm just wondering whether it's worth the effort and whether I should
> distinguish between "show" and grep".

Haven't thought things through, but my gut feeling is that it is on
the other side of the line. We could of course add more features and
over-engineered mechanisms, and the implementation may end up to be
even modular and clean, but I cannot answer "Yes" with a confidence
to the question "Does such a fine grained control help the users?"
and cannot answer "If so in what way?" myself.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-13  5:01                                     ` Junio C Hamano
@ 2013-05-13 11:55                                       ` Jeff King
  2013-05-13 14:57                                         ` Michael J Gruber
  0 siblings, 1 reply; 78+ messages in thread
From: Jeff King @ 2013-05-13 11:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git, Matthieu Moy

On Sun, May 12, 2013 at 10:01:38PM -0700, Junio C Hamano wrote:

> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
> > Adding to that:
> >
> > Somehow I still feel I should introduce a new attribute "show" (or a
> > better name) similar to "diff" so that you can specifiy a diff driver to
> > use for showing a blob (or grepping it), which may or may not be the
> > same you use for "diff". This would be a much more fine-grained and
> > systematic way of setting a default for "--textconv" for blobs.
> >
> > Of course, some driver attributes would just not matter for coverting
> > blobs, but that doesn't hurt.
> >
> > I'm just wondering whether it's worth the effort and whether I should
> > distinguish between "show" and grep".
> 
> Haven't thought things through, but my gut feeling is that it is on
> the other side of the line. We could of course add more features and
> over-engineered mechanisms, and the implementation may end up to be
> even modular and clean, but I cannot answer "Yes" with a confidence
> to the question "Does such a fine grained control help the users?"
> and cannot answer "If so in what way?" myself.

Yeah, I think the _most_ flexible thing is going to look something like:

  $ cat .gitattributes
  *.pdf diff=pdf show=pdf

  $ cat ~/.gitconfig
  [diff "pdf"]
          textconv = ...
  [show "pdf"]
          textconv = ...

But that obviously sucks, because in the common case that you want to
use the same command, you are repeating yourself in the config. You
could assume that the "show" attribute points us at a "diff" block. And
that makes sense for textconv, but what does it mean if you have
"show=foo" and "diff.foo.command" set?

If the _only_ thing you would want to do with such a "show" mechanism is
to display converted contents on show/grep, then we could lose the
flexibility and say that "show" is a single-bit: do we respect diff
textconv for show/grep in this case, or not? And that leaves only the
question of where to put it: is it a gitattribute, or does it go in the
config?

I don't think that it is a property of the file itself. That is, you do
not say "foo files are inherently uninteresting to git-show, and
therefore we always convert them, whereas bar files do not have that
property'. You say "in my workflows, I expect to see converted results
from grep/show". And the latter points to using config, like either
"diff.*.showConverted" (to allow per-type setting), or even
"grep.useTextconv" and "show.textConv" (to allow setting it per-user for
all types).

And of course for any workflow-oriented config, you will sometimes want
to override it for a particular operation. But that is why we have a
command-line escape hatch, and that part is already implemented.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-13 11:55                                       ` Jeff King
@ 2013-05-13 14:57                                         ` Michael J Gruber
  2013-05-13 15:41                                           ` Junio C Hamano
  2013-05-16  3:31                                           ` Jeff King
  0 siblings, 2 replies; 78+ messages in thread
From: Michael J Gruber @ 2013-05-13 14:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Matthieu Moy

Jeff King venit, vidit, dixit 13.05.2013 13:55:
> On Sun, May 12, 2013 at 10:01:38PM -0700, Junio C Hamano wrote:
> 
>> Michael J Gruber <git@drmicha.warpmail.net> writes:
>>
>>> Adding to that:
>>>
>>> Somehow I still feel I should introduce a new attribute "show" (or a
>>> better name) similar to "diff" so that you can specifiy a diff driver to
>>> use for showing a blob (or grepping it), which may or may not be the
>>> same you use for "diff". This would be a much more fine-grained and
>>> systematic way of setting a default for "--textconv" for blobs.
>>>
>>> Of course, some driver attributes would just not matter for coverting
>>> blobs, but that doesn't hurt.
>>>
>>> I'm just wondering whether it's worth the effort and whether I should
>>> distinguish between "show" and grep".
>>
>> Haven't thought things through, but my gut feeling is that it is on
>> the other side of the line. We could of course add more features and
>> over-engineered mechanisms, and the implementation may end up to be
>> even modular and clean, but I cannot answer "Yes" with a confidence
>> to the question "Does such a fine grained control help the users?"
>> and cannot answer "If so in what way?" myself.
> 
> Yeah, I think the _most_ flexible thing is going to look something like:
> 
>   $ cat .gitattributes
>   *.pdf diff=pdf show=pdf
> 
>   $ cat ~/.gitconfig
>   [diff "pdf"]
>           textconv = ...
>   [show "pdf"]
>           textconv = ...
> 
> But that obviously sucks, because in the common case that you want to
> use the same command, you are repeating yourself in the config. You
> could assume that the "show" attribute points us at a "diff" block. And
> that makes sense for textconv, but what does it mean if you have
> "show=foo" and "diff.foo.command" set?

I don't propose "show drivers". In your example above, you would point
to the same diff driver.

If you use a diff driver just with the "show" attribute then only its
textconv config will be relevant.

But you do have the possibility to use different drivers for diff and
show. For example, for showing a file some sort of automatic pagination
or line numbering can be helpful whereas it would hurt the diff case.

> If the _only_ thing you would want to do with such a "show" mechanism is
> to display converted contents on show/grep, then we could lose the
> flexibility and say that "show" is a single-bit: do we respect diff
> textconv for show/grep in this case, or not? And that leaves only the
> question of where to put it: is it a gitattribute, or does it go in the
> config?
> 
> I don't think that it is a property of the file itself. That is, you do
> not say "foo files are inherently uninteresting to git-show, and
> therefore we always convert them, whereas bar files do not have that
> property'. You say "in my workflows, I expect to see converted results
> from grep/show". And the latter points to using config, like either
> "diff.*.showConverted" (to allow per-type setting), or even
> "grep.useTextconv" and "show.textConv" (to allow setting it per-user for
> all types).

I strongly disagree here. I have textconv filters for pdf, gpg, odf,
xls, doc, xoj... I know, ugly. At least some of them would benefit from
different filteres or different settings.

The way I propose it, a user would just have to add "show=foo" to the
"diff=foo" lines without having to ad an extra filter, but with the
flexibility to do so.

> And of course for any workflow-oriented config, you will sometimes want
> to override it for a particular operation. But that is why we have a
> command-line escape hatch, and that part is already implemented.

One may ask what a purely ui output oriented setting like "show" has to
do in .gitattributes, of course, but that applies to "diff" as well.
Separating the two (one in attributes, one in config) looks artificial
to me.

Michael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-13 14:57                                         ` Michael J Gruber
@ 2013-05-13 15:41                                           ` Junio C Hamano
  2013-05-16  3:31                                           ` Jeff King
  1 sibling, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2013-05-13 15:41 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Jeff King, git, Matthieu Moy

Michael J Gruber <git@drmicha.warpmail.net> writes:

> But you do have the possibility to use different drivers for diff and
> show. For example, for showing a file some sort of automatic pagination
> or line numbering can be helpful whereas it would hurt the diff case.

I do not find the example convincing (yet); it looks more like you
are grasping for straws.

You would certainly do not want "line numbering" in grep.  My gut
feeling is that normal users would expect to have a single "text
version" and pass that to "pr" (if they want pagination) or "cat -n"
(if they want line numbering), regardless of where it comes from, be
it "git show --textconv" or some other program output, but you seem
to want to have different "text version"s for different purposes out
of a single binary file....

> I strongly disagree here. I have textconv filters for pdf, gpg, odf,
> xls, doc, xoj... I know, ugly. At least some of them would benefit from
> different filteres or different settings.

.... and an example to show why it is useful would help here.  I do
not feel that I have seen anything to substantiate "at least some of
them would benefit" yet.

Would it follow that "grep" and "cat-file" should be controlled by
yet two other knobs so that optionally the user can use different
"text version"s meant for them?

> The way I propose it, a user would just have to add "show=foo" to the
> "diff=foo" lines without having to ad an extra filter, but with the
> flexibility to do so.
>
>> And of course for any workflow-oriented config, you will sometimes want
>> to override it for a particular operation. But that is why we have a
>> command-line escape hatch, and that part is already implemented.
>
> One may ask what a purely ui output oriented setting like "show" has to
> do in .gitattributes, of course, but that applies to "diff" as well.
> Separating the two (one in attributes, one in config) looks artificial
> to me.

I am not sure what you mean by "artificial", but the separation of
the roles between attribute and config is not artificial at all. It
is very much deliberate and done for a good reason.

The attribute specifies what the type of the file is project wide
and is meant to go in in-tree .gitattrbute file, shared among people
on different platforms.  It says things like "These files are PDF".

The config specifies what should happen to the type of a file on a
particular platform each user uses to work in the copy of the
project, i.e. repository.  It says things like "Pass PDF files
through /opt/bin/pdf2txt", which obviously cannot be shared across
platforms.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCHv3 3/7] show: honor --textconv for blobs
  2013-05-13 14:57                                         ` Michael J Gruber
  2013-05-13 15:41                                           ` Junio C Hamano
@ 2013-05-16  3:31                                           ` Jeff King
  1 sibling, 0 replies; 78+ messages in thread
From: Jeff King @ 2013-05-16  3:31 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Junio C Hamano, git, Matthieu Moy

On Mon, May 13, 2013 at 04:57:55PM +0200, Michael J Gruber wrote:

> > I don't think that it is a property of the file itself. That is, you do
> > not say "foo files are inherently uninteresting to git-show, and
> > therefore we always convert them, whereas bar files do not have that
> > property'. You say "in my workflows, I expect to see converted results
> > from grep/show". And the latter points to using config, like either
> > "diff.*.showConverted" (to allow per-type setting), or even
> > "grep.useTextconv" and "show.textConv" (to allow setting it per-user for
> > all types).
> 
> I strongly disagree here. I have textconv filters for pdf, gpg, odf,
> xls, doc, xoj... I know, ugly. At least some of them would benefit from
> different filteres or different settings.

OK. I was speaking mostly from intuition, and I suspect you have more
real-world experience here. So I am willing to admit that my "you do not
say..." above was a strawman. :)

> The way I propose it, a user would just have to add "show=foo" to the
> "diff=foo" lines without having to ad an extra filter, but with the
> flexibility to do so.

Yes, I think that would work OK. The only problem is that it is a bit
weird to pointing "show=foo" to "diff.foo.*", especially when most of
the driver options are ignored. But if we can accept that wrinkle in the
UI, I think it would otherwise do what users want.

> One may ask what a purely ui output oriented setting like "show" has to
> do in .gitattributes, of course, but that applies to "diff" as well.
> Separating the two (one in attributes, one in config) looks artificial
> to me.

I think the point is that the attribute says "a property of this path is
that it has type X". And then the config says "when you see type X, do
this thing with it".

So arguably "diff=X" is wrong in the first place. It should be "type=X",
and we should have "diff.X", "merge.X", etc in the config. And
diff.*.textconv is potentially misplaced; it is not really about diffing
at all, but rather about creating a human-readable presentation for the
file. I don't think it is so bad that it is worth the pain of fixing it
now, though. It is a historical weirdness that "diff=X" means "present
the path according to the rules in X", but we can live with that.

But if we think of it that way, then automatically respecting textconv
for "git show" is a sensible thing to do. Hmph. Now I may have convinced
myself that flipping the default is the right thing. :)

So if it is not clear, I am pretty on the fence about how the defaults
should be handled, or what would surprise users the least. Either way,
though, it would probably make sense to have a configurable option. And
with the reasoning above for the split between attributes/config, it
would make sense to me for that option to be a boolean
"diff.X.showtextconv". Which seems totally odd and broken (we are not
doing a diff at all!), but that is where the textconv config lives, for
historical reasons.

-Peff

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2013-05-16  3:31 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber
2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber
2013-04-20  4:04   ` Jeff King
2013-04-20 13:35     ` Michael J Gruber
2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber
2013-04-20  4:06   ` Jeff King
2013-04-20 13:38     ` Michael J Gruber
2013-04-21  3:37       ` Jeff King
2013-04-22 10:29         ` Michael J Gruber
2013-04-22 15:25         ` Junio C Hamano
2013-04-22 15:29           ` Jeff King
2013-04-22 15:37             ` Jeremy Rosen
2013-04-22 15:54               ` Matthieu Moy
2013-04-23  8:58                 ` Michael J Gruber
2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber
2013-04-19 18:15   ` Junio C Hamano
2013-04-20  4:17   ` Jeff King
2013-04-20 14:27     ` Michael J Gruber
2013-04-19 16:44 ` [PATCH 4/6] t7008: demonstrate behavior of grep with textconv Michael J Gruber
2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber
2013-04-20  4:31   ` Jeff King
2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber
2013-04-20  4:24   ` Jeff King
2013-04-20 14:42     ` Michael J Gruber
2013-04-21  3:41       ` Jeff King
2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano
2013-04-20  4:26   ` Jeff King
2013-04-20 13:32   ` Michael J Gruber
2013-04-23 12:11     ` [PATCHv2 0/7] " Michael J Gruber
2013-04-23 12:11       ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber
2013-04-23 15:11         ` Junio C Hamano
2013-04-23 12:11       ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber
2013-04-23 15:14         ` Junio C Hamano
2013-04-24 10:09           ` Michael J Gruber
2013-04-24 17:30             ` Junio C Hamano
2013-04-23 12:11       ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
2013-04-23 15:15         ` Junio C Hamano
2013-04-23 12:11       ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
2013-04-23 15:16         ` Junio C Hamano
2013-04-24 10:09           ` Michael J Gruber
2013-04-24 17:29             ` Junio C Hamano
2013-04-23 12:11       ` [PATCHv2 5/7] grep: allow to use textconv filters Michael J Gruber
2013-04-23 12:11       ` [PATCHv2 6/7] grep: honor --textconv for the case rev:path Michael J Gruber
2013-04-23 12:11       ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber
2013-04-23 15:20         ` Junio C Hamano
2013-04-24 10:05           ` Michael J Gruber
2013-04-24 17:35             ` Junio C Hamano
2013-04-24 17:57               ` Matthieu Moy
2013-04-24 18:55                 ` Junio C Hamano
2013-04-26 11:59                   ` Michael J Gruber
2013-04-26 13:23                     ` Matthieu Moy
2013-04-29  9:04                       ` Michael J Gruber
2013-04-29 15:04                         ` Junio C Hamano
2013-05-10 15:08                           ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber
2013-05-10 15:10                             ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber
2013-05-10 15:10                             ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber
2013-05-10 15:31                               ` Eric Sunshine
2013-05-10 15:10                             ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber
2013-05-10 17:02                               ` Junio C Hamano
2013-05-10 17:34                                 ` Jeff King
2013-05-10 18:04                                   ` Junio C Hamano
2013-05-11  0:25                                     ` Jeff King
2013-05-11 19:54                                       ` Junio C Hamano
2013-05-11  8:54                                 ` Michael J Gruber
2013-05-11 10:00                                   ` Michael J Gruber
2013-05-13  5:01                                     ` Junio C Hamano
2013-05-13 11:55                                       ` Jeff King
2013-05-13 14:57                                         ` Michael J Gruber
2013-05-13 15:41                                           ` Junio C Hamano
2013-05-16  3:31                                           ` Jeff King
2013-05-11 17:36                                   ` Junio C Hamano
2013-05-12 12:13                                     ` Michael J Gruber
2013-05-10 15:10                             ` [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber
2013-05-10 15:10                             ` [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber
2013-05-10 15:10                             ` [PATCHv3 6/7] grep: allow to use textconv filters Michael J Gruber
2013-05-10 15:10                             ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber
2013-05-10 18:11                               ` Junio C Hamano
2013-05-10 18:31                                 ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.