* [PATCH 0/6] grep with textconv @ 2013-04-19 16:44 Michael J Gruber 2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber ` (6 more replies) 0 siblings, 7 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git This series teaches show and grep to obey textconv: show by default (like diff), grep only on request (--textconv). We might switch the default for the latter also, of course. I'd actually like that. Compared to an earlier (historic) series this one comes with tests. Besides, it has been in use since. Jeff King (1): grep: allow to use textconv filters Michael J Gruber (5): t4030: demonstrate behavior of show with textconv show: obey --textconv for blobs cat-file: do not die on --textconv without textconv filters t7008: demonstrate behavior of grep with textconv grep: obey --textconv for the case rev:path builtin/cat-file.c | 9 ++-- builtin/grep.c | 13 +++--- builtin/log.c | 24 +++++++++-- grep.c | 100 +++++++++++++++++++++++++++++++++++++------ grep.h | 1 + object.c | 26 ++++++++--- object.h | 2 + t/t4030-diff-textconv.sh | 18 ++++++++ t/t7008-grep-binary.sh | 39 +++++++++++++++++ t/t8007-cat-file-textconv.sh | 20 +++------ 10 files changed, 205 insertions(+), 47 deletions(-) -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCH 1/6] t4030: demonstrate behavior of show with textconv 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber @ 2013-04-19 16:44 ` Michael J Gruber 2013-04-20 4:04 ` Jeff King 2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber ` (5 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git "git show <commit>" obeys the textconc setting while "git show <blob>" does not. Demonstrate this in the test. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- t/t4030-diff-textconv.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh index 53ec330..f314ced 100755 --- a/t/t4030-diff-textconv.sh +++ b/t/t4030-diff-textconv.sh @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' ' test_cmp expect.text actual ' +test_expect_success 'show commit produces text' ' + git show HEAD >diff && + find_diff <diff >actual && + test_cmp expect.text actual +' + test_expect_success 'diff-tree produces binary' ' git diff-tree -p HEAD^ HEAD >diff && find_diff <diff >actual && @@ -84,6 +90,12 @@ test_expect_success 'status -v produces text' ' git reset --soft HEAD@{1} ' +test_expect_success 'show blob produces binary' ' + git show HEAD:file >actual && + printf "\\0\\n\\1\\n" >expect && + test_cmp expect actual +' + test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' echo one >expect && git log --root --format=%s -G0 >actual && -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCH 1/6] t4030: demonstrate behavior of show with textconv 2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber @ 2013-04-20 4:04 ` Jeff King 2013-04-20 13:35 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Jeff King @ 2013-04-20 4:04 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Fri, Apr 19, 2013 at 06:44:44PM +0200, Michael J Gruber wrote: > "git show <commit>" obeys the textconc setting while "git show <blob>" > does not. Demonstrate this in the test. s/textconc/textconv > diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh > index 53ec330..f314ced 100755 > --- a/t/t4030-diff-textconv.sh > +++ b/t/t4030-diff-textconv.sh > @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' ' > test_cmp expect.text actual > ' > > +test_expect_success 'show commit produces text' ' > + git show HEAD >diff && > + find_diff <diff >actual && > + test_cmp expect.text actual > +' Makes sense. > +test_expect_success 'show blob produces binary' ' > + git show HEAD:file >actual && > + printf "\\0\\n\\1\\n" >expect && > + test_cmp expect actual > +' I think this is probably the right thing. I can see instances where one would want the converted contents, but we have "cat-file --textconv" for that. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 1/6] t4030: demonstrate behavior of show with textconv 2013-04-20 4:04 ` Jeff King @ 2013-04-20 13:35 ` Michael J Gruber 0 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-20 13:35 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King venit, vidit, dixit 20.04.2013 06:04: > On Fri, Apr 19, 2013 at 06:44:44PM +0200, Michael J Gruber wrote: > >> "git show <commit>" obeys the textconc setting while "git show <blob>" >> does not. Demonstrate this in the test. > > s/textconc/textconv Thanks, plus s/obey/honor/ >> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh >> index 53ec330..f314ced 100755 >> --- a/t/t4030-diff-textconv.sh >> +++ b/t/t4030-diff-textconv.sh >> @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' ' >> test_cmp expect.text actual >> ' >> >> +test_expect_success 'show commit produces text' ' >> + git show HEAD >diff && >> + find_diff <diff >actual && >> + test_cmp expect.text actual >> +' > > Makes sense. > >> +test_expect_success 'show blob produces binary' ' >> + git show HEAD:file >actual && >> + printf "\\0\\n\\1\\n" >expect && >> + test_cmp expect actual >> +' > > I think this is probably the right thing. I can see instances where one > would want the converted contents, but we have "cat-file --textconv" for > that. > By that you mean that this behavior is to stay as is? My reasoning is twofold: - consistency between "git show commit" and "git show blob" - "git show" is a user facing command, and as such should produce output consumable by humans; whereas "git cat-file" is plumbing and should produce raw data unless told otherwise (-p, --textconv). Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCH 2/6] show: obey --textconv for blobs 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber 2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber @ 2013-04-19 16:44 ` Michael J Gruber 2013-04-20 4:06 ` Jeff King 2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber ` (4 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git Currently, "diff" and "cat-file" for blobs obey "--textconv" options (with the former defaulting to "--textconv" and the latter to "--no-textconv") whereas "show" does not obey this option, even though it takes diff options. Make "show" on blobs behave like "diff", i.e. obey "--textconv" by default and "--no-textconv" when given. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/log.c | 24 +++++++++++++++++++++--- t/t4030-diff-textconv.sh | 8 +++++++- 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index 5f3ed77..fe0275e 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(&out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) { + unsigned char sha1c[20]; + struct object_context obj_context; + char *buf; + unsigned long size; + fflush(stdout); - return stream_blob_to_fd(1, sha1, NULL, 0); + if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) + die("Not a valid object name %s", obj_name); + if (!obj_context.path[0] || + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (!buf) + die("git show %s: bad file", obj_name); + + write_or_die(1, buf, size); + return 0; } static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) @@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) const char *name = objects[i].name; switch (o->type) { case OBJ_BLOB: - ret = show_blob_object(o->sha1, NULL); + ret = show_blob_object(o->sha1, &rev, name); break; case OBJ_TAG: { struct tag *t = (struct tag *)o; diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh index f314ced..f9d55e1 100755 --- a/t/t4030-diff-textconv.sh +++ b/t/t4030-diff-textconv.sh @@ -90,8 +90,14 @@ test_expect_success 'status -v produces text' ' git reset --soft HEAD@{1} ' -test_expect_success 'show blob produces binary' ' +test_expect_success 'show blob produces text' ' git show HEAD:file >actual && + printf "0\\n1\\n" >expect && + test_cmp expect actual +' + +test_expect_success 'show --no-textconv blob produces binary' ' + git show --no-textconv HEAD:file >actual && printf "\\0\\n\\1\\n" >expect && test_cmp expect actual ' -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber @ 2013-04-20 4:06 ` Jeff King 2013-04-20 13:38 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Jeff King @ 2013-04-20 4:06 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Fri, Apr 19, 2013 at 06:44:45PM +0200, Michael J Gruber wrote: > Currently, "diff" and "cat-file" for blobs obey "--textconv" options > (with the former defaulting to "--textconv" and the latter to > "--no-textconv") whereas "show" does not obey this option, even though > it takes diff options. > > Make "show" on blobs behave like "diff", i.e. obey "--textconv" by > default and "--no-textconv" when given. Wait, this does the opposite of the last patch. If we do want to do this, shouldn't the last one have been an "expect_failure"? I'm not convinced this is the right thing to do, though. It would break: git show HEAD:file.c >file.c Admittedly, such people should be using "checkout" or "cat-file", so I do not mind too much breaking them if there is a good reason. But I am not sure what that reason is. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-20 4:06 ` Jeff King @ 2013-04-20 13:38 ` Michael J Gruber 2013-04-21 3:37 ` Jeff King 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-20 13:38 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King venit, vidit, dixit 20.04.2013 06:06: > On Fri, Apr 19, 2013 at 06:44:45PM +0200, Michael J Gruber wrote: > >> Currently, "diff" and "cat-file" for blobs obey "--textconv" options >> (with the former defaulting to "--textconv" and the latter to >> "--no-textconv") whereas "show" does not obey this option, even though >> it takes diff options. >> >> Make "show" on blobs behave like "diff", i.e. obey "--textconv" by >> default and "--no-textconv" when given. > > Wait, this does the opposite of the last patch. If we do want to do > this, shouldn't the last one have been an "expect_failure"? The last patch just documents the status quo, which is not a bug per se. Therefore, no failure, but change in the definition of "success". > I'm not convinced this is the right thing to do, though. It would break: > > git show HEAD:file.c >file.c > > Admittedly, such people should be using "checkout" or "cat-file", so I > do not mind too much breaking them if there is a good reason. But I am > not sure what that reason is. My reasoning is twofold: - consistency between "git show commit" and "git show blob" - "git show" is a user facing command, and as such should produce output consumable by humans; whereas "git cat-file" is plumbing and should produce raw data unless told otherwise (-p, --textconv). (Sorry for the repeat.) Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-20 13:38 ` Michael J Gruber @ 2013-04-21 3:37 ` Jeff King 2013-04-22 10:29 ` Michael J Gruber 2013-04-22 15:25 ` Junio C Hamano 0 siblings, 2 replies; 78+ messages in thread From: Jeff King @ 2013-04-21 3:37 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Junio C Hamano On Sat, Apr 20, 2013 at 03:38:53PM +0200, Michael J Gruber wrote: > > Wait, this does the opposite of the last patch. If we do want to do > > this, shouldn't the last one have been an "expect_failure"? > > The last patch just documents the status quo, which is not a bug per se. > Therefore, no failure, but change in the definition of "success". IMHO, the series is easier to review if you it does not go back and forth. If you have one patch that says "X is the right behavior", and then another patch that flips it to say "Y is the right behavior", the reviewer would read each in sequence and want to be convinced by your arguments for X and Y. But you probably cannot make a good argument for X if you are trying to end up at Y. :) So I'd much rather see the test introduced with the desired end behavior, marked as expect_failure, and the commit message contain an argument about why Y is a good thing (and squashing the tests in with the actual fix is often even better, because the fix itself would want to contain the same argument). Just my two cents as a reviewer. > My reasoning is twofold: > > - consistency between "git show commit" and "git show blob" I'm not sure I agree with this line of reasoning. "git show commit" is showing a diff, not the file contents; textconv has always been about munging the contents to produce a textual diff. It may be reasonable to extend its definition to "this is the preferred human view of this content, and that happens to be what you would want to produce a diff". But I do not think it is necessarily inconsistent not to apply it for the blob case. > - "git show" is a user facing command, and as such should produce output > consumable by humans; whereas "git cat-file" is plumbing and should > produce raw data unless told otherwise (-p, --textconv). That holds if the textconv is the only (or best) human-readable version of the file. And maybe that is reasonable. But is it possible that somebody uses "textconv" to produce a better diff of some already human-readable format? For example, imagine I define a textconv for XML files that normalizes the formatting to make diffs less noisy. When I am not looking at a diff, what is the best human-readable version? The original, or the normalized one? I'm not sure. Note that I'm somewhat playing devil's advocate here. For the cases where I have used textconv in the real world, I think I would probably prefer seeing the converted contents, and I am happy to call it user error if I use "git show HEAD:foo.jpg >bar.jpg" accidentally. But I also want to make sure we are not regressing somebody else unnecessarily. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-21 3:37 ` Jeff King @ 2013-04-22 10:29 ` Michael J Gruber 2013-04-22 15:25 ` Junio C Hamano 1 sibling, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-22 10:29 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King venit, vidit, dixit 21.04.2013 05:37: > On Sat, Apr 20, 2013 at 03:38:53PM +0200, Michael J Gruber wrote: > >>> Wait, this does the opposite of the last patch. If we do want to do >>> this, shouldn't the last one have been an "expect_failure"? >> >> The last patch just documents the status quo, which is not a bug per se. >> Therefore, no failure, but change in the definition of "success". > > IMHO, the series is easier to review if you it does not go back and > forth. If you have one patch that says "X is the right behavior", and > then another patch that flips it to say "Y is the right behavior", the > reviewer would read each in sequence and want to be convinced by your > arguments for X and Y. But you probably cannot make a good argument for > X if you are trying to end up at Y. :) > > So I'd much rather see the test introduced with the desired end > behavior, marked as expect_failure, and the commit message contain an > argument about why Y is a good thing (and squashing the tests in with > the actual fix is often even better, because the fix itself would want > to contain the same argument). > > Just my two cents as a reviewer. > >> My reasoning is twofold: >> >> - consistency between "git show commit" and "git show blob" > > I'm not sure I agree with this line of reasoning. "git show commit" is > showing a diff, not the file contents; textconv has always been about > munging the contents to produce a textual diff. It may be reasonable to > extend its definition to "this is the preferred human view of this > content, and that happens to be what you would want to produce a diff". > But I do not think it is necessarily inconsistent not to apply it for > the blob case. > >> - "git show" is a user facing command, and as such should produce output >> consumable by humans; whereas "git cat-file" is plumbing and should >> produce raw data unless told otherwise (-p, --textconv). > > That holds if the textconv is the only (or best) human-readable version > of the file. And maybe that is reasonable. But is it possible that > somebody uses "textconv" to produce a better diff of some already > human-readable format? For example, imagine I define a textconv for XML > files that normalizes the formatting to make diffs less noisy. When I am > not looking at a diff, what is the best human-readable version? The > original, or the normalized one? I'm not sure. > > Note that I'm somewhat playing devil's advocate here. For the cases > where I have used textconv in the real world, I think I would probably > prefer seeing the converted contents, and I am happy to call it user > error if I use "git show HEAD:foo.jpg >bar.jpg" accidentally. But I also > want to make sure we are not regressing somebody else unnecessarily. Yes, the thing is that textconv helps diff by converting content (to text) before the (textual) diff. So it's somehow a double-faced beast. It's clearly activated by a "diff" attribute; so that would be a strong argument against my patch, at least against defaulting to --textconv for blobs. OTOH, textconv does have this aspect of converting text to a form digestable by humans (pre-diff, granted), which is the argument for defaulting to --textconv in porcellain. We could use a separate attribute "show" in addition to "diff", but I don't think it's worth going there, unless there is a strong use case for "diff-specific textconv" which one would not want to apply when showing just the content. Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-21 3:37 ` Jeff King 2013-04-22 10:29 ` Michael J Gruber @ 2013-04-22 15:25 ` Junio C Hamano 2013-04-22 15:29 ` Jeff King 1 sibling, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-22 15:25 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git Jeff King <peff@peff.net> writes: > Just my two cents as a reviewer. > >> My reasoning is twofold: >> >> - consistency between "git show commit" and "git show blob" > > I'm not sure I agree with this line of reasoning. "git show commit" is > showing a diff, not the file contents; textconv has always been about > munging the contents to produce a textual diff. It may be reasonable to > extend its definition to "this is the preferred human view of this > content, and that happens to be what you would want to produce a diff". > But I do not think it is necessarily inconsistent not to apply it for > the blob case. True. Applying textconv to otherwise unreadable blobs is often useful, but I agree that it is unexpected if it is done by default, especially given that many people have learned to do: git show HEAD~4:binary-gob >old-binary-gob to recover old version of binary contents to a temporary file when checking the sanity of or restoring the breakage in the new one. It of course does _not_ forbid git show --textconv HEAD~4:binary-gob | less but I doubt it is a good idea to turn it on by default this late in the game. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-22 15:25 ` Junio C Hamano @ 2013-04-22 15:29 ` Jeff King 2013-04-22 15:37 ` Jeremy Rosen 0 siblings, 1 reply; 78+ messages in thread From: Jeff King @ 2013-04-22 15:29 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git On Mon, Apr 22, 2013 at 08:25:41AM -0700, Junio C Hamano wrote: > True. Applying textconv to otherwise unreadable blobs is often > useful, but I agree that it is unexpected if it is done by default, > especially given that many people have learned to do: > > git show HEAD~4:binary-gob >old-binary-gob > > to recover old version of binary contents to a temporary file when > checking the sanity of or restoring the breakage in the new one. > > It of course does _not_ forbid > > git show --textconv HEAD~4:binary-gob | less > > but I doubt it is a good idea to turn it on by default this late in > the game. Exactly. I certainly do not mind it as an option, and I am on the fence regarding it as a default (I think it might have been a sane thing to do from the start, but at this point the change-of-behavior makes me hesitate). So I am perfectly willing to go either way, depending on what others think. I'm going to be out of email contact the rest of the week, so I'll let you two talk it out. :) -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-22 15:29 ` Jeff King @ 2013-04-22 15:37 ` Jeremy Rosen 2013-04-22 15:54 ` Matthieu Moy 0 siblings, 1 reply; 78+ messages in thread From: Jeremy Rosen @ 2013-04-22 15:37 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git, Junio C Hamano > > git show --textconv HEAD~4:binary-gob | less > > > > but I doubt it is a good idea to turn it on by default this late in > > the game. > > Exactly. I certainly do not mind it as an option, and I am on the > fence > regarding it as a default (I think it might have been a sane thing to > do > from the start, but at this point the change-of-behavior makes me > hesitate). So I am perfectly willing to go either way, depending on > what > others think. > some features detect if they are piping to a terminal... couldn't we do something like that ? ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-22 15:37 ` Jeremy Rosen @ 2013-04-22 15:54 ` Matthieu Moy 2013-04-23 8:58 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Matthieu Moy @ 2013-04-22 15:54 UTC (permalink / raw) To: Jeremy Rosen; +Cc: Jeff King, Michael J Gruber, git, Junio C Hamano Jeremy Rosen <jeremy.rosen@openwide.fr> writes: > some features detect if they are piping to a terminal... couldn't we do > something like that ? That's OK for convenience features like colors or so, but that would be really, really unexpected to have $ git show HEAD:file foo $ git show HEAD:file > tmp $ cat tmp bar IMHO. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 2/6] show: obey --textconv for blobs 2013-04-22 15:54 ` Matthieu Moy @ 2013-04-23 8:58 ` Michael J Gruber 0 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 8:58 UTC (permalink / raw) To: Matthieu Moy; +Cc: Jeremy Rosen, Jeff King, git, Junio C Hamano Matthieu Moy venit, vidit, dixit 22.04.2013 17:54: > Jeremy Rosen <jeremy.rosen@openwide.fr> writes: > >> some features detect if they are piping to a terminal... couldn't we do >> something like that ? > > That's OK for convenience features like colors or so, but that would be > really, really unexpected to have > > $ git show HEAD:file > foo > $ git show HEAD:file > tmp > $ cat tmp > bar > > IMHO. Yes, I'd either do it by default in general (my preference) or on request, but not based on tty. Another point of input: You can do git show commit <blob> <commit> <blob> and also with other object types, of course. On the other hand, there is a single rev.diffopt. Besides the nuisance of having to track whether textconv has been specified explicitely and flipping the bit in rev.diffopt per argument (or adding a parameter), which is an implementation detail, it would mean that the default for different arguments in the argument list is different, depending on type. And that is a usablility issue, I would argue: Is textconv on by default for git show? Yes and no, for some arguments yes, for others no. That's what I want to cure ;) Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCH 3/6] cat-file: do not die on --textconv without textconv filters 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber 2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber 2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber @ 2013-04-19 16:44 ` Michael J Gruber 2013-04-19 18:15 ` Junio C Hamano 2013-04-20 4:17 ` Jeff King 2013-04-19 16:44 ` [PATCH 4/6] t7008: demonstrate behavior of grep with textconv Michael J Gruber ` (3 subsequent siblings) 6 siblings, 2 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git When a command is supposed to use textconv filters (by default or with "--textconv") and none are configured then the blob is output without conversion; the only exception to this rule is "cat-file --textconv". Make it behave like the rest of textconv aware commands. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/cat-file.c | 9 +++++---- t/t8007-cat-file-textconv.sh | 20 +++++--------------- 2 files changed, 10 insertions(+), 19 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 40f87b4..dd4e063 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -146,10 +146,11 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) die("git cat-file --textconv %s: <object> must be <sha1:path>", obj_name); - if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) - die("git cat-file --textconv: unable to run textconv on %s", - obj_name); - break; + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) + break; + + /* otherwise expect a blob */ + exp_type = "blob"; case 0: if (type_from_string(exp_type) == OBJ_BLOB) { diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh index 78a0085..83c6636 100755 --- a/t/t8007-cat-file-textconv.sh +++ b/t/t8007-cat-file-textconv.sh @@ -22,11 +22,11 @@ test_expect_success 'setup ' ' ' cat >expected <<EOF -fatal: git cat-file --textconv: unable to run textconv on :one.bin +bin: test version 2 EOF test_expect_success 'no filter specified' ' - git cat-file --textconv :one.bin 2>result + git cat-file --textconv :one.bin >result && test_cmp expected result ' @@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' ' git config diff.test.cachetextconv false ' -cat >expected <<EOF -bin: test version 2 -EOF - test_expect_success 'cat-file without --textconv' ' git cat-file blob :one.bin >result && test_cmp expected result @@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' ' ' test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' ' + printf "%s" "one.bin" >expected && git cat-file blob :symlink.bin >result && - printf "%s" "one.bin" >expected test_cmp expected result ' test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' ' - ! git cat-file --textconv :symlink.bin 2>result && - cat >expected <<\EOF && -fatal: git cat-file --textconv: unable to run textconv on :symlink.bin -EOF + git cat-file --textconv :symlink.bin >result && test_cmp expected result ' test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' ' - ! git cat-file --textconv HEAD:symlink.bin 2>result && - cat >expected <<EOF && -fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin -EOF + git cat-file --textconv HEAD:symlink.bin >result && test_cmp expected result ' -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCH 3/6] cat-file: do not die on --textconv without textconv filters 2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber @ 2013-04-19 18:15 ` Junio C Hamano 2013-04-20 4:17 ` Jeff King 1 sibling, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-04-19 18:15 UTC (permalink / raw) To: Michael J Gruber; +Cc: git Michael J Gruber <git@drmicha.warpmail.net> writes: > When a command is supposed to use textconv filters (by default or with > "--textconv") and none are configured then the blob is output without > conversion; the only exception to this rule is "cat-file --textconv". I am of two minds. Because cat-file is mostly a low-level plumbing, I do not necessarily think it is a bad behaviour for it to error out when it was asked to apply textconv where there is no filter or when the filter fails to produce an output. On the other hand, it certainly makes it more convenient for callers that do not care too deeply, taking textconv as a mere hint just like Porcelains do. But assuming that this is the direction we would want to go... > diff --git a/builtin/cat-file.c b/builtin/cat-file.c > index 40f87b4..dd4e063 100644 > --- a/builtin/cat-file.c > +++ b/builtin/cat-file.c > @@ -146,10 +146,11 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) > die("git cat-file --textconv %s: <object> must be <sha1:path>", > obj_name); > > - if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) > - die("git cat-file --textconv: unable to run textconv on %s", > - obj_name); > - break; > + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) > + break; > + > + /* otherwise expect a blob */ > + exp_type = "blob"; Please use the constant string blob_type that is available for all callers including this one. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 3/6] cat-file: do not die on --textconv without textconv filters 2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber 2013-04-19 18:15 ` Junio C Hamano @ 2013-04-20 4:17 ` Jeff King 2013-04-20 14:27 ` Michael J Gruber 1 sibling, 1 reply; 78+ messages in thread From: Jeff King @ 2013-04-20 4:17 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Fri, Apr 19, 2013 at 06:44:46PM +0200, Michael J Gruber wrote: > - die("git cat-file --textconv: unable to run textconv on %s", > - obj_name); > - break; > + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) > + break; > + > + /* otherwise expect a blob */ > + exp_type = "blob"; > > case 0: > if (type_from_string(exp_type) == OBJ_BLOB) { I'm not sure this is right. What happens with: git cat-file --textconv HEAD:Documentation We have failed to textconv, but should we be expecting a blob? -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 3/6] cat-file: do not die on --textconv without textconv filters 2013-04-20 4:17 ` Jeff King @ 2013-04-20 14:27 ` Michael J Gruber 0 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-20 14:27 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King venit, vidit, dixit 20.04.2013 06:17: > On Fri, Apr 19, 2013 at 06:44:46PM +0200, Michael J Gruber wrote: > >> - die("git cat-file --textconv: unable to run textconv on %s", >> - obj_name); >> - break; >> + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) >> + break; >> + >> + /* otherwise expect a blob */ >> + exp_type = "blob"; >> >> case 0: >> if (type_from_string(exp_type) == OBJ_BLOB) { > > I'm not sure this is right. What happens with: > > git cat-file --textconv HEAD:Documentation > > We have failed to textconv, but should we be expecting a blob? Very true, thanks. I'll reorder so that the --textconv case continues (without break) into the -p case. I think it makes sense to consider "--textconv" to be "at least as pretty as -p". Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCH 4/6] t7008: demonstrate behavior of grep with textconv 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber ` (2 preceding siblings ...) 2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber @ 2013-04-19 16:44 ` Michael J Gruber 2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber ` (2 subsequent siblings) 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git Currently, "git grep" does not invoke any textconv filters. Demonstrate this in the tests. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- t/t7008-grep-binary.sh | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 26f8319..a1fd0b2 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -145,4 +145,23 @@ test_expect_success 'grep respects not-binary diff attribute' ' test_cmp expect actual ' +cat >nul_to_q_textconv <<'EOF' +#!/bin/sh +"$PERL_PATH" -pe 'y/\000/Q/' < "$1" +EOF +chmod +x nul_to_q_textconv + +test_expect_success 'setup textconv filters' ' + echo a diff=foo >.gitattributes && + git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv +' + +test_expect_success 'grep does not obey textconv' ' + test_must_fail git grep Qfile +' + +test_expect_success 'grep blob does not obey textconv' ' + test_must_fail git grep Qfile HEAD:a +' + test_done -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCH 5/6] grep: allow to use textconv filters 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber ` (3 preceding siblings ...) 2013-04-19 16:44 ` [PATCH 4/6] t7008: demonstrate behavior of grep with textconv Michael J Gruber @ 2013-04-19 16:44 ` Michael J Gruber 2013-04-20 4:31 ` Jeff King 2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber 2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git; +Cc: Jeff King From: Jeff King <peff@peff.net> Recently and not so recently, we made sure that log/grep type operations use textconv filters when a userfacing diff would do the same: ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28) b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28) 0508fe5 (combine-diff: respect textconv attributes, 2011-05-23) "git grep" currently does not use textconv filters at all, that is neither for displaying the match and context nor for the actual grepping. Introduce an option "--textconv" which makes git grep use any configured textconv filters for grepping and output purposes. It is off by default. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/grep.c | 2 + grep.c | 100 ++++++++++++++++++++++++++++++++++++++++++------- grep.h | 1 + t/t7008-grep-binary.sh | 18 +++++++++ 4 files changed, 107 insertions(+), 14 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 159e65d..00ee57d 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -659,6 +659,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) OPT_SET_INT('I', NULL, &opt.binary, N_("don't match patterns in binary files"), GREP_BINARY_NOMATCH), + OPT_BOOL(0, "textconv", &opt.allow_textconv, + N_("process binary files with textconv filters")), { OPTION_INTEGER, 0, "max-depth", &opt.max_depth, N_("depth"), N_("descend at most <depth> levels"), PARSE_OPT_NONEG, NULL, 1 }, diff --git a/grep.c b/grep.c index bb548ca..c668034 100644 --- a/grep.c +++ b/grep.c @@ -2,6 +2,8 @@ #include "grep.h" #include "userdiff.h" #include "xdiff-interface.h" +#include "diff.h" +#include "diffcore.h" static int grep_source_load(struct grep_source *gs); static int grep_source_is_binary(struct grep_source *gs); @@ -1322,6 +1324,58 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size) fwrite(buf, size, 1, stdout); } +static int fill_textconv_grep(struct userdiff_driver *driver, + struct grep_source *gs) +{ + struct diff_filespec *df; + char *buf; + size_t size; + + if (!driver || !driver->textconv) + return grep_source_load(gs); + + /* + * The textconv interface is intimately tied to diff_filespecs, so we + * have to pretend to be one. If we could unify the grep_source + * and diff_filespec structs, this mess could just go away. + */ + df = alloc_filespec(gs->path); + switch (gs->type) { + case GREP_SOURCE_SHA1: + fill_filespec(df, gs->identifier, 1, 0100644); + break; + case GREP_SOURCE_FILE: + fill_filespec(df, null_sha1, 0, 0100644); + break; + default: + die("BUG: attempt to textconv something without a path?"); + } + + /* + * fill_textconv is not remotely thread-safe; it may load objects + * behind the scenes, and it modifies the global diff tempfile + * structure. + */ + grep_read_lock(); + size = fill_textconv(driver, df, &buf); + grep_read_unlock(); + free_filespec(df); + + /* + * The normal fill_textconv usage by the diff machinery would just keep + * the textconv'd buf separate from the diff_filespec. But much of the + * grep code passes around a grep_source and assumes that its "buf" + * pointer is the beginning of the thing we are searching. So let's + * install our textconv'd version into the grep_source, taking care not + * to leak any existing buffer. + */ + grep_source_clear_data(gs); + gs->buf = buf; + gs->size = size; + + return 0; +} + static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int collect_hits) { char *bol; @@ -1332,6 +1386,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle unsigned count = 0; int try_lookahead = 0; int show_function = 0; + struct userdiff_driver *textconv = NULL; enum grep_context ctx = GREP_CONTEXT_HEAD; xdemitconf_t xecfg; @@ -1353,19 +1408,36 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle } opt->last_shown = 0; - switch (opt->binary) { - case GREP_BINARY_DEFAULT: - if (grep_source_is_binary(gs)) - binary_match_only = 1; - break; - case GREP_BINARY_NOMATCH: - if (grep_source_is_binary(gs)) - return 0; /* Assume unmatch */ - break; - case GREP_BINARY_TEXT: - break; - default: - die("bug: unknown binary handling mode"); + if (opt->allow_textconv) { + grep_source_load_driver(gs); + /* + * We might set up the shared textconv cache data here, which + * is not thread-safe. + */ + grep_attr_lock(); + textconv = userdiff_get_textconv(gs->driver); + grep_attr_unlock(); + } + + /* + * We know the result of a textconv is text, so we only have to care + * about binary handling if we are not using it. + */ + if (!textconv) { + switch (opt->binary) { + case GREP_BINARY_DEFAULT: + if (grep_source_is_binary(gs)) + binary_match_only = 1; + break; + case GREP_BINARY_NOMATCH: + if (grep_source_is_binary(gs)) + return 0; /* Assume unmatch */ + break; + case GREP_BINARY_TEXT: + break; + default: + die("bug: unknown binary handling mode"); + } } memset(&xecfg, 0, sizeof(xecfg)); @@ -1373,7 +1445,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle try_lookahead = should_lookahead(opt); - if (grep_source_load(gs) < 0) + if (fill_textconv_grep(textconv, gs) < 0) return 0; bol = gs->buf; diff --git a/grep.h b/grep.h index e4a1df5..eaaced1 100644 --- a/grep.h +++ b/grep.h @@ -107,6 +107,7 @@ struct grep_opt { #define GREP_BINARY_NOMATCH 1 #define GREP_BINARY_TEXT 2 int binary; + int allow_textconv; int extended; int use_reflog_filter; int pcre; diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index a1fd0b2..a7fe94a 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -160,8 +160,26 @@ test_expect_success 'grep does not obey textconv' ' test_must_fail git grep Qfile ' +test_expect_success 'grep --textconv does obey textconv' ' + echo "a:binaryQfile" >expect && + git grep --textconv Qfile >actual && + test_cmp expect actual +' + +test_expect_success 'grep --no-textconv does not obey textconv' ' + test_must_fail git grep Qfile +' + test_expect_success 'grep blob does not obey textconv' ' test_must_fail git grep Qfile HEAD:a ' +test_expect_success 'grep --textconv blob does not obey textconv' ' + test_must_fail git grep --textconv Qfile HEAD:a +' + +test_expect_success 'grep --no-textconv blob does not obey textconv' ' + test_must_fail git grep --no-textconv Qfile HEAD:a +' + test_done -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCH 5/6] grep: allow to use textconv filters 2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber @ 2013-04-20 4:31 ` Jeff King 0 siblings, 0 replies; 78+ messages in thread From: Jeff King @ 2013-04-20 4:31 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Fri, Apr 19, 2013 at 06:44:48PM +0200, Michael J Gruber wrote: > From: Jeff King <peff@peff.net> > > Recently and not so recently, we made sure that log/grep type operations > use textconv filters when a userfacing diff would do the same: > > ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28) > b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28) > 0508fe5 (combine-diff: respect textconv attributes, 2011-05-23) > > "git grep" currently does not use textconv filters at all, that is > neither for displaying the match and context nor for the actual grepping. > > Introduce an option "--textconv" which makes git grep use any configured > textconv filters for grepping and output purposes. It is off by default. > > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> > --- > builtin/grep.c | 2 + > grep.c | 100 ++++++++++++++++++++++++++++++++++++++++++------- > grep.h | 1 + > t/t7008-grep-binary.sh | 18 +++++++++ > 4 files changed, 107 insertions(+), 14 deletions(-) This patch, of course, is flawless. :) Feel free to add: Signed-off-by: Jeff King <peff@peff.net> > + /* > + * We know the result of a textconv is text, so we only have to care > + * about binary handling if we are not using it. > + */ > + if (!textconv) { > + switch (opt->binary) { > + case GREP_BINARY_DEFAULT: > + if (grep_source_is_binary(gs)) > + binary_match_only = 1; > + break; > + case GREP_BINARY_NOMATCH: > + if (grep_source_is_binary(gs)) > + return 0; /* Assume unmatch */ > + break; > + case GREP_BINARY_TEXT: > + break; > + default: > + die("bug: unknown binary handling mode"); > + } > } Junio mentioned checking the textconv output for binary-ness. Doing that would involve removing the outer conditional here. But it's not quite so simple, as we don't load the textconv results until later, and grep_source_is_binary will lazily load the contents. I think it would be sufficient to fill_textconv_grep() right before, and then grep_source_is_binary would rely on the cached buffer. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCH 6/6] grep: obey --textconv for the case rev:path 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber ` (4 preceding siblings ...) 2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber @ 2013-04-19 16:44 ` Michael J Gruber 2013-04-20 4:24 ` Jeff King 2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-19 16:44 UTC (permalink / raw) To: git Make "grep" obey the "--textconv" option also for the object case, i.e. when used with an argument "rev:path". Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/grep.c | 11 ++++++----- object.c | 26 ++++++++++++++++++++------ object.h | 2 ++ t/t7008-grep-binary.sh | 6 ++++-- 4 files changed, 32 insertions(+), 13 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 00ee57d..bb7f970 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -458,10 +458,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, } static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, - struct object *obj, const char *name) + struct object *obj, const char *name, struct object_context *oc) { if (obj->type == OBJ_BLOB) - return grep_sha1(opt, obj->sha1, name, 0, NULL); + return grep_sha1(opt, obj->sha1, name, 0, oc ? oc->path : NULL); if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) { struct tree_desc tree; void *data; @@ -503,7 +503,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i < nr; i++) { struct object *real_obj; real_obj = deref_tag(list->objects[i].item, NULL, 0); - if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) { + if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].context)) { hit = 1; if (opt->status_only) break; @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix) for (i = 0; i < argc; i++) { const char *arg = argv[i]; unsigned char sha1[20]; + struct object_context oc; /* Is it a rev? */ - if (!get_sha1(arg, sha1)) { + if (!get_sha1_with_context(arg, 0, sha1, &oc)) { struct object *object = parse_object_or_die(sha1, arg); if (!seen_dashdash) verify_non_filename(prefix, arg); - add_object_array(object, arg, &list); + add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context))); continue; } if (!strcmp(arg, "--")) { diff --git a/object.c b/object.c index 20703f5..c8ffc9e 100644 --- a/object.c +++ b/object.c @@ -255,12 +255,7 @@ int object_list_contains(struct object_list *list, struct object *obj) return 0; } -void add_object_array(struct object *obj, const char *name, struct object_array *array) -{ - add_object_array_with_mode(obj, name, array, S_IFINVALID); -} - -void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) +static void add_object_array_with_mode_context(struct object *obj, const char *name, struct object_array *array, unsigned mode, struct object_context *context) { unsigned nr = array->nr; unsigned alloc = array->alloc; @@ -275,9 +270,28 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj objects[nr].item = obj; objects[nr].name = name; objects[nr].mode = mode; + objects[nr].context = context; array->nr = ++nr; } +void add_object_array(struct object *obj, const char *name, struct object_array *array) +{ + add_object_array_with_mode(obj, name, array, S_IFINVALID); +} + +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) +{ + add_object_array_with_mode_context(obj, name, array, mode, NULL); +} + +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context) +{ + if (context) + add_object_array_with_mode_context(obj, name, array, context->mode, context); + else + add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context); +} + void object_array_remove_duplicates(struct object_array *array) { unsigned int ref, src, dst; diff --git a/object.h b/object.h index 97d384b..695847d 100644 --- a/object.h +++ b/object.h @@ -13,6 +13,7 @@ struct object_array { struct object *item; const char *name; unsigned mode; + struct object_context *context; } *objects; }; @@ -85,6 +86,7 @@ int object_list_contains(struct object_list *list, struct object *obj); /* Object array handling .. */ void add_object_array(struct object *obj, const char *name, struct object_array *array); void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode); +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context); void object_array_remove_duplicates(struct object_array *); void clear_object_flags(unsigned flags); diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index a7fe94a..ef78abe 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -174,8 +174,10 @@ test_expect_success 'grep blob does not obey textconv' ' test_must_fail git grep Qfile HEAD:a ' -test_expect_success 'grep --textconv blob does not obey textconv' ' - test_must_fail git grep --textconv Qfile HEAD:a +test_expect_success 'grep --textconv blob does obey textconv' ' + echo "HEAD:a:binaryQfile" >expect && + git grep --textconv Qfile HEAD:a >actual && + test_cmp expect actual ' test_expect_success 'grep --no-textconv blob does not obey textconv' ' -- 1.8.2.1.728.ge98e8b0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCH 6/6] grep: obey --textconv for the case rev:path 2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber @ 2013-04-20 4:24 ` Jeff King 2013-04-20 14:42 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Jeff King @ 2013-04-20 4:24 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Fri, Apr 19, 2013 at 06:44:49PM +0200, Michael J Gruber wrote: > @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > for (i = 0; i < argc; i++) { > const char *arg = argv[i]; > unsigned char sha1[20]; > + struct object_context oc; > /* Is it a rev? */ > - if (!get_sha1(arg, sha1)) { > + if (!get_sha1_with_context(arg, 0, sha1, &oc)) { > struct object *object = parse_object_or_die(sha1, arg); > if (!seen_dashdash) > verify_non_filename(prefix, arg); > - add_object_array(object, arg, &list); > + add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context))); Hrm. I'm not excited about the extra allocation here. Who frees it? > +void add_object_array(struct object *obj, const char *name, struct object_array *array) > +{ > + add_object_array_with_mode(obj, name, array, S_IFINVALID); > +} > + > +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) > +{ > + add_object_array_with_mode_context(obj, name, array, mode, NULL); > +} > + > +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context) > +{ > + if (context) > + add_object_array_with_mode_context(obj, name, array, context->mode, context); > + else > + add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context); > +} And this mass of almost-the-same functions is gross, too, especially given that the object_context contains a mode itself. Unfortunately, I'm not sure if I have a more pleasant suggestion. I seem to recall wrestling with this issue during the last round, too. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 6/6] grep: obey --textconv for the case rev:path 2013-04-20 4:24 ` Jeff King @ 2013-04-20 14:42 ` Michael J Gruber 2013-04-21 3:41 ` Jeff King 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-20 14:42 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King venit, vidit, dixit 20.04.2013 06:24: > On Fri, Apr 19, 2013 at 06:44:49PM +0200, Michael J Gruber wrote: > >> @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix) >> for (i = 0; i < argc; i++) { >> const char *arg = argv[i]; >> unsigned char sha1[20]; >> + struct object_context oc; >> /* Is it a rev? */ >> - if (!get_sha1(arg, sha1)) { >> + if (!get_sha1_with_context(arg, 0, sha1, &oc)) { >> struct object *object = parse_object_or_die(sha1, arg); >> if (!seen_dashdash) >> verify_non_filename(prefix, arg); >> - add_object_array(object, arg, &list); >> + add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context))); > > Hrm. I'm not excited about the extra allocation here. Who frees it? > >> +void add_object_array(struct object *obj, const char *name, struct object_array *array) >> +{ >> + add_object_array_with_mode(obj, name, array, S_IFINVALID); >> +} >> + >> +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) >> +{ >> + add_object_array_with_mode_context(obj, name, array, mode, NULL); >> +} >> + >> +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context) >> +{ >> + if (context) >> + add_object_array_with_mode_context(obj, name, array, context->mode, context); >> + else >> + add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context); >> +} > > And this mass of almost-the-same functions is gross, too, especially > given that the object_context contains a mode itself. Well, it's just providing different ways to call into the one and only function, in order to satisfy different callers' needs. It's not unheard of (or rather: unseen) in our code, is it? > Unfortunately, I'm not sure if I have a more pleasant suggestion. I seem > to recall wrestling with this issue during the last round, too. Yes, I think that's what we ended up with. At least it's just one context struct per argument to grep, so it's not that bad after all. I vaguely seem to recall we had some more general framework cooking but I may be wrong (I was offline due to sickness for a while). It was about attaching some additional info to something. Yes, I said "vaguely" ... Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 6/6] grep: obey --textconv for the case rev:path 2013-04-20 14:42 ` Michael J Gruber @ 2013-04-21 3:41 ` Jeff King 0 siblings, 0 replies; 78+ messages in thread From: Jeff King @ 2013-04-21 3:41 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Junio C Hamano On Sat, Apr 20, 2013 at 04:42:49PM +0200, Michael J Gruber wrote: > > And this mass of almost-the-same functions is gross, too, especially > > given that the object_context contains a mode itself. > > Well, it's just providing different ways to call into the one and only > function, in order to satisfy different callers' needs. It's not unheard > of (or rather: unseen) in our code, is it? No, we have instances of it already. And they're ugly, too. :) I think when we hit more than 2 or 3 wrappers it is time to start thinking whether they can be consolidated. I think it is mostly the overlap in context and mode that makes me find this one particularly ugly. But it's probably not solvable without some pretty heavy refactoring. > I vaguely seem to recall we had some more general framework cooking but > I may be wrong (I was offline due to sickness for a while). It was about > attaching some additional info to something. Yes, I said "vaguely" ... Yeah, I really wanted to keep the context inside the object_array, but it means either wasting a lot of space (due to over-large buffers) or having the array elements be variable-sized (with a flex-array for the pathname). And object_array entries already have a memory-leak problem from the "name" field, which I think we just punt on elsewhere. So I think this is probably the lesser of the possible evils. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 0/6] grep with textconv 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber ` (5 preceding siblings ...) 2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber @ 2013-04-19 18:24 ` Junio C Hamano 2013-04-20 4:26 ` Jeff King 2013-04-20 13:32 ` Michael J Gruber 6 siblings, 2 replies; 78+ messages in thread From: Junio C Hamano @ 2013-04-19 18:24 UTC (permalink / raw) To: Michael J Gruber; +Cc: git Michael J Gruber <git@drmicha.warpmail.net> writes: > This series teaches show and grep to obey textconv: show by > default (like diff), grep only on request (--textconv). We might > switch the default for the latter also, of course. I'd actually > like that. > > Compared to an earlier (historic) series this one comes with tests. It would have been nicer if you referred to the previous thread cf. http://thread.gmane.org/gmane.comp.version-control.git/215385 > grep: allow to use textconv filters This looked mostly sensible except for one minor "eh, do we really need to assume textconv output is text, or wouldn't using the same codepath for raw blob and textconv result to make them consistently honor opt->binary easier to explain?". > t4030: demonstrate behavior of show with textconv > t7008: demonstrate behavior of grep with textconv It somehow felt they are better together in the patches that implement the features they exercise. > show: obey --textconv for blobs > cat-file: do not die on --textconv without textconv filters > grep: obey --textconv for the case rev:path I just let my eyes coast over these but didn't see anything obviously wrong. By the way, "git log --no-merges | grep obey | wc -l" shows that we say "honor an option" a lot more than "obey an option". We may want to be consistent here. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 0/6] grep with textconv 2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano @ 2013-04-20 4:26 ` Jeff King 2013-04-20 13:32 ` Michael J Gruber 1 sibling, 0 replies; 78+ messages in thread From: Jeff King @ 2013-04-20 4:26 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git On Fri, Apr 19, 2013 at 11:24:16AM -0700, Junio C Hamano wrote: > > grep: allow to use textconv filters > > This looked mostly sensible except for one minor "eh, do we really > need to assume textconv output is text, or wouldn't using the same > codepath for raw blob and textconv result to make them consistently > honor opt->binary easier to explain?". I don't mind re-checking the textconv output for binary-ness. But I did it that way for consistency with the diff code-path, which also assumes that textconv output is not binary. > By the way, "git log --no-merges | grep obey | wc -l" shows that we > say "honor an option" a lot more than "obey an option". We may want > to be consistent here. Yeah, it is a pretty minor thing, but I agree that "honor" sounds better. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCH 0/6] grep with textconv 2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano 2013-04-20 4:26 ` Jeff King @ 2013-04-20 13:32 ` Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber 1 sibling, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-20 13:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King Junio C Hamano venit, vidit, dixit 19.04.2013 20:24: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> This series teaches show and grep to obey textconv: show by >> default (like diff), grep only on request (--textconv). We might >> switch the default for the latter also, of course. I'd actually >> like that. >> >> Compared to an earlier (historic) series this one comes with tests. > > It would have been nicer if you referred to the previous thread > > cf. > > http://thread.gmane.org/gmane.comp.version-control.git/215385 Yes, sorry, I was on a slow mobile connection due to DSL breakage... >> grep: allow to use textconv filters > > This looked mostly sensible except for one minor "eh, do we really > need to assume textconv output is text, or wouldn't using the same > codepath for raw blob and textconv result to make them consistently > honor opt->binary easier to explain?". > I think we assume in general that textconv produces text, which is maybe not completely surprising given its name ;) >> t4030: demonstrate behavior of show with textconv >> t7008: demonstrate behavior of grep with textconv > > It somehow felt they are better together in the patches that > implement the features they exercise. I added them after the fact. They can be squashed in, of course. On the other hand you don't see the change in behavior that the latter patches introduce any more if you that; which is why I left them separate at least for review purposes and for camparing to the previous series which I had failed to reference. >> show: obey --textconv for blobs >> cat-file: do not die on --textconv without textconv filters >> grep: obey --textconv for the case rev:path > > I just let my eyes coast over these but didn't see anything > obviously wrong. > > By the way, "git log --no-merges | grep obey | wc -l" shows that we > say "honor an option" a lot more than "obey an option". We may want > to be consistent here. Okay, let's be honorable rather than obedient. Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv2 0/7] grep with textconv 2013-04-20 13:32 ` Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber ` (6 more replies) 0 siblings, 7 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano Here's a reroll, with the following changes: * Use "honor" for obey". * Fixed the issue with --textconv and non-blobs. * Restructured tests as per Jeff's preference. * Added 7/ which flips the default for git grep to textconv. Jeff King (1): grep: allow to use textconv filters Michael J Gruber (6): t4030: demonstrate behavior of show with textconv show: obey --textconv for blobs cat-file: do not die on --textconv without textconv filters t7008: demonstrate behavior of grep with textconv grep: honor --textconv for the case rev:path git grep: honor textconv by default Documentation/git-grep.txt | 9 +++- builtin/cat-file.c | 18 ++++---- builtin/grep.c | 13 +++--- builtin/log.c | 24 ++++++++-- grep.c | 102 +++++++++++++++++++++++++++++++++++++------ grep.h | 1 + object.c | 26 ++++++++--- object.h | 2 + t/t4030-diff-textconv.sh | 18 ++++++++ t/t7008-grep-binary.sh | 43 ++++++++++++++++++ t/t8007-cat-file-textconv.sh | 20 +++------ 11 files changed, 222 insertions(+), 54 deletions(-) -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv2 1/7] t4030: demonstrate behavior of show with textconv 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 15:11 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber ` (5 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano "git show <commit>" honors the textconv setting while "git show <blob>" does not. Demonstrate this in the test. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- t/t4030-diff-textconv.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh index 53ec330..260ea92 100755 --- a/t/t4030-diff-textconv.sh +++ b/t/t4030-diff-textconv.sh @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' ' test_cmp expect.text actual ' +test_expect_success 'show commit produces text' ' + git show HEAD >diff && + find_diff <diff >actual && + test_cmp expect.text actual +' + test_expect_success 'diff-tree produces binary' ' git diff-tree -p HEAD^ HEAD >diff && find_diff <diff >actual && @@ -84,6 +90,12 @@ test_expect_success 'status -v produces text' ' git reset --soft HEAD@{1} ' +test_expect_failure 'show blob produces text' ' + git show HEAD:file >actual && + printf "0\\n1\\n" >expect && + test_cmp expect actual +' + test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' echo one >expect && git log --root --format=%s -G0 >actual && -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv2 1/7] t4030: demonstrate behavior of show with textconv 2013-04-23 12:11 ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber @ 2013-04-23 15:11 ` Junio C Hamano 0 siblings, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-04-23 15:11 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > "git show <commit>" honors the textconv setting while "git show <blob>" > does not. Demonstrate this in the test. Should "git show <blob>" run textconv by default? I somehow had an impression that people were against it during the discussion on the previous round. > > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> > --- > t/t4030-diff-textconv.sh | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh > index 53ec330..260ea92 100755 > --- a/t/t4030-diff-textconv.sh > +++ b/t/t4030-diff-textconv.sh > @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' ' > test_cmp expect.text actual > ' > > +test_expect_success 'show commit produces text' ' > + git show HEAD >diff && > + find_diff <diff >actual && > + test_cmp expect.text actual > +' > + > test_expect_success 'diff-tree produces binary' ' > git diff-tree -p HEAD^ HEAD >diff && > find_diff <diff >actual && > @@ -84,6 +90,12 @@ test_expect_success 'status -v produces text' ' > git reset --soft HEAD@{1} > ' > > +test_expect_failure 'show blob produces text' ' > + git show HEAD:file >actual && > + printf "0\\n1\\n" >expect && > + test_cmp expect actual > +' > + > test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' > echo one >expect && > git log --root --format=%s -G0 >actual && ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv2 2/7] show: obey --textconv for blobs 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 15:14 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber ` (4 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano Currently, "diff" and "cat-file" for blobs honor "--textconv" options (with the former defaulting to "--textconv" and the latter to "--no-textconv") whereas "show" does not honor this option, even though it takes diff options. Make "show" on blobs behave like "diff", i.e. honor "--textconv" by default and "--no-textconv" when given. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/log.c | 24 +++++++++++++++++++++--- t/t4030-diff-textconv.sh | 8 +++++++- 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index 5f3ed77..fe0275e 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(&out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) { + unsigned char sha1c[20]; + struct object_context obj_context; + char *buf; + unsigned long size; + fflush(stdout); - return stream_blob_to_fd(1, sha1, NULL, 0); + if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) + die("Not a valid object name %s", obj_name); + if (!obj_context.path[0] || + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (!buf) + die("git show %s: bad file", obj_name); + + write_or_die(1, buf, size); + return 0; } static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) @@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) const char *name = objects[i].name; switch (o->type) { case OBJ_BLOB: - ret = show_blob_object(o->sha1, NULL); + ret = show_blob_object(o->sha1, &rev, name); break; case OBJ_TAG: { struct tag *t = (struct tag *)o; diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh index 260ea92..f9d55e1 100755 --- a/t/t4030-diff-textconv.sh +++ b/t/t4030-diff-textconv.sh @@ -90,12 +90,18 @@ test_expect_success 'status -v produces text' ' git reset --soft HEAD@{1} ' -test_expect_failure 'show blob produces text' ' +test_expect_success 'show blob produces text' ' git show HEAD:file >actual && printf "0\\n1\\n" >expect && test_cmp expect actual ' +test_expect_success 'show --no-textconv blob produces binary' ' + git show --no-textconv HEAD:file >actual && + printf "\\0\\n\\1\\n" >expect && + test_cmp expect actual +' + test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' echo one >expect && git log --root --format=%s -G0 >actual && -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv2 2/7] show: obey --textconv for blobs 2013-04-23 12:11 ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber @ 2013-04-23 15:14 ` Junio C Hamano 2013-04-24 10:09 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-23 15:14 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: >> Subject: Re: [PATCHv2 2/7] show: obey --textconv for blobs s/obey/honor/; > Currently, "diff" and "cat-file" for blobs honor "--textconv" options > (with the former defaulting to "--textconv" and the latter to > "--no-textconv") whereas "show" does not honor this option, even though > it takes diff options. > > Make "show" on blobs behave like "diff", i.e. honor "--textconv" by > default and "--no-textconv" when given. It is the right thing to do to teach it to react to --[no-]textconv; I am not sure if the default is right, though. > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> > --- > builtin/log.c | 24 +++++++++++++++++++++--- > t/t4030-diff-textconv.sh | 8 +++++++- > 2 files changed, 28 insertions(+), 4 deletions(-) > > diff --git a/builtin/log.c b/builtin/log.c > index 5f3ed77..fe0275e 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) > strbuf_release(&out); > } > > -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) > +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) > { > + unsigned char sha1c[20]; > + struct object_context obj_context; > + char *buf; > + unsigned long size; > + > fflush(stdout); > - return stream_blob_to_fd(1, sha1, NULL, 0); > + if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) > + return stream_blob_to_fd(1, sha1, NULL, 0); > + > + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) > + die("Not a valid object name %s", obj_name); > + if (!obj_context.path[0] || > + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) > + return stream_blob_to_fd(1, sha1, NULL, 0); > + > + if (!buf) > + die("git show %s: bad file", obj_name); > + > + write_or_die(1, buf, size); > + return 0; > } > > static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) > @@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) > const char *name = objects[i].name; > switch (o->type) { > case OBJ_BLOB: > - ret = show_blob_object(o->sha1, NULL); > + ret = show_blob_object(o->sha1, &rev, name); > break; > case OBJ_TAG: { > struct tag *t = (struct tag *)o; > diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh > index 260ea92..f9d55e1 100755 > --- a/t/t4030-diff-textconv.sh > +++ b/t/t4030-diff-textconv.sh > @@ -90,12 +90,18 @@ test_expect_success 'status -v produces text' ' > git reset --soft HEAD@{1} > ' > > -test_expect_failure 'show blob produces text' ' > +test_expect_success 'show blob produces text' ' > git show HEAD:file >actual && > printf "0\\n1\\n" >expect && > test_cmp expect actual > ' > > +test_expect_success 'show --no-textconv blob produces binary' ' > + git show --no-textconv HEAD:file >actual && > + printf "\\0\\n\\1\\n" >expect && > + test_cmp expect actual > +' > + > test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' > echo one >expect && > git log --root --format=%s -G0 >actual && ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 2/7] show: obey --textconv for blobs 2013-04-23 15:14 ` Junio C Hamano @ 2013-04-24 10:09 ` Michael J Gruber 2013-04-24 17:30 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-24 10:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Junio C Hamano venit, vidit, dixit 23.04.2013 17:14: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >>> Subject: Re: [PATCHv2 2/7] show: obey --textconv for blobs > > s/obey/honor/; I missed that one, thanks. >> Currently, "diff" and "cat-file" for blobs honor "--textconv" options >> (with the former defaulting to "--textconv" and the latter to >> "--no-textconv") whereas "show" does not honor this option, even though >> it takes diff options. >> >> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by >> default and "--no-textconv" when given. > > It is the right thing to do to teach it to react to --[no-]textconv; > I am not sure if the default is right, though. That is the question ;) >> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> >> --- >> builtin/log.c | 24 +++++++++++++++++++++--- >> t/t4030-diff-textconv.sh | 8 +++++++- >> 2 files changed, 28 insertions(+), 4 deletions(-) >> >> diff --git a/builtin/log.c b/builtin/log.c >> index 5f3ed77..fe0275e 100644 >> --- a/builtin/log.c >> +++ b/builtin/log.c >> @@ -436,10 +436,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) >> strbuf_release(&out); >> } >> >> -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) >> +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) >> { >> + unsigned char sha1c[20]; >> + struct object_context obj_context; >> + char *buf; >> + unsigned long size; >> + >> fflush(stdout); >> - return stream_blob_to_fd(1, sha1, NULL, 0); >> + if (!DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) >> + return stream_blob_to_fd(1, sha1, NULL, 0); >> + >> + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) >> + die("Not a valid object name %s", obj_name); >> + if (!obj_context.path[0] || >> + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) >> + return stream_blob_to_fd(1, sha1, NULL, 0); >> + >> + if (!buf) >> + die("git show %s: bad file", obj_name); >> + >> + write_or_die(1, buf, size); >> + return 0; >> } >> >> static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) >> @@ -525,7 +543,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) >> const char *name = objects[i].name; >> switch (o->type) { >> case OBJ_BLOB: >> - ret = show_blob_object(o->sha1, NULL); >> + ret = show_blob_object(o->sha1, &rev, name); >> break; >> case OBJ_TAG: { >> struct tag *t = (struct tag *)o; >> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh >> index 260ea92..f9d55e1 100755 >> --- a/t/t4030-diff-textconv.sh >> +++ b/t/t4030-diff-textconv.sh >> @@ -90,12 +90,18 @@ test_expect_success 'status -v produces text' ' >> git reset --soft HEAD@{1} >> ' >> >> -test_expect_failure 'show blob produces text' ' >> +test_expect_success 'show blob produces text' ' >> git show HEAD:file >actual && >> printf "0\\n1\\n" >expect && >> test_cmp expect actual >> ' >> >> +test_expect_success 'show --no-textconv blob produces binary' ' >> + git show --no-textconv HEAD:file >actual && >> + printf "\\0\\n\\1\\n" >expect && >> + test_cmp expect actual >> +' >> + >> test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' >> echo one >expect && >> git log --root --format=%s -G0 >actual && ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 2/7] show: obey --textconv for blobs 2013-04-24 10:09 ` Michael J Gruber @ 2013-04-24 17:30 ` Junio C Hamano 0 siblings, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-04-24 17:30 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > Junio C Hamano venit, vidit, dixit 23.04.2013 17:14: >> Michael J Gruber <git@drmicha.warpmail.net> writes: >> >>>> Subject: Re: [PATCHv2 2/7] show: obey --textconv for blobs >> >> s/obey/honor/; > > I missed that one, thanks. > >>> Currently, "diff" and "cat-file" for blobs honor "--textconv" options >>> (with the former defaulting to "--textconv" and the latter to >>> "--no-textconv") whereas "show" does not honor this option, even though >>> it takes diff options. >>> >>> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by >>> default and "--no-textconv" when given. >> >> It is the right thing to do to teach it to react to --[no-]textconv; >> I am not sure if the default is right, though. > > That is the question ;) Then let me make it easier. It is not just "I am not sure if", but "I do not think that". ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 15:15 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber ` (3 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano When a command is supposed to use textconv filters (by default or with "--textconv") and none are configured then the blob is output without conversion; the only exception to this rule is "cat-file --textconv". Make it behave like the rest of textconv aware commands. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/cat-file.c | 18 ++++++++---------- t/t8007-cat-file-textconv.sh | 20 +++++--------------- 2 files changed, 13 insertions(+), 25 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 045cee7..bd62373 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -48,6 +48,14 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) case 'e': return !has_sha1_file(sha1); + case 'c': + if (!obj_context.path[0]) + die("git cat-file --textconv %s: <object> must be <sha1:path>", + obj_name); + + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) + break; + case 'p': type = sha1_object_info(sha1, NULL); if (type < 0) @@ -70,16 +78,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) /* otherwise just spit out the data */ break; - case 'c': - if (!obj_context.path[0]) - die("git cat-file --textconv %s: <object> must be <sha1:path>", - obj_name); - - if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) - die("git cat-file --textconv: unable to run textconv on %s", - obj_name); - break; - case 0: if (type_from_string(exp_type) == OBJ_BLOB) { unsigned char blob_sha1[20]; diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh index 78a0085..83c6636 100755 --- a/t/t8007-cat-file-textconv.sh +++ b/t/t8007-cat-file-textconv.sh @@ -22,11 +22,11 @@ test_expect_success 'setup ' ' ' cat >expected <<EOF -fatal: git cat-file --textconv: unable to run textconv on :one.bin +bin: test version 2 EOF test_expect_success 'no filter specified' ' - git cat-file --textconv :one.bin 2>result + git cat-file --textconv :one.bin >result && test_cmp expected result ' @@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' ' git config diff.test.cachetextconv false ' -cat >expected <<EOF -bin: test version 2 -EOF - test_expect_success 'cat-file without --textconv' ' git cat-file blob :one.bin >result && test_cmp expected result @@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' ' ' test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' ' + printf "%s" "one.bin" >expected && git cat-file blob :symlink.bin >result && - printf "%s" "one.bin" >expected test_cmp expected result ' test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' ' - ! git cat-file --textconv :symlink.bin 2>result && - cat >expected <<\EOF && -fatal: git cat-file --textconv: unable to run textconv on :symlink.bin -EOF + git cat-file --textconv :symlink.bin >result && test_cmp expected result ' test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' ' - ! git cat-file --textconv HEAD:symlink.bin 2>result && - cat >expected <<EOF && -fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin -EOF + git cat-file --textconv HEAD:symlink.bin >result && test_cmp expected result ' -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters 2013-04-23 12:11 ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber @ 2013-04-23 15:15 ` Junio C Hamano 0 siblings, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-04-23 15:15 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > When a command is supposed to use textconv filters (by default or with > "--textconv") and none are configured then the blob is output without > conversion; the only exception to this rule is "cat-file --textconv". > > Make it behave like the rest of textconv aware commands. > > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> > --- > builtin/cat-file.c | 18 ++++++++---------- > t/t8007-cat-file-textconv.sh | 20 +++++--------------- > 2 files changed, 13 insertions(+), 25 deletions(-) > > diff --git a/builtin/cat-file.c b/builtin/cat-file.c > index 045cee7..bd62373 100644 > --- a/builtin/cat-file.c > +++ b/builtin/cat-file.c > @@ -48,6 +48,14 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) > case 'e': > return !has_sha1_file(sha1); > > + case 'c': > + if (!obj_context.path[0]) > + die("git cat-file --textconv %s: <object> must be <sha1:path>", > + obj_name); > + > + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) > + break; > + > case 'p': Yeah, falling back to the 'p' case is a lot more sensible. > type = sha1_object_info(sha1, NULL); > if (type < 0) > @@ -70,16 +78,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) > /* otherwise just spit out the data */ > break; > > - case 'c': > - if (!obj_context.path[0]) > - die("git cat-file --textconv %s: <object> must be <sha1:path>", > - obj_name); > - > - if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) > - die("git cat-file --textconv: unable to run textconv on %s", > - obj_name); > - break; > - > case 0: > if (type_from_string(exp_type) == OBJ_BLOB) { > unsigned char blob_sha1[20]; > diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh > index 78a0085..83c6636 100755 > --- a/t/t8007-cat-file-textconv.sh > +++ b/t/t8007-cat-file-textconv.sh > @@ -22,11 +22,11 @@ test_expect_success 'setup ' ' > ' > > cat >expected <<EOF > -fatal: git cat-file --textconv: unable to run textconv on :one.bin > +bin: test version 2 > EOF > > test_expect_success 'no filter specified' ' > - git cat-file --textconv :one.bin 2>result > + git cat-file --textconv :one.bin >result && > test_cmp expected result > ' > > @@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' ' > git config diff.test.cachetextconv false > ' > > -cat >expected <<EOF > -bin: test version 2 > -EOF > - > test_expect_success 'cat-file without --textconv' ' > git cat-file blob :one.bin >result && > test_cmp expected result > @@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' ' > ' > > test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' ' > + printf "%s" "one.bin" >expected && > git cat-file blob :symlink.bin >result && > - printf "%s" "one.bin" >expected > test_cmp expected result > ' > > > test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' ' > - ! git cat-file --textconv :symlink.bin 2>result && > - cat >expected <<\EOF && > -fatal: git cat-file --textconv: unable to run textconv on :symlink.bin > -EOF > + git cat-file --textconv :symlink.bin >result && > test_cmp expected result > ' > > test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' ' > - ! git cat-file --textconv HEAD:symlink.bin 2>result && > - cat >expected <<EOF && > -fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin > -EOF > + git cat-file --textconv HEAD:symlink.bin >result && > test_cmp expected result > ' ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber ` (2 preceding siblings ...) 2013-04-23 12:11 ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 15:16 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 5/7] grep: allow to use textconv filters Michael J Gruber ` (2 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano Currently, "git grep" does not honor any textconv filters. Demonstrate this in the tests. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- t/t7008-grep-binary.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 26f8319..126fe4c 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -145,4 +145,27 @@ test_expect_success 'grep respects not-binary diff attribute' ' test_cmp expect actual ' +cat >nul_to_q_textconv <<'EOF' +#!/bin/sh +"$PERL_PATH" -pe 'y/\000/Q/' < "$1" +EOF +chmod +x nul_to_q_textconv + +test_expect_success 'setup textconv filters' ' + echo a diff=foo >.gitattributes && + git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv +' + +test_expect_failure 'grep does not honor textconv' ' + echo "a:binaryQfile" >expect && + git grep Qfile >actual && + test_cmp expect actual +' + +test_expect_failure 'grep blob does not honor textconv' ' + echo "HEAD:a:binaryQfile" >expect && + git grep Qfile HEAD:a >actual && + test_cmp expect actual +' + test_done -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv 2013-04-23 12:11 ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber @ 2013-04-23 15:16 ` Junio C Hamano 2013-04-24 10:09 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-23 15:16 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > Currently, "git grep" does not honor any textconv filters. Demonstrate > this in the tests. > > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> > --- > t/t7008-grep-binary.sh | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh > index 26f8319..126fe4c 100755 > --- a/t/t7008-grep-binary.sh > +++ b/t/t7008-grep-binary.sh > @@ -145,4 +145,27 @@ test_expect_success 'grep respects not-binary diff attribute' ' > test_cmp expect actual > ' > > +cat >nul_to_q_textconv <<'EOF' > +#!/bin/sh > +"$PERL_PATH" -pe 'y/\000/Q/' < "$1" > +EOF > +chmod +x nul_to_q_textconv > + > +test_expect_success 'setup textconv filters' ' > + echo a diff=foo >.gitattributes && > + git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv > +' > + > +test_expect_failure 'grep does not honor textconv' ' > + echo "a:binaryQfile" >expect && > + git grep Qfile >actual && This should pass --textconv to "git grep". > + test_cmp expect actual > +' > + > +test_expect_failure 'grep blob does not honor textconv' ' > + echo "HEAD:a:binaryQfile" >expect && > + git grep Qfile HEAD:a >actual && Likewise. > + test_cmp expect actual > +' > + > test_done ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv 2013-04-23 15:16 ` Junio C Hamano @ 2013-04-24 10:09 ` Michael J Gruber 2013-04-24 17:29 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-24 10:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Junio C Hamano venit, vidit, dixit 23.04.2013 17:16: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> Currently, "git grep" does not honor any textconv filters. Demonstrate >> this in the tests. >> >> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> >> --- >> t/t7008-grep-binary.sh | 23 +++++++++++++++++++++++ >> 1 file changed, 23 insertions(+) >> >> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh >> index 26f8319..126fe4c 100755 >> --- a/t/t7008-grep-binary.sh >> +++ b/t/t7008-grep-binary.sh >> @@ -145,4 +145,27 @@ test_expect_success 'grep respects not-binary diff attribute' ' >> test_cmp expect actual >> ' >> >> +cat >nul_to_q_textconv <<'EOF' >> +#!/bin/sh >> +"$PERL_PATH" -pe 'y/\000/Q/' < "$1" >> +EOF >> +chmod +x nul_to_q_textconv >> + >> +test_expect_success 'setup textconv filters' ' >> + echo a diff=foo >.gitattributes && >> + git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv >> +' >> + >> +test_expect_failure 'grep does not honor textconv' ' >> + echo "a:binaryQfile" >expect && >> + git grep Qfile >actual && > > This should pass --textconv to "git grep". But "git grep" does not know that option yet, so the test would fail for the wrong reason. The point ist that I expect "git grep" to apply textconv filters by default, which it does not. (I know I might be the only one with this expectation.) Or do we want to document the absence of that option? >> + test_cmp expect actual >> +' >> + >> +test_expect_failure 'grep blob does not honor textconv' ' >> + echo "HEAD:a:binaryQfile" >expect && >> + git grep Qfile HEAD:a >actual && > > Likewise. > >> + test_cmp expect actual >> +' >> + >> test_done ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv 2013-04-24 10:09 ` Michael J Gruber @ 2013-04-24 17:29 ` Junio C Hamano 0 siblings, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-04-24 17:29 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: >>> +test_expect_failure 'grep does not honor textconv' ' >>> + echo "a:binaryQfile" >expect && >>> + git grep Qfile >actual && >> >> This should pass --textconv to "git grep". > > But "git grep" does not know that option yet, so the test would fail for > the wrong reason. > > The point ist that I expect "git grep" to apply textconv filters by > default, which it does not. (I know I might be the only one with this > expectation.) > > Or do we want to document the absence of that option? First, whether you write expect_failure or expec_success, please label the test to say what is expected to happen in the ideal world. The test in question says "grep does not honor textconv", but if you want it to honor textconv in the ideal world, it should be "grep honors textconv (when it should)". Now, from the point of view of testing "git grep honors textconv" missing support at the command line parser level and a buggy implementation of the command line parser that accepts but does not trigger the feature are the same thing. The command would not honor textconv either way. Marking the above as "failure" without explicitly asking for the feature with "--textconv" means we want it to use textconv by default, but that is *not* what the test title says is testing. In your patch, what the body of the text is really expecting is "grep uses textconv by default". If that is what it tests, then passing --textconv from the command line as I suggested would be wrong, but I was going by the title of the patch. ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv2 5/7] grep: allow to use textconv filters 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber ` (3 preceding siblings ...) 2013-04-23 12:11 ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 6/7] grep: honor --textconv for the case rev:path Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano From: Jeff King <peff@peff.net> Recently and not so recently, we made sure that log/grep type operations use textconv filters when a userfacing diff would do the same: ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28) b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28) 0508fe5 (combine-diff: respect textconv attributes, 2011-05-23) "git grep" currently does not use textconv filters at all, that is neither for displaying the match and context nor for the actual grepping. Introduce an option "--textconv" which makes git grep use any configured textconv filters for grepping and output purposes. It is off by default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- Documentation/git-grep.txt | 9 +++- builtin/grep.c | 2 + grep.c | 100 ++++++++++++++++++++++++++++++++++++++------- grep.h | 1 + t/t7008-grep-binary.sh | 20 +++++++++ 5 files changed, 117 insertions(+), 15 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 50d46e1..a5c5a27 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -9,7 +9,7 @@ git-grep - Print lines matching a pattern SYNOPSIS -------- [verse] -'git grep' [-a | --text] [-I] [-i | --ignore-case] [-w | --word-regexp] +'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp] [-v | --invert-match] [-h|-H] [--full-name] [-E | --extended-regexp] [-G | --basic-regexp] [-P | --perl-regexp] @@ -80,6 +80,13 @@ OPTIONS --text:: Process binary files as if they were text. +--textconv:: + Honor textconv filter settings. + +--no-textconv:: + Do not honor textconv filter settings. + This is the default. + -i:: --ignore-case:: Ignore case differences between the patterns and the diff --git a/builtin/grep.c b/builtin/grep.c index 159e65d..00ee57d 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -659,6 +659,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) OPT_SET_INT('I', NULL, &opt.binary, N_("don't match patterns in binary files"), GREP_BINARY_NOMATCH), + OPT_BOOL(0, "textconv", &opt.allow_textconv, + N_("process binary files with textconv filters")), { OPTION_INTEGER, 0, "max-depth", &opt.max_depth, N_("depth"), N_("descend at most <depth> levels"), PARSE_OPT_NONEG, NULL, 1 }, diff --git a/grep.c b/grep.c index bb548ca..c668034 100644 --- a/grep.c +++ b/grep.c @@ -2,6 +2,8 @@ #include "grep.h" #include "userdiff.h" #include "xdiff-interface.h" +#include "diff.h" +#include "diffcore.h" static int grep_source_load(struct grep_source *gs); static int grep_source_is_binary(struct grep_source *gs); @@ -1322,6 +1324,58 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size) fwrite(buf, size, 1, stdout); } +static int fill_textconv_grep(struct userdiff_driver *driver, + struct grep_source *gs) +{ + struct diff_filespec *df; + char *buf; + size_t size; + + if (!driver || !driver->textconv) + return grep_source_load(gs); + + /* + * The textconv interface is intimately tied to diff_filespecs, so we + * have to pretend to be one. If we could unify the grep_source + * and diff_filespec structs, this mess could just go away. + */ + df = alloc_filespec(gs->path); + switch (gs->type) { + case GREP_SOURCE_SHA1: + fill_filespec(df, gs->identifier, 1, 0100644); + break; + case GREP_SOURCE_FILE: + fill_filespec(df, null_sha1, 0, 0100644); + break; + default: + die("BUG: attempt to textconv something without a path?"); + } + + /* + * fill_textconv is not remotely thread-safe; it may load objects + * behind the scenes, and it modifies the global diff tempfile + * structure. + */ + grep_read_lock(); + size = fill_textconv(driver, df, &buf); + grep_read_unlock(); + free_filespec(df); + + /* + * The normal fill_textconv usage by the diff machinery would just keep + * the textconv'd buf separate from the diff_filespec. But much of the + * grep code passes around a grep_source and assumes that its "buf" + * pointer is the beginning of the thing we are searching. So let's + * install our textconv'd version into the grep_source, taking care not + * to leak any existing buffer. + */ + grep_source_clear_data(gs); + gs->buf = buf; + gs->size = size; + + return 0; +} + static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int collect_hits) { char *bol; @@ -1332,6 +1386,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle unsigned count = 0; int try_lookahead = 0; int show_function = 0; + struct userdiff_driver *textconv = NULL; enum grep_context ctx = GREP_CONTEXT_HEAD; xdemitconf_t xecfg; @@ -1353,19 +1408,36 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle } opt->last_shown = 0; - switch (opt->binary) { - case GREP_BINARY_DEFAULT: - if (grep_source_is_binary(gs)) - binary_match_only = 1; - break; - case GREP_BINARY_NOMATCH: - if (grep_source_is_binary(gs)) - return 0; /* Assume unmatch */ - break; - case GREP_BINARY_TEXT: - break; - default: - die("bug: unknown binary handling mode"); + if (opt->allow_textconv) { + grep_source_load_driver(gs); + /* + * We might set up the shared textconv cache data here, which + * is not thread-safe. + */ + grep_attr_lock(); + textconv = userdiff_get_textconv(gs->driver); + grep_attr_unlock(); + } + + /* + * We know the result of a textconv is text, so we only have to care + * about binary handling if we are not using it. + */ + if (!textconv) { + switch (opt->binary) { + case GREP_BINARY_DEFAULT: + if (grep_source_is_binary(gs)) + binary_match_only = 1; + break; + case GREP_BINARY_NOMATCH: + if (grep_source_is_binary(gs)) + return 0; /* Assume unmatch */ + break; + case GREP_BINARY_TEXT: + break; + default: + die("bug: unknown binary handling mode"); + } } memset(&xecfg, 0, sizeof(xecfg)); @@ -1373,7 +1445,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle try_lookahead = should_lookahead(opt); - if (grep_source_load(gs) < 0) + if (fill_textconv_grep(textconv, gs) < 0) return 0; bol = gs->buf; diff --git a/grep.h b/grep.h index e4a1df5..eaaced1 100644 --- a/grep.h +++ b/grep.h @@ -107,6 +107,7 @@ struct grep_opt { #define GREP_BINARY_NOMATCH 1 #define GREP_BINARY_TEXT 2 int binary; + int allow_textconv; int extended; int use_reflog_filter; int pcre; diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 126fe4c..1eae6a4 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -162,10 +162,30 @@ test_expect_failure 'grep does not honor textconv' ' test_cmp expect actual ' +test_expect_success 'grep --textconv does honor textconv' ' + echo "a:binaryQfile" >expect && + git grep --textconv Qfile >actual && + test_cmp expect actual +' + +test_expect_success 'grep --no-textconv does not honor textconv' ' + test_must_fail git grep --no-textconv Qfile +' + test_expect_failure 'grep blob does not honor textconv' ' echo "HEAD:a:binaryQfile" >expect && git grep Qfile HEAD:a >actual && test_cmp expect actual ' +test_expect_failure 'grep --textconv blob does not honor textconv' ' + echo "HEAD:a:binaryQfile" >expect && + git grep --textconv Qfile HEAD:a >actual && + test_cmp expect actual +' + +test_expect_success 'grep --no-textconv blob does not honor textconv' ' + test_must_fail git grep --no-textconv Qfile HEAD:a +' + test_done -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv2 6/7] grep: honor --textconv for the case rev:path 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber ` (4 preceding siblings ...) 2013-04-23 12:11 ` [PATCHv2 5/7] grep: allow to use textconv filters Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano Make "grep" honor the "--textconv" option also for the object case, i.e. when used with an argument "rev:path". Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/grep.c | 11 ++++++----- object.c | 26 ++++++++++++++++++++------ object.h | 2 ++ t/t7008-grep-binary.sh | 2 +- 4 files changed, 29 insertions(+), 12 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 00ee57d..bb7f970 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -458,10 +458,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, } static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, - struct object *obj, const char *name) + struct object *obj, const char *name, struct object_context *oc) { if (obj->type == OBJ_BLOB) - return grep_sha1(opt, obj->sha1, name, 0, NULL); + return grep_sha1(opt, obj->sha1, name, 0, oc ? oc->path : NULL); if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) { struct tree_desc tree; void *data; @@ -503,7 +503,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i < nr; i++) { struct object *real_obj; real_obj = deref_tag(list->objects[i].item, NULL, 0); - if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) { + if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].context)) { hit = 1; if (opt->status_only) break; @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix) for (i = 0; i < argc; i++) { const char *arg = argv[i]; unsigned char sha1[20]; + struct object_context oc; /* Is it a rev? */ - if (!get_sha1(arg, sha1)) { + if (!get_sha1_with_context(arg, 0, sha1, &oc)) { struct object *object = parse_object_or_die(sha1, arg); if (!seen_dashdash) verify_non_filename(prefix, arg); - add_object_array(object, arg, &list); + add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context))); continue; } if (!strcmp(arg, "--")) { diff --git a/object.c b/object.c index 20703f5..c8ffc9e 100644 --- a/object.c +++ b/object.c @@ -255,12 +255,7 @@ int object_list_contains(struct object_list *list, struct object *obj) return 0; } -void add_object_array(struct object *obj, const char *name, struct object_array *array) -{ - add_object_array_with_mode(obj, name, array, S_IFINVALID); -} - -void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) +static void add_object_array_with_mode_context(struct object *obj, const char *name, struct object_array *array, unsigned mode, struct object_context *context) { unsigned nr = array->nr; unsigned alloc = array->alloc; @@ -275,9 +270,28 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj objects[nr].item = obj; objects[nr].name = name; objects[nr].mode = mode; + objects[nr].context = context; array->nr = ++nr; } +void add_object_array(struct object *obj, const char *name, struct object_array *array) +{ + add_object_array_with_mode(obj, name, array, S_IFINVALID); +} + +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) +{ + add_object_array_with_mode_context(obj, name, array, mode, NULL); +} + +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context) +{ + if (context) + add_object_array_with_mode_context(obj, name, array, context->mode, context); + else + add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context); +} + void object_array_remove_duplicates(struct object_array *array) { unsigned int ref, src, dst; diff --git a/object.h b/object.h index 97d384b..695847d 100644 --- a/object.h +++ b/object.h @@ -13,6 +13,7 @@ struct object_array { struct object *item; const char *name; unsigned mode; + struct object_context *context; } *objects; }; @@ -85,6 +86,7 @@ int object_list_contains(struct object_list *list, struct object *obj); /* Object array handling .. */ void add_object_array(struct object *obj, const char *name, struct object_array *array); void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode); +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context); void object_array_remove_duplicates(struct object_array *); void clear_object_flags(unsigned flags); diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 1eae6a4..10b2c8b 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -178,7 +178,7 @@ test_expect_failure 'grep blob does not honor textconv' ' test_cmp expect actual ' -test_expect_failure 'grep --textconv blob does not honor textconv' ' +test_expect_success 'grep --textconv blob does honor textconv' ' echo "HEAD:a:binaryQfile" >expect && git grep --textconv Qfile HEAD:a >actual && test_cmp expect actual -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv2 7/7] git grep: honor textconv by default 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber ` (5 preceding siblings ...) 2013-04-23 12:11 ` [PATCHv2 6/7] grep: honor --textconv for the case rev:path Michael J Gruber @ 2013-04-23 12:11 ` Michael J Gruber 2013-04-23 15:20 ` Junio C Hamano 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-23 12:11 UTC (permalink / raw) To: git; +Cc: Matthieu.Moy, jeremy.rosen, Jeff King, Junio C Hamano Currently, "git grep" does not honor textconv settings by default. Make it honor them by default just like "git log --grep" does. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- Documentation/git-grep.txt | 2 +- grep.c | 2 ++ t/t7008-grep-binary.sh | 4 ++-- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index a5c5a27..f54ac0c 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -82,10 +82,10 @@ OPTIONS --textconv:: Honor textconv filter settings. + This is the default. --no-textconv:: Do not honor textconv filter settings. - This is the default. -i:: --ignore-case:: diff --git a/grep.c b/grep.c index c668034..161d3f0 100644 --- a/grep.c +++ b/grep.c @@ -31,6 +31,7 @@ void init_grep_defaults(void) opt->max_depth = -1; opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED; opt->extended_regexp_option = 0; + opt->allow_textconv = 1; strcpy(opt->color_context, ""); strcpy(opt->color_filename, ""); strcpy(opt->color_function, ""); @@ -134,6 +135,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) opt->pathname = def->pathname; opt->regflags = def->regflags; opt->relative = def->relative; + opt->allow_textconv = def->allow_textconv; strcpy(opt->color_context, def->color_context); strcpy(opt->color_filename, def->color_filename); diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 10b2c8b..2fc9d9c 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -156,7 +156,7 @@ test_expect_success 'setup textconv filters' ' git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv ' -test_expect_failure 'grep does not honor textconv' ' +test_expect_success 'grep does honor textconv' ' echo "a:binaryQfile" >expect && git grep Qfile >actual && test_cmp expect actual @@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' ' test_must_fail git grep --no-textconv Qfile ' -test_expect_failure 'grep blob does not honor textconv' ' +test_expect_success 'grep blob does honor textconv' ' echo "HEAD:a:binaryQfile" >expect && git grep Qfile HEAD:a >actual && test_cmp expect actual -- 1.8.2.1.799.g1ac2534 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-23 12:11 ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber @ 2013-04-23 15:20 ` Junio C Hamano 2013-04-24 10:05 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-23 15:20 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > Currently, "git grep" does not honor textconv settings by default. > Make it honor them by default just like "git log --grep" does. "git log --grep" looks for strings in the log message which never goes through textconv filters. Puzzled.... If you meant -S/-G, it justifies use of textconv because we are generating diff and the user defines textconv to get a reasonable output for otherwise undiffable contents. I do not know if it is sensible to apply textconv by default for "grep" (or for that matter "git show" that gives blob contents). > > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> > --- > Documentation/git-grep.txt | 2 +- > grep.c | 2 ++ > t/t7008-grep-binary.sh | 4 ++-- > 3 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt > index a5c5a27..f54ac0c 100644 > --- a/Documentation/git-grep.txt > +++ b/Documentation/git-grep.txt > @@ -82,10 +82,10 @@ OPTIONS > > --textconv:: > Honor textconv filter settings. > + This is the default. > > --no-textconv:: > Do not honor textconv filter settings. > - This is the default. > > -i:: > --ignore-case:: > diff --git a/grep.c b/grep.c > index c668034..161d3f0 100644 > --- a/grep.c > +++ b/grep.c > @@ -31,6 +31,7 @@ void init_grep_defaults(void) > opt->max_depth = -1; > opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED; > opt->extended_regexp_option = 0; > + opt->allow_textconv = 1; > strcpy(opt->color_context, ""); > strcpy(opt->color_filename, ""); > strcpy(opt->color_function, ""); > @@ -134,6 +135,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) > opt->pathname = def->pathname; > opt->regflags = def->regflags; > opt->relative = def->relative; > + opt->allow_textconv = def->allow_textconv; > > strcpy(opt->color_context, def->color_context); > strcpy(opt->color_filename, def->color_filename); > diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh > index 10b2c8b..2fc9d9c 100755 > --- a/t/t7008-grep-binary.sh > +++ b/t/t7008-grep-binary.sh > @@ -156,7 +156,7 @@ test_expect_success 'setup textconv filters' ' > git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv > ' > > -test_expect_failure 'grep does not honor textconv' ' > +test_expect_success 'grep does honor textconv' ' > echo "a:binaryQfile" >expect && > git grep Qfile >actual && > test_cmp expect actual > @@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' ' > test_must_fail git grep --no-textconv Qfile > ' > > -test_expect_failure 'grep blob does not honor textconv' ' > +test_expect_success 'grep blob does honor textconv' ' > echo "HEAD:a:binaryQfile" >expect && > git grep Qfile HEAD:a >actual && > test_cmp expect actual ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-23 15:20 ` Junio C Hamano @ 2013-04-24 10:05 ` Michael J Gruber 2013-04-24 17:35 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-24 10:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Junio C Hamano venit, vidit, dixit 23.04.2013 17:20: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> Currently, "git grep" does not honor textconv settings by default. >> Make it honor them by default just like "git log --grep" does. > > "git log --grep" looks for strings in the log message which never > goes through textconv filters. > > Puzzled.... > > If you meant -S/-G, it justifies use of textconv because we are > generating diff and the user defines textconv to get a reasonable > output for otherwise undiffable contents. Sorry, yes, I meant "log grep diff", aka "log -S/-G". > I do not know if it is sensible to apply textconv by default for > "grep" (or for that matter "git show" that gives blob contents). Well, that is the discussion that we were having, with no real end result, which is why I haven't implemented this differently yet. My point is that we apply textconv on "log diff greps" already, so why should't we on content greps? The question is really whether we should treat "content" similar to "diff", that's question both when comparing "git log -S" to "git grep" and "git show <commit>" to "git show <blob>". My choice is clear, but others seem torn. For "git grep", implementing a "no-textconv" default is simple, but for "git show <blob>" this appears to be cumbersome to me. >> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> >> --- >> Documentation/git-grep.txt | 2 +- >> grep.c | 2 ++ >> t/t7008-grep-binary.sh | 4 ++-- >> 3 files changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt >> index a5c5a27..f54ac0c 100644 >> --- a/Documentation/git-grep.txt >> +++ b/Documentation/git-grep.txt >> @@ -82,10 +82,10 @@ OPTIONS >> >> --textconv:: >> Honor textconv filter settings. >> + This is the default. >> >> --no-textconv:: >> Do not honor textconv filter settings. >> - This is the default. >> >> -i:: >> --ignore-case:: >> diff --git a/grep.c b/grep.c >> index c668034..161d3f0 100644 >> --- a/grep.c >> +++ b/grep.c >> @@ -31,6 +31,7 @@ void init_grep_defaults(void) >> opt->max_depth = -1; >> opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED; >> opt->extended_regexp_option = 0; >> + opt->allow_textconv = 1; >> strcpy(opt->color_context, ""); >> strcpy(opt->color_filename, ""); >> strcpy(opt->color_function, ""); >> @@ -134,6 +135,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) >> opt->pathname = def->pathname; >> opt->regflags = def->regflags; >> opt->relative = def->relative; >> + opt->allow_textconv = def->allow_textconv; >> >> strcpy(opt->color_context, def->color_context); >> strcpy(opt->color_filename, def->color_filename); >> diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh >> index 10b2c8b..2fc9d9c 100755 >> --- a/t/t7008-grep-binary.sh >> +++ b/t/t7008-grep-binary.sh >> @@ -156,7 +156,7 @@ test_expect_success 'setup textconv filters' ' >> git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv >> ' >> >> -test_expect_failure 'grep does not honor textconv' ' >> +test_expect_success 'grep does honor textconv' ' >> echo "a:binaryQfile" >expect && >> git grep Qfile >actual && >> test_cmp expect actual >> @@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' ' >> test_must_fail git grep --no-textconv Qfile >> ' >> >> -test_expect_failure 'grep blob does not honor textconv' ' >> +test_expect_success 'grep blob does honor textconv' ' >> echo "HEAD:a:binaryQfile" >expect && >> git grep Qfile HEAD:a >actual && >> test_cmp expect actual ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-24 10:05 ` Michael J Gruber @ 2013-04-24 17:35 ` Junio C Hamano 2013-04-24 17:57 ` Matthieu Moy 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-24 17:35 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Matthieu.Moy, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > My point is that we apply textconv on "log diff greps" already, so why > should't we on content greps? I think you are going in circles. If you start from "textconv is about mangling blob contents", then it would look natural to you that "show <blob>", "diff A B", and "grep <pattern> <blob>" would all first mangle the blob contents using textconv and then work on them. But diff.<driver>.textconv is to mangle blob contents in preparation for comparing with another. That is why you explicitly ask "cat-file --textconv" to use the same mangling even when you are not comparing it with anything else. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-24 17:35 ` Junio C Hamano @ 2013-04-24 17:57 ` Matthieu Moy 2013-04-24 18:55 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Matthieu Moy @ 2013-04-24 17:57 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git, jeremy.rosen, Jeff King Junio C Hamano <gitster@pobox.com> writes: > But diff.<driver>.textconv is to mangle blob contents in preparation > for comparing with another. and also in preparation for "blame". In both cases (diff and blame), we're preparing to show the file content to the user, and showing the binary makes no sense. Grepping through the binary, on the other hand, can very well make sense, like: $ git grep foo file.txt: some instance of foo binary file bar.bin matches One reason not to run the filter is performance: "git grep" is fast, and it's cool. My textconv filters are usually slow, and it's not a big problem because the diff machinery will only invoke the textconv filter when the files are modified (i.e. hopefully not often for tracked binary files). OTOH, "git grep" would need to run the textconv filters for each binary files being searched for. I tend to agree with Junio that it makes sense to keep it disabled by default. Perhaps a grep.textconv config option? -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-24 17:57 ` Matthieu Moy @ 2013-04-24 18:55 ` Junio C Hamano 2013-04-26 11:59 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-24 18:55 UTC (permalink / raw) To: Matthieu Moy; +Cc: Michael J Gruber, git, jeremy.rosen, Jeff King Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes: > Grepping through the binary, on the other hand, can very well make > sense, like: > > $ git grep foo > file.txt: some instance of foo > binary file bar.bin matches Yes, I am moderately negative on making it the default, mostly because it goes against established expectations, but I did not mean to say that an ability to pass blob contents through textconv before running grep should not exist. It would be a good option to have. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-24 18:55 ` Junio C Hamano @ 2013-04-26 11:59 ` Michael J Gruber 2013-04-26 13:23 ` Matthieu Moy 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-26 11:59 UTC (permalink / raw) To: Junio C Hamano; +Cc: Matthieu Moy, git, jeremy.rosen, Jeff King Junio C Hamano venit, vidit, dixit 24.04.2013 20:55: > Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes: > >> Grepping through the binary, on the other hand, can very well make >> sense, like: >> >> $ git grep foo >> file.txt: some instance of foo >> binary file bar.bin matches BTW, textconv does not have to be slow - just use textconv-cache. > Yes, > > I am moderately negative on making it the default, mostly because it > goes against established expectations, but I did not mean to say > that an ability to pass blob contents through textconv before > running grep should not exist. It would be a good option to have. I'm still looking for a way to at least treat "git grep" and "git show blob" the same way. I understand that I cannot convince you to change the default here. The two options that I see are: - Implement the --textconv option but leave the default as is. I did that for "git grep" already (just drop 7/7) but it seems to be cumbersome for "git show blob". I have to recheck. - Implement a new attribute "show" analogous to "diff" which applies to the blob case ("git grep" is a blob case, and so is "git show blob") which can specify a "show" driver, which is like a "diff" driver but understands textconv and cachetextconv options only. Here, the default would be "--textconv" in any case, but unless you specify a "show" attribute and driver there is no change in current behavior. The second case is a bit like clean/smudge, so, alternatively, one could add a textconv and cachetextconv option to "filter" rather than introducing "show". I'm not sure how much the textconv machinery needs to change, though. Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-26 11:59 ` Michael J Gruber @ 2013-04-26 13:23 ` Matthieu Moy 2013-04-29 9:04 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Matthieu Moy @ 2013-04-26 13:23 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: > BTW, textconv does not have to be slow - just use textconv-cache. Right, thanks for reminding me about this, I had forgotten its existance ;-). > I'm still looking for a way to at least treat "git grep" and "git show > blob" the same way. I agree they should be treated similarly. > - Implement the --textconv option but leave the default as is. I did > that for "git grep" already (just drop 7/7) That seems sensible. > but it seems to be cumbersome for "git show blob". I have to recheck. It should be possible to have a tri-state for the --[no-]textconv option: unset, set to true or set to false. But the code sharing between log, show and diff might make that non-trivial. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-26 13:23 ` Matthieu Moy @ 2013-04-29 9:04 ` Michael J Gruber 2013-04-29 15:04 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-04-29 9:04 UTC (permalink / raw) To: Matthieu Moy; +Cc: Junio C Hamano, git, jeremy.rosen, Jeff King Matthieu Moy venit, vidit, dixit 26.04.2013 15:23: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> BTW, textconv does not have to be slow - just use textconv-cache. > > Right, thanks for reminding me about this, I had forgotten its existance ;-). > >> I'm still looking for a way to at least treat "git grep" and "git show >> blob" the same way. > > I agree they should be treated similarly. > >> - Implement the --textconv option but leave the default as is. I did >> that for "git grep" already (just drop 7/7) > > That seems sensible. > >> but it seems to be cumbersome for "git show blob". I have to recheck. > > It should be possible to have a tri-state for the --[no-]textconv > option: unset, set to true or set to false. But the code sharing between > log, show and diff might make that non-trivial. Right now it's a diffopt bit... Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv2 7/7] git grep: honor textconv by default 2013-04-29 9:04 ` Michael J Gruber @ 2013-04-29 15:04 ` Junio C Hamano 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-04-29 15:04 UTC (permalink / raw) To: Michael J Gruber; +Cc: Matthieu Moy, git, jeremy.rosen, Jeff King Michael J Gruber <git@drmicha.warpmail.net> writes: >> It should be possible to have a tri-state for the --[no-]textconv >> option: unset, set to true or set to false. But the code sharing between >> log, show and diff might make that non-trivial. > > Right now it's a diffopt bit... I wonder if you can do something along the lines of the attached patch. The following discussion assumes that your default wants textconv for generating patches, and no textconv for showing blobs, which is the case your "it is a bit" becomes an issue. The basic structure is that: * There is an extra "opt->touched_flags" that keeps track of all the fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR; * You may continue setting the default values to the flags, like commands in the "log" family do in cmd_log_init_defaults(), but after you finished setting the defaults, you clear the touched_flags field; * And then you let the usual callchain to call diff_opt_parse(), allowing the opt->flags be set or unset, while keeping track of which bits the user touched; * There is an optional callback "opt->set_default" that is called at the very beginning to lets you inspect touched_flags and update opt->flags appropriately, before the remainder of the diffcore machinery is set up, taking the opt->flags value into account. Your "git show" could start out with ALLOW_TEXTCONV set, but notice explicit requests to --[no-]textconv from the command line in your set_default() callback. And then when it deals with a blob, check if the user touched ALLOW_TEXTCONV and appropriately act on that knowledge. There would be three cases in your set_default callback: * flags has ALLOW_TEXTCONV set, and the bit was touched: the user explicitly said --textconv because she wants blobs to be mangled; * flags has ALLOW_TEXTCONV set, and the bit was not touched: the user did not say --textconv; do not mangle blobs; * flags has ALLOW_TEXTCONV unset; the user did not say --textconv, or explicitly said --no-textconv; do not mangle blobs. The set_default callback can also be used to adjust defaults for fields that are not handled by the DIFF_OPT_SET/CLR/TST, by the way. You can remember the address of the default value you fed to a string field before entering the callchain to diff_opt_parse(), and in your set_default callback see if the value is still pointing at the same piece of memory (in which case the user did not touch it). builtin/log.c | 1 + diff.c | 3 +++ diff.h | 7 +++++-- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index 6e56a50..c62ecd1 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -91,6 +91,7 @@ static void cmd_log_init_defaults(struct rev_info *rev) if (default_date_mode) rev->date_mode = parse_date_format(default_date_mode); + rev->diffopt.touched_flags = 0; } static void cmd_log_init_finish(int argc, const char **argv, const char *prefix, diff --git a/diff.c b/diff.c index f0b3e7c..7c24872 100644 --- a/diff.c +++ b/diff.c @@ -3213,6 +3213,9 @@ void diff_setup_done(struct diff_options *options) { int count = 0; + if (options->set_default) + options->set_default(options); + if (options->output_format & DIFF_FORMAT_NAME) count++; if (options->output_format & DIFF_FORMAT_NAME_STATUS) diff --git a/diff.h b/diff.h index 78b4091..5c2f878 100644 --- a/diff.h +++ b/diff.h @@ -87,8 +87,8 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data) #define DIFF_OPT_PICKAXE_IGNORE_CASE (1 << 30) #define DIFF_OPT_TST(opts, flag) ((opts)->flags & DIFF_OPT_##flag) -#define DIFF_OPT_SET(opts, flag) ((opts)->flags |= DIFF_OPT_##flag) -#define DIFF_OPT_CLR(opts, flag) ((opts)->flags &= ~DIFF_OPT_##flag) +#define DIFF_OPT_SET(opts, flag) (((opts)->flags |= DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag)) +#define DIFF_OPT_CLR(opts, flag) (((opts)->flags &= ~DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag)) #define DIFF_XDL_TST(opts, flag) ((opts)->xdl_opts & XDF_##flag) #define DIFF_XDL_SET(opts, flag) ((opts)->xdl_opts |= XDF_##flag) #define DIFF_XDL_CLR(opts, flag) ((opts)->xdl_opts &= ~XDF_##flag) @@ -109,6 +109,7 @@ struct diff_options { const char *single_follow; const char *a_prefix, *b_prefix; unsigned flags; + unsigned touched_flags; int use_color; int context; int interhunkcontext; @@ -145,6 +146,8 @@ struct diff_options { /* to support internal diff recursion by --follow hack*/ int found_follow; + void (*set_default)(struct diff_options *); + FILE *file; int close_file; ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv3 0/7] textconv with grep and show @ 2013-05-10 15:08 ` Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber ` (6 more replies) 0 siblings, 7 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:08 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy This is the "Git Merge edition" of the textconv series. Great conference, the series struggles to match that. v3 keeps all defaults as they are (no textconv for blobs by default) and incorporates Junio's touched_flags patch. I do not need the callback but left it in. As for beeing able to choose textconv as a default for blobs, I'm wondering how fine grained that needs to be: one for all, differentiate between "show blob" and "grep", or even driver specific (diff.driver.blobtextconv boolean). In the latter case, it's not clear to me how "--textconv" with a "false" config should behave. Jeff King (1): grep: allow to use textconv filters Junio C Hamano (1): diff_opt: track whether flags have been set explicitly Michael J Gruber (5): t4030: demonstrate behavior of show with textconv show: honor --textconv for blobs cat-file: do not die on --textconv without textconv filters t7008: demonstrate behavior of grep with textconv grep: honor --textconv for the case rev:path Documentation/git-grep.txt | 9 +++- Documentation/technical/api-diff.txt | 10 +++- builtin/cat-file.c | 18 +++---- builtin/grep.c | 13 +++-- builtin/log.c | 26 +++++++-- diff.c | 3 ++ diff.h | 8 ++- grep.c | 100 ++++++++++++++++++++++++++++++----- grep.h | 1 + object.c | 26 ++++++--- object.h | 2 + t/t4030-diff-textconv.sh | 24 +++++++++ t/t7008-grep-binary.sh | 31 +++++++++++ t/t8007-cat-file-textconv.sh | 20 ++----- 14 files changed, 234 insertions(+), 57 deletions(-) -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber ` (5 subsequent siblings) 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy "git show <commit>" honors the --textconv option while "git show <blob>" does not. Demonstrate this in the test. Since the current behavior is supposed to stay as is, we expect the default for "git show <blob>" to remain --no-textconv. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- t/t4030-diff-textconv.sh | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh index 53ec330..3950fc9 100755 --- a/t/t4030-diff-textconv.sh +++ b/t/t4030-diff-textconv.sh @@ -58,6 +58,12 @@ test_expect_success 'diff produces text' ' test_cmp expect.text actual ' +test_expect_success 'show commit produces text' ' + git show HEAD >diff && + find_diff <diff >actual && + test_cmp expect.text actual +' + test_expect_success 'diff-tree produces binary' ' git diff-tree -p HEAD^ HEAD >diff && find_diff <diff >actual && @@ -84,6 +90,24 @@ test_expect_success 'status -v produces text' ' git reset --soft HEAD@{1} ' +test_expect_success 'show blob produces binary' ' + git show HEAD:file >actual && + printf "\\0\\n\\01\\n" >expect && + test_cmp expect actual +' + +test_expect_failure 'show --textconv blob produces text' ' + git show --textconv HEAD:file >actual && + printf "0\\n1\\n" >expect && + test_cmp expect actual +' + +test_success 'show --no-textconv blob produces binary' ' + git show --textconv HEAD:file >actual && + printf "\\0\\n\\01\\n" >expect && + test_cmp expect actual +' + test_expect_success 'grep-diff (-G) operates on textconv data (add)' ' echo one >expect && git log --root --format=%s -G0 >actual && -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 15:31 ` Eric Sunshine 2013-05-10 15:10 ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber ` (4 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy From: Junio C Hamano <gitster@pobox.com> The diff_opt infrastructure sets flags based on defaults and command line options. Currently, it is impossible to detect whether a flag has been set as a default or on explicit request. Amend the structure so that this detection is possible: * There is an extra "opt->touched_flags" that keeps track of all the fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR; * You may continue setting the default values to the flags, like commands in the "log" family do in cmd_log_init_defaults(), but after you finished setting the defaults, you clear the touched_flags field; * And then you let the usual callchain call diff_opt_parse(), allowing the opt->flags be set or unset, while keeping track of which bits the user touched; * There is an optional callback "opt->set_default" that is called at the very beginning to lets you inspect touched_flags and update opt->flags appropriately, before the remainder of the diffcore machinery is set up, taking the opt->flags value into account. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- Documentation/technical/api-diff.txt | 10 +++++++++- builtin/log.c | 1 + diff.c | 3 +++ diff.h | 8 ++++++-- 4 files changed, 19 insertions(+), 3 deletions(-) diff --git a/Documentation/technical/api-diff.txt b/Documentation/technical/api-diff.txt index 2d2ebc0..8b001de 100644 --- a/Documentation/technical/api-diff.txt +++ b/Documentation/technical/api-diff.txt @@ -28,7 +28,8 @@ Calling sequence * Call `diff_setup_done()`; this inspects the options set up so far for internal consistency and make necessary tweaking to it (e.g. if - textual patch output was asked, recursive behaviour is turned on). + textual patch output was asked, recursive behaviour is turned on); + the callback set_default in diff_options can be used to tweak this more. * As you find different pairs of files, call `diff_change()` to feed modified files, `diff_addremove()` to feed created or deleted files, @@ -115,6 +116,13 @@ Notable members are: operation, but some do not have anything to do with the diffcore library. +`touched_flags`:: + Records whether a flag has been changed due to user request + (rather than just set/unset by default). + +`set_default`:: + Callback which allows tweaking the options in diff_setup_done(). + BINARY, TEXT;; Affects the way how a file that is seemingly binary is treated. diff --git a/builtin/log.c b/builtin/log.c index 9e21232..f19d779 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -111,6 +111,7 @@ static void cmd_log_init_defaults(struct rev_info *rev) if (default_date_mode) rev->date_mode = parse_date_format(default_date_mode); + rev->diffopt.touched_flags = 0; } static void cmd_log_init_finish(int argc, const char **argv, const char *prefix, diff --git a/diff.c b/diff.c index f0b3e7c..7c24872 100644 --- a/diff.c +++ b/diff.c @@ -3213,6 +3213,9 @@ void diff_setup_done(struct diff_options *options) { int count = 0; + if (options->set_default) + options->set_default(options); + if (options->output_format & DIFF_FORMAT_NAME) count++; if (options->output_format & DIFF_FORMAT_NAME_STATUS) diff --git a/diff.h b/diff.h index 78b4091..e995ae1 100644 --- a/diff.h +++ b/diff.h @@ -87,8 +87,9 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data) #define DIFF_OPT_PICKAXE_IGNORE_CASE (1 << 30) #define DIFF_OPT_TST(opts, flag) ((opts)->flags & DIFF_OPT_##flag) -#define DIFF_OPT_SET(opts, flag) ((opts)->flags |= DIFF_OPT_##flag) -#define DIFF_OPT_CLR(opts, flag) ((opts)->flags &= ~DIFF_OPT_##flag) +#define DIFF_OPT_TOUCHED(opts, flag) ((opts)->touched_flags & DIFF_OPT_##flag) +#define DIFF_OPT_SET(opts, flag) (((opts)->flags |= DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag)) +#define DIFF_OPT_CLR(opts, flag) (((opts)->flags &= ~DIFF_OPT_##flag),((opts)->touched_flags |= DIFF_OPT_##flag)) #define DIFF_XDL_TST(opts, flag) ((opts)->xdl_opts & XDF_##flag) #define DIFF_XDL_SET(opts, flag) ((opts)->xdl_opts |= XDF_##flag) #define DIFF_XDL_CLR(opts, flag) ((opts)->xdl_opts &= ~XDF_##flag) @@ -109,6 +110,7 @@ struct diff_options { const char *single_follow; const char *a_prefix, *b_prefix; unsigned flags; + unsigned touched_flags; int use_color; int context; int interhunkcontext; @@ -145,6 +147,8 @@ struct diff_options { /* to support internal diff recursion by --follow hack*/ int found_follow; + void (*set_default)(struct diff_options *); + FILE *file; int close_file; -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly 2013-05-10 15:10 ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber @ 2013-05-10 15:31 ` Eric Sunshine 0 siblings, 0 replies; 78+ messages in thread From: Eric Sunshine @ 2013-05-10 15:31 UTC (permalink / raw) To: Michael J Gruber; +Cc: Git List, Jeff King, Junio C Hamano, Matthieu Moy On Fri, May 10, 2013 at 11:10 AM, Michael J Gruber <git@drmicha.warpmail.net> wrote: > From: Junio C Hamano <gitster@pobox.com> > > The diff_opt infrastructure sets flags based on defaults and command > line options. Currently, it is impossible to detect whether a flag has > been set as a default or on explicit request. > > Amend the structure so that this detection is possible: > > * There is an extra "opt->touched_flags" that keeps track of all > the fields that have been touched by DIFF_OPT_SET and > DIFF_OPT_CLR; > > * You may continue setting the default values to the flags, like > commands in the "log" family do in cmd_log_init_defaults(), but > after you finished setting the defaults, you clear the > touched_flags field; > > * And then you let the usual callchain call diff_opt_parse(), > allowing the opt->flags be set or unset, while keeping track of > which bits the user touched; > > * There is an optional callback "opt->set_default" that is called > at the very beginning to lets you inspect touched_flags and s/lets/let/ > update opt->flags appropriately, before the remainder of the > diffcore machinery is set up, taking the opt->flags value into > account. > > Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 17:02 ` Junio C Hamano 2013-05-10 15:10 ` [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber ` (3 subsequent siblings) 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy Currently, "diff" and "cat-file" for blobs honor "--textconv" options (with the former defaulting to "--textconv" and the latter to "--no-textconv") whereas "show" does not honor this option, even though it takes diff options. Make "show" on blobs behave like "diff", i.e. honor "--textconv" by default and "--no-textconv" when given. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/log.c | 25 ++++++++++++++++++++++--- t/t4030-diff-textconv.sh | 6 +++--- 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index f19d779..dd3f108 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -437,10 +437,29 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(&out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) { + unsigned char sha1c[20]; + struct object_context obj_context; + char *buf; + unsigned long size; + fflush(stdout); - return stream_blob_to_fd(1, sha1, NULL, 0); + if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) || + !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) + die("Not a valid object name %s", obj_name); + if (!obj_context.path[0] || + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (!buf) + die("git show %s: bad file", obj_name); + + write_or_die(1, buf, size); + return 0; } static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) @@ -526,7 +545,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) const char *name = objects[i].name; switch (o->type) { case OBJ_BLOB: - ret = show_blob_object(o->sha1, NULL); + ret = show_blob_object(o->sha1, &rev, name); break; case OBJ_TAG: { struct tag *t = (struct tag *)o; diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh index 3950fc9..0ebb028 100755 --- a/t/t4030-diff-textconv.sh +++ b/t/t4030-diff-textconv.sh @@ -96,14 +96,14 @@ test_expect_success 'show blob produces binary' ' test_cmp expect actual ' -test_expect_failure 'show --textconv blob produces text' ' +test_expect_success 'show --textconv blob produces text' ' git show --textconv HEAD:file >actual && printf "0\\n1\\n" >expect && test_cmp expect actual ' -test_success 'show --no-textconv blob produces binary' ' - git show --textconv HEAD:file >actual && +test_expect_success 'show --no-textconv blob produces binary' ' + git show --no-textconv HEAD:file >actual && printf "\\0\\n\\01\\n" >expect && test_cmp expect actual ' -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-10 15:10 ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber @ 2013-05-10 17:02 ` Junio C Hamano 2013-05-10 17:34 ` Jeff King 2013-05-11 8:54 ` Michael J Gruber 0 siblings, 2 replies; 78+ messages in thread From: Junio C Hamano @ 2013-05-10 17:02 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy Michael J Gruber <git@drmicha.warpmail.net> writes: > Currently, "diff" and "cat-file" for blobs honor "--textconv" options > (with the former defaulting to "--textconv" and the latter to > "--no-textconv") whereas "show" does not honor this option, even though > it takes diff options. > > Make "show" on blobs behave like "diff", i.e. honor "--textconv" by > default and "--no-textconv" when given. Hmm... > +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) > { > + unsigned char sha1c[20]; > + struct object_context obj_context; > + char *buf; > + unsigned long size; > + > fflush(stdout); > - return stream_blob_to_fd(1, sha1, NULL, 0); > + if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) || > + !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) > + return stream_blob_to_fd(1, sha1, NULL, 0); It is surprising that the necessary change is only this, but I think it is correct ;-). We ignore textconv when the command line did not mention --[no-]textconv, or the command line said --no-textconv explicitly. This (especially the first condition) may deserve an in-code comment for anybody who wonders where this default behaviour is implemented. So "show" on blobs does show the raw contents by default, but the user can explicitly ask to enable textconv with --[no-]textconv. Is the second paragraph in the log message still valid? > + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) > + die("Not a valid object name %s", obj_name); This looks somewhat unfortunate. We already have sha1[]; actually we not just know sha1[] but have the struct object for it. How did we obtain it before we got here? Will we always have a valid name in rev.pending.objects->name? Will that name convert back to the same sha1 we got in sha1[]? I think the answers are "Yes (it is a command line argument), Yes (that is what setup_revisions() got by feeding the name to give us sha1[])". I wonder if enriching rev_info->pending with the context information might be a clean solution to avoid this redundant but unavoidable conversion, but that is a separate and future topic, I think. > + if (!obj_context.path[0] || > + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) > + return stream_blob_to_fd(1, sha1, NULL, 0); > + > + if (!buf) > + die("git show %s: bad file", obj_name); > + > + write_or_die(1, buf, size); > + return 0; > } > > static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) > @@ -526,7 +545,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) > const char *name = objects[i].name; > switch (o->type) { > case OBJ_BLOB: > - ret = show_blob_object(o->sha1, NULL); > + ret = show_blob_object(o->sha1, &rev, name); > break; > case OBJ_TAG: { > struct tag *t = (struct tag *)o; > diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh > index 3950fc9..0ebb028 100755 > --- a/t/t4030-diff-textconv.sh > +++ b/t/t4030-diff-textconv.sh > @@ -96,14 +96,14 @@ test_expect_success 'show blob produces binary' ' > test_cmp expect actual > ' > > -test_expect_failure 'show --textconv blob produces text' ' > +test_expect_success 'show --textconv blob produces text' ' > git show --textconv HEAD:file >actual && > printf "0\\n1\\n" >expect && > test_cmp expect actual > ' > > -test_success 'show --no-textconv blob produces binary' ' > - git show --textconv HEAD:file >actual && > +test_expect_success 'show --no-textconv blob produces binary' ' > + git show --no-textconv HEAD:file >actual && > printf "\\0\\n\\01\\n" >expect && > test_cmp expect actual > ' ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-10 17:02 ` Junio C Hamano @ 2013-05-10 17:34 ` Jeff King 2013-05-10 18:04 ` Junio C Hamano 2013-05-11 8:54 ` Michael J Gruber 1 sibling, 1 reply; 78+ messages in thread From: Jeff King @ 2013-05-10 17:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git, Matthieu Moy On Fri, May 10, 2013 at 10:02:51AM -0700, Junio C Hamano wrote: > > Make "show" on blobs behave like "diff", i.e. honor "--textconv" by > > default and "--no-textconv" when given. > [...] > So "show" on blobs does show the raw contents by default, but the > user can explicitly ask to enable textconv with --[no-]textconv. Is > the second paragraph in the log message still valid? Yes, I had the same thought... > > + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) > > + die("Not a valid object name %s", obj_name); > > This looks somewhat unfortunate. > [...] > I wonder if enriching rev_info->pending with the context information > might be a clean solution to avoid this redundant but unavoidable > conversion, but that is a separate and future topic, I think. It would be, and indeed, that is similar to what the final patch does. The problem is that it requires an extra allocation (we do not want to unconditionally put the object_context into the object_array because it is too big, so we add only a pointer). So having rev_info->pending store that information would mean that callers would have to know to free it when freeing the pending array. We would have to either teach each existing caller to do so, or perhaps enable the behavior only when a certain flag is set (e.g., rev->keep_object_context or something). -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-10 17:34 ` Jeff King @ 2013-05-10 18:04 ` Junio C Hamano 2013-05-11 0:25 ` Jeff King 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-05-10 18:04 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git, Matthieu Moy Jeff King <peff@peff.net> writes: > On Fri, May 10, 2013 at 10:02:51AM -0700, Junio C Hamano wrote: > >> > Make "show" on blobs behave like "diff", i.e. honor "--textconv" by >> > default and "--no-textconv" when given. >> [...] >> So "show" on blobs does show the raw contents by default, but the >> user can explicitly ask to enable textconv with --[no-]textconv. Is >> the second paragraph in the log message still valid? > > Yes, I had the same thought... I'd rewrite the paragraph to something like: Make "show" on blobs honor "--textconv" when it is asked. The default is not to apply textconv, which is in line with what "cat-file" does. >> > + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) >> > + die("Not a valid object name %s", obj_name); >> >> This looks somewhat unfortunate. >> [...] >> I wonder if enriching rev_info->pending with the context information >> might be a clean solution to avoid this redundant but unavoidable >> conversion, but that is a separate and future topic, I think. > > It would be, and indeed, that is similar to what the final patch does. OK, I wasn't paying attention ;-) > The problem is that it requires an extra allocation (we do not want to > unconditionally put the object_context into the object_array because it > is too big, so we add only a pointer). So having rev_info->pending store > that information would mean that callers would have to know to free it > when freeing the pending array. We would have to either teach each > existing caller to do so, or perhaps enable the behavior only when a > certain flag is set (e.g., rev->keep_object_context or something). One thing to notice is that those accessing rev->pending before calling prepare_revision_walk(), as opposed to those receiving objects in rev->commits via get_revision(), are the only ones that care about the context and wants to act differently depending on where these came from and how they were specified. That suggests at least two possibilities to me: - Perhaps we can place the context in rev->pending and clear them when prepare_revision_walk() moves them to rev->commits, without introducing rev->keep_object_context? - Perhaps instead of extending object-array, we can move this kind of information to rev_cmdline and enrich that structure? ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-10 18:04 ` Junio C Hamano @ 2013-05-11 0:25 ` Jeff King 2013-05-11 19:54 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Jeff King @ 2013-05-11 0:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git, Matthieu Moy On Fri, May 10, 2013 at 11:04:01AM -0700, Junio C Hamano wrote: > One thing to notice is that those accessing rev->pending before > calling prepare_revision_walk(), as opposed to those receiving > objects in rev->commits via get_revision(), are the only ones that > care about the context and wants to act differently depending on > where these came from and how they were specified. > > That suggests at least two possibilities to me: > > - Perhaps we can place the context in rev->pending and clear them > when prepare_revision_walk() moves them to rev->commits, without > introducing rev->keep_object_context? > > - Perhaps instead of extending object-array, we can move this kind > of information to rev_cmdline and enrich that structure? Without looking too closely to see whether it is feasible, I would think the latter would end up being much more elegant, since I think it already deals with some allocation issues already. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-11 0:25 ` Jeff King @ 2013-05-11 19:54 ` Junio C Hamano 0 siblings, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-05-11 19:54 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git, Matthieu Moy Jeff King <peff@peff.net> writes: > On Fri, May 10, 2013 at 11:04:01AM -0700, Junio C Hamano wrote: > >> One thing to notice is that those accessing rev->pending before >> calling prepare_revision_walk(), as opposed to those receiving >> objects in rev->commits via get_revision(), are the only ones that >> care about the context and wants to act differently depending on >> where these came from and how they were specified. >> >> That suggests at least two possibilities to me: >> >> - Perhaps we can place the context in rev->pending and clear them >> when prepare_revision_walk() moves them to rev->commits, without >> introducing rev->keep_object_context? >> >> - Perhaps instead of extending object-array, we can move this kind >> of information to rev_cmdline and enrich that structure? > > Without looking too closely to see whether it is feasible, I would think > the latter would end up being much more elegant, since I think it > already deals with some allocation issues already. Yeah. I am fairly reluctant to apply a change that makes entries in object-array larger. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-10 17:02 ` Junio C Hamano 2013-05-10 17:34 ` Jeff King @ 2013-05-11 8:54 ` Michael J Gruber 2013-05-11 10:00 ` Michael J Gruber 2013-05-11 17:36 ` Junio C Hamano 1 sibling, 2 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-11 8:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King, Matthieu Moy Junio C Hamano venit, vidit, dixit 10.05.2013 19:02: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> Currently, "diff" and "cat-file" for blobs honor "--textconv" options >> (with the former defaulting to "--textconv" and the latter to >> "--no-textconv") whereas "show" does not honor this option, even though >> it takes diff options. >> >> Make "show" on blobs behave like "diff", i.e. honor "--textconv" by >> default and "--no-textconv" when given. > > Hmm... Sorry, I overlooked that ;) >> +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) >> { >> + unsigned char sha1c[20]; >> + struct object_context obj_context; >> + char *buf; >> + unsigned long size; >> + >> fflush(stdout); >> - return stream_blob_to_fd(1, sha1, NULL, 0); >> + if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) || >> + !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) >> + return stream_blob_to_fd(1, sha1, NULL, 0); > > It is surprising that the necessary change is only this, but I think > it is correct ;-). We ignore textconv when the command line did not > mention --[no-]textconv, or the command line said --no-textconv > explicitly. > > This (especially the first condition) may deserve an in-code comment > for anybody who wonders where this default behaviour is implemented. It's not as if we would document behavior by in-code comments in general, do we? The usual answer is "git log -S" or "git blame". > So "show" on blobs does show the raw contents by default, but the > user can explicitly ask to enable textconv with --[no-]textconv. Is > the second paragraph in the log message still valid? > >> + if (get_sha1_with_context(obj_name, 0, sha1c, &obj_context)) >> + die("Not a valid object name %s", obj_name); > > This looks somewhat unfortunate. > > We already have sha1[]; actually we not just know sha1[] but have > the struct object for it. How did we obtain it before we got here? > > Will we always have a valid name in rev.pending.objects->name? Will > that name convert back to the same sha1 we got in sha1[]? > > I think the answers are "Yes (it is a command line argument), Yes > (that is what setup_revisions() got by feeding the name to give us > sha1[])". > > I wonder if enriching rev_info->pending with the context information > might be a clean solution to avoid this redundant but unavoidable > conversion, but that is a separate and future topic, I think. Yes, I think both Jeff and I have thought it and came to the same conclusion - "later" ;) >> + if (!obj_context.path[0] || >> + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, &buf, &size)) >> + return stream_blob_to_fd(1, sha1, NULL, 0); >> + >> + if (!buf) >> + die("git show %s: bad file", obj_name); >> + >> + write_or_die(1, buf, size); >> + return 0; >> } >> >> static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) >> @@ -526,7 +545,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) >> const char *name = objects[i].name; >> switch (o->type) { >> case OBJ_BLOB: >> - ret = show_blob_object(o->sha1, NULL); >> + ret = show_blob_object(o->sha1, &rev, name); >> break; >> case OBJ_TAG: { >> struct tag *t = (struct tag *)o; >> diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh >> index 3950fc9..0ebb028 100755 >> --- a/t/t4030-diff-textconv.sh >> +++ b/t/t4030-diff-textconv.sh >> @@ -96,14 +96,14 @@ test_expect_success 'show blob produces binary' ' >> test_cmp expect actual >> ' >> >> -test_expect_failure 'show --textconv blob produces text' ' >> +test_expect_success 'show --textconv blob produces text' ' >> git show --textconv HEAD:file >actual && >> printf "0\\n1\\n" >expect && >> test_cmp expect actual >> ' >> >> -test_success 'show --no-textconv blob produces binary' ' >> - git show --textconv HEAD:file >actual && >> +test_expect_success 'show --no-textconv blob produces binary' ' >> + git show --no-textconv HEAD:file >actual && >> printf "\\0\\n\\01\\n" >expect && >> test_cmp expect actual >> ' ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-11 8:54 ` Michael J Gruber @ 2013-05-11 10:00 ` Michael J Gruber 2013-05-13 5:01 ` Junio C Hamano 2013-05-11 17:36 ` Junio C Hamano 1 sibling, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-05-11 10:00 UTC (permalink / raw) Cc: Junio C Hamano, git, Jeff King, Matthieu Moy Adding to that: Somehow I still feel I should introduce a new attribute "show" (or a better name) similar to "diff" so that you can specifiy a diff driver to use for showing a blob (or grepping it), which may or may not be the same you use for "diff". This would be a much more fine-grained and systematic way of setting a default for "--textconv" for blobs. Of course, some driver attributes would just not matter for coverting blobs, but that doesn't hurt. I'm just wondering whether it's worth the effort and whether I should distinguish between "show" and grep". So, the sructure would be: "--textconv" is on by default for diff, show, grep. diff looks for a textconv driver using the "diff" attribute. show/grep look for a textconv driver using the "show" attribute. That way, turning on "--textconv" by default does not affect anyone unless a user specifies the new attribute! Also, all commands would behave "the same way" if you have both a diff and a show attribute set on the same files.. Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-11 10:00 ` Michael J Gruber @ 2013-05-13 5:01 ` Junio C Hamano 2013-05-13 11:55 ` Jeff King 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-05-13 5:01 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy Michael J Gruber <git@drmicha.warpmail.net> writes: > Adding to that: > > Somehow I still feel I should introduce a new attribute "show" (or a > better name) similar to "diff" so that you can specifiy a diff driver to > use for showing a blob (or grepping it), which may or may not be the > same you use for "diff". This would be a much more fine-grained and > systematic way of setting a default for "--textconv" for blobs. > > Of course, some driver attributes would just not matter for coverting > blobs, but that doesn't hurt. > > I'm just wondering whether it's worth the effort and whether I should > distinguish between "show" and grep". Haven't thought things through, but my gut feeling is that it is on the other side of the line. We could of course add more features and over-engineered mechanisms, and the implementation may end up to be even modular and clean, but I cannot answer "Yes" with a confidence to the question "Does such a fine grained control help the users?" and cannot answer "If so in what way?" myself. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-13 5:01 ` Junio C Hamano @ 2013-05-13 11:55 ` Jeff King 2013-05-13 14:57 ` Michael J Gruber 0 siblings, 1 reply; 78+ messages in thread From: Jeff King @ 2013-05-13 11:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git, Matthieu Moy On Sun, May 12, 2013 at 10:01:38PM -0700, Junio C Hamano wrote: > Michael J Gruber <git@drmicha.warpmail.net> writes: > > > Adding to that: > > > > Somehow I still feel I should introduce a new attribute "show" (or a > > better name) similar to "diff" so that you can specifiy a diff driver to > > use for showing a blob (or grepping it), which may or may not be the > > same you use for "diff". This would be a much more fine-grained and > > systematic way of setting a default for "--textconv" for blobs. > > > > Of course, some driver attributes would just not matter for coverting > > blobs, but that doesn't hurt. > > > > I'm just wondering whether it's worth the effort and whether I should > > distinguish between "show" and grep". > > Haven't thought things through, but my gut feeling is that it is on > the other side of the line. We could of course add more features and > over-engineered mechanisms, and the implementation may end up to be > even modular and clean, but I cannot answer "Yes" with a confidence > to the question "Does such a fine grained control help the users?" > and cannot answer "If so in what way?" myself. Yeah, I think the _most_ flexible thing is going to look something like: $ cat .gitattributes *.pdf diff=pdf show=pdf $ cat ~/.gitconfig [diff "pdf"] textconv = ... [show "pdf"] textconv = ... But that obviously sucks, because in the common case that you want to use the same command, you are repeating yourself in the config. You could assume that the "show" attribute points us at a "diff" block. And that makes sense for textconv, but what does it mean if you have "show=foo" and "diff.foo.command" set? If the _only_ thing you would want to do with such a "show" mechanism is to display converted contents on show/grep, then we could lose the flexibility and say that "show" is a single-bit: do we respect diff textconv for show/grep in this case, or not? And that leaves only the question of where to put it: is it a gitattribute, or does it go in the config? I don't think that it is a property of the file itself. That is, you do not say "foo files are inherently uninteresting to git-show, and therefore we always convert them, whereas bar files do not have that property'. You say "in my workflows, I expect to see converted results from grep/show". And the latter points to using config, like either "diff.*.showConverted" (to allow per-type setting), or even "grep.useTextconv" and "show.textConv" (to allow setting it per-user for all types). And of course for any workflow-oriented config, you will sometimes want to override it for a particular operation. But that is why we have a command-line escape hatch, and that part is already implemented. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-13 11:55 ` Jeff King @ 2013-05-13 14:57 ` Michael J Gruber 2013-05-13 15:41 ` Junio C Hamano 2013-05-16 3:31 ` Jeff King 0 siblings, 2 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-13 14:57 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git, Matthieu Moy Jeff King venit, vidit, dixit 13.05.2013 13:55: > On Sun, May 12, 2013 at 10:01:38PM -0700, Junio C Hamano wrote: > >> Michael J Gruber <git@drmicha.warpmail.net> writes: >> >>> Adding to that: >>> >>> Somehow I still feel I should introduce a new attribute "show" (or a >>> better name) similar to "diff" so that you can specifiy a diff driver to >>> use for showing a blob (or grepping it), which may or may not be the >>> same you use for "diff". This would be a much more fine-grained and >>> systematic way of setting a default for "--textconv" for blobs. >>> >>> Of course, some driver attributes would just not matter for coverting >>> blobs, but that doesn't hurt. >>> >>> I'm just wondering whether it's worth the effort and whether I should >>> distinguish between "show" and grep". >> >> Haven't thought things through, but my gut feeling is that it is on >> the other side of the line. We could of course add more features and >> over-engineered mechanisms, and the implementation may end up to be >> even modular and clean, but I cannot answer "Yes" with a confidence >> to the question "Does such a fine grained control help the users?" >> and cannot answer "If so in what way?" myself. > > Yeah, I think the _most_ flexible thing is going to look something like: > > $ cat .gitattributes > *.pdf diff=pdf show=pdf > > $ cat ~/.gitconfig > [diff "pdf"] > textconv = ... > [show "pdf"] > textconv = ... > > But that obviously sucks, because in the common case that you want to > use the same command, you are repeating yourself in the config. You > could assume that the "show" attribute points us at a "diff" block. And > that makes sense for textconv, but what does it mean if you have > "show=foo" and "diff.foo.command" set? I don't propose "show drivers". In your example above, you would point to the same diff driver. If you use a diff driver just with the "show" attribute then only its textconv config will be relevant. But you do have the possibility to use different drivers for diff and show. For example, for showing a file some sort of automatic pagination or line numbering can be helpful whereas it would hurt the diff case. > If the _only_ thing you would want to do with such a "show" mechanism is > to display converted contents on show/grep, then we could lose the > flexibility and say that "show" is a single-bit: do we respect diff > textconv for show/grep in this case, or not? And that leaves only the > question of where to put it: is it a gitattribute, or does it go in the > config? > > I don't think that it is a property of the file itself. That is, you do > not say "foo files are inherently uninteresting to git-show, and > therefore we always convert them, whereas bar files do not have that > property'. You say "in my workflows, I expect to see converted results > from grep/show". And the latter points to using config, like either > "diff.*.showConverted" (to allow per-type setting), or even > "grep.useTextconv" and "show.textConv" (to allow setting it per-user for > all types). I strongly disagree here. I have textconv filters for pdf, gpg, odf, xls, doc, xoj... I know, ugly. At least some of them would benefit from different filteres or different settings. The way I propose it, a user would just have to add "show=foo" to the "diff=foo" lines without having to ad an extra filter, but with the flexibility to do so. > And of course for any workflow-oriented config, you will sometimes want > to override it for a particular operation. But that is why we have a > command-line escape hatch, and that part is already implemented. One may ask what a purely ui output oriented setting like "show" has to do in .gitattributes, of course, but that applies to "diff" as well. Separating the two (one in attributes, one in config) looks artificial to me. Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-13 14:57 ` Michael J Gruber @ 2013-05-13 15:41 ` Junio C Hamano 2013-05-16 3:31 ` Jeff King 1 sibling, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-05-13 15:41 UTC (permalink / raw) To: Michael J Gruber; +Cc: Jeff King, git, Matthieu Moy Michael J Gruber <git@drmicha.warpmail.net> writes: > But you do have the possibility to use different drivers for diff and > show. For example, for showing a file some sort of automatic pagination > or line numbering can be helpful whereas it would hurt the diff case. I do not find the example convincing (yet); it looks more like you are grasping for straws. You would certainly do not want "line numbering" in grep. My gut feeling is that normal users would expect to have a single "text version" and pass that to "pr" (if they want pagination) or "cat -n" (if they want line numbering), regardless of where it comes from, be it "git show --textconv" or some other program output, but you seem to want to have different "text version"s for different purposes out of a single binary file.... > I strongly disagree here. I have textconv filters for pdf, gpg, odf, > xls, doc, xoj... I know, ugly. At least some of them would benefit from > different filteres or different settings. .... and an example to show why it is useful would help here. I do not feel that I have seen anything to substantiate "at least some of them would benefit" yet. Would it follow that "grep" and "cat-file" should be controlled by yet two other knobs so that optionally the user can use different "text version"s meant for them? > The way I propose it, a user would just have to add "show=foo" to the > "diff=foo" lines without having to ad an extra filter, but with the > flexibility to do so. > >> And of course for any workflow-oriented config, you will sometimes want >> to override it for a particular operation. But that is why we have a >> command-line escape hatch, and that part is already implemented. > > One may ask what a purely ui output oriented setting like "show" has to > do in .gitattributes, of course, but that applies to "diff" as well. > Separating the two (one in attributes, one in config) looks artificial > to me. I am not sure what you mean by "artificial", but the separation of the roles between attribute and config is not artificial at all. It is very much deliberate and done for a good reason. The attribute specifies what the type of the file is project wide and is meant to go in in-tree .gitattrbute file, shared among people on different platforms. It says things like "These files are PDF". The config specifies what should happen to the type of a file on a particular platform each user uses to work in the copy of the project, i.e. repository. It says things like "Pass PDF files through /opt/bin/pdf2txt", which obviously cannot be shared across platforms. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-13 14:57 ` Michael J Gruber 2013-05-13 15:41 ` Junio C Hamano @ 2013-05-16 3:31 ` Jeff King 1 sibling, 0 replies; 78+ messages in thread From: Jeff King @ 2013-05-16 3:31 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git, Matthieu Moy On Mon, May 13, 2013 at 04:57:55PM +0200, Michael J Gruber wrote: > > I don't think that it is a property of the file itself. That is, you do > > not say "foo files are inherently uninteresting to git-show, and > > therefore we always convert them, whereas bar files do not have that > > property'. You say "in my workflows, I expect to see converted results > > from grep/show". And the latter points to using config, like either > > "diff.*.showConverted" (to allow per-type setting), or even > > "grep.useTextconv" and "show.textConv" (to allow setting it per-user for > > all types). > > I strongly disagree here. I have textconv filters for pdf, gpg, odf, > xls, doc, xoj... I know, ugly. At least some of them would benefit from > different filteres or different settings. OK. I was speaking mostly from intuition, and I suspect you have more real-world experience here. So I am willing to admit that my "you do not say..." above was a strawman. :) > The way I propose it, a user would just have to add "show=foo" to the > "diff=foo" lines without having to ad an extra filter, but with the > flexibility to do so. Yes, I think that would work OK. The only problem is that it is a bit weird to pointing "show=foo" to "diff.foo.*", especially when most of the driver options are ignored. But if we can accept that wrinkle in the UI, I think it would otherwise do what users want. > One may ask what a purely ui output oriented setting like "show" has to > do in .gitattributes, of course, but that applies to "diff" as well. > Separating the two (one in attributes, one in config) looks artificial > to me. I think the point is that the attribute says "a property of this path is that it has type X". And then the config says "when you see type X, do this thing with it". So arguably "diff=X" is wrong in the first place. It should be "type=X", and we should have "diff.X", "merge.X", etc in the config. And diff.*.textconv is potentially misplaced; it is not really about diffing at all, but rather about creating a human-readable presentation for the file. I don't think it is so bad that it is worth the pain of fixing it now, though. It is a historical weirdness that "diff=X" means "present the path according to the rules in X", but we can live with that. But if we think of it that way, then automatically respecting textconv for "git show" is a sensible thing to do. Hmph. Now I may have convinced myself that flipping the default is the right thing. :) So if it is not clear, I am pretty on the fence about how the defaults should be handled, or what would surprise users the least. Either way, though, it would probably make sense to have a configurable option. And with the reasoning above for the split between attributes/config, it would make sense to me for that option to be a boolean "diff.X.showtextconv". Which seems totally odd and broken (we are not doing a diff at all!), but that is where the textconv config lives, for historical reasons. -Peff ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-11 8:54 ` Michael J Gruber 2013-05-11 10:00 ` Michael J Gruber @ 2013-05-11 17:36 ` Junio C Hamano 2013-05-12 12:13 ` Michael J Gruber 1 sibling, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-05-11 17:36 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy Michael J Gruber <git@drmicha.warpmail.net> writes: >>> + if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) || >>> + !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) >>> + return stream_blob_to_fd(1, sha1, NULL, 0); >> >> It is surprising that the necessary change is only this, but I think >> it is correct ;-). We ignore textconv when the command line did not >> mention --[no-]textconv, or the command line said --no-textconv >> explicitly. >> >> This (especially the first condition) may deserve an in-code comment >> for anybody who wonders where this default behaviour is implemented. > > It's not as if we would document behavior by in-code comments in > general, do we? The usual answer is "git log -S" or "git blame". The comment and the future reader I had in mind was more like Default to --no-textconv, even though cmd_log_init_defaults() sets the bit, when the user did not explicitly ask for it. sought by somebody who wonders _where_ in the code we ignore ALLOW_TEXTCONV that is set in cmd_log_init_defaults(). That is not something you can find with "log -S" or "blame", is it? ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 3/7] show: honor --textconv for blobs 2013-05-11 17:36 ` Junio C Hamano @ 2013-05-12 12:13 ` Michael J Gruber 0 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-12 12:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King, Matthieu Moy Junio C Hamano venit, vidit, dixit 11.05.2013 19:36: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >>>> + if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) || >>>> + !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV)) >>>> + return stream_blob_to_fd(1, sha1, NULL, 0); >>> >>> It is surprising that the necessary change is only this, but I think >>> it is correct ;-). We ignore textconv when the command line did not >>> mention --[no-]textconv, or the command line said --no-textconv >>> explicitly. >>> >>> This (especially the first condition) may deserve an in-code comment >>> for anybody who wonders where this default behaviour is implemented. >> >> It's not as if we would document behavior by in-code comments in >> general, do we? The usual answer is "git log -S" or "git blame". > > The comment and the future reader I had in mind was more like > > Default to --no-textconv, even though cmd_log_init_defaults() > sets the bit, when the user did not explicitly ask for it. > > sought by somebody who wonders _where_ in the code we ignore > ALLOW_TEXTCONV that is set in cmd_log_init_defaults(). > > That is not something you can find with "log -S" or "blame", is it? > I'll refactor and restructure anyways. That will also get this whole default discussion out of the way: I'll try out the "show attribute" route as indicated. I'm not sure what to do about the object_array/context discussion, though. Michael ^ permalink raw reply [flat|nested] 78+ messages in thread
* [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber ` (2 preceding siblings ...) 2013-05-10 15:10 ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber ` (2 subsequent siblings) 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy When a command is supposed to use textconv filters (by default or with "--textconv") and none are configured then the blob is output without conversion; the only exception to this rule is "cat-file --textconv". Make it behave like the rest of textconv aware commands. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/cat-file.c | 18 ++++++++---------- t/t8007-cat-file-textconv.sh | 20 +++++--------------- 2 files changed, 13 insertions(+), 25 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 045cee7..bd62373 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -48,6 +48,14 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) case 'e': return !has_sha1_file(sha1); + case 'c': + if (!obj_context.path[0]) + die("git cat-file --textconv %s: <object> must be <sha1:path>", + obj_name); + + if (textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) + break; + case 'p': type = sha1_object_info(sha1, NULL); if (type < 0) @@ -70,16 +78,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) /* otherwise just spit out the data */ break; - case 'c': - if (!obj_context.path[0]) - die("git cat-file --textconv %s: <object> must be <sha1:path>", - obj_name); - - if (!textconv_object(obj_context.path, obj_context.mode, sha1, 1, &buf, &size)) - die("git cat-file --textconv: unable to run textconv on %s", - obj_name); - break; - case 0: if (type_from_string(exp_type) == OBJ_BLOB) { unsigned char blob_sha1[20]; diff --git a/t/t8007-cat-file-textconv.sh b/t/t8007-cat-file-textconv.sh index 78a0085..83c6636 100755 --- a/t/t8007-cat-file-textconv.sh +++ b/t/t8007-cat-file-textconv.sh @@ -22,11 +22,11 @@ test_expect_success 'setup ' ' ' cat >expected <<EOF -fatal: git cat-file --textconv: unable to run textconv on :one.bin +bin: test version 2 EOF test_expect_success 'no filter specified' ' - git cat-file --textconv :one.bin 2>result + git cat-file --textconv :one.bin >result && test_cmp expected result ' @@ -36,10 +36,6 @@ test_expect_success 'setup textconv filters' ' git config diff.test.cachetextconv false ' -cat >expected <<EOF -bin: test version 2 -EOF - test_expect_success 'cat-file without --textconv' ' git cat-file blob :one.bin >result && test_cmp expected result @@ -73,25 +69,19 @@ test_expect_success 'cat-file --textconv on previous commit' ' ' test_expect_success SYMLINKS 'cat-file without --textconv (symlink)' ' + printf "%s" "one.bin" >expected && git cat-file blob :symlink.bin >result && - printf "%s" "one.bin" >expected test_cmp expected result ' test_expect_success SYMLINKS 'cat-file --textconv on index (symlink)' ' - ! git cat-file --textconv :symlink.bin 2>result && - cat >expected <<\EOF && -fatal: git cat-file --textconv: unable to run textconv on :symlink.bin -EOF + git cat-file --textconv :symlink.bin >result && test_cmp expected result ' test_expect_success SYMLINKS 'cat-file --textconv on HEAD (symlink)' ' - ! git cat-file --textconv HEAD:symlink.bin 2>result && - cat >expected <<EOF && -fatal: git cat-file --textconv: unable to run textconv on HEAD:symlink.bin -EOF + git cat-file --textconv HEAD:symlink.bin >result && test_cmp expected result ' -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber ` (3 preceding siblings ...) 2013-05-10 15:10 ` [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 6/7] grep: allow to use textconv filters Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy Currently, "git grep" does not honor any textconv filters, with nor without --textconv. Demonstrate this in the tests. The default is expected to remain unchanged. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- t/t7008-grep-binary.sh | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 26f8319..1c0946f 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -145,4 +145,35 @@ test_expect_success 'grep respects not-binary diff attribute' ' test_cmp expect actual ' +cat >nul_to_q_textconv <<'EOF' +#!/bin/sh +"$PERL_PATH" -pe 'y/\000/Q/' < "$1" +EOF +chmod +x nul_to_q_textconv + +test_expect_success 'setup textconv filters' ' + echo a diff=foo >.gitattributes && + git config diff.foo.textconv "\"$(pwd)\""/nul_to_q_textconv +' + +test_expect_success 'grep does not honor textconv' ' + test_must_fail git grep Qfile +' + +test_expect_failure 'grep --textconv honors textconv' ' + echo "a:binaryQfile" >expect && + git grep --textconv Qfile >actual && + test_cmp expect actual +' + +test_expect_success 'grep --no-textconv does not honor textconv' ' + test_must_fail git grep --no-textconv Qfile +' + +test_expect_failure 'grep --textconv blob honors textconv' ' + echo "HEAD:a:binaryQfile" >expect && + git grep --textconv Qfile HEAD:a >actual && + test_cmp expect actual +' + test_done -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv3 6/7] grep: allow to use textconv filters 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber ` (4 preceding siblings ...) 2013-05-10 15:10 ` [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber 6 siblings, 0 replies; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy From: Jeff King <peff@peff.net> Recently and not so recently, we made sure that log/grep type operations use textconv filters when a userfacing diff would do the same: ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28) b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28) 0508fe5 (combine-diff: respect textconv attributes, 2011-05-23) "git grep" currently does not use textconv filters at all, that is neither for displaying the match and context nor for the actual grepping, even when requested by --textconv. Introduce an option "--textconv" which makes git grep use any configured textconv filters for grepping and output purposes. It is off by default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- Documentation/git-grep.txt | 9 +++- builtin/grep.c | 2 + grep.c | 100 ++++++++++++++++++++++++++++++++++++++------- grep.h | 1 + t/t7008-grep-binary.sh | 6 ++- 5 files changed, 102 insertions(+), 16 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 50d46e1..a5c5a27 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -9,7 +9,7 @@ git-grep - Print lines matching a pattern SYNOPSIS -------- [verse] -'git grep' [-a | --text] [-I] [-i | --ignore-case] [-w | --word-regexp] +'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp] [-v | --invert-match] [-h|-H] [--full-name] [-E | --extended-regexp] [-G | --basic-regexp] [-P | --perl-regexp] @@ -80,6 +80,13 @@ OPTIONS --text:: Process binary files as if they were text. +--textconv:: + Honor textconv filter settings. + +--no-textconv:: + Do not honor textconv filter settings. + This is the default. + -i:: --ignore-case:: Ignore case differences between the patterns and the diff --git a/builtin/grep.c b/builtin/grep.c index 159e65d..00ee57d 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -659,6 +659,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) OPT_SET_INT('I', NULL, &opt.binary, N_("don't match patterns in binary files"), GREP_BINARY_NOMATCH), + OPT_BOOL(0, "textconv", &opt.allow_textconv, + N_("process binary files with textconv filters")), { OPTION_INTEGER, 0, "max-depth", &opt.max_depth, N_("depth"), N_("descend at most <depth> levels"), PARSE_OPT_NONEG, NULL, 1 }, diff --git a/grep.c b/grep.c index bb548ca..c668034 100644 --- a/grep.c +++ b/grep.c @@ -2,6 +2,8 @@ #include "grep.h" #include "userdiff.h" #include "xdiff-interface.h" +#include "diff.h" +#include "diffcore.h" static int grep_source_load(struct grep_source *gs); static int grep_source_is_binary(struct grep_source *gs); @@ -1322,6 +1324,58 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size) fwrite(buf, size, 1, stdout); } +static int fill_textconv_grep(struct userdiff_driver *driver, + struct grep_source *gs) +{ + struct diff_filespec *df; + char *buf; + size_t size; + + if (!driver || !driver->textconv) + return grep_source_load(gs); + + /* + * The textconv interface is intimately tied to diff_filespecs, so we + * have to pretend to be one. If we could unify the grep_source + * and diff_filespec structs, this mess could just go away. + */ + df = alloc_filespec(gs->path); + switch (gs->type) { + case GREP_SOURCE_SHA1: + fill_filespec(df, gs->identifier, 1, 0100644); + break; + case GREP_SOURCE_FILE: + fill_filespec(df, null_sha1, 0, 0100644); + break; + default: + die("BUG: attempt to textconv something without a path?"); + } + + /* + * fill_textconv is not remotely thread-safe; it may load objects + * behind the scenes, and it modifies the global diff tempfile + * structure. + */ + grep_read_lock(); + size = fill_textconv(driver, df, &buf); + grep_read_unlock(); + free_filespec(df); + + /* + * The normal fill_textconv usage by the diff machinery would just keep + * the textconv'd buf separate from the diff_filespec. But much of the + * grep code passes around a grep_source and assumes that its "buf" + * pointer is the beginning of the thing we are searching. So let's + * install our textconv'd version into the grep_source, taking care not + * to leak any existing buffer. + */ + grep_source_clear_data(gs); + gs->buf = buf; + gs->size = size; + + return 0; +} + static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int collect_hits) { char *bol; @@ -1332,6 +1386,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle unsigned count = 0; int try_lookahead = 0; int show_function = 0; + struct userdiff_driver *textconv = NULL; enum grep_context ctx = GREP_CONTEXT_HEAD; xdemitconf_t xecfg; @@ -1353,19 +1408,36 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle } opt->last_shown = 0; - switch (opt->binary) { - case GREP_BINARY_DEFAULT: - if (grep_source_is_binary(gs)) - binary_match_only = 1; - break; - case GREP_BINARY_NOMATCH: - if (grep_source_is_binary(gs)) - return 0; /* Assume unmatch */ - break; - case GREP_BINARY_TEXT: - break; - default: - die("bug: unknown binary handling mode"); + if (opt->allow_textconv) { + grep_source_load_driver(gs); + /* + * We might set up the shared textconv cache data here, which + * is not thread-safe. + */ + grep_attr_lock(); + textconv = userdiff_get_textconv(gs->driver); + grep_attr_unlock(); + } + + /* + * We know the result of a textconv is text, so we only have to care + * about binary handling if we are not using it. + */ + if (!textconv) { + switch (opt->binary) { + case GREP_BINARY_DEFAULT: + if (grep_source_is_binary(gs)) + binary_match_only = 1; + break; + case GREP_BINARY_NOMATCH: + if (grep_source_is_binary(gs)) + return 0; /* Assume unmatch */ + break; + case GREP_BINARY_TEXT: + break; + default: + die("bug: unknown binary handling mode"); + } } memset(&xecfg, 0, sizeof(xecfg)); @@ -1373,7 +1445,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle try_lookahead = should_lookahead(opt); - if (grep_source_load(gs) < 0) + if (fill_textconv_grep(textconv, gs) < 0) return 0; bol = gs->buf; diff --git a/grep.h b/grep.h index e4a1df5..eaaced1 100644 --- a/grep.h +++ b/grep.h @@ -107,6 +107,7 @@ struct grep_opt { #define GREP_BINARY_NOMATCH 1 #define GREP_BINARY_TEXT 2 int binary; + int allow_textconv; int extended; int use_reflog_filter; int pcre; diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index 1c0946f..a91260a 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -160,7 +160,7 @@ test_expect_success 'grep does not honor textconv' ' test_must_fail git grep Qfile ' -test_expect_failure 'grep --textconv honors textconv' ' +test_expect_success 'grep --textconv honors textconv' ' echo "a:binaryQfile" >expect && git grep --textconv Qfile >actual && test_cmp expect actual @@ -176,4 +176,8 @@ test_expect_failure 'grep --textconv blob honors textconv' ' test_cmp expect actual ' +test_expect_success 'grep --no-textconv blob does not honor textconv' ' + test_must_fail git grep --no-textconv Qfile HEAD:a +' + test_done -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [PATCHv3 7/7] grep: honor --textconv for the case rev:path 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber ` (5 preceding siblings ...) 2013-05-10 15:10 ` [PATCHv3 6/7] grep: allow to use textconv filters Michael J Gruber @ 2013-05-10 15:10 ` Michael J Gruber 2013-05-10 18:11 ` Junio C Hamano 6 siblings, 1 reply; 78+ messages in thread From: Michael J Gruber @ 2013-05-10 15:10 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Matthieu Moy Make "grep" honor the "--textconv" option also for the object case, i.e. when used with an argument "rev:path". Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> --- builtin/grep.c | 11 ++++++----- object.c | 26 ++++++++++++++++++++------ object.h | 2 ++ t/t7008-grep-binary.sh | 6 +----- 4 files changed, 29 insertions(+), 16 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 00ee57d..bb7f970 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -458,10 +458,10 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, } static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, - struct object *obj, const char *name) + struct object *obj, const char *name, struct object_context *oc) { if (obj->type == OBJ_BLOB) - return grep_sha1(opt, obj->sha1, name, 0, NULL); + return grep_sha1(opt, obj->sha1, name, 0, oc ? oc->path : NULL); if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) { struct tree_desc tree; void *data; @@ -503,7 +503,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i < nr; i++) { struct object *real_obj; real_obj = deref_tag(list->objects[i].item, NULL, 0); - if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) { + if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].context)) { hit = 1; if (opt->status_only) break; @@ -820,12 +820,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix) for (i = 0; i < argc; i++) { const char *arg = argv[i]; unsigned char sha1[20]; + struct object_context oc; /* Is it a rev? */ - if (!get_sha1(arg, sha1)) { + if (!get_sha1_with_context(arg, 0, sha1, &oc)) { struct object *object = parse_object_or_die(sha1, arg); if (!seen_dashdash) verify_non_filename(prefix, arg); - add_object_array(object, arg, &list); + add_object_array_with_context(object, arg, &list, xmemdupz(&oc, sizeof(struct object_context))); continue; } if (!strcmp(arg, "--")) { diff --git a/object.c b/object.c index 88d0bec..264b6df 100644 --- a/object.c +++ b/object.c @@ -265,12 +265,7 @@ int object_list_contains(struct object_list *list, struct object *obj) return 0; } -void add_object_array(struct object *obj, const char *name, struct object_array *array) -{ - add_object_array_with_mode(obj, name, array, S_IFINVALID); -} - -void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) +static void add_object_array_with_mode_context(struct object *obj, const char *name, struct object_array *array, unsigned mode, struct object_context *context) { unsigned nr = array->nr; unsigned alloc = array->alloc; @@ -285,9 +280,28 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj objects[nr].item = obj; objects[nr].name = name; objects[nr].mode = mode; + objects[nr].context = context; array->nr = ++nr; } +void add_object_array(struct object *obj, const char *name, struct object_array *array) +{ + add_object_array_with_mode(obj, name, array, S_IFINVALID); +} + +void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) +{ + add_object_array_with_mode_context(obj, name, array, mode, NULL); +} + +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context) +{ + if (context) + add_object_array_with_mode_context(obj, name, array, context->mode, context); + else + add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context); +} + void object_array_remove_duplicates(struct object_array *array) { unsigned int ref, src, dst; diff --git a/object.h b/object.h index 97d384b..695847d 100644 --- a/object.h +++ b/object.h @@ -13,6 +13,7 @@ struct object_array { struct object *item; const char *name; unsigned mode; + struct object_context *context; } *objects; }; @@ -85,6 +86,7 @@ int object_list_contains(struct object_list *list, struct object *obj); /* Object array handling .. */ void add_object_array(struct object *obj, const char *name, struct object_array *array); void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode); +void add_object_array_with_context(struct object *obj, const char *name, struct object_array *array, struct object_context *context); void object_array_remove_duplicates(struct object_array *); void clear_object_flags(unsigned flags); diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index a91260a..b146406 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -170,14 +170,10 @@ test_expect_success 'grep --no-textconv does not honor textconv' ' test_must_fail git grep --no-textconv Qfile ' -test_expect_failure 'grep --textconv blob honors textconv' ' +test_expect_success 'grep --textconv blob honors textconv' ' echo "HEAD:a:binaryQfile" >expect && git grep --textconv Qfile HEAD:a >actual && test_cmp expect actual ' -test_expect_success 'grep --no-textconv blob does not honor textconv' ' - test_must_fail git grep --no-textconv Qfile HEAD:a -' - test_done -- 1.8.3.rc1.406.gf4dce7e ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [PATCHv3 7/7] grep: honor --textconv for the case rev:path 2013-05-10 15:10 ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber @ 2013-05-10 18:11 ` Junio C Hamano 2013-05-10 18:31 ` Junio C Hamano 0 siblings, 1 reply; 78+ messages in thread From: Junio C Hamano @ 2013-05-10 18:11 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy Michael J Gruber <git@drmicha.warpmail.net> writes: > diff --git a/object.h b/object.h > index 97d384b..695847d 100644 > --- a/object.h > +++ b/object.h > @@ -13,6 +13,7 @@ struct object_array { > struct object *item; > const char *name; > unsigned mode; > + struct object_context *context; > } *objects; > }; fsck has to hold this for each and every objects in the repository it has found but hasn't inspected (i.e. pending), doesn't it? Do we really want to add 8 bytes for each of them? ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [PATCHv3 7/7] grep: honor --textconv for the case rev:path 2013-05-10 18:11 ` Junio C Hamano @ 2013-05-10 18:31 ` Junio C Hamano 0 siblings, 0 replies; 78+ messages in thread From: Junio C Hamano @ 2013-05-10 18:31 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Jeff King, Matthieu Moy Junio C Hamano <gitster@pobox.com> writes: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> diff --git a/object.h b/object.h >> index 97d384b..695847d 100644 >> --- a/object.h >> +++ b/object.h >> @@ -13,6 +13,7 @@ struct object_array { >> struct object *item; >> const char *name; >> unsigned mode; >> + struct object_context *context; >> } *objects; >> }; > > fsck has to hold this for each and every objects in the repository > it has found but hasn't inspected (i.e. pending), doesn't it? Do we > really want to add 8 bytes for each of them? Perhaps fsck does not even want "name" and "mode" for that matter. I wonder what improvement, if any, we would see with a change like this patch in a large-ish repository. builtin/fsck.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index bb9a2cd..c1de2a9 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -73,7 +73,32 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...) return (type == FSCK_WARN) ? 0 : 1; } -static struct object_array pending; +static struct pending_object { + unsigned int nr; + unsigned int alloc; + struct object **objects; +} pending; + +static int max_pending; + +static void add_pending(struct object *object) +{ + unsigned nr = pending.nr; + unsigned alloc = pending.alloc; + struct object **objects = pending.objects; + + if (nr >= alloc) { + alloc = (alloc + 32) * 2; + objects = xrealloc(objects, alloc * sizeof(*objects)); + pending.alloc = alloc; + pending.objects = objects; + } + objects[nr] = object; + pending.nr = ++nr; + + if (max_pending < nr) + max_pending = nr; +} static int mark_object(struct object *obj, int type, void *data) { @@ -112,7 +137,7 @@ static int mark_object(struct object *obj, int type, void *data) return 1; } - add_object_array(obj, (void *) parent, &pending); + add_pending(obj); return 0; } @@ -148,15 +173,15 @@ static int traverse_reachable(void) if (show_progress) progress = start_progress_delay("Checking connectivity", 0, 0, 2); while (pending.nr) { - struct object_array_entry *entry; - struct object *obj; + struct object **entry, *obj; entry = pending.objects + --pending.nr; - obj = entry->item; + obj = *entry; result |= traverse_one_object(obj); display_progress(progress, ++nr); } stop_progress(&progress); + fprintf(stderr, "max# pending objects = %d\n", max_pending); return !!result; } ^ permalink raw reply related [flat|nested] 78+ messages in thread
end of thread, other threads:[~2013-05-16 3:31 UTC | newest] Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-04-19 16:44 [PATCH 0/6] grep with textconv Michael J Gruber 2013-04-19 16:44 ` [PATCH 1/6] t4030: demonstrate behavior of show " Michael J Gruber 2013-04-20 4:04 ` Jeff King 2013-04-20 13:35 ` Michael J Gruber 2013-04-19 16:44 ` [PATCH 2/6] show: obey --textconv for blobs Michael J Gruber 2013-04-20 4:06 ` Jeff King 2013-04-20 13:38 ` Michael J Gruber 2013-04-21 3:37 ` Jeff King 2013-04-22 10:29 ` Michael J Gruber 2013-04-22 15:25 ` Junio C Hamano 2013-04-22 15:29 ` Jeff King 2013-04-22 15:37 ` Jeremy Rosen 2013-04-22 15:54 ` Matthieu Moy 2013-04-23 8:58 ` Michael J Gruber 2013-04-19 16:44 ` [PATCH 3/6] cat-file: do not die on --textconv without textconv filters Michael J Gruber 2013-04-19 18:15 ` Junio C Hamano 2013-04-20 4:17 ` Jeff King 2013-04-20 14:27 ` Michael J Gruber 2013-04-19 16:44 ` [PATCH 4/6] t7008: demonstrate behavior of grep with textconv Michael J Gruber 2013-04-19 16:44 ` [PATCH 5/6] grep: allow to use textconv filters Michael J Gruber 2013-04-20 4:31 ` Jeff King 2013-04-19 16:44 ` [PATCH 6/6] grep: obey --textconv for the case rev:path Michael J Gruber 2013-04-20 4:24 ` Jeff King 2013-04-20 14:42 ` Michael J Gruber 2013-04-21 3:41 ` Jeff King 2013-04-19 18:24 ` [PATCH 0/6] grep with textconv Junio C Hamano 2013-04-20 4:26 ` Jeff King 2013-04-20 13:32 ` Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 0/7] " Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 1/7] t4030: demonstrate behavior of show " Michael J Gruber 2013-04-23 15:11 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 2/7] show: obey --textconv for blobs Michael J Gruber 2013-04-23 15:14 ` Junio C Hamano 2013-04-24 10:09 ` Michael J Gruber 2013-04-24 17:30 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 3/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber 2013-04-23 15:15 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 4/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber 2013-04-23 15:16 ` Junio C Hamano 2013-04-24 10:09 ` Michael J Gruber 2013-04-24 17:29 ` Junio C Hamano 2013-04-23 12:11 ` [PATCHv2 5/7] grep: allow to use textconv filters Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 6/7] grep: honor --textconv for the case rev:path Michael J Gruber 2013-04-23 12:11 ` [PATCHv2 7/7] git grep: honor textconv by default Michael J Gruber 2013-04-23 15:20 ` Junio C Hamano 2013-04-24 10:05 ` Michael J Gruber 2013-04-24 17:35 ` Junio C Hamano 2013-04-24 17:57 ` Matthieu Moy 2013-04-24 18:55 ` Junio C Hamano 2013-04-26 11:59 ` Michael J Gruber 2013-04-26 13:23 ` Matthieu Moy 2013-04-29 9:04 ` Michael J Gruber 2013-04-29 15:04 ` Junio C Hamano 2013-05-10 15:08 ` [PATCHv3 0/7] textconv with grep and show Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 1/7] t4030: demonstrate behavior of show with textconv Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 2/7] diff_opt: track whether flags have been set explicitly Michael J Gruber 2013-05-10 15:31 ` Eric Sunshine 2013-05-10 15:10 ` [PATCHv3 3/7] show: honor --textconv for blobs Michael J Gruber 2013-05-10 17:02 ` Junio C Hamano 2013-05-10 17:34 ` Jeff King 2013-05-10 18:04 ` Junio C Hamano 2013-05-11 0:25 ` Jeff King 2013-05-11 19:54 ` Junio C Hamano 2013-05-11 8:54 ` Michael J Gruber 2013-05-11 10:00 ` Michael J Gruber 2013-05-13 5:01 ` Junio C Hamano 2013-05-13 11:55 ` Jeff King 2013-05-13 14:57 ` Michael J Gruber 2013-05-13 15:41 ` Junio C Hamano 2013-05-16 3:31 ` Jeff King 2013-05-11 17:36 ` Junio C Hamano 2013-05-12 12:13 ` Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 4/7] cat-file: do not die on --textconv without textconv filters Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 5/7] t7008: demonstrate behavior of grep with textconv Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 6/7] grep: allow to use textconv filters Michael J Gruber 2013-05-10 15:10 ` [PATCHv3 7/7] grep: honor --textconv for the case rev:path Michael J Gruber 2013-05-10 18:11 ` Junio C Hamano 2013-05-10 18:31 ` Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.