All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] attr: map builtin userdiff drivers to well-known extensions
@ 2011-12-16 11:00 Jeff King
  2011-12-16 14:00 ` Johannes Sixt
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Jeff King @ 2011-12-16 11:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brandon Casey

We already provide sane hunk-header patterns for specific
languages.

However, the user has to manually map common extensions to
use them. It's not that hard to do, but it's an extra step
that the user might not even know is an option. Let's be
nice and do it automatically.

It could be a problem in the future if the builtin userdiff
drivers started growing more invasive options, like
automatically claiming to be non-binary (e.g., setting
diff.cpp.binary = false by default), but right now we do not
do that, so it should be safe. To help safeguard against
future changes, we add a new test to t4012 making sure that
we don't consider binary files as text by their extension.

We also have to update t4018, which assumed that without a
.gitattributes file, we would receive the default funcname
pattern for a file matching "*.java". Changing this behavior
is not covering up a regression, but rather the feature
working as intended.

Signed-off-by: Jeff King <peff@peff.net>
---
I forgot to send this out in time for v1.7.8.

Prior discussion here:

  http://thread.gmane.org/gmane.comp.version-control.git/180103

and here:

  http://thread.gmane.org/gmane.comp.version-control.git/181253

The list of extensions is collected from those threads. The tests are
new since the last time I posted (and the t4018 is slightly different
than what you queued in pu).

I punted on the question of case-sensitivity. Brandon mentioned using
fnmatch_icase to handle this, which sounds sane, but I think it is
really a separate topic.

 attr.c                   |   24 ++++++++++++++++++++++++
 t/t4012-diff-binary.sh   |   13 +++++++++++++
 t/t4018-diff-funcname.sh |   10 +++++++++-
 3 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/attr.c b/attr.c
index 76b079f..2ad7cc4 100644
--- a/attr.c
+++ b/attr.c
@@ -306,6 +306,30 @@ static void free_attr_elem(struct attr_stack *e)
 
 static const char *builtin_attr[] = {
 	"[attr]binary -diff -text",
+	"*.html diff=html",
+	"*.htm diff=html",
+	"*.java diff=java",
+	"*.perl diff=perl",
+	"*.pl diff=perl",
+	"*.php diff=php",
+	"*.py diff=python",
+	"*.rb diff=ruby",
+	"*.bib diff=bibtex",
+	"*.tex diff=tex",
+	"*.c diff=cpp",
+	"*.cc diff=cpp",
+	"*.cxx diff=cpp",
+	"*.cpp diff=cpp",
+	"*.h diff=cpp",
+	"*.hpp diff=cpp",
+	"*.cs diff=csharp",
+	"*.[Ff] diff=fortran",
+	"*.[Ff][0-9][0-9] diff=fortran",
+	"*.m diff=objc",
+	"*.mm diff=objc",
+	"*.pas diff=pascal",
+	"*.pp diff=pascal",
+	"*.lpr diff=pascal",
 	NULL,
 };
 
diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
index 2d9f9a0..b2fc807 100755
--- a/t/t4012-diff-binary.sh
+++ b/t/t4012-diff-binary.sh
@@ -90,4 +90,17 @@ test_expect_success 'diff --no-index with binary creation' '
 	test_cmp expected actual
 '
 
+test_expect_success 'binary files are not considered text by file extension' '
+	echo Q | q_to_nul >binary.c &&
+	git add binary.c &&
+	cat >expect <<-\EOF &&
+	diff --git a/binary.c b/binary.c
+	new file mode 100644
+	index 0000000..1f2a4f5
+	Binary files /dev/null and b/binary.c differ
+	EOF
+	git diff --cached binary.c >actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 4bd2a1c..a6227ef 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -124,7 +124,9 @@ do
 done
 
 test_expect_success 'default behaviour' '
-	rm -f .gitattributes &&
+	cat >.gitattributes <<-\EOF &&
+	*.java diff=default
+	EOF
 	test_expect_funcname "public class Beer\$"
 '
 
@@ -187,4 +189,10 @@ test_expect_success 'alternation in pattern' '
 	test_expect_funcname "public static void main("
 '
 
+test_expect_success 'custom diff drivers override built-in extension matches' '
+	test_config diff.foo.funcname "int special" &&
+	echo "*.java diff=foo" >.gitattributes &&
+	test_expect_funcname "int special"
+'
+
 test_done
-- 
1.7.7.4.13.g57bf4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 11:00 [PATCH] attr: map builtin userdiff drivers to well-known extensions Jeff King
@ 2011-12-16 14:00 ` Johannes Sixt
  2011-12-16 17:01   ` Junio C Hamano
  2011-12-16 19:21   ` Jeff King
  2011-12-16 17:51 ` [PATCH] attr: map builtin userdiff drivers to well-known extensions Mark Levedahl
  2011-12-16 19:26 ` Philip Oakley
  2 siblings, 2 replies; 35+ messages in thread
From: Johannes Sixt @ 2011-12-16 14:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Brandon Casey

Am 12/16/2011 12:00, schrieb Jeff King:
>  static const char *builtin_attr[] = {
...
> +	"*.c diff=cpp",
> +	"*.cc diff=cpp",
> +	"*.cxx diff=cpp",
> +	"*.cpp diff=cpp",
> +	"*.h diff=cpp",
> +	"*.hpp diff=cpp",

Please don't do this. It would be a serious regression for C++ coders, and
some C coders as well. The built-in hunk header patterns are severly
broken and don't work well with C++ code. I know for sure that the
following are not recognized:

- template declarations, e.g. template<class T> func(T x);
- constructor definitionss, e.g. MyClass::MyClass()
- functions that return references, e.g. const string& func()
- function definitions along the GNU coding style, e.g.

     void
     the_func ()

I am currently using this pattern (but I'm sure it can be optimized) with
an appropriate xcpp attribute:

[diff "xcpp"]
        xfuncname = "!^[
\\t]*[a-zA-Z_][a-zA-Z_0-9]*[^()]*:[[:space:]]*$\n^[a-zA-Z_][a-zA-Z_0-9]*.*"

(modulo MUA line wrapping).

-- Hannes

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 14:00 ` Johannes Sixt
@ 2011-12-16 17:01   ` Junio C Hamano
  2011-12-16 19:21   ` Jeff King
  1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2011-12-16 17:01 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jeff King, git, Brandon Casey

Johannes Sixt <j.sixt@viscovery.net> writes:

> Am 12/16/2011 12:00, schrieb Jeff King:
>>  static const char *builtin_attr[] = {
> ...
>> +	"*.c diff=cpp",
>> +	"*.cc diff=cpp",
>> +	"*.cxx diff=cpp",
>> +	"*.cpp diff=cpp",
>> +	"*.h diff=cpp",
>> +	"*.hpp diff=cpp",
>
> Please don't do this. It would be a serious regression for C++ coders, and
> some C coders as well. The built-in hunk header patterns are severly
> broken and don't work well with C++ code.

I do not often do C++, but I tend to agree wrt C.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 11:00 [PATCH] attr: map builtin userdiff drivers to well-known extensions Jeff King
  2011-12-16 14:00 ` Johannes Sixt
@ 2011-12-16 17:51 ` Mark Levedahl
  2011-12-16 19:28   ` Jeff King
  2011-12-16 19:26 ` Philip Oakley
  2 siblings, 1 reply; 35+ messages in thread
From: Mark Levedahl @ 2011-12-16 17:51 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Brandon Casey

On 12/16/2011 06:00 AM, Jeff King wrote:

> +	"*.m diff=objc",

Please don't do this: Matlab files also use .m as a suffix, and there is 
little to no compatibility between objective c and Matlab syntax.

Mark

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 14:00 ` Johannes Sixt
  2011-12-16 17:01   ` Junio C Hamano
@ 2011-12-16 19:21   ` Jeff King
  2011-12-16 19:30     ` Jeff King
                       ` (2 more replies)
  1 sibling, 3 replies; 35+ messages in thread
From: Jeff King @ 2011-12-16 19:21 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, Brandon Casey

On Fri, Dec 16, 2011 at 03:00:51PM +0100, Johannes Sixt wrote:

> Am 12/16/2011 12:00, schrieb Jeff King:
> >  static const char *builtin_attr[] = {
> ...
> > +	"*.c diff=cpp",
> > +	"*.cc diff=cpp",
> > +	"*.cxx diff=cpp",
> > +	"*.cpp diff=cpp",
> > +	"*.h diff=cpp",
> > +	"*.hpp diff=cpp",
> 
> Please don't do this. It would be a serious regression for C++ coders, and
> some C coders as well. The built-in hunk header patterns are severly
> broken and don't work well with C++ code. I know for sure that the
> following are not recognized:
> 
> - template declarations, e.g. template<class T> func(T x);
> - constructor definitionss, e.g. MyClass::MyClass()
> - functions that return references, e.g. const string& func()
> - function definitions along the GNU coding style, e.g.
> 
>      void
>      the_func ()

Hmm. I think it's a legitimate criticism to say "hunk-header detection
is a broken feature because our heuristics aren't good enough, and we
shouldn't start using it by default because people will complain because
it sucks too much".

At the same time, I think we have seen people complaining that the
regular dumb funcname detection is not good enough[1], and that using
language-specific funcnames, while not 100% perfect, produces better
results on the whole.

So I think rather than saying "this doesn't always work", it's important
to ask "on the whole, does this tend to produce better results than
without, and when we are wrong, how bad is it?"

I'm not clear from what you wrote on whether you were saying it is
simply sub-optimal, or whether on balance it is way worse than the
default funcname matching.

And if it is bad on balance, is the right solution to avoid exposing
people to it, or is it to make our patterns better? I.e., is it fixable,
or is it simply too hard a problem to get right in the general case, and
we shouldn't turn it on by default?

> I am currently using this pattern (but I'm sure it can be optimized) with
> an appropriate xcpp attribute:
> 
> [diff "xcpp"]
>         xfuncname = "!^[
> \\t]*[a-zA-Z_][a-zA-Z_0-9]*[^()]*:[[:space:]]*$\n^[a-zA-Z_][a-zA-Z_0-9]*.*"

So, I'm confused. If you are using this, surely you have "*.c diff=xcpp"
in your attributes file, and my patch has no effect for you, as it is
lower precedence than user-supplied gitattributes? Also, if you called
it diff.cpp.xfuncname, then wouldn't my patch still be useful, as your
complaint is not "my *.c files are not actually C language" but "the C
language driver sucks" (but you be remedying that by providing your own
config).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 11:00 [PATCH] attr: map builtin userdiff drivers to well-known extensions Jeff King
  2011-12-16 14:00 ` Johannes Sixt
  2011-12-16 17:51 ` [PATCH] attr: map builtin userdiff drivers to well-known extensions Mark Levedahl
@ 2011-12-16 19:26 ` Philip Oakley
  2011-12-16 19:32   ` Jeff King
  2011-12-16 19:38   ` Junio C Hamano
  2 siblings, 2 replies; 35+ messages in thread
From: Philip Oakley @ 2011-12-16 19:26 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: git, Brandon Casey

From: "Jeff King" <peff@peff.net>
> We already provide sane hunk-header patterns for specific
> languages.
>
> However, the user has to manually map common extensions to
> use them. It's not that hard to do, but it's an extra step
> that the user might not even know is an option. Let's be
> nice and do it automatically.
>
> It could be a problem in the future if the builtin userdiff
> drivers started growing more invasive options, like
> automatically claiming to be non-binary (e.g., setting
> diff.cpp.binary = false by default), but right now we do not
> do that, so it should be safe. To help safeguard against
> future changes, we add a new test to t4012 making sure that
> we don't consider binary files as text by their extension.
>
> We also have to update t4018, which assumed that without a
> .gitattributes file, we would receive the default funcname
> pattern for a file matching "*.java". Changing this behavior
> is not covering up a regression, but rather the feature
> working as intended.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I forgot to send this out in time for v1.7.8.
>
> Prior discussion here:
>
>  http://thread.gmane.org/gmane.comp.version-control.git/180103
>
> and here:
>
>  http://thread.gmane.org/gmane.comp.version-control.git/181253
>
> The list of extensions is collected from those threads. The tests are
> new since the last time I posted (and the t4018 is slightly different
> than what you queued in pu).
>
> I punted on the question of case-sensitivity. Brandon mentioned using
> fnmatch_icase to handle this, which sounds sane, but I think it is
> really a separate topic.
>
> attr.c                   |   24 ++++++++++++++++++++++++
> t/t4012-diff-binary.sh   |   13 +++++++++++++
> t/t4018-diff-funcname.sh |   10 +++++++++-
> 3 files changed, 46 insertions(+), 1 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index 76b079f..2ad7cc4 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -306,6 +306,30 @@ static void free_attr_elem(struct attr_stack *e)
>
> static const char *builtin_attr[] = {
>  "[attr]binary -diff -text",
> + "*.html diff=html",
> + "*.htm diff=html",
> + "*.java diff=java",
> + "*.perl diff=perl",
> + "*.pl diff=perl",
> + "*.php diff=php",
> + "*.py diff=python",
> + "*.rb diff=ruby",
> + "*.bib diff=bibtex",
> + "*.tex diff=tex",
> + "*.c diff=cpp",
> + "*.cc diff=cpp",
> + "*.cxx diff=cpp",
> + "*.cpp diff=cpp",
> + "*.h diff=cpp",
> + "*.hpp diff=cpp",
> + "*.cs diff=csharp",
> + "*.[Ff] diff=fortran",
> + "*.[Ff][0-9][0-9] diff=fortran",
> + "*.m diff=objc",

There is a conflict here with the Matlab community which also uses "*.m" 
files for its scripts and code.
They fit the "It's not that hard to do, but it's an extra step that the user 
might not even know is an option." rationale.

If the objc.m is used as a default it must be overidable easily, and listed 
in the appropriate documentation to mitigate the "might not even know" risk.
Philip

> + "*.mm diff=objc",
> + "*.pas diff=pascal",
> + "*.pp diff=pascal",
> + "*.lpr diff=pascal",
>  NULL,
> };
>
> diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
> index 2d9f9a0..b2fc807 100755
> --- a/t/t4012-diff-binary.sh
> +++ b/t/t4012-diff-binary.sh
> @@ -90,4 +90,17 @@ test_expect_success 'diff --no-index with binary 
> creation' '
>  test_cmp expected actual
> '
>
> +test_expect_success 'binary files are not considered text by file 
> extension' '
> + echo Q | q_to_nul >binary.c &&
> + git add binary.c &&
> + cat >expect <<-\EOF &&
> + diff --git a/binary.c b/binary.c
> + new file mode 100644
> + index 0000000..1f2a4f5
> + Binary files /dev/null and b/binary.c differ
> + EOF
> + git diff --cached binary.c >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
> diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
> index 4bd2a1c..a6227ef 100755
> --- a/t/t4018-diff-funcname.sh
> +++ b/t/t4018-diff-funcname.sh
> @@ -124,7 +124,9 @@ do
> done
>
> test_expect_success 'default behaviour' '
> - rm -f .gitattributes &&
> + cat >.gitattributes <<-\EOF &&
> + *.java diff=default
> + EOF
>  test_expect_funcname "public class Beer\$"
> '
>
> @@ -187,4 +189,10 @@ test_expect_success 'alternation in pattern' '
>  test_expect_funcname "public static void main("
> '
>
> +test_expect_success 'custom diff drivers override built-in extension 
> matches' '
> + test_config diff.foo.funcname "int special" &&
> + echo "*.java diff=foo" >.gitattributes &&
> + test_expect_funcname "int special"
> +'
> +
> test_done
> -- 
> 1.7.7.4.13.g57bf4
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1890 / Virus Database: 2108/4682 - Release Date: 12/15/11
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 17:51 ` [PATCH] attr: map builtin userdiff drivers to well-known extensions Mark Levedahl
@ 2011-12-16 19:28   ` Jeff King
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2011-12-16 19:28 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Junio C Hamano, git, Brandon Casey

On Fri, Dec 16, 2011 at 12:51:19PM -0500, Mark Levedahl wrote:

> On 12/16/2011 06:00 AM, Jeff King wrote:
> 
> >+	"*.m diff=objc",
> 
> Please don't do this: Matlab files also use .m as a suffix, and there
> is little to no compatibility between objective c and Matlab syntax.

Thanks for the feedback. Unlike JSixt's objection, I think this one is
at the heart of the patch: using file extensions to map to file types is
just a heuristic, and that heuristic can be spectacularly wrong.

And that's why we took the conservative approach until now, and simply
left it up to projects to define their own attributes mapping files to
types (even though we provided funcname patterns for some types).

So I think it is really worth weighing the convenience of "user does not
have to bother configuring attributes for each project" versus "we might
get it wrong".

Fortunately, the "might get it wrong" side is pretty easily mitigated by
making .gitattributes file (i.e., the same thing they would have to do
without this mapping heuristic). So the question is not "did we get it
wrong", but "how much worse is the objc funcname pattern versus the
default one for matlab files". I'd be interested to hear results from
Matlab people.

And of course there's the question of how good or bad each heuristic is.
It sounds like ".c" is more likely to be C than ".m" is to be objc, for
example. So maybe the concept is sound, but "*.m" is too overloaded an
extension to make the default list. I dunno.

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:21   ` Jeff King
@ 2011-12-16 19:30     ` Jeff King
  2011-12-16 19:33     ` Junio C Hamano
  2011-12-16 22:05     ` Johannes Sixt
  2 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2011-12-16 19:30 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, Brandon Casey

On Fri, Dec 16, 2011 at 02:21:04PM -0500, Jeff King wrote:

> At the same time, I think we have seen people complaining that the
> regular dumb funcname detection is not good enough[1], and that using
> language-specific funcnames, while not 100% perfect, produces better
> results on the whole.

I forgot to include my footnote, but it was:

[1] We've seen requests on the list, and we also receive similar
    requests at GitHub for web-based diffs to use better funcnames. We
    just enabled the mapping ourselves a week or two ago via a system
    /etc/gitattributes file.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:26 ` Philip Oakley
@ 2011-12-16 19:32   ` Jeff King
  2011-12-22  0:05     ` Philip Oakley
  2011-12-16 19:38   ` Junio C Hamano
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff King @ 2011-12-16 19:32 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, git, Brandon Casey

On Fri, Dec 16, 2011 at 07:26:00PM -0000, Philip Oakley wrote:

> >+ "*.m diff=objc",
> 
> There is a conflict here with the Matlab community which also uses
> "*.m" files for its scripts and code.
> They fit the "It's not that hard to do, but it's an extra step that
> the user might not even know is an option." rationale.
> 
> If the objc.m is used as a default it must be overidable easily, and
> listed in the appropriate documentation to mitigate the "might not
> even know" risk.

It is easily overridable; just put your own "*.m" (or anything that
matches your files) entry into your gitattributes file. I'm more
concerned that people will start getting worse results than with the
default, and not know how to fix it.

If you have some Matlab files, would you mind doing diffs with the
default driver and with the objc driver, and comment on how good or bad
the results are?

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:21   ` Jeff King
  2011-12-16 19:30     ` Jeff King
@ 2011-12-16 19:33     ` Junio C Hamano
  2011-12-17  1:17       ` Jeff King
  2011-12-16 22:05     ` Johannes Sixt
  2 siblings, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2011-12-16 19:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, git, Brandon Casey

Jeff King <peff@peff.net> writes:

> I'm not clear from what you wrote on whether you were saying it is
> simply sub-optimal, or whether on balance it is way worse than the
> default funcname matching.

I think we recently saw that the optional built-in one for C did not even
understand a function that returns a pointer, and nobody complained about
it for a long time, and what was even more funny was that a patch to fix
it got a comment from somebody who wasn't using the optional built-in one
saying "The default works fine; what problem are you fixing?". That is
clearly one example where the default one is still better than the pattern
based one, iow, the pattern based one is way premature to be turned on by
default.

> And if it is bad on balance, is the right solution to avoid exposing
> people to it, or is it to make our patterns better?

Can't we do both, by avoid exposing normal users to broken one while
people who want to improve the pattern based one work on unbreak it?

> I.e., is it fixable,
> or is it simply too hard a problem to get right in the general case, and
> we shouldn't turn it on by default?

I do not think that is the "either-or" question. My impression has been
that even if it is fixable, it is too broken and produces worse result
than the simple default in its current form.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:26 ` Philip Oakley
  2011-12-16 19:32   ` Jeff King
@ 2011-12-16 19:38   ` Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2011-12-16 19:38 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Jeff King, git, Brandon Casey

"Philip Oakley" <philipoakley@iee.org> writes:

Administrivia: do not quote the whole message if you are only responding
to one small detail; cull the part you are not responding to. Thanks.

>> + "*.m diff=objc",
>
> There is a conflict here with the Matlab community which also uses "*.m"
> files for its scripts and code.  They fit the "It's not that hard to do,
> but it's an extra step that the user might not even know is an option."
> rationale.
> ...
>
> If the objc.m is used as a default it must be overidable easily, and
> listed in the appropriate documentation to mitigate the "might not
> even know" risk.

Even if we did so, I suspect that projects around Matlab would have to say
"Give attributes *.m diff=matlab" in their README files anyway, so if we
accept that, it wouldn't be too much additional trouble for the same
README files to also say "Specify diff.matlab.xfuncname to this pattern"
as well.  So in a sense, it might help reducing the confusion if we 
dropped the built-in Matlab pattern rule we recently added.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:21   ` Jeff King
  2011-12-16 19:30     ` Jeff King
  2011-12-16 19:33     ` Junio C Hamano
@ 2011-12-16 22:05     ` Johannes Sixt
  2011-12-17  1:21       ` Jeff King
  2 siblings, 1 reply; 35+ messages in thread
From: Johannes Sixt @ 2011-12-16 22:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Brandon Casey

Am 16.12.2011 20:21, schrieb Jeff King:
> I'm not clear from what you wrote on whether you were saying it is
> simply sub-optimal, or whether on balance it is way worse than the
> default funcname matching.

I'm saying the latter. Okay, we're talking "only" about hunk headers.
But when you are reviewing patches, they are *extremely* useful and a
time-saver; when they are wrong or not present, they are exactly the
opposite.

> So, I'm confused. If you are using this, surely you have "*.c diff=xcpp"
> in your attributes file, and my patch has no effect for you,

Sure I have. What I didn't say (sorry for that!), but wanted to hint at
is that this is to experiment with a pattern in order to ultimately
improve the built-in pattern. The topic came up just the other day, and
I took Thomas Rast's suggestion to experiment with a simplified pattern:

http://thread.gmane.org/gmane.comp.version-control.git/186355/focus=186439

But as is, the built-in pattern misses way too many anchor points in C++
code.

-- Hannes

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:33     ` Junio C Hamano
@ 2011-12-17  1:17       ` Jeff King
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2011-12-17  1:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git, Brandon Casey

On Fri, Dec 16, 2011 at 11:33:30AM -0800, Junio C Hamano wrote:

> I think we recently saw that the optional built-in one for C did not even
> understand a function that returns a pointer, and nobody complained about
> it for a long time,

Yeah. That implies to me that either people don't really care about
this feature, or that they are not actually using it because it requires
special configuration (we are not even using it in git.git, for
example).

> > And if it is bad on balance, is the right solution to avoid exposing
> > people to it, or is it to make our patterns better?
> 
> Can't we do both, by avoid exposing normal users to broken one while
> people who want to improve the pattern based one work on unbreak it?

Sure, we can do both. But if nobody is eating the dogfood, it will never
grow to taste better. Maybe we should start by using diff=c in git
itself?

> > I.e., is it fixable,
> > or is it simply too hard a problem to get right in the general case, and
> > we shouldn't turn it on by default?
> 
> I do not think that is the "either-or" question. My impression has been
> that even if it is fixable, it is too broken and produces worse result
> than the simple default in its current form.

What I meant by the either-or was: if it is fixable, then we should fix
it and consider turning it on as a default. If it's too hard to get
right, then we probably never want it on by default, and people who do
like it can turn it on (presumably because it works on their code
style).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 22:05     ` Johannes Sixt
@ 2011-12-17  1:21       ` Jeff King
  2011-12-17  3:38         ` Jonathan Nieder
  2011-12-19 20:52         ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Brandon Casey
  0 siblings, 2 replies; 35+ messages in thread
From: Jeff King @ 2011-12-17  1:21 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, Brandon Casey

On Fri, Dec 16, 2011 at 11:05:27PM +0100, Johannes Sixt wrote:

> Am 16.12.2011 20:21, schrieb Jeff King:
> > I'm not clear from what you wrote on whether you were saying it is
> > simply sub-optimal, or whether on balance it is way worse than the
> > default funcname matching.
> 
> I'm saying the latter. Okay, we're talking "only" about hunk headers.
> But when you are reviewing patches, they are *extremely* useful and a
> time-saver; when they are wrong or not present, they are exactly the
> opposite.

Right. I don't think it is worth arguing "well, it's only funcname
headers". Because that same argument applies to both the advantages
(i.e., hopefully with the patch we are generating better funcname
headers) and disadvantage (i.e., it seems that we might be generating
worse funcname headers).

So it is really a question of "how good" or "how bad" for each style.

> Sure I have. What I didn't say (sorry for that!), but wanted to hint at
> is that this is to experiment with a pattern in order to ultimately
> improve the built-in pattern. The topic came up just the other day, and
> I took Thomas Rast's suggestion to experiment with a simplified pattern:
> 
> http://thread.gmane.org/gmane.comp.version-control.git/186355/focus=186439
> 
> But as is, the built-in pattern misses way too many anchor points in C++
> code.

Yeah, I can certainly agree that the patterns could be better.

Maybe we should just table the extension mapping for now, then, and see
if the patterns improve? Or maybe just drop the C ones (and probably the
objc one based on other parts of the thread) and do the rest?

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-17  1:21       ` Jeff King
@ 2011-12-17  3:38         ` Jonathan Nieder
  2011-12-19 15:49           ` [PATCHv2 1/2] " Jeff King
  2011-12-19 15:57           ` [PATCHv2 2/2] attr: drop C/C++ default extension mapping Jeff King
  2011-12-19 20:52         ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Brandon Casey
  1 sibling, 2 replies; 35+ messages in thread
From: Jonathan Nieder @ 2011-12-17  3:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, Junio C Hamano, git, Brandon Casey

Jeff King wrote:

>                          Or maybe just drop the C ones (and probably the
> objc one based on other parts of the thread) and do the rest?

Yes, that.

This way, someone wondering why the C ones are not used by default can
easily look at your patch, see that they are utterly broken, and help
us fix it.  That's how progress is made.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCHv2 1/2] attr: map builtin userdiff drivers to well-known extensions
  2011-12-17  3:38         ` Jonathan Nieder
@ 2011-12-19 15:49           ` Jeff King
  2011-12-19 18:07             ` Jonathan Nieder
  2011-12-22  1:47             ` Ævar Arnfjörð Bjarmason
  2011-12-19 15:57           ` [PATCHv2 2/2] attr: drop C/C++ default extension mapping Jeff King
  1 sibling, 2 replies; 35+ messages in thread
From: Jeff King @ 2011-12-19 15:49 UTC (permalink / raw)
  To: git
  Cc: Johannes Sixt, Junio C Hamano, Brandon Casey, Jonathan Nieder,
	Philip Oakley

We already provide sane hunk-header patterns for specific
languages.

However, the user has to manually map common extensions to
use them. It's not that hard to do, but it's an extra step
that the user might not even know is an option. Let's be
nice and do it automatically.

It could be a problem in the future if the builtin userdiff
drivers started growing more invasive options, like
automatically claiming to be non-binary (i.e., setting
diff.cpp.binary = false by default), but right now we do not
do that, so it should be safe. To help safeguard against
future changes, we add a new test to t4012 making sure that
we don't consider binary files as text by their extension.

We also have to update t4018, which assumed that without a
.gitattributes file, we would receive the default funcname
pattern for a file matching "*.java". This is not covering
up a regression, but rather the feature working as intended.

Signed-off-by: Jeff King <peff@peff.net>
---
This drops the objc mappings from v1. I still have no data on how much
worse the objc funcname performs on Matlab files, but I'd rather be
conservative until an objc person wants to show up and argue about it.

The C mappings are still here, but see the next patch.

 attr.c                   |   22 ++++++++++++++++++++++
 t/t4012-diff-binary.sh   |   13 +++++++++++++
 t/t4018-diff-funcname.sh |   10 +++++++++-
 3 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/attr.c b/attr.c
index 76b079f..10713f3 100644
--- a/attr.c
+++ b/attr.c
@@ -306,6 +306,28 @@ static void free_attr_elem(struct attr_stack *e)
 
 static const char *builtin_attr[] = {
 	"[attr]binary -diff -text",
+	"*.html diff=html",
+	"*.htm diff=html",
+	"*.java diff=java",
+	"*.perl diff=perl",
+	"*.pl diff=perl",
+	"*.php diff=php",
+	"*.py diff=python",
+	"*.rb diff=ruby",
+	"*.bib diff=bibtex",
+	"*.tex diff=tex",
+	"*.c diff=cpp",
+	"*.cc diff=cpp",
+	"*.cxx diff=cpp",
+	"*.cpp diff=cpp",
+	"*.h diff=cpp",
+	"*.hpp diff=cpp",
+	"*.cs diff=csharp",
+	"*.[Ff] diff=fortran",
+	"*.[Ff][0-9][0-9] diff=fortran",
+	"*.pas diff=pascal",
+	"*.pp diff=pascal",
+	"*.lpr diff=pascal",
 	NULL,
 };
 
diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
index 2d9f9a0..b2fc807 100755
--- a/t/t4012-diff-binary.sh
+++ b/t/t4012-diff-binary.sh
@@ -90,4 +90,17 @@ test_expect_success 'diff --no-index with binary creation' '
 	test_cmp expected actual
 '
 
+test_expect_success 'binary files are not considered text by file extension' '
+	echo Q | q_to_nul >binary.c &&
+	git add binary.c &&
+	cat >expect <<-\EOF &&
+	diff --git a/binary.c b/binary.c
+	new file mode 100644
+	index 0000000..1f2a4f5
+	Binary files /dev/null and b/binary.c differ
+	EOF
+	git diff --cached binary.c >actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 4bd2a1c..a6227ef 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -124,7 +124,9 @@ do
 done
 
 test_expect_success 'default behaviour' '
-	rm -f .gitattributes &&
+	cat >.gitattributes <<-\EOF &&
+	*.java diff=default
+	EOF
 	test_expect_funcname "public class Beer\$"
 '
 
@@ -187,4 +189,10 @@ test_expect_success 'alternation in pattern' '
 	test_expect_funcname "public static void main("
 '
 
+test_expect_success 'custom diff drivers override built-in extension matches' '
+	test_config diff.foo.funcname "int special" &&
+	echo "*.java diff=foo" >.gitattributes &&
+	test_expect_funcname "int special"
+'
+
 test_done
-- 
1.7.8.rc2.38.gd9625

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCHv2 2/2] attr: drop C/C++ default extension mapping
  2011-12-17  3:38         ` Jonathan Nieder
  2011-12-19 15:49           ` [PATCHv2 1/2] " Jeff King
@ 2011-12-19 15:57           ` Jeff King
  2011-12-19 18:10             ` Jonathan Nieder
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff King @ 2011-12-19 15:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Sixt, Junio C Hamano, Brandon Casey, Jonathan Nieder,
	Philip Oakley

The point of this mapping is largely to get funcname
support. However, there's been some indication that our C
funcname pattern produces worse results than the default
pattern, so let's leave it unmapped for now.

If and when it improves, this commit can be reverted.

Signed-off-by: Jeff King <peff@peff.net>
---
Obviously this could just be squashed into the first patch. But I think
I'd rather leave a more explicit note in the history.

When writing the justification for this commit message, though, I did
notice that my reasoning is slightly flawed. The complaint is that the C
funcname pattern sucks, and therefore a user who hasn't configured
anything has a worse experience with patch 1. But enabling that sucky
experience is a two-step process:

  1. map *.c, etc to the diff driver "cpp"

  2. diff driver "cpp" has a funcname (which is reportedly bad)

Since this series is about tweaking extension mapping, the natural thing
to do is not enable (1).

But when you think about it, if our funcname pattern is bad, shouldn't
preventing (2) be the right thing? That is, if our funcname pattern is
really worse than the default language-agnostic match, wouldn't we be
doing everybody a service to simply remove the builtin
diff.cpp.xfuncname pattern?

Then you're not only not causing a regression for users who haven't
configured anything; you're actively helping people who have set
"diff=cpp" themselves.

Of course you're causing a regression to people who _like_ the current
diff.cpp.xfuncname. But if they are so widespread, then why is there so
much opposition to turning it on by default? My theory is that people
aren't actually using the builtin diff.cpp.xfuncname.

 attr.c |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/attr.c b/attr.c
index 10713f3..2f33128 100644
--- a/attr.c
+++ b/attr.c
@@ -316,12 +316,6 @@ static void free_attr_elem(struct attr_stack *e)
 	"*.rb diff=ruby",
 	"*.bib diff=bibtex",
 	"*.tex diff=tex",
-	"*.c diff=cpp",
-	"*.cc diff=cpp",
-	"*.cxx diff=cpp",
-	"*.cpp diff=cpp",
-	"*.h diff=cpp",
-	"*.hpp diff=cpp",
 	"*.cs diff=csharp",
 	"*.[Ff] diff=fortran",
 	"*.[Ff][0-9][0-9] diff=fortran",
-- 
1.7.8.rc2.38.gd9625

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCHv2 1/2] attr: map builtin userdiff drivers to well-known extensions
  2011-12-19 15:49           ` [PATCHv2 1/2] " Jeff King
@ 2011-12-19 18:07             ` Jonathan Nieder
  2011-12-19 18:55               ` Jeff King
  2011-12-22  1:47             ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 35+ messages in thread
From: Jonathan Nieder @ 2011-12-19 18:07 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Johannes Sixt, Junio C Hamano, Brandon Casey, Philip Oakley

Hi,

Two quick thoughts:

Jeff King wrote:

> The C mappings are still here, but see the next patch.

This is adding a regression in order to remove it.  I guess it's
harmless, but I don't see the point.

[...]
> --- a/t/t4012-diff-binary.sh
> +++ b/t/t4012-diff-binary.sh
> @@ -90,4 +90,17 @@ test_expect_success 'diff --no-index with binary creation' '
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'binary files are not considered text by file extension' '
> +	echo Q | q_to_nul >binary.c &&
> +	git add binary.c &&
> +	cat >expect <<-\EOF &&
> +	diff --git a/binary.c b/binary.c
> +	new file mode 100644
> +	index 0000000..1f2a4f5
> +	Binary files /dev/null and b/binary.c differ
> +	EOF
> +	git diff --cached binary.c >actual &&
> +	test_cmp expect actual

Re the idea of this test: very good idea.

Re the mechanics: I would have been happier to see

	echo Q | q_to_nul >binary.c &&
	git add binary.c &&
	git diff --cached binary.c >diff &&
	grep Binary files diff

since that would avoid hard-coding some assumptions:

 - the blob name of binary.c
 - that [diff] mnemonicprefix defaults to false (I'd like to see the
   default change to true)
 - that [core] abbrev defaults to 7 (it probably won't change, but
   it's a distracting detail, and if we were starting over 8 might be
   a better default)

A bonus comment: :)

[...]
> --- a/t/t4018-diff-funcname.sh
> +++ b/t/t4018-diff-funcname.sh
> @@ -124,7 +124,9 @@ do
>  done
>  
>  test_expect_success 'default behaviour' '
> -	rm -f .gitattributes &&
> +	cat >.gitattributes <<-\EOF &&
> +	*.java diff=default
> +	EOF
>  	test_expect_funcname "public class Beer\$"
>  '

	echo "*.java diff=default" >.gitattributes

would do the same with two lines fewer. :)

Thanks for working on this.  I owe you a beer.

Jonathan

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCHv2 2/2] attr: drop C/C++ default extension mapping
  2011-12-19 15:57           ` [PATCHv2 2/2] attr: drop C/C++ default extension mapping Jeff King
@ 2011-12-19 18:10             ` Jonathan Nieder
  2011-12-19 20:51               ` Thomas Rast
  0 siblings, 1 reply; 35+ messages in thread
From: Jonathan Nieder @ 2011-12-19 18:10 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Johannes Sixt, Junio C Hamano, Brandon Casey, Philip Oakley

Jeff King wrote:

> But when you think about it, if our funcname pattern is bad, shouldn't
> preventing (2) be the right thing? That is, if our funcname pattern is
> really worse than the default language-agnostic match, wouldn't we be
> doing everybody a service to simply remove the builtin
> diff.cpp.xfuncname pattern?

I don't see why.  Anyone who has set "diff=cpp" either likes suffering
(maybe they are hoping to improve the pattern) or is working with a
codebase for which the current pattern works better than the default
behavior (maybe their codebase has a lot of goto labels aligned at
column zero).  So removing the funcname pattern can only hurt them.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCHv2 1/2] attr: map builtin userdiff drivers to well-known extensions
  2011-12-19 18:07             ` Jonathan Nieder
@ 2011-12-19 18:55               ` Jeff King
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2011-12-19 18:55 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Johannes Sixt, Junio C Hamano, Brandon Casey, Philip Oakley

On Mon, Dec 19, 2011 at 12:07:33PM -0600, Jonathan Nieder wrote:

> > The C mappings are still here, but see the next patch.
> 
> This is adding a regression in order to remove it.  I guess it's
> harmless, but I don't see the point.

It's purely an attempt to help somebody reading "git log" later
understand what happened. Maybe a comment in the commit message is more
appropriate.

> > +test_expect_success 'binary files are not considered text by file extension' '
> > +	echo Q | q_to_nul >binary.c &&
> > +	git add binary.c &&
> > +	cat >expect <<-\EOF &&
> > +	diff --git a/binary.c b/binary.c
> > +	new file mode 100644
> > +	index 0000000..1f2a4f5
> > +	Binary files /dev/null and b/binary.c differ
> > +	EOF
> > +	git diff --cached binary.c >actual &&
> > +	test_cmp expect actual
> 
> Re the idea of this test: very good idea.
> 
> Re the mechanics: I would have been happier to see
> 
> 	echo Q | q_to_nul >binary.c &&
> 	git add binary.c &&
> 	git diff --cached binary.c >diff &&
> 	grep Binary files diff

Yeah, I think that's fine, and I'll squash it in to my local version.

It does miss one problem, though (that is also present in my original):
using "binary.c" is no longer a good name, since the next patch will
revert the "*.c" bits. :)

> > --- a/t/t4018-diff-funcname.sh
> > +++ b/t/t4018-diff-funcname.sh
> > @@ -124,7 +124,9 @@ do
> >  done
> >  
> >  test_expect_success 'default behaviour' '
> > -	rm -f .gitattributes &&
> > +	cat >.gitattributes <<-\EOF &&
> > +	*.java diff=default
> > +	EOF
> >  	test_expect_funcname "public class Beer\$"
> >  '
> 
> 	echo "*.java diff=default" >.gitattributes
> 
> would do the same with two lines fewer. :)

Yup. I was following the style of the test directly below, which sets
both java and perl drivers. But the "default" test that needed updating
only checks the java case.

Will squash.

> Thanks for working on this.  I owe you a beer.

You're welcome. :)

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCHv2 2/2] attr: drop C/C++ default extension mapping
  2011-12-19 18:10             ` Jonathan Nieder
@ 2011-12-19 20:51               ` Thomas Rast
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Rast @ 2011-12-19 20:51 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jeff King, git, Johannes Sixt, Junio C Hamano, Brandon Casey,
	Philip Oakley

Jonathan Nieder <jrnieder@gmail.com> writes:

> Jeff King wrote:
>
>> But when you think about it, if our funcname pattern is bad, shouldn't
>> preventing (2) be the right thing? That is, if our funcname pattern is
>> really worse than the default language-agnostic match, wouldn't we be
>> doing everybody a service to simply remove the builtin
>> diff.cpp.xfuncname pattern?
>
> I don't see why.  Anyone who has set "diff=cpp" either likes suffering
> (maybe they are hoping to improve the pattern) or is working with a
> codebase for which the current pattern works better than the default
> behavior (maybe their codebase has a lot of goto labels aligned at
> column zero).  So removing the funcname pattern can only hurt them.

FWIW, the funcname pattern is not the only feature of the diff
attributes.  I set it mainly to get the built-in --word-diff split
regexes.

I agree with Peff's patches though, until the cpp pattern improves, we
should not turn them on by default.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] t4018: introduce test cases for the internal hunk header patterns
  2011-12-17  1:21       ` Jeff King
  2011-12-17  3:38         ` Jonathan Nieder
@ 2011-12-19 20:52         ` Brandon Casey
  2011-12-19 21:53           ` [PATCH] t4018: add a few more test cases for cpp hunk header matching Brandon Casey
  2011-12-19 22:37           ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Junio C Hamano
  1 sibling, 2 replies; 35+ messages in thread
From: Brandon Casey @ 2011-12-19 20:52 UTC (permalink / raw)
  To: peff; +Cc: git, j6t, gitster, jrnieder, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

Recently it has been pointed out that one or more of the internal hunk
header patterns are sub-optimal.  Specifically, the C/C++ "cpp" pattern was
called out.

Let's introduce some infrastructure to make it easy to create test cases
for the hunk header patterns and provide a few cases for the cpp pattern.

   * new test cases can be dropped into the t4018 directory
   * filenames end with the pattern name e.g. .cpp .objc .matlab etc.
   * filenames should be descriptive since it will be used in the test
     suite output
   * broken test cases should be given a filename prefixed with "broken_"
   * test cases must provide a function named "RIGHT_function_hunk_header"
     - this is the function name that should appear on the hunk header line
     - the body of this function should have an assignment like

          answer = 0

       The test suite will modify the above line to produce a difference
       from the original.  Additionally, this should be far enough within
       the body of the function so that the function name is not part of
       the lines of context.

Example test case:

   int WRONG_function_hunk_header (void)
   {
           return 0;
   }

   int RIGHT_function_hunk_header (void)
   {
           /*
            * Filler
            * Filler
            * Filler
            */

           int answer = 0;
           return 0;
   }

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---


On Fri, Dec 16, 2011 at 7:21 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Dec 16, 2011 at 11:05:27PM +0100, Johannes Sixt wrote:
>
<snip>
>> ... in order to ultimately
>> improve the built-in pattern. The topic came up just the other day, and
>> I took Thomas Rast's suggestion to experiment with a simplified pattern:
>>
>> http://thread.gmane.org/gmane.comp.version-control.git/186355/focus=186439
>>
>> But as is, the built-in pattern misses way too many anchor points in C++
>> code.
>
> Yeah, I can certainly agree that the patterns could be better.
>
> Maybe we should just table the extension mapping for now, then, and see
> if the patterns improve? Or maybe just drop the C ones (and probably the
> objc one based on other parts of the thread) and do the rest?

/methinks t4018 needs to be greatly expanded.  There is no way to tell what
is currently broken by the current pattern, or what is newly broken by a
new pattern.

How about this for a start?

-Brandon


 t/t4018-diff-funcname.sh             |   18 ++++++++++++
 t/t4018/broken_class_constructor.cpp |   34 +++++++++++++++++++++++
 t/t4018/broken_class_destructor.cpp  |   34 +++++++++++++++++++++++
 t/t4018/broken_gnu_style.cpp         |   35 +++++++++++++++++++++++
 t/t4018/broken_reference.cpp         |   34 +++++++++++++++++++++++
 t/t4018/broken_template.cpp          |   34 +++++++++++++++++++++++
 t/t4018/class_method.cpp             |   34 +++++++++++++++++++++++
 t/t4018/simple.cpp                   |   50 ++++++++++++++++++++++++++++++++++
 t/t4018/static.cpp                   |   34 +++++++++++++++++++++++
 9 files changed, 307 insertions(+), 0 deletions(-)
 create mode 100644 t/t4018/broken_class_constructor.cpp
 create mode 100644 t/t4018/broken_class_destructor.cpp
 create mode 100644 t/t4018/broken_gnu_style.cpp
 create mode 100644 t/t4018/broken_reference.cpp
 create mode 100644 t/t4018/broken_template.cpp
 create mode 100644 t/t4018/class_method.cpp
 create mode 100644 t/t4018/simple.cpp
 create mode 100644 t/t4018/static.cpp

diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 4bd2a1c..5a1f7b7 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -121,6 +121,24 @@ do
 		! grep fatal msg &&
 		! grep error msg
 	'
+
+	for f in "$TEST_DIRECTORY"/t4018/*.$p; do
+		test -f "$f" || continue
+		name=`basename "$f" .$p`
+		test_expect_which=test_expect_success
+		if test "$name" != "${name#broken_}"; then
+			name=${name#broken_}
+			test_expect_which=test_expect_failure
+		fi
+		$test_expect_which \
+			"builtin $p pattern works correctly for $name case" "
+			echo \"*.$p diff=$p\" >.gitattributes &&
+			sed -e 's/answer = 0/answer = 42/' < \"$f\" > \"$name.$p\" &&
+			test_expect_code 1 git diff --no-index \"$f\" \
+				\"$name.$p\" >actual &&
+			egrep '^@@ .* @@ .*RIGHT_function_hunk_header' actual
+		"
+	done
 done
 
 test_expect_success 'default behaviour' '
diff --git a/t/t4018/broken_class_constructor.cpp b/t/t4018/broken_class_constructor.cpp
new file mode 100644
index 0000000..1e4cb9c
--- /dev/null
+++ b/t/t4018/broken_class_constructor.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+RIGHT_function_hunk_header::RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_class_destructor.cpp b/t/t4018/broken_class_destructor.cpp
new file mode 100644
index 0000000..84d9506
--- /dev/null
+++ b/t/t4018/broken_class_destructor.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+RIGHT_function_hunk_header::~RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_gnu_style.cpp b/t/t4018/broken_gnu_style.cpp
new file mode 100644
index 0000000..ffba9b9
--- /dev/null
+++ b/t/t4018/broken_gnu_style.cpp
@@ -0,0 +1,35 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int
+RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_reference.cpp b/t/t4018/broken_reference.cpp
new file mode 100644
index 0000000..ea90b82
--- /dev/null
+++ b/t/t4018/broken_reference.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int& RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_template.cpp b/t/t4018/broken_template.cpp
new file mode 100644
index 0000000..95a6dd5
--- /dev/null
+++ b/t/t4018/broken_template.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+template <class T> int RIGHT_function_hunk_header (T unused)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/class_method.cpp b/t/t4018/class_method.cpp
new file mode 100644
index 0000000..2e51853
--- /dev/null
+++ b/t/t4018/class_method.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int test_class::RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/simple.cpp b/t/t4018/simple.cpp
new file mode 100644
index 0000000..b8fca06
--- /dev/null
+++ b/t/t4018/simple.cpp
@@ -0,0 +1,50 @@
+/*
+ *  Test file for testing the internal hunk header patterns
+ *
+ *  The "RIGHT" hunk header function, the one that should appear on the
+ *  hunk header line, should be named "RIGHT_function_hunk_header" and
+ *  the body of this function should have an assignment that looks like
+ *
+ *     answer = 0
+ *
+ *  within it, deep enough so the lines of context do not include the
+ *  function name.
+ *
+ *  If the name of this file begins with "broken_", then it will be
+ *  interpreted as a pattern which does not work, but which should.
+ */
+
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/static.cpp b/t/t4018/static.cpp
new file mode 100644
index 0000000..fcc48f2
--- /dev/null
+++ b/t/t4018/static.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+static int RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
-- 
1.7.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH] t4018: add a few more test cases for cpp hunk header matching
  2011-12-19 20:52         ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Brandon Casey
@ 2011-12-19 21:53           ` Brandon Casey
  2011-12-19 22:37           ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Brandon Casey @ 2011-12-19 21:53 UTC (permalink / raw)
  To: peff; +Cc: git, j6t, gitster, jrnieder, trast, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

Add one case for matching a function returning a pointer.

Plus add examples of things we explicitly do not match:

   labels
   function declarations
   global variable declarations

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---


This can be squashed into the original patch with the other test cases.
This just introduces a few more cases pointed out by Thomas Rast in the
email Johannes referenced.

   http://thread.gmane.org/gmane.comp.version-control.git/186355/focus=186439

Also, note that all of the tests pass except for ignore_global.cpp with
Johannes's pattern:

   "!^[ \\t]*[a-zA-Z_][a-zA-Z_0-9]*[^()]*:[[:space:]]*$\n^[a-zA-Z_][a-zA-Z_0-9]*.*"

-Brandon


 t/t4018/ignore_declaration.cpp |   35 +++++++++++++++++++++++++++++++++++
 t/t4018/ignore_global.cpp      |   36 ++++++++++++++++++++++++++++++++++++
 t/t4018/ignore_label.cpp       |   35 +++++++++++++++++++++++++++++++++++
 t/t4018/pointer_return.cpp     |   34 ++++++++++++++++++++++++++++++++++
 4 files changed, 140 insertions(+), 0 deletions(-)
 create mode 100644 t/t4018/ignore_declaration.cpp
 create mode 100644 t/t4018/ignore_global.cpp
 create mode 100644 t/t4018/ignore_label.cpp
 create mode 100644 t/t4018/pointer_return.cpp

diff --git a/t/t4018/ignore_declaration.cpp b/t/t4018/ignore_declaration.cpp
new file mode 100644
index 0000000..615aea0
--- /dev/null
+++ b/t/t4018/ignore_declaration.cpp
@@ -0,0 +1,35 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+	void WRONG_function_declaration_within_body (void);
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/ignore_global.cpp b/t/t4018/ignore_global.cpp
new file mode 100644
index 0000000..df6b8aa
--- /dev/null
+++ b/t/t4018/ignore_global.cpp
@@ -0,0 +1,36 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_global_variable;
+
+/*
+ * Filler
+ * Filler
+ * Filler
+ * Filler
+ * Filler
+ * Filler
+ */
+
+int answer = 0;
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/ignore_label.cpp b/t/t4018/ignore_label.cpp
new file mode 100644
index 0000000..2e3ce10
--- /dev/null
+++ b/t/t4018/ignore_label.cpp
@@ -0,0 +1,35 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+WRONG_should_not_match_label:
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/pointer_return.cpp b/t/t4018/pointer_return.cpp
new file mode 100644
index 0000000..fd85545
--- /dev/null
+++ b/t/t4018/pointer_return.cpp
@@ -0,0 +1,34 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+static int *RIGHT_function_hunk_header (void)
+{
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	int answer = 0;
+
+	/*
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 * Filler
+	 */
+
+	return answer;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
-- 
1.7.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] t4018: introduce test cases for the internal hunk header patterns
  2011-12-19 20:52         ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Brandon Casey
  2011-12-19 21:53           ` [PATCH] t4018: add a few more test cases for cpp hunk header matching Brandon Casey
@ 2011-12-19 22:37           ` Junio C Hamano
  2011-12-19 22:57             ` Brandon Casey
  2011-12-20 20:08             ` [PATCH] " Junio C Hamano
  1 sibling, 2 replies; 35+ messages in thread
From: Junio C Hamano @ 2011-12-19 22:37 UTC (permalink / raw)
  To: Brandon Casey; +Cc: peff, git, j6t, jrnieder, Brandon Casey

Brandon Casey <casey@nrlssc.navy.mil> writes:

>    * new test cases can be dropped into the t4018 directory
>    * filenames end with the pattern name e.g. .cpp .objc .matlab etc.
>    * filenames should be descriptive since it will be used in the test
>      suite output
>    * broken test cases should be given a filename prefixed with "broken_"

Cute. I like the general idea.

>    * test cases must provide a function named "RIGHT_function_hunk_header"
>      - this is the function name that should appear on the hunk header line
>      - the body of this function should have an assignment like
>
>           answer = 0
>
>        The test suite will modify the above line to produce a difference
>        from the original.  Additionally, this should be far enough within
>        the body of the function so that the function name is not part of
>        the lines of context.

Although I do not think of any language with a syntax rule where that the
overlong RIGHT_func... token is an illegal symbol offhand, this feels a
bit _too_ specific to the C language. I would prefer something like this
instead:

    * a test case must have one (and only one) line that contains "RIGHT"
      (all uppercase) and that line should become the hunk header for the
      test to succeed.

    * after the line that contains "RIGHT" token, there should be one (and
      only one) line that contains "ChangeMe". The test modifies this
      token to "IWasChanged", compares the original with the modified
      result, and expects the "RIGHT" token above appears on the hunk
      header.

Also I would prefer not to require "enough filler", as we might want to
enhance the logic to consider using a line in the pre-context as the hunk
header in some cases, e.g.

    @@ ... @@ int RIGHT_function_hunk_header(void)

     int RIGHT_function_hunk_header(void)
     {
    -    int ChangeME;
    +    int IWasChanged;
         printf("Hello, world\n");
	 return 0;
     }
    @@ ...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] t4018: introduce test cases for the internal hunk header patterns
  2011-12-19 22:37           ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Junio C Hamano
@ 2011-12-19 22:57             ` Brandon Casey
  2011-12-19 23:17               ` Junio C Hamano
  2011-12-20 20:08             ` [PATCH] " Junio C Hamano
  1 sibling, 1 reply; 35+ messages in thread
From: Brandon Casey @ 2011-12-19 22:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Casey, peff, git, j6t, jrnieder

On Mon, Dec 19, 2011 at 4:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Casey <casey@nrlssc.navy.mil> writes:
>>    * test cases must provide a function named "RIGHT_function_hunk_header"
>>      - this is the function name that should appear on the hunk header line
>>      - the body of this function should have an assignment like
>>
>>           answer = 0
>>
>>        The test suite will modify the above line to produce a difference
>>        from the original.  Additionally, this should be far enough within
>>        the body of the function so that the function name is not part of
>>        the lines of context.
>
> Although I do not think of any language with a syntax rule where that the
> overlong RIGHT_func... token is an illegal symbol offhand, this feels a
> bit _too_ specific to the C language.

Good point.  "RIGHT_function_hunk_header" doesn't really have much
meaning for non-programming language patterns like tex and html.

> I would prefer something like this
> instead:
>
>    * a test case must have one (and only one) line that contains "RIGHT"
>      (all uppercase) and that line should become the hunk header for the
>      test to succeed.
>
>    * after the line that contains "RIGHT" token, there should be one (and
>      only one) line that contains "ChangeMe". The test modifies this
>      token to "IWasChanged", compares the original with the modified
>      result, and expects the "RIGHT" token above appears on the hunk
>      header.

Both good improvements.

> Also I would prefer not to require "enough filler", as we might want to
> enhance the logic to consider using a line in the pre-context as the hunk
> header in some cases, e.g.
>
>    @@ ... @@ int RIGHT_function_hunk_header(void)
>
>     int RIGHT_function_hunk_header(void)
>     {
>    -    int ChangeME;
>    +    int IWasChanged;
>         printf("Hello, world\n");
>         return 0;
>     }
>    @@ ...

Yeah, I didn't mean to imply that "enough filler" was a requirement of
the test, just trying to point out the requirement imposed by the hunk
header logic.  I'll remove this statement.

Will reroll.

-Brandon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] t4018: introduce test cases for the internal hunk header patterns
  2011-12-19 22:57             ` Brandon Casey
@ 2011-12-19 23:17               ` Junio C Hamano
  2011-12-20  2:42                 ` [PATCH v2] " Brandon Casey
  0 siblings, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2011-12-19 23:17 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Brandon Casey, peff, git, j6t, jrnieder

Brandon Casey <drafnel@gmail.com> writes:

>>    * after the line that contains "RIGHT" token, there should be one (and
>>      only one) line that contains "ChangeMe". The test modifies this
>>      token to "IWasChanged", compares the original with the modified
>>      result, and expects the "RIGHT" token above appears on the hunk
>>      header.
>
> Both good improvements.

Let's not call "ChangeMe" a *token*, as I think the easiest example would
look like the following and it would not use it as such.  Rephrasing it as
a "string" would be better.

    @@ ... @@ int RIGHT_function_hunk_header(void)

     int RIGHT_function_hunk_header(void)
     {
    -    const char *msg = "ChangeME";
    +    const char *msg = "IWasChanged";
         printf("Hello, world, %s\n", msg);
	 return 0;
     }
    @@ ...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2] t4018: introduce test cases for the internal hunk header patterns
  2011-12-19 23:17               ` Junio C Hamano
@ 2011-12-20  2:42                 ` Brandon Casey
  2011-12-20  8:25                   ` Jakub Narebski
                                     ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Brandon Casey @ 2011-12-20  2:42 UTC (permalink / raw)
  To: gitster; +Cc: git, peff, j6t, jrnieder, trast, Brandon Casey

Recently it has been pointed out that one (or more?) of the internal hunk
header patterns is sub-optimal.  Specifically, the C/C++ "cpp" pattern was
called out.

Let's introduce some infrastructure to make it easy to create test cases
for the hunk header patterns and provide a few cases for the cpp pattern.

   * new test cases can be dropped into the t4018 directory
   * filenames end with the pattern name e.g. .cpp .objc .matlab etc.
   * filenames should be descriptive since it will be used in the test
     suite output
   * broken test cases should be given a filename prefixed with "broken_"
   * a test case must have one (and only one) line that contains "RIGHT"
     (all uppercase) and that line should become the hunk header for the
     test to succeed
   * after the line that contains "RIGHT" token, there should be one (and
     only one) line that contains the string "ChangeMe". The test modifies
     this string to "IWasChanged", compares the original with the modified
     result, and expects the "RIGHT" token above to appear on the hunk
     header

Example test case:

   int WRONG_function_hunk_header (void)
   {
           return 0;
   }

   int RIGHT_function_hunk_header (void)
   {
           const char *msg = "ChangeMe";
           printf("Hello, world, %s\n", msg);
           return 0;
   }

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---


Updated based on Junio's comments and squashed the additional tests I
sent.  Plus I added -U1 to the git diff line so that the filler lines are
no longer necessary.

-Brandon


 t/t4018-diff-funcname.sh             |   18 ++++++++++++++++++
 t/t4018/broken_class_constructor.cpp |   16 ++++++++++++++++
 t/t4018/broken_class_destructor.cpp  |   16 ++++++++++++++++
 t/t4018/broken_gnu_style.cpp         |   17 +++++++++++++++++
 t/t4018/broken_reference.cpp         |   16 ++++++++++++++++
 t/t4018/broken_template.cpp          |   16 ++++++++++++++++
 t/t4018/class_method.cpp             |   16 ++++++++++++++++
 t/t4018/ignore_declaration.cpp       |   17 +++++++++++++++++
 t/t4018/ignore_global.cpp            |   19 +++++++++++++++++++
 t/t4018/ignore_label.cpp             |   17 +++++++++++++++++
 t/t4018/pointer_return.cpp           |   16 ++++++++++++++++
 t/t4018/simple.cpp                   |   32 ++++++++++++++++++++++++++++++++
 t/t4018/static.cpp                   |   16 ++++++++++++++++
 13 files changed, 232 insertions(+), 0 deletions(-)
 create mode 100644 t/t4018/broken_class_constructor.cpp
 create mode 100644 t/t4018/broken_class_destructor.cpp
 create mode 100644 t/t4018/broken_gnu_style.cpp
 create mode 100644 t/t4018/broken_reference.cpp
 create mode 100644 t/t4018/broken_template.cpp
 create mode 100644 t/t4018/class_method.cpp
 create mode 100644 t/t4018/ignore_declaration.cpp
 create mode 100644 t/t4018/ignore_global.cpp
 create mode 100644 t/t4018/ignore_label.cpp
 create mode 100644 t/t4018/pointer_return.cpp
 create mode 100644 t/t4018/simple.cpp
 create mode 100644 t/t4018/static.cpp

diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 4bd2a1c..a3c4577 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -121,6 +121,24 @@ do
 		! grep fatal msg &&
 		! grep error msg
 	'
+
+	for f in "$TEST_DIRECTORY"/t4018/*.$p; do
+		test -f "$f" || continue
+		name=`basename "$f" .$p`
+		test_expect_which=test_expect_success
+		if test "$name" != "${name#broken_}"; then
+			name=${name#broken_}
+			test_expect_which=test_expect_failure
+		fi
+		$test_expect_which \
+			"builtin $p pattern works correctly for $name case" "
+			echo \"*.$p diff=$p\" >.gitattributes &&
+			sed -e 's/ChangeMe/IWasChanged/' < \"$f\" > \"$name.$p\" &&
+			test_expect_code 1 git diff -U1 --no-index \"$f\" \
+				\"$name.$p\" >actual &&
+			egrep '^@@ .* @@ .*RIGHT' actual
+		"
+	done
 done
 
 test_expect_success 'default behaviour' '
diff --git a/t/t4018/broken_class_constructor.cpp b/t/t4018/broken_class_constructor.cpp
new file mode 100644
index 0000000..774c7f9
--- /dev/null
+++ b/t/t4018/broken_class_constructor.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+RIGHT_function_hunk_header::RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_class_destructor.cpp b/t/t4018/broken_class_destructor.cpp
new file mode 100644
index 0000000..3045fd1
--- /dev/null
+++ b/t/t4018/broken_class_destructor.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+RIGHT_function_hunk_header::~RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_gnu_style.cpp b/t/t4018/broken_gnu_style.cpp
new file mode 100644
index 0000000..58e574a
--- /dev/null
+++ b/t/t4018/broken_gnu_style.cpp
@@ -0,0 +1,17 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int
+RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_reference.cpp b/t/t4018/broken_reference.cpp
new file mode 100644
index 0000000..4d4549f
--- /dev/null
+++ b/t/t4018/broken_reference.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int& RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/broken_template.cpp b/t/t4018/broken_template.cpp
new file mode 100644
index 0000000..5c62e73
--- /dev/null
+++ b/t/t4018/broken_template.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+template <class T> int RIGHT_function_hunk_header (T unused)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/class_method.cpp b/t/t4018/class_method.cpp
new file mode 100644
index 0000000..fe53620
--- /dev/null
+++ b/t/t4018/class_method.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int test_class::RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/ignore_declaration.cpp b/t/t4018/ignore_declaration.cpp
new file mode 100644
index 0000000..ce7a0f6
--- /dev/null
+++ b/t/t4018/ignore_declaration.cpp
@@ -0,0 +1,17 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+	void WRONG_function_declaration_within_body (void);
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/ignore_global.cpp b/t/t4018/ignore_global.cpp
new file mode 100644
index 0000000..95e23bc
--- /dev/null
+++ b/t/t4018/ignore_global.cpp
@@ -0,0 +1,19 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+	printf("Hello, world\n");
+	return 0;
+}
+
+int WRONG_global_variable;
+
+int ChangeMe;
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/ignore_label.cpp b/t/t4018/ignore_label.cpp
new file mode 100644
index 0000000..a8f407d
--- /dev/null
+++ b/t/t4018/ignore_label.cpp
@@ -0,0 +1,17 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+WRONG_should_not_match_label:
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/pointer_return.cpp b/t/t4018/pointer_return.cpp
new file mode 100644
index 0000000..ea30d2d
--- /dev/null
+++ b/t/t4018/pointer_return.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+static int *RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/simple.cpp b/t/t4018/simple.cpp
new file mode 100644
index 0000000..c96ad87
--- /dev/null
+++ b/t/t4018/simple.cpp
@@ -0,0 +1,32 @@
+/*
+ *  Test file for testing the internal hunk header patterns
+ *
+ *  The "RIGHT" hunk header function, the one that should appear on the
+ *  hunk header line, should be named "RIGHT_function_hunk_header" and
+ *  the body of this function should have an assignment that looks like
+ *
+ *     answer = 0
+ *
+ *  within it, deep enough so the lines of context do not include the
+ *  function name.
+ *
+ *  If the name of this file begins with "broken_", then it will be
+ *  interpreted as a pattern which does not work, but which should.
+ */
+
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+int RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/static.cpp b/t/t4018/static.cpp
new file mode 100644
index 0000000..f6ee0f3
--- /dev/null
+++ b/t/t4018/static.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+static int RIGHT_function_hunk_header (void)
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] t4018: introduce test cases for the internal hunk header patterns
  2011-12-20  2:42                 ` [PATCH v2] " Brandon Casey
@ 2011-12-20  8:25                   ` Jakub Narebski
  2011-12-20 15:58                     ` Brandon Casey
  2011-12-20  9:13                   ` Thomas Rast
  2011-12-20 19:52                   ` Johannes Sixt
  2 siblings, 1 reply; 35+ messages in thread
From: Jakub Narebski @ 2011-12-20  8:25 UTC (permalink / raw)
  To: Brandon Casey; +Cc: gitster, git, peff, j6t, jrnieder, trast

Brandon Casey <drafnel@gmail.com> writes:

> diff --git a/t/t4018/ignore_global.cpp b/t/t4018/ignore_global.cpp
> new file mode 100644
> index 0000000..95e23bc
> --- /dev/null
> +++ b/t/t4018/ignore_global.cpp
> @@ -0,0 +1,19 @@
> +int WRONG_function_hunk_header_preceding_the_right_one (void)
> +{
> +	return 0;
> +}
> +
> +int RIGHT_function_hunk_header (void)
> +{
> +	printf("Hello, world\n");
> +	return 0;
> +}
> +
> +int WRONG_global_variable;
> +
> +int ChangeMe;
> +
> +int WRONG_function_hunk_header_following_the_right_one (void)
> +{
> +	return 0;
> +}

Shouldn't ChangeMe be inside function?

> diff --git a/t/t4018/simple.cpp b/t/t4018/simple.cpp
> new file mode 100644
> index 0000000..c96ad87
> --- /dev/null
> +++ b/t/t4018/simple.cpp
> @@ -0,0 +1,32 @@
> +/*
> + *  Test file for testing the internal hunk header patterns
> + *
> + *  The "RIGHT" hunk header function, the one that should appear on the
> + *  hunk header line, should be named "RIGHT_function_hunk_header" and
> + *  the body of this function should have an assignment that looks like
> + *
> + *     answer = 0

Shouldn't it be ChangeMe?

> + *
> + *  within it, deep enough so the lines of context do not include the
> + *  function name.
> + *
> + *  If the name of this file begins with "broken_", then it will be
> + *  interpreted as a pattern which does not work, but which should.
> + */

-- 
Jakub Narêbski

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] t4018: introduce test cases for the internal hunk header patterns
  2011-12-20  2:42                 ` [PATCH v2] " Brandon Casey
  2011-12-20  8:25                   ` Jakub Narebski
@ 2011-12-20  9:13                   ` Thomas Rast
  2011-12-20 19:52                   ` Johannes Sixt
  2 siblings, 0 replies; 35+ messages in thread
From: Thomas Rast @ 2011-12-20  9:13 UTC (permalink / raw)
  To: Brandon Casey; +Cc: gitster, git, peff, j6t, jrnieder

Brandon Casey <drafnel@gmail.com> writes:

> Let's introduce some infrastructure to make it easy to create test cases
> for the hunk header patterns and provide a few cases for the cpp pattern.
[...]
>    int WRONG_function_hunk_header (void)
[...]
>    int RIGHT_function_hunk_header (void)
>    {
>            const char *msg = "ChangeMe";

Excellent idea!

> +template <class T> int RIGHT_function_hunk_header (T unused)
> +{
> +	const char *msg = "ChangeMe";
> +	printf("Hello, world, %s\n", msg);
> +	return 0;
> +}

I'd still like to have an extremely contrived overuse of templated
classes, like so:

---- 8< ----
int WRONG_function_hunk_header_preceding_the_right_one (void)
{
	return 0;
}

foo::RIGHT<int*&,1>::operator<<(int bar)
{
	const char *msg = "ChangeMe";
	printf("Hello, world, %s\n", msg);
	return 0;
}

int WRONG_function_hunk_header_following_the_right_one (void)
{
	return 0;
}
---- >8 ----

That will guard us against updating the C++ pattern to something better
but still slightly too simple.

Other than that and Jakub's comments,

Acked-by: Thomas Rast <trast@student.ethz.ch>

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] t4018: introduce test cases for the internal hunk header patterns
  2011-12-20  8:25                   ` Jakub Narebski
@ 2011-12-20 15:58                     ` Brandon Casey
  0 siblings, 0 replies; 35+ messages in thread
From: Brandon Casey @ 2011-12-20 15:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: gitster, git, peff, j6t, jrnieder, trast

On Tue, Dec 20, 2011 at 2:25 AM, Jakub Narebski <jnareb@gmail.com> wrote:
> Brandon Casey <drafnel@gmail.com> writes:
>
>> diff --git a/t/t4018/ignore_global.cpp b/t/t4018/ignore_global.cpp
>> new file mode 100644
>> index 0000000..95e23bc
>> --- /dev/null
>> +++ b/t/t4018/ignore_global.cpp
>> @@ -0,0 +1,19 @@
>> +int WRONG_function_hunk_header_preceding_the_right_one (void)
>> +{
>> +     return 0;
>> +}
>> +
>> +int RIGHT_function_hunk_header (void)
>> +{
>> +     printf("Hello, world\n");
>> +     return 0;
>> +}
>> +
>> +int WRONG_global_variable;
>> +
>> +int ChangeMe;
>> +
>> +int WRONG_function_hunk_header_following_the_right_one (void)
>> +{
>> +     return 0;
>> +}
>
> Shouldn't ChangeMe be inside function?

No, this one is placed here after the WRONG_global_variable to make
sure that a global variable is not chosen for the hunk header.  It
should chose the most recently encountered function name for the hunk
header, which is RIGHT_function_hunk_header.

>> diff --git a/t/t4018/simple.cpp b/t/t4018/simple.cpp
>> new file mode 100644
>> index 0000000..c96ad87
>> --- /dev/null
>> +++ b/t/t4018/simple.cpp
>> @@ -0,0 +1,32 @@
>> +/*
>> + *  Test file for testing the internal hunk header patterns
>> + *
>> + *  The "RIGHT" hunk header function, the one that should appear on the
>> + *  hunk header line, should be named "RIGHT_function_hunk_header" and
>> + *  the body of this function should have an assignment that looks like
>> + *
>> + *     answer = 0
>
> Shouldn't it be ChangeMe?

Whoops, I forgot about that text.  Thanks for noticing.  Yes this is
incorrect now.  I think I'll break that out into a README file.

v3 forthcoming.

-Brandon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] t4018: introduce test cases for the internal hunk header patterns
  2011-12-20  2:42                 ` [PATCH v2] " Brandon Casey
  2011-12-20  8:25                   ` Jakub Narebski
  2011-12-20  9:13                   ` Thomas Rast
@ 2011-12-20 19:52                   ` Johannes Sixt
  2 siblings, 0 replies; 35+ messages in thread
From: Johannes Sixt @ 2011-12-20 19:52 UTC (permalink / raw)
  To: Brandon Casey; +Cc: gitster, git, peff, jrnieder, trast

Am 20.12.2011 03:42, schrieb Brandon Casey:
> +int WRONG_global_variable;

Personally, I'm not a fan of this one. IMHO, global variable
definitions need not be ignored. (But I can live with it.)

But here are some more tests that I care about and that you can squash in.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 t/t4018/broken_class_constructor.cpp               |    3 +-
 t/t4018/broken_return_type_in_global_namespace.cpp |   16 ++++++++++++++
 t/t4018/class_definition.cpp                       |   19 +++++++++++++++++
 t/t4018/derived_class_definition.cpp               |   20 ++++++++++++++++++
 t/t4018/ignore_access_specifier.cpp                |   22 ++++++++++++++++++++
 5 files changed, 79 insertions(+), 1 deletions(-)
 create mode 100644 t/t4018/broken_return_type_in_global_namespace.cpp
 create mode 100644 t/t4018/class_definition.cpp
 create mode 100644 t/t4018/derived_class_definition.cpp
 create mode 100644 t/t4018/ignore_access_specifier.cpp

diff --git a/t/t4018/broken_class_constructor.cpp b/t/t4018/broken_class_constructor.cpp
index 774c7f9..28c1bc8 100644
--- a/t/t4018/broken_class_constructor.cpp
+++ b/t/t4018/broken_class_constructor.cpp
@@ -3,7 +3,8 @@ int WRONG_function_hunk_header_preceding_the_right_one (void)
 	return 0;
 }
 
-RIGHT_function_hunk_header::RIGHT_function_hunk_header (void)
+RIGHT_function_hunk_header::RIGHT_function_hunk_header (int x) :
+	WRONG_member_initializer(x)
 {
 	const char *msg = "ChangeMe";
 	printf("Hello, world, %s\n", msg);
diff --git a/t/t4018/broken_return_type_in_global_namespace.cpp b/t/t4018/broken_return_type_in_global_namespace.cpp
new file mode 100644
index 0000000..da9931b
--- /dev/null
+++ b/t/t4018/broken_return_type_in_global_namespace.cpp
@@ -0,0 +1,16 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+::TypeInGlobalNamespace RIGHT_function_hunk_header()
+{
+	const char *msg = "ChangeMe";
+	printf("Hello, world, %s\n", msg);
+	return 0;
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/class_definition.cpp b/t/t4018/class_definition.cpp
new file mode 100644
index 0000000..fc3df34
--- /dev/null
+++ b/t/t4018/class_definition.cpp
@@ -0,0 +1,19 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+class RIGHT_class_name
+{
+	void WRONG_function_name_following_the_right_one()
+	{
+		const char *msg = "ChangeMe";
+		printf("Hello, world, %s\n", msg);
+		return 0;
+	}
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/derived_class_definition.cpp b/t/t4018/derived_class_definition.cpp
new file mode 100644
index 0000000..b8501a5
--- /dev/null
+++ b/t/t4018/derived_class_definition.cpp
@@ -0,0 +1,20 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+class RIGHT_class_name :
+	public WRONG_class_name_following_the_right_one
+{
+	void WRONG_function_name_following_the_right_one()
+	{
+		const char *msg = "ChangeMe";
+		printf("Hello, world, %s\n", msg);
+		return 0;
+	}
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
diff --git a/t/t4018/ignore_access_specifier.cpp b/t/t4018/ignore_access_specifier.cpp
new file mode 100644
index 0000000..d865040
--- /dev/null
+++ b/t/t4018/ignore_access_specifier.cpp
@@ -0,0 +1,22 @@
+int WRONG_function_hunk_header_preceding_the_right_one (void)
+{
+	return 0;
+}
+
+class RIGHT_class_name
+{
+public:
+protected:
+private:
+	void WRONG_function_name_following_the_right_one()
+	{
+		const char *msg = "ChangeMe";
+		printf("Hello, world, %s\n", msg);
+		return 0;
+	}
+}
+
+int WRONG_function_hunk_header_following_the_right_one (void)
+{
+	return 0;
+}
-- 
1.7.8.216.g2e426

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] t4018: introduce test cases for the internal hunk header patterns
  2011-12-19 22:37           ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Junio C Hamano
  2011-12-19 22:57             ` Brandon Casey
@ 2011-12-20 20:08             ` Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2011-12-20 20:08 UTC (permalink / raw)
  To: Brandon Casey; +Cc: peff, git, j6t, jrnieder, Brandon Casey

Junio C Hamano <gitster@pobox.com> writes:

> Brandon Casey <casey@nrlssc.navy.mil> writes:
>
>>    * new test cases can be dropped into the t4018 directory
>>    * filenames end with the pattern name e.g. .cpp .objc .matlab etc.
>>    * filenames should be descriptive since it will be used in the test
>>      suite output
>>    * broken test cases should be given a filename prefixed with "broken_"
>
> Cute. I like the general idea.

Actually, I do not like this "broken_" filename prefix part. Even though I
can imagine "git show -M" would do a reasonable job, marking that a fix to
a pattern makes a test that used not to pass succeed by renaming a file
goes against the convention of changing one line to turn the "failure" to
"success" in test_expect_failure used in normal tests.

Can we use stable filename that says what is being tested instead?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-16 19:32   ` Jeff King
@ 2011-12-22  0:05     ` Philip Oakley
  2011-12-23  5:47       ` Jeff King
  0 siblings, 1 reply; 35+ messages in thread
From: Philip Oakley @ 2011-12-22  0:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Brandon Casey

From: "Jeff King" <peff@peff.net> Sent: Friday, December 16, 2011 7:32 PM
> On Fri, Dec 16, 2011 at 07:26:00PM -0000, Philip Oakley wrote:
>
>> >+ "*.m diff=objc",
>>
>> There is a conflict here with the Matlab community which also uses
>> "*.m" files for its scripts and code.
>> They fit the "It's not that hard to do, but it's an extra step that
>> the user might not even know is an option." rationale.
>>
>> If the objc.m is used as a default it must be overidable easily, and
>> listed in the appropriate documentation to mitigate the "might not
>> even know" risk.
>
> It is easily overridable; just put your own "*.m" (or anything that
> matches your files) entry into your gitattributes file. I'm more
> concerned that people will start getting worse results than with the
> default, and not know how to fix it.
>
> If you have some Matlab files, would you mind doing diffs with the
> default driver and with the objc driver, and comment on how good or bad
> the results are?
>
> -Peff
> --
Sorry for the delay.
I started with a fresh install of Msysgit 1.7.8 for my tests, and created a 
test repo from a set of old project zip files, retaining only the *.m files. 
i.e. it is a real hack project. The diff shown was a small tweak & 
investigation step. Below are the three cases of:
1. plain vanilla install (no .gitattributes file)
2. with *.m=matlab in .gitattributes
3. with *.m=objc in .gitattributes

The "*.m=matlab" does give better (proper) hunk headers as it picks out the 
"^%%" comment line which starts a code block . For option 3 (ObjC) they are 
empty (which is poor). The plain vanila (default) hunk headers are so-so.

There is a vast quantity (10,000+) of Matlab examples on the Mathworks 
(vendor) File exchange web site, if anyone is interested, 
http://www.mathworks.com/matlabcentral/fileexchange/?sort=date_desc_updated&term=

Roughly my command sequence was:
$git diff HEAD HEAD~1  -p > test.txt
#rename test.txt to suitable name

$echo "*.m diff=matlab" >>.gitattributes
#repeat

$echo "*.m diff=objc" >>.gitattributes
#repeat

-----------------
1. plain vanilla install (no .gitattributes file)

diff --git a/detect_and_track.m b/detect_and_track.m
index 0f0356d..69d650f 100644
--- a/detect_and_track.m
+++ b/detect_and_track.m
@@ -161,9 +161,8 @@ sz = [960, 1280]; pixshrink = 10; % 10 is a guess!
 parabolic = pararate(sz(2), pixshrink);

 %% Set up the filters.
-% changed sizes to be an extra pixel all round both inner & outer
-hsize = [15 17]; % [11 13] size of (square) filter [vert horiz]
-inhsize = [7 7] ; % [5 5] size of inner filter
+hsize = [11 13]; % size of (square) filter [vert horiz]
+inhsize = [5 5] ; % size of inner filter
 % siz=[hsize hsize];
 sigma2 = [3.5 4.5]; % outer radius
 sigma3 = [1.4 1.4]; % inner radius
@@ -290,7 +289,7 @@ for TframeNum = FrameRange;  %change this value to read 
in a different frame
             % shrink the image
             Image = Shrink2(Image);
         end
-        if any(find(J==[1 2 3 4])) %  5 6 7
+        if any(find(J==[1 2])) %  3 4 5 6 7
             % list the levels you want procesed / rectangles shown
             % i.e. [1 2 3 4 5 6 7]
             [localmean localstdev ] = Point_Filter_cross3(Image,Table);
diff --git a/do_noise_ratios.m b/do_noise_ratios.m
index 663d898..dd73ed0 100644
--- a/do_noise_ratios.m
+++ b/do_noise_ratios.m
@@ -10,7 +10,7 @@

 % NewFrame{J,2}=localstdev; %

-for baselev=1:3;
+for baselev=1:2;
 Noise_Ratio = Stretch2(NewFrame{baselev+1,2}) ./  NewFrame{baselev,2} ;
 Noise_Ratio(Noise_Ratio>5)=5;
 figure(49+baselev); colorbar;


-----------------
2. with *.m=matlab in .gitattributes

diff --git a/detect_and_track.m b/detect_and_track.m
index 0f0356d..69d650f 100644
--- a/detect_and_track.m
+++ b/detect_and_track.m
@@ -161,9 +161,8 @@ %% set up the camera Configurations
 parabolic = pararate(sz(2), pixshrink);

 %% Set up the filters.
-% changed sizes to be an extra pixel all round both inner & outer
-hsize = [15 17]; % [11 13] size of (square) filter [vert horiz]
-inhsize = [7 7] ; % [5 5] size of inner filter
+hsize = [11 13]; % size of (square) filter [vert horiz]
+inhsize = [5 5] ; % size of inner filter
 % siz=[hsize hsize];
 sigma2 = [3.5 4.5]; % outer radius
 sigma3 = [1.4 1.4]; % inner radius
@@ -290,7 +289,7 @@ %% Loop Starts here
             % shrink the image
             Image = Shrink2(Image);
         end
-        if any(find(J==[1 2 3 4])) %  5 6 7
+        if any(find(J==[1 2])) %  3 4 5 6 7
             % list the levels you want procesed / rectangles shown
             % i.e. [1 2 3 4 5 6 7]
             [localmean localstdev ] = Point_Filter_cross3(Image,Table);
diff --git a/do_noise_ratios.m b/do_noise_ratios.m
index 663d898..dd73ed0 100644
--- a/do_noise_ratios.m
+++ b/do_noise_ratios.m
@@ -10,7 +10,7 @@ %% do_noise_ratios

 % NewFrame{J,2}=localstdev; %

-for baselev=1:3;
+for baselev=1:2;
 Noise_Ratio = Stretch2(NewFrame{baselev+1,2}) ./  NewFrame{baselev,2} ;
 Noise_Ratio(Noise_Ratio>5)=5;
 figure(49+baselev); colorbar;


-----------------
3. with *.m=objc in .gitattributes

diff --git a/detect_and_track.m b/detect_and_track.m
index 0f0356d..69d650f 100644
--- a/detect_and_track.m
+++ b/detect_and_track.m
@@ -161,9 +161,8 @@
 parabolic = pararate(sz(2), pixshrink);

 %% Set up the filters.
-% changed sizes to be an extra pixel all round both inner & outer
-hsize = [15 17]; % [11 13] size of (square) filter [vert horiz]
-inhsize = [7 7] ; % [5 5] size of inner filter
+hsize = [11 13]; % size of (square) filter [vert horiz]
+inhsize = [5 5] ; % size of inner filter
 % siz=[hsize hsize];
 sigma2 = [3.5 4.5]; % outer radius
 sigma3 = [1.4 1.4]; % inner radius
@@ -290,7 +289,7 @@
             % shrink the image
             Image = Shrink2(Image);
         end
-        if any(find(J==[1 2 3 4])) %  5 6 7
+        if any(find(J==[1 2])) %  3 4 5 6 7
             % list the levels you want procesed / rectangles shown
             % i.e. [1 2 3 4 5 6 7]
             [localmean localstdev ] = Point_Filter_cross3(Image,Table);
diff --git a/do_noise_ratios.m b/do_noise_ratios.m
index 663d898..dd73ed0 100644
--- a/do_noise_ratios.m
+++ b/do_noise_ratios.m
@@ -10,7 +10,7 @@

 % NewFrame{J,2}=localstdev; %

-for baselev=1:3;
+for baselev=1:2;
 Noise_Ratio = Stretch2(NewFrame{baselev+1,2}) ./  NewFrame{baselev,2} ;
 Noise_Ratio(Noise_Ratio>5)=5;
 figure(49+baselev); colorbar;

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCHv2 1/2] attr: map builtin userdiff drivers to well-known extensions
  2011-12-19 15:49           ` [PATCHv2 1/2] " Jeff King
  2011-12-19 18:07             ` Jonathan Nieder
@ 2011-12-22  1:47             ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2011-12-22  1:47 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Johannes Sixt, Junio C Hamano, Brandon Casey,
	Jonathan Nieder, Philip Oakley

On Mon, Dec 19, 2011 at 16:49, Jeff King <peff@peff.net> wrote:

> +       "*.perl diff=perl",
> +       "*.pl diff=perl",

This should also be:

 *.pm (for Perl module files)
 *.PL (for Makefile.PL)

And it's also very common for Perl code to use, for tests:

 *.t

But that likely runs into the namespace clashing issue all over again.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
  2011-12-22  0:05     ` Philip Oakley
@ 2011-12-23  5:47       ` Jeff King
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2011-12-23  5:47 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, git, Brandon Casey

On Thu, Dec 22, 2011 at 12:05:39AM -0000, Philip Oakley wrote:

> The "*.m=matlab" does give better (proper) hunk headers as it picks
> out the "^%%" comment line which starts a code block . For option 3
> (ObjC) they are empty (which is poor). The plain vanila (default)
> hunk headers are so-so.

Thanks for all of the detail. I think it comes down to the part I
quoted, though: it is indeed a disservice to matlab people to map "*.m"
to objc. So let's be conservative and not do that (projects can always
add their own gitattributes).

(Even before seeing your answer, I dropped it from the re-roll of my
patch that I sent, but this confirms to me that it was the right thing
to do).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2011-12-23  5:47 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-16 11:00 [PATCH] attr: map builtin userdiff drivers to well-known extensions Jeff King
2011-12-16 14:00 ` Johannes Sixt
2011-12-16 17:01   ` Junio C Hamano
2011-12-16 19:21   ` Jeff King
2011-12-16 19:30     ` Jeff King
2011-12-16 19:33     ` Junio C Hamano
2011-12-17  1:17       ` Jeff King
2011-12-16 22:05     ` Johannes Sixt
2011-12-17  1:21       ` Jeff King
2011-12-17  3:38         ` Jonathan Nieder
2011-12-19 15:49           ` [PATCHv2 1/2] " Jeff King
2011-12-19 18:07             ` Jonathan Nieder
2011-12-19 18:55               ` Jeff King
2011-12-22  1:47             ` Ævar Arnfjörð Bjarmason
2011-12-19 15:57           ` [PATCHv2 2/2] attr: drop C/C++ default extension mapping Jeff King
2011-12-19 18:10             ` Jonathan Nieder
2011-12-19 20:51               ` Thomas Rast
2011-12-19 20:52         ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Brandon Casey
2011-12-19 21:53           ` [PATCH] t4018: add a few more test cases for cpp hunk header matching Brandon Casey
2011-12-19 22:37           ` [PATCH] t4018: introduce test cases for the internal hunk header patterns Junio C Hamano
2011-12-19 22:57             ` Brandon Casey
2011-12-19 23:17               ` Junio C Hamano
2011-12-20  2:42                 ` [PATCH v2] " Brandon Casey
2011-12-20  8:25                   ` Jakub Narebski
2011-12-20 15:58                     ` Brandon Casey
2011-12-20  9:13                   ` Thomas Rast
2011-12-20 19:52                   ` Johannes Sixt
2011-12-20 20:08             ` [PATCH] " Junio C Hamano
2011-12-16 17:51 ` [PATCH] attr: map builtin userdiff drivers to well-known extensions Mark Levedahl
2011-12-16 19:28   ` Jeff King
2011-12-16 19:26 ` Philip Oakley
2011-12-16 19:32   ` Jeff King
2011-12-22  0:05     ` Philip Oakley
2011-12-23  5:47       ` Jeff King
2011-12-16 19:38   ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.