All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] diff: make -M -C mean the same as -C -M
@ 2015-01-23  2:07 Mike Hommey
  2015-01-23 18:41 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Hommey @ 2015-01-23  2:07 UTC (permalink / raw)
  To: gitster; +Cc: git

While -C implies -M, it is quite common to see both on example command lines
here and there. The unintuitive thing is that if -M appears after -C, then
copy detection is turned off because of how the command line arguments are
handled.

Change this so that when both -C and -M appear, whatever their order, copy
detection is on.

Signed-off-by: Mike Hommey <mh@glandium.org>
---

Interestingly, I even found mentions of -C -M in this order for benchmarks,
on this very list (see 6555655.XSJ9EnW4BY@mako).

 diff.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index d1bd534..9081fd8 100644
--- a/diff.c
+++ b/diff.c
@@ -3670,7 +3670,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 		 !strcmp(arg, "--find-renames")) {
 		if ((options->rename_score = diff_scoreopt_parse(arg)) == -1)
 			return error("invalid argument to -M: %s", arg+2);
-		options->detect_rename = DIFF_DETECT_RENAME;
+		if (options->detect_rename != DIFF_DETECT_COPY)
+			options->detect_rename = DIFF_DETECT_RENAME;
 	}
 	else if (!strcmp(arg, "-D") || !strcmp(arg, "--irreversible-delete")) {
 		options->irreversible_delete = 1;
-- 
2.2.2.2.g806f5e2.dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] diff: make -M -C mean the same as -C -M
  2015-01-23  2:07 [PATCH] diff: make -M -C mean the same as -C -M Mike Hommey
@ 2015-01-23 18:41 ` Junio C Hamano
  2015-01-23 22:26   ` Mike Hommey
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2015-01-23 18:41 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

Mike Hommey <mh@glandium.org> writes:

> While -C implies -M, it is quite common to see both on example command lines
> here and there. The unintuitive thing is that if -M appears after -C, then
> copy detection is turned off because of how the command line arguments are
> handled.

This is deliberate, see below.

> Change this so that when both -C and -M appear, whatever their order, copy
> detection is on.
>
> Signed-off-by: Mike Hommey <mh@glandium.org>
> ---
>
> Interestingly, I even found mentions of -C -M in this order for benchmarks,
> on this very list (see 6555655.XSJ9EnW4BY@mako).

Aside from the general warning against taking everything you see on
this list as true, I think you are looking at an orange and talking
about an apple.  $gmane/244581 is about "git blame", and "-M/-C"
over there mean completely different things.

Imagine you have a file a/b/c/d/e/f/g/h/, where each alphabet
represents a line with more meaningful contents than a single-liner,
and slash represents an end of line, and you have changed the
contents of the file to e/f/g/h/a/b/c/d/.

"blame -M" is to tell the command to notice that you moved the first
four lines to the end (i.e. you did not do anything original and
just moved lines).  Because this needs more processing to notice,
the feature is triggered by a "-M" option (you would see something
like "e/f/g/h/ came from the original and then a/b/c/d/ are newly
added" without the option).  "-M" in "blame" is about showing such
movement of lines within the same file [*1*].

On the other hand "-C" in "blame" is about noticing contents that
were copied from other files.

In the context of "git blame", "-C" and "-M" control orthogonal
concepts and it makes sense to use only one but not the other, or
both.

But "-M" given to "git diff" is not about such an orthogonal
concept.

Even though its source candidates are narrower than that of "-C" in
that "-M" chooses only from the set of files that disappeared while
"-C" also chooses from files that remain after the change, they are
both about "which file did the whole thing come from?", and it makes
sense to have progression of "-M" < "-C" < "-C -C".  You can think
of these as a short-hand for --rename-copy-level={0,1,2,3} option
(where the level 0 -- do not do anything -- corresponds to giving no
"-M/-C").  It can be argued both ways: either we take the maximum of
the values given to --rename-copy-level options (which corresponds
to what your patch attempts to do), or the last one wins.  We happen
to have implemented the latter long time ago and that is how
existing users expect things to work.

So, I dunno.  I am saying that the semantics the current code gives
is *not* the only possibly correct one, and I am not saying that
your alternative interpretation is *wrong*, but I am not sure if it
makes sense to switch the semantics this late in the game.


[Footnote]

*1* "diff" cannot do anything magical about such a change.  It can
only say that you removed the first four lines from the original,
and then added new four lines at the end (or it may say you added
new four lines at the beginning and removed four lines at the end).

There is no way for "diff" to say that these new four lines have any
relation to what you removed from the beginning of the file, with
any combination of options.  This is inherent in its output format,
which is limited to express only deletions and additions.

>  diff.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/diff.c b/diff.c
> index d1bd534..9081fd8 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3670,7 +3670,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
>  		 !strcmp(arg, "--find-renames")) {
>  		if ((options->rename_score = diff_scoreopt_parse(arg)) == -1)
>  			return error("invalid argument to -M: %s", arg+2);
> -		options->detect_rename = DIFF_DETECT_RENAME;
> +		if (options->detect_rename != DIFF_DETECT_COPY)
> +			options->detect_rename = DIFF_DETECT_RENAME;
>  	}
>  	else if (!strcmp(arg, "-D") || !strcmp(arg, "--irreversible-delete")) {
>  		options->irreversible_delete = 1;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] diff: make -M -C mean the same as -C -M
  2015-01-23 18:41 ` Junio C Hamano
@ 2015-01-23 22:26   ` Mike Hommey
  2015-01-23 22:34     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Hommey @ 2015-01-23 22:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Jan 23, 2015 at 10:41:10AM -0800, Junio C Hamano wrote:
> Mike Hommey <mh@glandium.org> writes:
> 
> > While -C implies -M, it is quite common to see both on example command lines
> > here and there. The unintuitive thing is that if -M appears after -C, then
> > copy detection is turned off because of how the command line arguments are
> > handled.
> 
> This is deliberate, see below.
> 
> > Change this so that when both -C and -M appear, whatever their order, copy
> > detection is on.
> >
> > Signed-off-by: Mike Hommey <mh@glandium.org>
> > ---
> >
> > Interestingly, I even found mentions of -C -M in this order for benchmarks,
> > on this very list (see 6555655.XSJ9EnW4BY@mako).
> 
> Aside from the general warning against taking everything you see on
> this list as true

The problem is not whether what is on the list is true or not, but that
people will use what they see.

> I think you are looking at an orange and talking
> about an apple.  $gmane/244581 is about "git blame", and "-M/-C"
> over there mean completely different things.

I thought blame was using the diff option parser, and I was wrong.

(snip)
> In the context of "git blame", "-C" and "-M" control orthogonal
> concepts and it makes sense to use only one but not the other, or
> both.

In the context of blame both -C and -M |= a flags set, so one doesn't
override the other. You can place them in any order, the result will be
the same. Incidentally, -C includes the flag -M sets, so -C -M is
actually redundant. What -C and -M can be used for is set different
scores (-C9 -M8).

> But "-M" given to "git diff" is not about such an orthogonal
> concept.
> 
> Even though its source candidates are narrower than that of "-C" in
> that "-M" chooses only from the set of files that disappeared while
> "-C" also chooses from files that remain after the change, they are
> both about "which file did the whole thing come from?", and it makes
> sense to have progression of "-M" < "-C" < "-C -C".  You can think
> of these as a short-hand for --rename-copy-level={0,1,2,3} option
> (where the level 0 -- do not do anything -- corresponds to giving no
> "-M/-C").  It can be argued both ways: either we take the maximum of
> the values given to --rename-copy-level options (which corresponds
> to what your patch attempts to do), or the last one wins.  We happen
> to have implemented the latter long time ago and that is how
> existing users expect things to work.

Are they? I, for one, wasn't. It is easy to understand that --foo=1
--foo=2 is going to take the last given --foo, but do people really
expect the last of -C -M to be used? Reading the code further, I can see
that it's even more confusing than that: -C -C wins over -M, wherever it
happens. So -C -C -M == -C -C ; -C -M == -M ; -M -C == -C.

At the very least, this should be spelled out in the documentation.

Mike

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] diff: make -M -C mean the same as -C -M
  2015-01-23 22:26   ` Mike Hommey
@ 2015-01-23 22:34     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2015-01-23 22:34 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

Mike Hommey <mh@glandium.org> writes:

>> In the context of "git blame", "-C" and "-M" control orthogonal
>> concepts and it makes sense to use only one but not the other, or
>> both.
>
> In the context of blame both -C and -M |= a flags set, so one doesn't
> override the other. You can place them in any order, the result will be
> the same. Incidentally, -C includes the flag -M sets, so -C -M is
> actually redundant. What -C and -M can be used for is set different
> scores (-C9 -M8).

Yes.  That is what I meant by "orthogonal" and "makese sense to use
only one but not the other, or both".  As they are not related with
each other, it makes sense to mix them freely, unlike "-C/-M" given
to diff.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-23 22:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23  2:07 [PATCH] diff: make -M -C mean the same as -C -M Mike Hommey
2015-01-23 18:41 ` Junio C Hamano
2015-01-23 22:26   ` Mike Hommey
2015-01-23 22:34     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.