All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] diff: Introduce diff.algorithm variable
@ 2012-03-09 13:48 Michal Privoznik
  2012-03-09 18:40 ` Junio C Hamano
  2012-03-13 14:42 ` Tay Ray Chuan
  0 siblings, 2 replies; 6+ messages in thread
From: Michal Privoznik @ 2012-03-09 13:48 UTC (permalink / raw)
  To: git; +Cc: gitster, trast, peff, Lawrence.Holding

Some users have preference over various diff algorithms. However,
now they are forced to use appropriate argument every time they
run git-diff and tools using it. This is impractical. Therefore
create new variable which can set preferred algorithm. Of course,
this can be overridden on command line via --diff-algorithm=*.
Accepted values are myers (default), histogram, minimal, patience.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
This is basically v2 for:

   http://www.spinics.net/lists/git/msg176100.html

As we agreed on list, I've switched from diff.patience to
diff.algorithm and created new argument --diff-algorithm.

Please keep me CC'ed as I am not signed into the list.

 Documentation/diff-config.txt          |   21 +++++++++++++++++++
 Documentation/diff-options.txt         |   23 +++++++++++++++++++++
 contrib/completion/git-completion.bash |   13 +++++++++++
 diff.c                                 |   35 ++++++++++++++++++++++++++++++++
 4 files changed, 92 insertions(+), 0 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 6aa1be0..1047e81 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -1,3 +1,24 @@
+diff.algorithm::
+    Choose a diff algorithm.  The variants are as follows:
++
+--
+histogram::
+    This is the fastest algorithm.
+
+myers::
+    The classical Myers diff algorithm. This is the default.
+
+minimal::
+    Like 'myers', but spend extra time making sure that the diff
+    is the shortest possible for the set of changes performed.
+
+patience::
+    The patience diff algorithm, which first matches unique lines
+    with each other.  This sometimes results in more readable (if
+    longer) patches than the other algorithms.
+--
++
+
 diff.autorefreshindex::
 	When using 'git diff' to compare with work tree
 	files, do not consider stat-only change as changed.
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7d4566f..4e8bc5b 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -55,6 +55,29 @@ endif::git-format-patch[]
 --histogram::
 	Generate a diff using the "histogram diff" algorithm.
 
+--diff-algorithm={histogram|myers|minimal|patience}::
+    Choose a diff algorithm. The defaults are controlled
+    by the `diff.algorithm` configuration variable
+    (see linkgit:git-config[1]). The variants are as follows:
++
+--
+histogram::
+    This is the fastest algorithm.
+
+myers::
+    The classical Myers diff algorithm. This is the default.
+
+minimal::
+    Like 'myers', but spend extra time making sure that the diff
+    is the shortest possible for the set of changes performed.
+
+patience::
+    The patience diff algorithm, which first matches unique lines
+    with each other.  This sometimes results in more readable (if
+    longer) patches than the other algorithms.
+--
++
+
 --stat[=<width>[,<name-width>[,<count>]]]::
 	Generate a diffstat. By default, as much space as necessary
 	will be used for the filename part, and the rest for the graph
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index fba076d..d54f3a3 100755
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1299,6 +1299,7 @@ __git_diff_common_options="--stat --numstat --shortstat --summary
 			--raw
 			--dirstat --dirstat= --dirstat-by-file
 			--dirstat-by-file= --cumulative
+			--diff-algorithm=
 "
 
 _git_diff ()
@@ -1306,6 +1307,10 @@ _git_diff ()
 	__git_has_doubledash && return
 
 	case "$cur" in
+	--diff-algorithm=*)
+		__gitcomp "myers histogram minimal patience"
+		return
+		;;
 	--*)
 		__gitcomp "--cached --staged --pickaxe-all --pickaxe-regex
 			--base --ours --theirs --no-index
@@ -1569,6 +1574,10 @@ _git_log ()
 		__gitcomp "long short" "" "${cur##--decorate=}"
 		return
 		;;
+    --diff-algorithm=*)
+		__gitcomp "myers histogram minimal patience"
+		return
+		;;
 	--*)
 		__gitcomp "
 			$__git_log_common_options
@@ -2376,6 +2385,10 @@ _git_show ()
 			" "" "${cur#*=}"
 		return
 		;;
+	--diff-algorithm=*)
+		__gitcomp "myers histogram minimal patience"
+		return
+		;;
 	--*)
 		__gitcomp "--pretty= --format= --abbrev-commit --oneline
 			$__git_diff_common_options
diff --git a/diff.c b/diff.c
index 377ec1e..f5c965b 100644
--- a/diff.c
+++ b/diff.c
@@ -34,6 +34,7 @@ static int diff_no_prefix;
 static int diff_stat_graph_width;
 static int diff_dirstat_permille_default = 30;
 static struct diff_options default_diff_options;
+static int diff_algorithm = 0;
 
 static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_RESET,
@@ -47,6 +48,20 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
 };
 
+static int diff_algorithm_parse(const char *alg)
+{
+	if (!alg || !strcmp(alg, "myers"))
+		return 0;
+	else if (!strcmp(alg, "patience"))
+		return XDF_PATIENCE_DIFF;
+	else if (!strcmp(alg, "histogram"))
+		return XDF_HISTOGRAM_DIFF;
+	else if (!strcmp(alg, "minimal"))
+		return XDF_NEED_MINIMAL;
+
+	return -1;
+}
+
 static int parse_diff_color_slot(const char *var, int ofs)
 {
 	if (!strcasecmp(var+ofs, "plain"))
@@ -169,6 +184,12 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "diff.ignoresubmodules"))
 		handle_ignore_submodules_arg(&default_diff_options, value);
 
+	if (!strcmp(var, "diff.algorithm")) {
+		if ((diff_algorithm = diff_algorithm_parse(value)) < 0)
+			die("bad diff.algorithm value: %s", value);
+		return 0;
+	}
+
 	if (git_color_config(var, value, cb) < 0)
 		return -1;
 
@@ -3250,6 +3271,9 @@ int diff_setup_done(struct diff_options *options)
 		DIFF_OPT_SET(options, EXIT_WITH_STATUS);
 	}
 
+	/* set diff algorithm */
+	options->xdl_opts |= diff_algorithm;
+
 	return 0;
 }
 
@@ -3528,6 +3552,17 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 		DIFF_XDL_SET(options, PATIENCE_DIFF);
 	else if (!strcmp(arg, "--histogram"))
 		DIFF_XDL_SET(options, HISTOGRAM_DIFF);
+	else if ((argcount = parse_long_opt("diff-algorithm", av, &optarg))) {
+		int alg = diff_algorithm_parse(optarg);
+		if (alg < 0)
+			die("unknown algorithm: %s", optarg);
+		DIFF_XDL_CLR(options, NEED_MINIMAL);
+		DIFF_XDL_CLR(options, HISTOGRAM_DIFF);
+		DIFF_XDL_CLR(options, PATIENCE_DIFF);
+		options->xdl_opts |= alg;
+		return argcount;
+	}
+
 
 	/* flags options */
 	else if (!strcmp(arg, "--binary")) {
-- 
1.7.8.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Introduce diff.algorithm variable
  2012-03-09 13:48 [PATCH] diff: Introduce diff.algorithm variable Michal Privoznik
@ 2012-03-09 18:40 ` Junio C Hamano
  2012-03-12  5:42   ` Junio C Hamano
  2012-03-13 14:42 ` Tay Ray Chuan
  1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2012-03-09 18:40 UTC (permalink / raw)
  To: Michal Privoznik; +Cc: git, trast, peff, Lawrence.Holding

Michal Privoznik <mprivozn@redhat.com> writes:

> Some users have preference over various diff algorithms. However,
> now they are forced to use appropriate argument every time they
> run git-diff and tools using it. This is impractical. Therefore
> create new variable which can set preferred algorithm. Of course,

We tend to prefer avoiding these subjective and judgemental words
like "forced" and "impractical" when the paragraph conveys the same
reasoning even when they are toned down.

> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
> index 6aa1be0..1047e81 100644
> --- a/Documentation/diff-config.txt
> +++ b/Documentation/diff-config.txt
> @@ -1,3 +1,24 @@
> +diff.algorithm::
> +    Choose a diff algorithm.  The variants are as follows:

In order to avoid confusing users, we may want to mention that it is
deliberate that this configuration does not apply to plumbing
commands.

It may make sense to drop the enumeration of "The variants are...",
and say that it can take the values valid for the "--diff-algorithm"
parameter to the "diff" family of commands.  We do not have to worry
about keeping two enumerations in sync if we did so.

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 7d4566f..4e8bc5b 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -55,6 +55,29 @@ endif::git-format-patch[]
>  --histogram::
>  	Generate a diff using the "histogram diff" algorithm.
>  
> +--diff-algorithm={histogram|myers|minimal|patience}::
> +    Choose a diff algorithm. The defaults are controlled
> +    by the `diff.algorithm` configuration variable

s/$/ for the Porcelain commands/; perhaps?

> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index fba076d..d54f3a3 100755
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1299,6 +1299,7 @@ __git_diff_common_options="--stat --numstat --shortstat --summary
>  			--raw
>  			--dirstat --dirstat= --dirstat-by-file
>  			--dirstat-by-file= --cumulative
> +			--diff-algorithm=
>  "

So this helps me when I do "git diff --diff-al<TAB>" and expands it
to "git diff --diff-algorithm=" and then...

>  _git_diff ()
> @@ -1306,6 +1307,10 @@ _git_diff ()
>  	__git_has_doubledash && return
>  
>  	case "$cur" in
> +	--diff-algorithm=*)
> +		__gitcomp "myers histogram minimal patience"
> +		return
> +		;;

... when typing <TAB> again, this helps me by reminding what choices
are available.  Is that the idea?  Nice.

But I see you repeated this four-element list three times.  Can't we
do it in just one place?

> diff --git a/diff.c b/diff.c
> index 377ec1e..f5c965b 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -34,6 +34,7 @@ static int diff_no_prefix;
>  static int diff_stat_graph_width;
>  static int diff_dirstat_permille_default = 30;
>  static struct diff_options default_diff_options;
> +static int diff_algorithm = 0;

Please let BSS take care of this initialization by omitting " = 0".

> @@ -169,6 +184,12 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
>  	if (!strcmp(var, "diff.ignoresubmodules"))
>  		handle_ignore_submodules_arg(&default_diff_options, value);
>  
> +	if (!strcmp(var, "diff.algorithm")) {
> +		if ((diff_algorithm = diff_algorithm_parse(value)) < 0)
> +			die("bad diff.algorithm value: %s", value);

I am not sure if this is a good idea.  What happens when Git 2.0
adds a new algorithm, the user names it in ~/.gitconfig, and an
older version of Git sees this value?  Do you force the user to
modify ~/.gitconfig to use one of the older ones?

It might be more friendly to just warn and use the default.

> @@ -3250,6 +3271,9 @@ int diff_setup_done(struct diff_options *options)
>  		DIFF_OPT_SET(options, EXIT_WITH_STATUS);
>  	}
>  
> +	/* set diff algorithm */
> +	options->xdl_opts |= diff_algorithm;
> +
>  	return 0;
>  }
>  
> @@ -3528,6 +3552,17 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
>  		DIFF_XDL_SET(options, PATIENCE_DIFF);
>  	else if (!strcmp(arg, "--histogram"))
>  		DIFF_XDL_SET(options, HISTOGRAM_DIFF);
> +	else if ((argcount = parse_long_opt("diff-algorithm", av, &optarg))) {
> +		int alg = diff_algorithm_parse(optarg);
> +		if (alg < 0)
> +			die("unknown algorithm: %s", optarg);
> +		DIFF_XDL_CLR(options, NEED_MINIMAL);
> +		DIFF_XDL_CLR(options, HISTOGRAM_DIFF);
> +		DIFF_XDL_CLR(options, PATIENCE_DIFF);
> +		options->xdl_opts |= alg;
> +		return argcount;
> +	}
> +

The last two hunks gives me a queasy feeling for some reason.  I
think I recently saw a patch to clean up this area to clarify that
PATIENCE/HISTOGRAM are not independent option bits (but I cannot
remember where) and we probably should use the approach taken by
that patch.

Other than that, both the design and the implementation seems to be
done reasonably competently.  Don't we want to add tests for this?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Introduce diff.algorithm variable
  2012-03-09 18:40 ` Junio C Hamano
@ 2012-03-12  5:42   ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2012-03-12  5:42 UTC (permalink / raw)
  To: Michal Privoznik; +Cc: git, trast, peff, Lawrence.Holding

Junio C Hamano <gitster@pobox.com> writes:

> The last two hunks gives me a queasy feeling for some reason.  I
> think I recently saw a patch to clean up this area to clarify that
> PATIENCE/HISTOGRAM are not independent option bits (but I cannot
> remember where) and we probably should use the approach taken by
> that patch.

I did some digging; no wonder I remember it---it is one of the stuff
I wrote and then later discarded.

  http://thread.gmane.org/gmane.comp.version-control.git/191036/focus=191038

I think this "algorithm" thing should build on that patch, even
though the later parts of the series in that thread has turned out
to be unwanted.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Introduce diff.algorithm variable
  2012-03-09 13:48 [PATCH] diff: Introduce diff.algorithm variable Michal Privoznik
  2012-03-09 18:40 ` Junio C Hamano
@ 2012-03-13 14:42 ` Tay Ray Chuan
  2012-03-13 17:46   ` Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Tay Ray Chuan @ 2012-03-13 14:42 UTC (permalink / raw)
  To: Michal Privoznik; +Cc: git, gitster, trast, peff, Lawrence.Holding

On Fri, Mar 9, 2012 at 9:48 PM, Michal Privoznik <mprivozn@redhat.com> wrote:
> Some users have preference over various diff algorithms. However,
> now they are forced to use appropriate argument every time they
> run git-diff and tools using it. This is impractical. Therefore
> create new variable which can set preferred algorithm. Of course,
> this can be overridden on command line via --diff-algorithm=*.
> Accepted values are myers (default), histogram, minimal, patience.
>
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
> This is basically v2 for:
>
>   http://www.spinics.net/lists/git/msg176100.html
>
> As we agreed on list, I've switched from diff.patience to
> diff.algorithm and created new argument --diff-algorithm.
>
> Please keep me CC'ed as I am not signed into the list.
>
>  Documentation/diff-config.txt          |   21 +++++++++++++++++++
>  Documentation/diff-options.txt         |   23 +++++++++++++++++++++
>  contrib/completion/git-completion.bash |   13 +++++++++++
>  diff.c                                 |   35 ++++++++++++++++++++++++++++++++
>  4 files changed, 92 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
> index 6aa1be0..1047e81 100644
> --- a/Documentation/diff-config.txt
> +++ b/Documentation/diff-config.txt
> @@ -1,3 +1,24 @@
> +diff.algorithm::
> +    Choose a diff algorithm.  The variants are as follows:
> ++
> +--
> +histogram::
> +    This is the fastest algorithm.
> +
> +myers::
> +    The classical Myers diff algorithm. This is the default.
> +
> +minimal::
> +    Like 'myers', but spend extra time making sure that the diff
> +    is the shortest possible for the set of changes performed.
> +
> +patience::
> +    The patience diff algorithm, which first matches unique lines
> +    with each other.  This sometimes results in more readable (if
> +    longer) patches than the other algorithms.
> +--
> ++
> +

Considering that --minimal isn't really an algorithm, could this be
instead made to a "default options to always pass to diff" config? I
imagine this would make the codepath involved in this patch much
simpler (just reuse the argc/argv opt parsing machinery).

Additionally, it would be useful for users who want to, say, set the
context to 5 lines always.

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Introduce diff.algorithm variable
  2012-03-13 14:42 ` Tay Ray Chuan
@ 2012-03-13 17:46   ` Junio C Hamano
  2012-03-13 20:48     ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2012-03-13 17:46 UTC (permalink / raw)
  To: Tay Ray Chuan; +Cc: Michal Privoznik, git, trast, peff, Lawrence.Holding

Tay Ray Chuan <rctay89@gmail.com> writes:

> Considering that --minimal isn't really an algorithm,...

I think this discussion thread already settled that "minimal" *is* an
algorithm for the purpose of this topic, considering that the --minimal
option does not apply anything but myers.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Introduce diff.algorithm variable
  2012-03-13 17:46   ` Junio C Hamano
@ 2012-03-13 20:48     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2012-03-13 20:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Tay Ray Chuan, Michal Privoznik, git, trast, Lawrence.Holding

On Tue, Mar 13, 2012 at 10:46:46AM -0700, Junio C Hamano wrote:

> Tay Ray Chuan <rctay89@gmail.com> writes:
> 
> > Considering that --minimal isn't really an algorithm,...
> 
> I think this discussion thread already settled that "minimal" *is* an
> algorithm for the purpose of this topic, considering that the --minimal
> option does not apply anything but myers.

I don't know that it is settled. I proposed it as a concept, and nobody
else objected. It is true that right _now_ --minimal doesn't do anything
for non-myers algorithms, but in theory it could. IOW, it is a matter of
the user's perspective whether the flag is "use the minimal Myers diff
algorithm" versus "spend extra effort to make the diff more minimal, no
matter which algorithm is being used". If the latter, it is only that we
have not (yet) implemented the extra effort tweaks to the other
algorithms.

So I think we could go either way.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-03-13 20:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-09 13:48 [PATCH] diff: Introduce diff.algorithm variable Michal Privoznik
2012-03-09 18:40 ` Junio C Hamano
2012-03-12  5:42   ` Junio C Hamano
2012-03-13 14:42 ` Tay Ray Chuan
2012-03-13 17:46   ` Junio C Hamano
2012-03-13 20:48     ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.