git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "John Cai via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Eric Sunshine" <sunshine@sunshineco.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jeff King" <peff@peff.net>, "Elijah Newren" <newren@gmail.com>,
	"John Cai" <johncai86@gmail.com>
Subject: [PATCH v4 0/2] Teach diff to honor diff algorithms set through git attributes
Date: Mon, 20 Feb 2023 21:04:40 +0000	[thread overview]
Message-ID: <pull.1452.v4.git.git.1676927082.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1452.v3.git.git.1676665285.gitgitgadget@gmail.com>

When a repository contains different kinds of files, it may be desirable to
use different algorithms based on file type. This is currently not feasible
through the command line or using git configs. However, we can leverage the
fact that gitattributes are path aware.

Teach the diff machinery to check gitattributes when diffing files by using
the existing diff. scheme, and add an "algorithm" type to the external
driver config.

Change since V3:

 * cleaned up documentation, typos
 * minor cleanup such as if statement ordering, and overly long lines

Changes since V2:

 * minor clean up and variable renaming
 * avoid parsing attribute files for the driver if the diff algorithm is set
   through the command line

Changes since V1:

 * utilize the existing diff.<driver>.* scheme where the driver is defined
   in gitattributes, but the algorithm is defined in the gitconfig.

To address some of the performance concerns in the previous series, a
benchmark shows that now only a minor performance penalty is incurred, now
that we are no longer adding an additional attributes parsing call:

$ echo "*.[ch] diff=other" >> .gitattributes $ hyperfine -r 10 -L a
git-bin-wrapper,git '{a} -c diff.other.algorithm=myers diff v2.0.0 v2.28.0'
Benchmark 1: git-bin-wrapper -c diff.other.algorithm=myers diff v2.0.0
v2.28.0 Time (mean ± σ): 716.3 ms ± 3.8 ms [User: 660.2 ms, System: 50.8 ms]
Range (min … max): 709.8 ms … 720.6 ms 10 runs

Benchmark 2: git -c diff.other.algorithm=myers diff v2.0.0 v2.28.0 Time
(mean ± σ): 704.3 ms ± 2.9 ms [User: 656.6 ms, System: 44.3 ms] Range (min …
max): 700.1 ms … 708.6 ms 10 runs

Summary 'git -c diff.other.algorithm=myers diff v2.0.0 v2.28.0' ran 1.02 ±
0.01 times faster than 'git-bin-wrapper -c diff.other.algorithm=myers diff
v2.0.0 v2.28.0'

John Cai (2):
  diff: consolidate diff algorithm option parsing
  diff: teach diff to read algorithm from diff driver

 Documentation/gitattributes.txt | 31 ++++++++++++
 diff.c                          | 90 ++++++++++++++++++++++++---------
 diff.h                          |  1 +
 t/lib-diff-alternative.sh       | 38 +++++++++++++-
 userdiff.c                      |  4 +-
 userdiff.h                      |  1 +
 6 files changed, 140 insertions(+), 25 deletions(-)


base-commit: c867e4fa180bec4750e9b54eb10f459030dbebfd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1452%2Fjohn-cai%2Fjc%2Fattr-diff-algo-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1452/john-cai/jc/attr-diff-algo-v4
Pull-Request: https://github.com/git/git/pull/1452

Range-diff vs v3:

 1:  816c47aa414 = 1:  816c47aa414 diff: consolidate diff algorithm option parsing
 2:  b330222ce83 ! 2:  77e66ab98fc diff: teach diff to read algorithm from diff driver
     @@ Commit message
          finally the diff.algorithm config.
      
          To enforce precedence order, use a new `ignore_driver_algorithm` member
     -    during options pasing to indicate the diff algorithm was set via command
     +    during options parsing to indicate the diff algorithm was set via command
          line args.
      
          Signed-off-by: John Cai <johncai86@gmail.com>
     @@ Documentation/gitattributes.txt: with the above configuration, i.e. `j-c-diff`,
      +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      +
      +The diff algorithm can be set through the `diff.algorithm` config key, but
     -+sometimes it may be helpful to set the diff algorithm by path. For example, one
     -+might wish to set a diff algorithm automatically for all `.json` files such that
     -+the user would not need to pass in a separate command line `--diff-algorithm`
     -+flag each time.
     ++sometimes it may be helpful to set the diff algorithm per path. For example,
     ++one may want to use the `minimal` diff algorithm for .json files, and the
     ++`histogram` for .c files, and so on without having to pass in the algorithm
     ++through the command line each time.
      +
      +First, in `.gitattributes`, assign the `diff` attribute for paths.
      +
     @@ Documentation/gitattributes.txt: with the above configuration, i.e. `j-c-diff`,
      +------------------------
      +
      +Then, define a "diff.<name>.algorithm" configuration to specify the diff
     -+algorithm, choosing from `meyers`, `patience`, `minimal`, or `histogram`.
     ++algorithm, choosing from `myers`, `patience`, `minimal`, or `histogram`.
      +
      +----------------------------------------------------------------
      +[diff "<name>"]
     @@ Documentation/gitattributes.txt: with the above configuration, i.e. `j-c-diff`,
      +git-show(1) and is used for the `--stat` output as well. The merge machinery
      +will not use the diff algorithm set through this method.
      +
     -+NOTE: If the `command` key also exists, then Git will treat this as an external
     -+diff and attempt to use the value set for `command` as an external program. For
     -+instance, the following config, combined with the above `.gitattributes` file,
     -+will result in `command` favored over `algorithm`.
     -+
     -+----------------------------------------------------------------
     -+[diff "<name>"]
     -+  command = j-c-diff
     -+  algorithm = histogram
     -+----------------------------------------------------------------
     ++NOTE: If `diff.<name>.command` is defined for path with the
     ++`diff=<name>` attribute, it is executed as an external diff driver
     ++(see above), and adding `diff.<name>.algorithm` has no effect, as the
     ++algorithm is not passed to the external diff driver.
       
       Defining a custom hunk-header
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     @@ diff.c: static void run_diff_cmd(const char *pgm,
       	}
      -	if (one && two)
      +	if (one && two) {
     -+		if (drv && !o->ignore_driver_algorithm && drv->algorithm)
     ++		if (!o->ignore_driver_algorithm && drv && drv->algorithm)
      +			set_diff_algorithm(o, drv->algorithm);
      +
       		builtin_diff(name, other ? other : name,
     @@ diff.c: static void run_diffstat(struct diff_filepair *p, struct diff_options *o
       	const char *other;
       
      +	if (!o->ignore_driver_algorithm) {
     -+		struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path);
     ++		struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index,
     ++								    p->one->path);
      +
     -+		if (drv && drv->algorithm) {
     ++		if (drv && drv->algorithm)
      +			set_diff_algorithm(o, drv->algorithm);
     -+		}
      +	}
      +
       	if (DIFF_PAIR_UNMERGED(p)) {
     @@ t/lib-diff-alternative.sh: index $file1..$file2 100644
      +
      +	test_expect_success "$STRATEGY diff command line precedence before attributes" '
      +		echo "file* diff=driver" >.gitattributes &&
     -+		git config diff.driver.algorithm meyers &&
     ++		git config diff.driver.algorithm myers &&
      +		test_must_fail git diff --no-index "--diff-algorithm=$STRATEGY" file1 file2 > output &&
      +		test_cmp expect output
      +	'

-- 
gitgitgadget

  parent reply	other threads:[~2023-02-20 21:04 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-05  3:46 [PATCH 0/2] Teach diff to honor diff algorithms set through git attributes John Cai via GitGitGadget
2023-02-05  3:46 ` [PATCH 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-06 16:20   ` Phillip Wood
2023-02-05  3:46 ` [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm John Cai via GitGitGadget
2023-02-05 17:50   ` Eric Sunshine
2023-02-06 13:10     ` John Cai
2023-02-06 16:27   ` Phillip Wood
2023-02-06 18:14     ` Eric Sunshine
2023-02-06 19:50     ` John Cai
2023-02-09  8:26       ` Elijah Newren
2023-02-09 10:31         ` "bad" diffs (was: [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm) Ævar Arnfjörð Bjarmason
2023-02-09 16:37         ` [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm John Cai
2023-02-06 16:39   ` Ævar Arnfjörð Bjarmason
2023-02-06 20:37     ` John Cai
2023-02-07 14:55       ` Phillip Wood
2023-02-07 17:00         ` John Cai
2023-02-09  9:09           ` Elijah Newren
2023-02-09 14:44             ` Phillip Wood
2023-02-10  9:57               ` Elijah Newren
2023-02-11 17:39                 ` Phillip Wood
2023-02-11  1:59               ` Jeff King
2023-02-15  2:35                 ` Elijah Newren
2023-02-15  4:21                   ` Jeff King
2023-02-15  5:20                     ` Junio C Hamano
2023-02-15 14:44                 ` Phillip Wood
2023-02-15 15:00                   ` Jeff King
2023-02-07 17:27         ` Ævar Arnfjörð Bjarmason
2023-02-15 14:47           ` Phillip Wood
2023-02-09  8:44       ` Elijah Newren
2023-02-14 21:16         ` John Cai
2023-02-15  3:41           ` Elijah Newren
2023-02-09  7:50     ` Elijah Newren
2023-02-09  9:41       ` Ævar Arnfjörð Bjarmason
2023-02-11  2:04         ` Jeff King
2023-02-07 17:56   ` Jeff King
2023-02-07 20:18     ` Ævar Arnfjörð Bjarmason
2023-02-07 20:47       ` Junio C Hamano
2023-02-07 21:05         ` Ævar Arnfjörð Bjarmason
2023-02-07 21:28           ` Junio C Hamano
2023-02-07 21:44             ` Ævar Arnfjörð Bjarmason
2023-02-09 16:34     ` John Cai
2023-02-11  1:39       ` Jeff King
2023-02-14 21:40 ` [PATCH v2 0/2] Teach diff to honor diff algorithms set through git attributes John Cai via GitGitGadget
2023-02-14 21:40   ` [PATCH v2 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-15  2:38     ` Junio C Hamano
2023-02-15 23:34       ` John Cai
2023-02-15 23:42         ` Junio C Hamano
2023-02-16  2:14           ` Jeff King
2023-02-16  2:57             ` Junio C Hamano
2023-02-16 20:34               ` John Cai
2023-02-14 21:40   ` [PATCH v2 2/2] diff: teach diff to read gitattribute diff-algorithm John Cai via GitGitGadget
2023-02-15  2:56     ` Junio C Hamano
2023-02-15  3:20       ` Junio C Hamano
2023-02-16 20:37         ` John Cai
2023-02-17 20:21   ` [PATCH v3 0/2] Teach diff to honor diff algorithms set through git attributes John Cai via GitGitGadget
2023-02-17 20:21     ` [PATCH v3 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-17 21:27       ` Junio C Hamano
2023-02-18  1:36       ` Elijah Newren
2023-02-17 20:21     ` [PATCH v3 2/2] diff: teach diff to read algorithm from diff driver John Cai via GitGitGadget
2023-02-17 21:50       ` Junio C Hamano
2023-02-18  2:56       ` Elijah Newren
2023-02-20 15:32         ` John Cai
2023-02-20 16:21           ` Elijah Newren
2023-02-20 16:49             ` John Cai
2023-02-20 17:32               ` Elijah Newren
2023-02-20 20:53                 ` John Cai
2023-02-22 19:47                 ` Jeff King
2023-02-24 17:44                   ` John Cai
2023-02-18  1:16     ` [PATCH v3 0/2] Teach diff to honor diff algorithms set through git attributes Elijah Newren
2023-02-20 13:37       ` John Cai
2023-02-20 21:04     ` John Cai via GitGitGadget [this message]
2023-02-20 21:04       ` [PATCH v4 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-20 21:04       ` [PATCH v4 2/2] diff: teach diff to read algorithm from diff driver John Cai via GitGitGadget
2023-02-21 17:34       ` [PATCH v4 0/2] Teach diff to honor diff algorithms set through git attributes Junio C Hamano
2023-02-21 18:05         ` Elijah Newren
2023-02-21 18:51           ` Junio C Hamano
2023-02-21 19:36             ` John Cai
2023-02-21 20:16               ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1452.v4.git.git.1676927082.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=johncai86@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=phillip.wood123@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).