* [PATCH v2 1/3] t/t3430: avoid undefined git diff behavior
2020-06-09 19:00 ` [PATCH v2 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
@ 2020-06-09 19:00 ` Chris Torek via GitGitGadget
2020-06-09 19:00 ` [PATCH v2 2/3] git diff: improve A...B merge-base handling Chris Torek via GitGitGadget
` (2 subsequent siblings)
3 siblings, 0 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-09 19:00 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
The autosquash-and-exec test used "git diff HEAD^!" to mean
"git diff HEAD^ HEAD". Use these directly instead of relying
on the undefined but actual-current behavior of "HEAD^!".
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
t/t3430-rebase-merges.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index a1bc3e20016..b454f400ebd 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -420,7 +420,7 @@ test_expect_success 'with --autosquash and --exec' '
git commit --fixup B B.t &&
write_script show.sh <<-\EOF &&
subject="$(git show -s --format=%s HEAD)"
- content="$(git diff HEAD^! | tail -n 1)"
+ content="$(git diff HEAD^ HEAD | tail -n 1)"
echo "$subject: $content"
EOF
test_tick &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 2/3] git diff: improve A...B merge-base handling
2020-06-09 19:00 ` [PATCH v2 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
2020-06-09 19:00 ` [PATCH v2 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
@ 2020-06-09 19:00 ` Chris Torek via GitGitGadget
2020-06-09 22:52 ` Junio C Hamano
2020-06-09 19:00 ` [PATCH v2 3/3] Documentation: tweak git diff help slightly Chris Torek via GitGitGadget
2020-06-11 15:15 ` [PATCH v3 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
3 siblings, 1 reply; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-09 19:00 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
To avoid all this, add a routine to catch the A...B case and verify that
there is at least one merge base, and that the arguments make sense.
As a side effect, produce a warning showing *which* merge base is being
used when there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
builtin/diff.c | 129 +++++++++++++++++++++++++++++++++-----
t/t4068-diff-symmetric.sh | 81 ++++++++++++++++++++++++
2 files changed, 195 insertions(+), 15 deletions(-)
create mode 100755 t/t4068-diff-symmetric.sh
diff --git a/builtin/diff.c b/builtin/diff.c
index 8537b17bd5e..0b6e63dbd02 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -6,6 +6,7 @@
#define USE_THE_INDEX_COMPATIBILITY_MACROS
#include "cache.h"
#include "config.h"
+#include "ewah/ewok.h"
#include "lockfile.h"
#include "color.h"
#include "commit.h"
@@ -254,6 +255,103 @@ static int builtin_diff_files(struct rev_info *revs, int argc, const char **argv
return run_diff_files(revs, options);
}
+struct symdiff {
+ struct bitmap *skip; /* bitmap of commit indices to skip, or NULL */
+ int warn; /* true if there were multiple merge bases */
+ int base, left, right; /* index of chosen merge base and left&right */
+};
+
+/*
+ * Check for symmetric-difference arguments, and if present, arrange
+ * everything we need to know to handle them correctly.
+ *
+ * For an actual symmetric diff, *symdiff is set this way:
+ *
+ * - its skip is non-NULL and marks *all* rev->pending.objects[i]
+ * indices that the caller should ignore (extra merge bases, of
+ * which there might be many, and A in A...B). Note that the
+ * chosen merge base and right side are NOT marked.
+ * - warn is set if there are multiple merge bases.
+ * - base, left, and right hold the merge base and left and
+ * right side indices, for warnings or errors.
+ *
+ * If there is no symmetric diff argument, sym->skip is NULL and
+ * sym->warn is cleared. The remaining fields are not set.
+ *
+ * If the user provides a symmetric diff with no merge base, or
+ * more than one range, we do a usage-exit.
+ */
+static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
+{
+ int i, lcount = 0, rcount = 0, basecount = 0;
+ int lpos = -1, rpos = -1, basepos = -1;
+ struct bitmap *map = NULL;
+
+ /*
+ * Use the whence fields to find merge bases and left and
+ * right parts of symmetric difference, so that we do not
+ * depend on the order that revisions are parsed. If there
+ * are any revs that aren't from these sources, we have a
+ * "git diff C A...B" or "git diff A...B C" case. Or we
+ * could even get "git diff A...B C...E", for instance.
+ *
+ * If we don't have just one merge base, we pick one
+ * at random.
+ *
+ * NB: REV_CMD_LEFT, REV_CMD_RIGHT are also used for A..B,
+ * so we must check for SYMMETRIC_LEFT too. The two arrays
+ * rev->pending.objects and rev->cmdline.rev are parallel.
+ */
+ for (i = 0; i < rev->cmdline.nr; i++) {
+ struct object *obj = rev->pending.objects[i].item;
+ switch (rev->cmdline.rev[i].whence) {
+ case REV_CMD_MERGE_BASE:
+ if (basepos < 0)
+ basepos = i;
+ basecount++;
+ break; /* do mark all bases */
+ case REV_CMD_LEFT:
+ if (obj->flags & SYMMETRIC_LEFT) {
+ lpos = i;
+ lcount++;
+ break; /* do mark A */
+ }
+ continue;
+ case REV_CMD_RIGHT:
+ rpos = i;
+ rcount++;
+ continue; /* don't mark B */
+ default:
+ continue;
+ }
+ if (map == NULL)
+ map = bitmap_new();
+ bitmap_set(map, i);
+ }
+
+ if (lcount == 0) { /* not a symmetric difference */
+ bitmap_free(map);
+ sym->warn = 0;
+ sym->skip = NULL;
+ return;
+ }
+
+ if (lcount != 1)
+ die(_("cannot use more than one symmetric difference"));
+
+ if (basecount == 0) {
+ const char *lname = rev->pending.objects[lpos].name;
+ const char *rname = rev->pending.objects[rpos].name;
+ die(_("%s...%s: no merge base"), lname, rname);
+ }
+ bitmap_unset(map, basepos); /* unmark the base we want */
+ sym->base = basepos;
+ sym->left = lpos;
+ sym->right = rpos;
+ sym->warn = basecount > 1;
+ sym->skip = map;
+}
+
int cmd_diff(int argc, const char **argv, const char *prefix)
{
int i;
@@ -263,6 +361,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
struct object_array_entry *blob[2];
int nongit = 0, no_index = 0;
int result = 0;
+ struct symdiff sdiff;
/*
* We could get N tree-ish in the rev.pending_objects list.
@@ -382,6 +481,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
}
}
+ symdiff_prepare(&rev, &sdiff);
for (i = 0; i < rev.pending.nr; i++) {
struct object_array_entry *entry = &rev.pending.objects[i];
struct object *obj = entry->item;
@@ -394,8 +494,9 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
die(_("invalid object '%s' given."), name);
if (obj->type == OBJ_COMMIT)
obj = &get_commit_tree(((struct commit *)obj))->object;
-
if (obj->type == OBJ_TREE) {
+ if (sdiff.skip && bitmap_get(sdiff.skip, i))
+ continue;
obj->flags |= flags;
add_object_array(obj, name, &ent);
} else if (obj->type == OBJ_BLOB) {
@@ -437,24 +538,22 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
usage(builtin_diff_usage);
else if (ent.nr == 1)
result = builtin_diff_index(&rev, argc, argv);
- else if (ent.nr == 2)
+ else if (ent.nr == 2) {
+ if (sdiff.warn) {
+ const char *lname = rev.pending.objects[sdiff.left].name;
+ const char *rname = rev.pending.objects[sdiff.right].name;
+ const char *basename = rev.pending.objects[sdiff.base].name;
+ warning(_("%s...%s: multiple merge bases, using %s"),
+ lname, rname, basename);
+ }
result = builtin_diff_tree(&rev, argc, argv,
&ent.objects[0], &ent.objects[1]);
- else if (ent.objects[0].item->flags & UNINTERESTING) {
- /*
- * diff A...B where there is at least one merge base
- * between A and B. We have ent.objects[0] ==
- * merge-base, ent.objects[ents-2] == A, and
- * ent.objects[ents-1] == B. Show diff between the
- * base and B. Note that we pick one merge base at
- * random if there are more than one.
- */
- result = builtin_diff_tree(&rev, argc, argv,
- &ent.objects[0],
- &ent.objects[ent.nr-1]);
- } else
+ } else {
+ if (sdiff.skip)
+ usage(builtin_diff_usage);
result = builtin_diff_combined(&rev, argc, argv,
ent.objects, ent.nr);
+ }
result = diff_result_code(&rev.diffopt, result);
if (1 < rev.diffopt.skip_stat_unmatch)
refresh_index_quietly();
diff --git a/t/t4068-diff-symmetric.sh b/t/t4068-diff-symmetric.sh
new file mode 100755
index 00000000000..7b5988933da
--- /dev/null
+++ b/t/t4068-diff-symmetric.sh
@@ -0,0 +1,81 @@
+#!/bin/sh
+
+test_description='behavior of diff with symmetric-diff setups'
+
+. ./test-lib.sh
+
+# build these situations:
+# - normal merge with one merge base (b1...b2);
+# - criss-cross merge ie 2 merge bases (b1...master);
+# - disjoint subgraph (orphan branch, b3...master).
+#
+# B---E <-- master
+# / \ /
+# A X
+# \ / \
+# C---D--G <-- br1
+# \ /
+# ---F <-- br2
+#
+# H <-- br3
+#
+# We put files into a few commits so that we can verify the
+# output as well.
+
+test_expect_success setup '
+ git commit --allow-empty -m A &&
+ echo b >b &&
+ git add b &&
+ git commit -m B &&
+ git checkout -b br1 HEAD^ &&
+ echo c >c &&
+ git add c &&
+ git commit -m C &&
+ git tag commit-C &&
+ git merge -m D master &&
+ git tag commit-D &&
+ git checkout master &&
+ git merge -m E commit-C &&
+ git checkout -b br2 commit-C &&
+ echo f >f &&
+ git add f &&
+ git commit -m F &&
+ git checkout br1 &&
+ git merge -m G br2 &&
+ git checkout --orphan br3 &&
+ git commit -m H
+'
+
+test_expect_success 'diff with one merge base' '
+ git diff commit-D...br1 >tmp &&
+ tail -1 tmp >actual &&
+ echo +f >expect &&
+ test_cmp expect actual
+'
+
+# The output (in tmp) can have +b or +c depending
+# on which merge base (commit B or C) is picked.
+# It should have one of those two, which comes out
+# to seven lines.
+test_expect_success 'diff with two merge bases' '
+ git diff br1...master >tmp 2>err &&
+ test_line_count = 7 tmp &&
+ test_line_count = 1 err
+'
+
+test_expect_success 'diff with no merge bases' '
+ test_must_fail git diff br2...br3 >tmp 2>err &&
+ test_i18ngrep "fatal: br2...br3: no merge base" err
+'
+
+test_expect_success 'diff with too many symmetric differences' '
+ test_must_fail git diff br1...master br2...br3 >tmp 2>err &&
+ test_i18ngrep "fatal: cannot use more than one symmetric difference" err
+'
+
+test_expect_success 'diff with symmetric difference and extraneous arg' '
+ test_must_fail git diff master br1...master >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v2 2/3] git diff: improve A...B merge-base handling
2020-06-09 19:00 ` [PATCH v2 2/3] git diff: improve A...B merge-base handling Chris Torek via GitGitGadget
@ 2020-06-09 22:52 ` Junio C Hamano
0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2020-06-09 22:52 UTC (permalink / raw)
To: Chris Torek via GitGitGadget; +Cc: git, Chris Torek
"Chris Torek via GitGitGadget" <gitgitgadget@gmail.com> writes:
> +static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
> +{
> + int i, lcount = 0, rcount = 0, basecount = 0;
> + int lpos = -1, rpos = -1, basepos = -1;
> + struct bitmap *map = NULL;
The logic around rcount and rpos in this function still smells
fishy. For example, rcount is counted up from 0 but its value is
never consulted, so we should be able to get rid of it.
For that matter, lcount and lpos look somewhat redundant. lpos
begins with -1 to signal "have not seen any left end of symmetric
range yet", and we won't allow more than two symmetric ranges
anyway, so we should be able to get rid of lcount and base the error
condition purely on lpos by
* when we see SYMMETRIC_LEFT, check if lpos is already non-negative,
and die otherwise right there. We don't need "if lcount != 1, die"
after the loop.
* after the loop, if lpos is still -1, we know we didn't see
symmetric difference.
> + /*
> + * Use the whence fields to find merge bases and left and
> + * right parts of symmetric difference, so that we do not
> + * depend on the order that revisions are parsed. If there
> + * are any revs that aren't from these sources, we have a
> + * "git diff C A...B" or "git diff A...B C" case. Or we
> + * could even get "git diff A...B C...E", for instance.
> + *
> + * If we don't have just one merge base, we pick one
> + * at random.
> + *
> + * NB: REV_CMD_LEFT, REV_CMD_RIGHT are also used for A..B,
> + * so we must check for SYMMETRIC_LEFT too. The two arrays
> + * rev->pending.objects and rev->cmdline.rev are parallel.
> + */
> + for (i = 0; i < rev->cmdline.nr; i++) {
> + struct object *obj = rev->pending.objects[i].item;
> + switch (rev->cmdline.rev[i].whence) {
> + case REV_CMD_MERGE_BASE:
> + if (basepos < 0)
> + basepos = i;
> + basecount++;
> + break; /* do mark all bases */
> + case REV_CMD_LEFT:
> + if (obj->flags & SYMMETRIC_LEFT) {
> + lpos = i;
> + lcount++;
> + break; /* do mark A */
> + }
> + continue;
> + case REV_CMD_RIGHT:
> + rpos = i;
> + rcount++;
> + continue; /* don't mark B */
It is unclear if we want to allow "git diff A..B C..D" (or
alternatively "git diff A...B C..D") and if so why.
It appears that you are allowing both, but I am not sure if that is
a good idea. Read a bit further below.
> + default:
> + continue;
> + }
> + if (map == NULL)
> + map = bitmap_new();
> + bitmap_set(map, i);
> + }
> +
> + if (lcount == 0) { /* not a symmetric difference */
> + bitmap_free(map);
> + sym->warn = 0;
> + sym->skip = NULL;
> + return;
> + }
> +
> + if (lcount != 1)
> + die(_("cannot use more than one symmetric difference"));
> +
> + if (basecount == 0) {
> + const char *lname = rev->pending.objects[lpos].name;
> + const char *rname = rev->pending.objects[rpos].name;
> + die(_("%s...%s: no merge base"), lname, rname);
When "git diff A...B C..D" is given, what do we want to do? A is
the only element marked with REV_CMD_LEFT, which is pointed at by
lpos and lcount gets incremented to become one. When we see B, we
make rpos point at it, but then later, when we see D, wouldn't rpos
get updated to point at it? What error message would we give when
we see no merge base then? We would want to say the symdiff between
A and B has no merge bases, but wouldn't we end up mentioning D
instead of B?
> + }
> + bitmap_unset(map, basepos); /* unmark the base we want */
> + sym->base = basepos;
> + sym->left = lpos;
> + sym->right = rpos;
> + sym->warn = basecount > 1;
> + sym->skip = map;
> +}
> @@ -394,8 +494,9 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> die(_("invalid object '%s' given."), name);
> if (obj->type == OBJ_COMMIT)
> obj = &get_commit_tree(((struct commit *)obj))->object;
> -
Do not lose this blank line. It does not make a difference to the
compiler, but it is semantically significant to human readers.
> if (obj->type == OBJ_TREE) {
> + if (sdiff.skip && bitmap_get(sdiff.skip, i))
> + continue;
> obj->flags |= flags;
> add_object_array(obj, name, &ent);
> } else if (obj->type == OBJ_BLOB) {
> diff --git a/t/t4068-diff-symmetric.sh b/t/t4068-diff-symmetric.sh
> new file mode 100755
> index 00000000000..7b5988933da
> --- /dev/null
> +++ b/t/t4068-diff-symmetric.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +...
> +test_expect_success setup '
> + git commit --allow-empty -m A &&
> + echo b >b &&
> + git add b &&
> + git commit -m B &&
> + git checkout -b br1 HEAD^ &&
> + echo c >c &&
> + git add c &&
> + git commit -m C &&
> + git tag commit-C &&
> + git merge -m D master &&
> + git tag commit-D &&
> + git checkout master &&
> + git merge -m E commit-C &&
> + git checkout -b br2 commit-C &&
> + echo f >f &&
> + git add f &&
> + git commit -m F &&
> + git checkout br1 &&
> + git merge -m G br2 &&
> + git checkout --orphan br3 &&
> + git commit -m H
> +'
> +
> +test_expect_success 'diff with one merge base' '
> + git diff commit-D...br1 >tmp &&
> + tail -1 tmp >actual &&
Let's make sure we spell "tail -n 1" (same for "head -1" -> "head -n 1").
I know there are a handful of existing offenders, but that does not mean
it is OK to make things worse.
> + echo +f >expect &&
> + test_cmp expect actual
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2 3/3] Documentation: tweak git diff help slightly
2020-06-09 19:00 ` [PATCH v2 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
2020-06-09 19:00 ` [PATCH v2 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
2020-06-09 19:00 ` [PATCH v2 2/3] git diff: improve A...B merge-base handling Chris Torek via GitGitGadget
@ 2020-06-09 19:00 ` Chris Torek via GitGitGadget
2020-06-09 23:01 ` Junio C Hamano
2020-06-11 15:15 ` [PATCH v3 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
3 siblings, 1 reply; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-09 19:00 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
Update the manual page synopsis to include the three-dot notation
and the combined-diff option
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Document the usage for producing combined commits.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
Documentation/git-diff.txt | 21 +++++++++++++++++----
builtin/diff.c | 8 +++++++-
2 files changed, 24 insertions(+), 5 deletions(-)
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 37781cf1755..0bce278652a 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -11,15 +11,17 @@ SYNOPSIS
[verse]
'git diff' [<options>] [<commit>] [--] [<path>...]
'git diff' [<options>] --cached [<commit>] [--] [<path>...]
-'git diff' [<options>] <commit> <commit> [--] [<path>...]
+'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]
+'git diff' [<options>] <commit>...<commit> [--] [<path>...]
'git diff' [<options>] <blob> <blob>
'git diff' [<options>] --no-index [--] <path> <path>
DESCRIPTION
-----------
Show changes between the working tree and the index or a tree, changes
-between the index and a tree, changes between two trees, changes between
-two blob objects, or changes between two files on disk.
+between the index and a tree, changes between two trees, changes resulting
+from a merge, changes between two blob objects, or changes between two
+files on disk.
'git diff' [<options>] [--] [<path>...]::
@@ -67,6 +69,16 @@ two blob objects, or changes between two files on disk.
one side is omitted, it will have the same effect as
using HEAD instead.
+'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]::
+
+ This form is to view the results of a merge commit. The first
+ listed <commit> must be the merge itself; the remaining two or
+ more commits should be its parents. A convenient way to produce
+ the desired set of revisions is to use the {caret}@ suffix, i.e.,
+ "git diff master master^@". This is equivalent to running "git
+ show --format=" on the merge commit, e.g., "git show --format=
+ master".
+
'git diff' [<options>] <commit>\...<commit> [--] [<path>...]::
This form is to view the changes on the branch containing
@@ -196,7 +208,8 @@ linkgit:git-difftool[1],
linkgit:git-log[1],
linkgit:gitdiffcore[7],
linkgit:git-format-patch[1],
-linkgit:git-apply[1]
+linkgit:git-apply[1],
+linkgit:git-show[1]
GIT
---
diff --git a/builtin/diff.c b/builtin/diff.c
index 0b6e63dbd02..b333640082b 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -24,7 +24,13 @@
#define DIFF_NO_INDEX_IMPLICIT 2
static const char builtin_diff_usage[] =
-"git diff [<options>] [<commit> [<commit>]] [--] [<path>...]";
+"git diff [<options>] [<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] --cached [<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] <commit> [<commit>...] <commit> [--] [<path>...]\n"
+" or: git diff [<options>] <commit>...<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] <blob> <blob>]\n"
+" or: git diff [<options>] --no-index [--] <path> <path>]\n"
+COMMON_DIFF_OPTIONS_HELP;
static const char *blob_path(struct object_array_entry *entry)
{
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/3] Documentation: tweak git diff help slightly
2020-06-09 19:00 ` [PATCH v2 3/3] Documentation: tweak git diff help slightly Chris Torek via GitGitGadget
@ 2020-06-09 23:01 ` Junio C Hamano
0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2020-06-09 23:01 UTC (permalink / raw)
To: Chris Torek via GitGitGadget; +Cc: git, Chris Torek
"Chris Torek via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Chris Torek <chris.torek@gmail.com>
>
> Update the manual page synopsis to include the three-dot notation
> and the combined-diff option
Surely. That is "tweak ... slightly". Full-stop is missing here,
by the way.
> Make "git diff -h" print the same usage summary as the manual
> page synopsis, minus the "A..B" form, which is now discouraged.
Good.
> Document the usage for producing combined commits.
Yup, that is "while we are at it". The new text reads well, but it
appears that it is the more significant part of the change in this
patch now ;-)
> diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
> index 37781cf1755..0bce278652a 100644
> --- a/Documentation/git-diff.txt
> +++ b/Documentation/git-diff.txt
> @@ -11,15 +11,17 @@ SYNOPSIS
> [verse]
> 'git diff' [<options>] [<commit>] [--] [<path>...]
> 'git diff' [<options>] --cached [<commit>] [--] [<path>...]
> -'git diff' [<options>] <commit> <commit> [--] [<path>...]
> +'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]
> +'git diff' [<options>] <commit>...<commit> [--] [<path>...]
> 'git diff' [<options>] <blob> <blob>
> 'git diff' [<options>] --no-index [--] <path> <path>
>
> DESCRIPTION
> -----------
> Show changes between the working tree and the index or a tree, changes
> -between the index and a tree, changes between two trees, changes between
> -two blob objects, or changes between two files on disk.
> +between the index and a tree, changes between two trees, changes resulting
> +from a merge, changes between two blob objects, or changes between two
> +files on disk.
>
> 'git diff' [<options>] [--] [<path>...]::
>
> @@ -67,6 +69,16 @@ two blob objects, or changes between two files on disk.
> one side is omitted, it will have the same effect as
> using HEAD instead.
>
> +'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]::
> +
> + This form is to view the results of a merge commit. The first
> + listed <commit> must be the merge itself; the remaining two or
> + more commits should be its parents. A convenient way to produce
> + the desired set of revisions is to use the {caret}@ suffix, i.e.,
> + "git diff master master^@". This is equivalent to running "git
Don't we usually use `git diff master master^@` to mark up literal
examples with tt, instead of "git diff..." with double-quotes?
> + show --format=" on the merge commit, e.g., "git show --format=
> + master".
Likewise.
But more importantly, I think giving the exact equivalent is much
less important than keeping the explanation concise, simple and
clear, and the "empty format to omit the log part" is distracting
(after all, teaching how to squelch the log message part in the
"show" command is not the topic of this manpage).
For a merge commit `master`, this gives the same combined diff
as `git show master` does.
perhaps?
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3 0/3] improve git-diff documentation and A...B handling
2020-06-09 19:00 ` [PATCH v2 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
` (2 preceding siblings ...)
2020-06-09 19:00 ` [PATCH v2 3/3] Documentation: tweak git diff help slightly Chris Torek via GitGitGadget
@ 2020-06-11 15:15 ` Chris Torek via GitGitGadget
2020-06-11 15:15 ` [PATCH v3 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
` (3 more replies)
3 siblings, 4 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-11 15:15 UTC (permalink / raw)
To: git; +Cc: Chris Torek
git diff -h help is succinct, but perhaps too much so.
The symmetric-diff syntax, git diff A...B, is defined by the documentation
to compare the merge base of A and B to commit B. It does so just fine when
there is a merge base. It compares A and B directly if there is no merge
base, and it is overly forgiving of bad arguments after which it can produce
nonsensical diffs. It also behaves badly with other odd/incorrect usages,
such as git diff A..B C..D.
The first patch simply adjusts a test that will fail if the second patch is
accepted. The second patch adds special handling for the symmetric diff
syntax so that the option parsing works, plus a small test suite. The third
patch updates the documentation, including adding a section for combined
commits, and makes the help output more verbose (to match the SYNOPSIS and
provide common diff options like git-diff-files, for instance).
Changes since v1:
* updated commit messages
* rewrote prepare function
* added tests to reject bad two-dot usage as well
* removed A..B syntax from usage (and fixed typo)
* fixed up three-dot synopsis text
Chris Torek (3):
t/t3430: avoid undefined git diff behavior
git diff: improve range handling
Documentation: usage for diff combined commits
Documentation/git-diff.txt | 20 ++++--
builtin/diff.c | 132 +++++++++++++++++++++++++++++++++----
t/t3430-rebase-merges.sh | 2 +-
t/t4068-diff-symmetric.sh | 91 +++++++++++++++++++++++++
4 files changed, 226 insertions(+), 19 deletions(-)
create mode 100755 t/t4068-diff-symmetric.sh
base-commit: 20514004ddf1a3528de8933bc32f284e175e1012
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-804%2Fchris3torek%2Fcleanup-diff-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-804/chris3torek/cleanup-diff-v3
Pull-Request: https://github.com/git/git/pull/804
Range-diff vs v2:
1: 2ccaad645ff = 1: 2ccaad645ff t/t3430: avoid undefined git diff behavior
2: 100fa403477 ! 2: 60aed3f9d65 git diff: improve A...B merge-base handling
@@ Metadata
Author: Chris Torek <chris.torek@gmail.com>
## Commit message ##
- git diff: improve A...B merge-base handling
+ git diff: improve range handling
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
@@ Commit message
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
+ Likewise, using multiple two-dot ranges, or tossing extra
+ revision specifiers into the command line with two-dot ranges,
+ or mixing two and three dot ranges, all produce nonsense.
- To avoid all this, add a routine to catch the A...B case and verify that
- there is at least one merge base, and that the arguments make sense.
- As a side effect, produce a warning showing *which* merge base is being
- used when there are multiple choices; die if there is no merge base.
+ To avoid all this, add a routine to catch the range cases and
+ verify that that the arguments make sense. As a side effect,
+ produce a warning showing *which* merge base is being used when
+ there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
@@ builtin/diff.c: static int builtin_diff_files(struct rev_info *revs, int argc, c
}
+struct symdiff {
-+ struct bitmap *skip; /* bitmap of commit indices to skip, or NULL */
-+ int warn; /* true if there were multiple merge bases */
-+ int base, left, right; /* index of chosen merge base and left&right */
++ struct bitmap *skip;
++ int warn;
++ const char *base, *left, *right;
+};
+
+/*
+ * Check for symmetric-difference arguments, and if present, arrange
-+ * everything we need to know to handle them correctly.
++ * everything we need to know to handle them correctly. As a bonus,
++ * weed out all bogus range-based revision specifications, e.g.,
++ * "git diff A..B C..D" or "git diff A..B C" get rejected.
+ *
+ * For an actual symmetric diff, *symdiff is set this way:
+ *
@@ builtin/diff.c: static int builtin_diff_files(struct rev_info *revs, int argc, c
+ * which there might be many, and A in A...B). Note that the
+ * chosen merge base and right side are NOT marked.
+ * - warn is set if there are multiple merge bases.
-+ * - base, left, and right hold the merge base and left and
-+ * right side indices, for warnings or errors.
++ * - base, left, and right point to the names to use in a
++ * warning about multiple merge bases.
+ *
+ * If there is no symmetric diff argument, sym->skip is NULL and
+ * sym->warn is cleared. The remaining fields are not set.
-+ *
-+ * If the user provides a symmetric diff with no merge base, or
-+ * more than one range, we do a usage-exit.
+ */
+static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
+{
-+ int i, lcount = 0, rcount = 0, basecount = 0;
++ int i, is_symdiff = 0, basecount = 0, othercount = 0;
+ int lpos = -1, rpos = -1, basepos = -1;
+ struct bitmap *map = NULL;
+
@@ builtin/diff.c: static int builtin_diff_files(struct rev_info *revs, int argc, c
+ basecount++;
+ break; /* do mark all bases */
+ case REV_CMD_LEFT:
++ if (lpos > 0)
++ usage(builtin_diff_usage);
++ lpos = i;
+ if (obj->flags & SYMMETRIC_LEFT) {
-+ lpos = i;
-+ lcount++;
++ is_symdiff = 1;
+ break; /* do mark A */
+ }
+ continue;
+ case REV_CMD_RIGHT:
++ if (rpos > 0)
++ usage(builtin_diff_usage);
+ rpos = i;
-+ rcount++;
+ continue; /* don't mark B */
-+ default:
++ case REV_CMD_PARENTS_ONLY:
++ case REV_CMD_REF:
++ case REV_CMD_REV:
++ othercount++;
+ continue;
+ }
+ if (map == NULL)
@@ builtin/diff.c: static int builtin_diff_files(struct rev_info *revs, int argc, c
+ bitmap_set(map, i);
+ }
+
-+ if (lcount == 0) { /* not a symmetric difference */
++ /*
++ * Forbid any additional revs for both A...B and A..B.
++ */
++ if (lpos >= 0 && othercount > 0)
++ usage(builtin_diff_usage);
++
++ if (!is_symdiff) {
+ bitmap_free(map);
+ sym->warn = 0;
+ sym->skip = NULL;
+ return;
+ }
+
-+ if (lcount != 1)
-+ die(_("cannot use more than one symmetric difference"));
-+
-+ if (basecount == 0) {
-+ const char *lname = rev->pending.objects[lpos].name;
-+ const char *rname = rev->pending.objects[rpos].name;
-+ die(_("%s...%s: no merge base"), lname, rname);
-+ }
++ sym->left = rev->pending.objects[lpos].name;
++ sym->right = rev->pending.objects[rpos].name;
++ sym->base = rev->pending.objects[basepos].name;
++ if (basecount == 0)
++ die(_("%s...%s: no merge base"), sym->left, sym->right);
+ bitmap_unset(map, basepos); /* unmark the base we want */
-+ sym->base = basepos;
-+ sym->left = lpos;
-+ sym->right = rpos;
+ sym->warn = basecount > 1;
+ sym->skip = map;
+}
@@ builtin/diff.c: int cmd_diff(int argc, const char **argv, const char *prefix)
struct object_array_entry *entry = &rev.pending.objects[i];
struct object *obj = entry->item;
@@ builtin/diff.c: int cmd_diff(int argc, const char **argv, const char *prefix)
- die(_("invalid object '%s' given."), name);
- if (obj->type == OBJ_COMMIT)
obj = &get_commit_tree(((struct commit *)obj))->object;
--
+
if (obj->type == OBJ_TREE) {
+ if (sdiff.skip && bitmap_get(sdiff.skip, i))
+ continue;
@@ builtin/diff.c: int cmd_diff(int argc, const char **argv, const char *prefix)
result = builtin_diff_index(&rev, argc, argv);
- else if (ent.nr == 2)
+ else if (ent.nr == 2) {
-+ if (sdiff.warn) {
-+ const char *lname = rev.pending.objects[sdiff.left].name;
-+ const char *rname = rev.pending.objects[sdiff.right].name;
-+ const char *basename = rev.pending.objects[sdiff.base].name;
++ if (sdiff.warn)
+ warning(_("%s...%s: multiple merge bases, using %s"),
-+ lname, rname, basename);
-+ }
++ sdiff.left, sdiff.right, sdiff.base);
result = builtin_diff_tree(&rev, argc, argv,
&ent.objects[0], &ent.objects[1]);
- else if (ent.objects[0].item->flags & UNINTERESTING) {
@@ builtin/diff.c: int cmd_diff(int argc, const char **argv, const char *prefix)
- result = builtin_diff_tree(&rev, argc, argv,
- &ent.objects[0],
- &ent.objects[ent.nr-1]);
-- } else
-+ } else {
-+ if (sdiff.skip)
-+ usage(builtin_diff_usage);
+ } else
result = builtin_diff_combined(&rev, argc, argv,
ent.objects, ent.nr);
-+ }
- result = diff_result_code(&rev.diffopt, result);
- if (1 < rev.diffopt.skip_stat_unmatch)
- refresh_index_quietly();
## t/t4068-diff-symmetric.sh (new) ##
@@
@@ t/t4068-diff-symmetric.sh (new)
+
+test_expect_success 'diff with one merge base' '
+ git diff commit-D...br1 >tmp &&
-+ tail -1 tmp >actual &&
++ tail -n 1 tmp >actual &&
+ echo +f >expect &&
+ test_cmp expect actual
+'
@@ t/t4068-diff-symmetric.sh (new)
+
+test_expect_success 'diff with too many symmetric differences' '
+ test_must_fail git diff br1...master br2...br3 >tmp 2>err &&
-+ test_i18ngrep "fatal: cannot use more than one symmetric difference" err
++ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with symmetric difference and extraneous arg' '
@@ t/t4068-diff-symmetric.sh (new)
+ test_i18ngrep "usage" err
+'
+
++test_expect_success 'diff with two ranges' '
++ test_must_fail git diff master br1..master br2..br3 >tmp 2>err &&
++ test_i18ngrep "usage" err
++'
++
++test_expect_success 'diff with ranges and extra arg' '
++ test_must_fail git diff master br1..master commit-D >tmp 2>err &&
++ test_i18ngrep "usage" err
++'
++
+test_done
3: b9b4c6f113d ! 3: a7da92cd635 Documentation: tweak git diff help slightly
@@ Metadata
Author: Chris Torek <chris.torek@gmail.com>
## Commit message ##
- Documentation: tweak git diff help slightly
+ Documentation: usage for diff combined commits
- Update the manual page synopsis to include the three-dot notation
- and the combined-diff option
+ Document the usage for producing combined commits with "git diff".
+ This includes updating the synopsis section.
+
+ While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
- Document the usage for producing combined commits.
-
Signed-off-by: Chris Torek <chris.torek@gmail.com>
## Documentation/git-diff.txt ##
@@ Documentation/git-diff.txt: two blob objects, or changes between two files on di
+ This form is to view the results of a merge commit. The first
+ listed <commit> must be the merge itself; the remaining two or
+ more commits should be its parents. A convenient way to produce
-+ the desired set of revisions is to use the {caret}@ suffix, i.e.,
-+ "git diff master master^@". This is equivalent to running "git
-+ show --format=" on the merge commit, e.g., "git show --format=
-+ master".
++ the desired set of revisions is to use the {caret}@ suffix.
++ For instance, if `master` names a merge commit, `git diff master
++ master^@` gives the same combined diff as `git show master`.
+
'git diff' [<options>] <commit>\...<commit> [--] [<path>...]::
--
gitgitgadget
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3 1/3] t/t3430: avoid undefined git diff behavior
2020-06-11 15:15 ` [PATCH v3 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
@ 2020-06-11 15:15 ` Chris Torek via GitGitGadget
2020-06-11 15:15 ` [PATCH v3 2/3] git diff: improve range handling Chris Torek via GitGitGadget
` (2 subsequent siblings)
3 siblings, 0 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-11 15:15 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
The autosquash-and-exec test used "git diff HEAD^!" to mean
"git diff HEAD^ HEAD". Use these directly instead of relying
on the undefined but actual-current behavior of "HEAD^!".
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
t/t3430-rebase-merges.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index a1bc3e20016..b454f400ebd 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -420,7 +420,7 @@ test_expect_success 'with --autosquash and --exec' '
git commit --fixup B B.t &&
write_script show.sh <<-\EOF &&
subject="$(git show -s --format=%s HEAD)"
- content="$(git diff HEAD^! | tail -n 1)"
+ content="$(git diff HEAD^ HEAD | tail -n 1)"
echo "$subject: $content"
EOF
test_tick &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 2/3] git diff: improve range handling
2020-06-11 15:15 ` [PATCH v3 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
2020-06-11 15:15 ` [PATCH v3 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
@ 2020-06-11 15:15 ` Chris Torek via GitGitGadget
2020-06-11 15:51 ` Chris Torek
2020-06-11 15:15 ` [PATCH v3 3/3] Documentation: usage for diff combined commits Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
3 siblings, 1 reply; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-11 15:15 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
builtin/diff.c | 124 ++++++++++++++++++++++++++++++++++----
t/t4068-diff-symmetric.sh | 91 ++++++++++++++++++++++++++++
2 files changed, 202 insertions(+), 13 deletions(-)
create mode 100755 t/t4068-diff-symmetric.sh
diff --git a/builtin/diff.c b/builtin/diff.c
index 8537b17bd5e..3d33a9231ae 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -6,6 +6,7 @@
#define USE_THE_INDEX_COMPATIBILITY_MACROS
#include "cache.h"
#include "config.h"
+#include "ewah/ewok.h"
#include "lockfile.h"
#include "color.h"
#include "commit.h"
@@ -254,6 +255,108 @@ static int builtin_diff_files(struct rev_info *revs, int argc, const char **argv
return run_diff_files(revs, options);
}
+struct symdiff {
+ struct bitmap *skip;
+ int warn;
+ const char *base, *left, *right;
+};
+
+/*
+ * Check for symmetric-difference arguments, and if present, arrange
+ * everything we need to know to handle them correctly. As a bonus,
+ * weed out all bogus range-based revision specifications, e.g.,
+ * "git diff A..B C..D" or "git diff A..B C" get rejected.
+ *
+ * For an actual symmetric diff, *symdiff is set this way:
+ *
+ * - its skip is non-NULL and marks *all* rev->pending.objects[i]
+ * indices that the caller should ignore (extra merge bases, of
+ * which there might be many, and A in A...B). Note that the
+ * chosen merge base and right side are NOT marked.
+ * - warn is set if there are multiple merge bases.
+ * - base, left, and right point to the names to use in a
+ * warning about multiple merge bases.
+ *
+ * If there is no symmetric diff argument, sym->skip is NULL and
+ * sym->warn is cleared. The remaining fields are not set.
+ */
+static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
+{
+ int i, is_symdiff = 0, basecount = 0, othercount = 0;
+ int lpos = -1, rpos = -1, basepos = -1;
+ struct bitmap *map = NULL;
+
+ /*
+ * Use the whence fields to find merge bases and left and
+ * right parts of symmetric difference, so that we do not
+ * depend on the order that revisions are parsed. If there
+ * are any revs that aren't from these sources, we have a
+ * "git diff C A...B" or "git diff A...B C" case. Or we
+ * could even get "git diff A...B C...E", for instance.
+ *
+ * If we don't have just one merge base, we pick one
+ * at random.
+ *
+ * NB: REV_CMD_LEFT, REV_CMD_RIGHT are also used for A..B,
+ * so we must check for SYMMETRIC_LEFT too. The two arrays
+ * rev->pending.objects and rev->cmdline.rev are parallel.
+ */
+ for (i = 0; i < rev->cmdline.nr; i++) {
+ struct object *obj = rev->pending.objects[i].item;
+ switch (rev->cmdline.rev[i].whence) {
+ case REV_CMD_MERGE_BASE:
+ if (basepos < 0)
+ basepos = i;
+ basecount++;
+ break; /* do mark all bases */
+ case REV_CMD_LEFT:
+ if (lpos > 0)
+ usage(builtin_diff_usage);
+ lpos = i;
+ if (obj->flags & SYMMETRIC_LEFT) {
+ is_symdiff = 1;
+ break; /* do mark A */
+ }
+ continue;
+ case REV_CMD_RIGHT:
+ if (rpos > 0)
+ usage(builtin_diff_usage);
+ rpos = i;
+ continue; /* don't mark B */
+ case REV_CMD_PARENTS_ONLY:
+ case REV_CMD_REF:
+ case REV_CMD_REV:
+ othercount++;
+ continue;
+ }
+ if (map == NULL)
+ map = bitmap_new();
+ bitmap_set(map, i);
+ }
+
+ /*
+ * Forbid any additional revs for both A...B and A..B.
+ */
+ if (lpos >= 0 && othercount > 0)
+ usage(builtin_diff_usage);
+
+ if (!is_symdiff) {
+ bitmap_free(map);
+ sym->warn = 0;
+ sym->skip = NULL;
+ return;
+ }
+
+ sym->left = rev->pending.objects[lpos].name;
+ sym->right = rev->pending.objects[rpos].name;
+ sym->base = rev->pending.objects[basepos].name;
+ if (basecount == 0)
+ die(_("%s...%s: no merge base"), sym->left, sym->right);
+ bitmap_unset(map, basepos); /* unmark the base we want */
+ sym->warn = basecount > 1;
+ sym->skip = map;
+}
+
int cmd_diff(int argc, const char **argv, const char *prefix)
{
int i;
@@ -263,6 +366,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
struct object_array_entry *blob[2];
int nongit = 0, no_index = 0;
int result = 0;
+ struct symdiff sdiff;
/*
* We could get N tree-ish in the rev.pending_objects list.
@@ -382,6 +486,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
}
}
+ symdiff_prepare(&rev, &sdiff);
for (i = 0; i < rev.pending.nr; i++) {
struct object_array_entry *entry = &rev.pending.objects[i];
struct object *obj = entry->item;
@@ -396,6 +501,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
obj = &get_commit_tree(((struct commit *)obj))->object;
if (obj->type == OBJ_TREE) {
+ if (sdiff.skip && bitmap_get(sdiff.skip, i))
+ continue;
obj->flags |= flags;
add_object_array(obj, name, &ent);
} else if (obj->type == OBJ_BLOB) {
@@ -437,21 +544,12 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
usage(builtin_diff_usage);
else if (ent.nr == 1)
result = builtin_diff_index(&rev, argc, argv);
- else if (ent.nr == 2)
+ else if (ent.nr == 2) {
+ if (sdiff.warn)
+ warning(_("%s...%s: multiple merge bases, using %s"),
+ sdiff.left, sdiff.right, sdiff.base);
result = builtin_diff_tree(&rev, argc, argv,
&ent.objects[0], &ent.objects[1]);
- else if (ent.objects[0].item->flags & UNINTERESTING) {
- /*
- * diff A...B where there is at least one merge base
- * between A and B. We have ent.objects[0] ==
- * merge-base, ent.objects[ents-2] == A, and
- * ent.objects[ents-1] == B. Show diff between the
- * base and B. Note that we pick one merge base at
- * random if there are more than one.
- */
- result = builtin_diff_tree(&rev, argc, argv,
- &ent.objects[0],
- &ent.objects[ent.nr-1]);
} else
result = builtin_diff_combined(&rev, argc, argv,
ent.objects, ent.nr);
diff --git a/t/t4068-diff-symmetric.sh b/t/t4068-diff-symmetric.sh
new file mode 100755
index 00000000000..dd60da0294e
--- /dev/null
+++ b/t/t4068-diff-symmetric.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='behavior of diff with symmetric-diff setups'
+
+. ./test-lib.sh
+
+# build these situations:
+# - normal merge with one merge base (b1...b2);
+# - criss-cross merge ie 2 merge bases (b1...master);
+# - disjoint subgraph (orphan branch, b3...master).
+#
+# B---E <-- master
+# / \ /
+# A X
+# \ / \
+# C---D--G <-- br1
+# \ /
+# ---F <-- br2
+#
+# H <-- br3
+#
+# We put files into a few commits so that we can verify the
+# output as well.
+
+test_expect_success setup '
+ git commit --allow-empty -m A &&
+ echo b >b &&
+ git add b &&
+ git commit -m B &&
+ git checkout -b br1 HEAD^ &&
+ echo c >c &&
+ git add c &&
+ git commit -m C &&
+ git tag commit-C &&
+ git merge -m D master &&
+ git tag commit-D &&
+ git checkout master &&
+ git merge -m E commit-C &&
+ git checkout -b br2 commit-C &&
+ echo f >f &&
+ git add f &&
+ git commit -m F &&
+ git checkout br1 &&
+ git merge -m G br2 &&
+ git checkout --orphan br3 &&
+ git commit -m H
+'
+
+test_expect_success 'diff with one merge base' '
+ git diff commit-D...br1 >tmp &&
+ tail -n 1 tmp >actual &&
+ echo +f >expect &&
+ test_cmp expect actual
+'
+
+# The output (in tmp) can have +b or +c depending
+# on which merge base (commit B or C) is picked.
+# It should have one of those two, which comes out
+# to seven lines.
+test_expect_success 'diff with two merge bases' '
+ git diff br1...master >tmp 2>err &&
+ test_line_count = 7 tmp &&
+ test_line_count = 1 err
+'
+
+test_expect_success 'diff with no merge bases' '
+ test_must_fail git diff br2...br3 >tmp 2>err &&
+ test_i18ngrep "fatal: br2...br3: no merge base" err
+'
+
+test_expect_success 'diff with too many symmetric differences' '
+ test_must_fail git diff br1...master br2...br3 >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with symmetric difference and extraneous arg' '
+ test_must_fail git diff master br1...master >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with two ranges' '
+ test_must_fail git diff master br1..master br2..br3 >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with ranges and extra arg' '
+ test_must_fail git diff master br1..master commit-D >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 3/3] Documentation: usage for diff combined commits
2020-06-11 15:15 ` [PATCH v3 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
2020-06-11 15:15 ` [PATCH v3 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
2020-06-11 15:15 ` [PATCH v3 2/3] git diff: improve range handling Chris Torek via GitGitGadget
@ 2020-06-11 15:15 ` Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
3 siblings, 0 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-11 15:15 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
Documentation/git-diff.txt | 20 ++++++++++++++++----
builtin/diff.c | 8 +++++++-
2 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 37781cf1755..1018110ddc2 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -11,15 +11,17 @@ SYNOPSIS
[verse]
'git diff' [<options>] [<commit>] [--] [<path>...]
'git diff' [<options>] --cached [<commit>] [--] [<path>...]
-'git diff' [<options>] <commit> <commit> [--] [<path>...]
+'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]
+'git diff' [<options>] <commit>...<commit> [--] [<path>...]
'git diff' [<options>] <blob> <blob>
'git diff' [<options>] --no-index [--] <path> <path>
DESCRIPTION
-----------
Show changes between the working tree and the index or a tree, changes
-between the index and a tree, changes between two trees, changes between
-two blob objects, or changes between two files on disk.
+between the index and a tree, changes between two trees, changes resulting
+from a merge, changes between two blob objects, or changes between two
+files on disk.
'git diff' [<options>] [--] [<path>...]::
@@ -67,6 +69,15 @@ two blob objects, or changes between two files on disk.
one side is omitted, it will have the same effect as
using HEAD instead.
+'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]::
+
+ This form is to view the results of a merge commit. The first
+ listed <commit> must be the merge itself; the remaining two or
+ more commits should be its parents. A convenient way to produce
+ the desired set of revisions is to use the {caret}@ suffix.
+ For instance, if `master` names a merge commit, `git diff master
+ master^@` gives the same combined diff as `git show master`.
+
'git diff' [<options>] <commit>\...<commit> [--] [<path>...]::
This form is to view the changes on the branch containing
@@ -196,7 +207,8 @@ linkgit:git-difftool[1],
linkgit:git-log[1],
linkgit:gitdiffcore[7],
linkgit:git-format-patch[1],
-linkgit:git-apply[1]
+linkgit:git-apply[1],
+linkgit:git-show[1]
GIT
---
diff --git a/builtin/diff.c b/builtin/diff.c
index 3d33a9231ae..c8a999f8089 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -24,7 +24,13 @@
#define DIFF_NO_INDEX_IMPLICIT 2
static const char builtin_diff_usage[] =
-"git diff [<options>] [<commit> [<commit>]] [--] [<path>...]";
+"git diff [<options>] [<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] --cached [<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] <commit> [<commit>...] <commit> [--] [<path>...]\n"
+" or: git diff [<options>] <commit>...<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] <blob> <blob>]\n"
+" or: git diff [<options>] --no-index [--] <path> <path>]\n"
+COMMON_DIFF_OPTIONS_HELP;
static const char *blob_path(struct object_array_entry *entry)
{
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v4 0/3] improve git-diff documentation and A...B handling
2020-06-11 15:15 ` [PATCH v3 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
` (2 preceding siblings ...)
2020-06-11 15:15 ` [PATCH v3 3/3] Documentation: usage for diff combined commits Chris Torek via GitGitGadget
@ 2020-06-12 16:19 ` Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
` (2 more replies)
3 siblings, 3 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-12 16:19 UTC (permalink / raw)
To: git; +Cc: Chris Torek
git diff -h help is succinct, but perhaps too much so.
The symmetric-diff syntax, git diff A...B, is defined by the documentation
to compare the merge base of A and B to commit B. It does so just fine when
there is a merge base. It compares A and B directly if there is no merge
base, and it is overly forgiving of bad arguments after which it can produce
nonsensical diffs. It also behaves badly with other odd/incorrect usages,
such as git diff A...B C..D.
The first patch simply adjusts a test that will fail if the second patch is
accepted. The second patch adds special handling for the symmetric and range
diff syntax so that the option parsing works, plus a small test suite. The
third patch updates the documentation, including adding a section for
combined commits, and makes the help output more verbose (to match the
SYNOPSIS and provide common diff options like git-diff-files, for instance).
Changes since v3:
* correct > / >= goof
* fix test nit per Philip Oakley
Chris Torek (3):
t/t3430: avoid undefined git diff behavior
git diff: improve range handling
Documentation: usage for diff combined commits
Documentation/git-diff.txt | 20 ++++--
builtin/diff.c | 132 +++++++++++++++++++++++++++++++++----
t/t3430-rebase-merges.sh | 2 +-
t/t4068-diff-symmetric.sh | 91 +++++++++++++++++++++++++
4 files changed, 226 insertions(+), 19 deletions(-)
create mode 100755 t/t4068-diff-symmetric.sh
base-commit: 20514004ddf1a3528de8933bc32f284e175e1012
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-804%2Fchris3torek%2Fcleanup-diff-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-804/chris3torek/cleanup-diff-v4
Pull-Request: https://github.com/git/git/pull/804
Range-diff vs v3:
1: 2ccaad645ff = 1: 2ccaad645ff t/t3430: avoid undefined git diff behavior
2: 60aed3f9d65 ! 2: 4fa6fba33b3 git diff: improve range handling
@@ builtin/diff.c: static int builtin_diff_files(struct rev_info *revs, int argc, c
+ basecount++;
+ break; /* do mark all bases */
+ case REV_CMD_LEFT:
-+ if (lpos > 0)
++ if (lpos >= 0)
+ usage(builtin_diff_usage);
+ lpos = i;
+ if (obj->flags & SYMMETRIC_LEFT) {
@@ builtin/diff.c: static int builtin_diff_files(struct rev_info *revs, int argc, c
+ }
+ continue;
+ case REV_CMD_RIGHT:
-+ if (rpos > 0)
++ if (rpos >= 0)
+ usage(builtin_diff_usage);
+ rpos = i;
+ continue; /* don't mark B */
@@ t/t4068-diff-symmetric.sh (new)
+. ./test-lib.sh
+
+# build these situations:
-+# - normal merge with one merge base (b1...b2);
-+# - criss-cross merge ie 2 merge bases (b1...master);
-+# - disjoint subgraph (orphan branch, b3...master).
++# - normal merge with one merge base (br1...b2r);
++# - criss-cross merge ie 2 merge bases (br1...master);
++# - disjoint subgraph (orphan branch, br3...master).
+#
+# B---E <-- master
+# / \ /
3: a7da92cd635 = 3: d7bc9aca44b Documentation: usage for diff combined commits
--
gitgitgadget
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v4 1/3] t/t3430: avoid undefined git diff behavior
2020-06-12 16:19 ` [PATCH v4 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
@ 2020-06-12 16:19 ` Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 2/3] git diff: improve range handling Chris Torek via GitGitGadget
2020-06-12 16:20 ` [PATCH v4 3/3] Documentation: usage for diff combined commits Chris Torek via GitGitGadget
2 siblings, 0 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-12 16:19 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
The autosquash-and-exec test used "git diff HEAD^!" to mean
"git diff HEAD^ HEAD". Use these directly instead of relying
on the undefined but actual-current behavior of "HEAD^!".
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
t/t3430-rebase-merges.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index a1bc3e20016..b454f400ebd 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -420,7 +420,7 @@ test_expect_success 'with --autosquash and --exec' '
git commit --fixup B B.t &&
write_script show.sh <<-\EOF &&
subject="$(git show -s --format=%s HEAD)"
- content="$(git diff HEAD^! | tail -n 1)"
+ content="$(git diff HEAD^ HEAD | tail -n 1)"
echo "$subject: $content"
EOF
test_tick &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v4 2/3] git diff: improve range handling
2020-06-12 16:19 ` [PATCH v4 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
@ 2020-06-12 16:19 ` Chris Torek via GitGitGadget
2020-06-12 18:31 ` Junio C Hamano
2020-06-12 16:20 ` [PATCH v4 3/3] Documentation: usage for diff combined commits Chris Torek via GitGitGadget
2 siblings, 1 reply; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-12 16:19 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
builtin/diff.c | 124 ++++++++++++++++++++++++++++++++++----
t/t4068-diff-symmetric.sh | 91 ++++++++++++++++++++++++++++
2 files changed, 202 insertions(+), 13 deletions(-)
create mode 100755 t/t4068-diff-symmetric.sh
diff --git a/builtin/diff.c b/builtin/diff.c
index 8537b17bd5e..0f48b0d3e71 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -6,6 +6,7 @@
#define USE_THE_INDEX_COMPATIBILITY_MACROS
#include "cache.h"
#include "config.h"
+#include "ewah/ewok.h"
#include "lockfile.h"
#include "color.h"
#include "commit.h"
@@ -254,6 +255,108 @@ static int builtin_diff_files(struct rev_info *revs, int argc, const char **argv
return run_diff_files(revs, options);
}
+struct symdiff {
+ struct bitmap *skip;
+ int warn;
+ const char *base, *left, *right;
+};
+
+/*
+ * Check for symmetric-difference arguments, and if present, arrange
+ * everything we need to know to handle them correctly. As a bonus,
+ * weed out all bogus range-based revision specifications, e.g.,
+ * "git diff A..B C..D" or "git diff A..B C" get rejected.
+ *
+ * For an actual symmetric diff, *symdiff is set this way:
+ *
+ * - its skip is non-NULL and marks *all* rev->pending.objects[i]
+ * indices that the caller should ignore (extra merge bases, of
+ * which there might be many, and A in A...B). Note that the
+ * chosen merge base and right side are NOT marked.
+ * - warn is set if there are multiple merge bases.
+ * - base, left, and right point to the names to use in a
+ * warning about multiple merge bases.
+ *
+ * If there is no symmetric diff argument, sym->skip is NULL and
+ * sym->warn is cleared. The remaining fields are not set.
+ */
+static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
+{
+ int i, is_symdiff = 0, basecount = 0, othercount = 0;
+ int lpos = -1, rpos = -1, basepos = -1;
+ struct bitmap *map = NULL;
+
+ /*
+ * Use the whence fields to find merge bases and left and
+ * right parts of symmetric difference, so that we do not
+ * depend on the order that revisions are parsed. If there
+ * are any revs that aren't from these sources, we have a
+ * "git diff C A...B" or "git diff A...B C" case. Or we
+ * could even get "git diff A...B C...E", for instance.
+ *
+ * If we don't have just one merge base, we pick one
+ * at random.
+ *
+ * NB: REV_CMD_LEFT, REV_CMD_RIGHT are also used for A..B,
+ * so we must check for SYMMETRIC_LEFT too. The two arrays
+ * rev->pending.objects and rev->cmdline.rev are parallel.
+ */
+ for (i = 0; i < rev->cmdline.nr; i++) {
+ struct object *obj = rev->pending.objects[i].item;
+ switch (rev->cmdline.rev[i].whence) {
+ case REV_CMD_MERGE_BASE:
+ if (basepos < 0)
+ basepos = i;
+ basecount++;
+ break; /* do mark all bases */
+ case REV_CMD_LEFT:
+ if (lpos >= 0)
+ usage(builtin_diff_usage);
+ lpos = i;
+ if (obj->flags & SYMMETRIC_LEFT) {
+ is_symdiff = 1;
+ break; /* do mark A */
+ }
+ continue;
+ case REV_CMD_RIGHT:
+ if (rpos >= 0)
+ usage(builtin_diff_usage);
+ rpos = i;
+ continue; /* don't mark B */
+ case REV_CMD_PARENTS_ONLY:
+ case REV_CMD_REF:
+ case REV_CMD_REV:
+ othercount++;
+ continue;
+ }
+ if (map == NULL)
+ map = bitmap_new();
+ bitmap_set(map, i);
+ }
+
+ /*
+ * Forbid any additional revs for both A...B and A..B.
+ */
+ if (lpos >= 0 && othercount > 0)
+ usage(builtin_diff_usage);
+
+ if (!is_symdiff) {
+ bitmap_free(map);
+ sym->warn = 0;
+ sym->skip = NULL;
+ return;
+ }
+
+ sym->left = rev->pending.objects[lpos].name;
+ sym->right = rev->pending.objects[rpos].name;
+ sym->base = rev->pending.objects[basepos].name;
+ if (basecount == 0)
+ die(_("%s...%s: no merge base"), sym->left, sym->right);
+ bitmap_unset(map, basepos); /* unmark the base we want */
+ sym->warn = basecount > 1;
+ sym->skip = map;
+}
+
int cmd_diff(int argc, const char **argv, const char *prefix)
{
int i;
@@ -263,6 +366,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
struct object_array_entry *blob[2];
int nongit = 0, no_index = 0;
int result = 0;
+ struct symdiff sdiff;
/*
* We could get N tree-ish in the rev.pending_objects list.
@@ -382,6 +486,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
}
}
+ symdiff_prepare(&rev, &sdiff);
for (i = 0; i < rev.pending.nr; i++) {
struct object_array_entry *entry = &rev.pending.objects[i];
struct object *obj = entry->item;
@@ -396,6 +501,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
obj = &get_commit_tree(((struct commit *)obj))->object;
if (obj->type == OBJ_TREE) {
+ if (sdiff.skip && bitmap_get(sdiff.skip, i))
+ continue;
obj->flags |= flags;
add_object_array(obj, name, &ent);
} else if (obj->type == OBJ_BLOB) {
@@ -437,21 +544,12 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
usage(builtin_diff_usage);
else if (ent.nr == 1)
result = builtin_diff_index(&rev, argc, argv);
- else if (ent.nr == 2)
+ else if (ent.nr == 2) {
+ if (sdiff.warn)
+ warning(_("%s...%s: multiple merge bases, using %s"),
+ sdiff.left, sdiff.right, sdiff.base);
result = builtin_diff_tree(&rev, argc, argv,
&ent.objects[0], &ent.objects[1]);
- else if (ent.objects[0].item->flags & UNINTERESTING) {
- /*
- * diff A...B where there is at least one merge base
- * between A and B. We have ent.objects[0] ==
- * merge-base, ent.objects[ents-2] == A, and
- * ent.objects[ents-1] == B. Show diff between the
- * base and B. Note that we pick one merge base at
- * random if there are more than one.
- */
- result = builtin_diff_tree(&rev, argc, argv,
- &ent.objects[0],
- &ent.objects[ent.nr-1]);
} else
result = builtin_diff_combined(&rev, argc, argv,
ent.objects, ent.nr);
diff --git a/t/t4068-diff-symmetric.sh b/t/t4068-diff-symmetric.sh
new file mode 100755
index 00000000000..31d17a5af02
--- /dev/null
+++ b/t/t4068-diff-symmetric.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='behavior of diff with symmetric-diff setups'
+
+. ./test-lib.sh
+
+# build these situations:
+# - normal merge with one merge base (br1...b2r);
+# - criss-cross merge ie 2 merge bases (br1...master);
+# - disjoint subgraph (orphan branch, br3...master).
+#
+# B---E <-- master
+# / \ /
+# A X
+# \ / \
+# C---D--G <-- br1
+# \ /
+# ---F <-- br2
+#
+# H <-- br3
+#
+# We put files into a few commits so that we can verify the
+# output as well.
+
+test_expect_success setup '
+ git commit --allow-empty -m A &&
+ echo b >b &&
+ git add b &&
+ git commit -m B &&
+ git checkout -b br1 HEAD^ &&
+ echo c >c &&
+ git add c &&
+ git commit -m C &&
+ git tag commit-C &&
+ git merge -m D master &&
+ git tag commit-D &&
+ git checkout master &&
+ git merge -m E commit-C &&
+ git checkout -b br2 commit-C &&
+ echo f >f &&
+ git add f &&
+ git commit -m F &&
+ git checkout br1 &&
+ git merge -m G br2 &&
+ git checkout --orphan br3 &&
+ git commit -m H
+'
+
+test_expect_success 'diff with one merge base' '
+ git diff commit-D...br1 >tmp &&
+ tail -n 1 tmp >actual &&
+ echo +f >expect &&
+ test_cmp expect actual
+'
+
+# The output (in tmp) can have +b or +c depending
+# on which merge base (commit B or C) is picked.
+# It should have one of those two, which comes out
+# to seven lines.
+test_expect_success 'diff with two merge bases' '
+ git diff br1...master >tmp 2>err &&
+ test_line_count = 7 tmp &&
+ test_line_count = 1 err
+'
+
+test_expect_success 'diff with no merge bases' '
+ test_must_fail git diff br2...br3 >tmp 2>err &&
+ test_i18ngrep "fatal: br2...br3: no merge base" err
+'
+
+test_expect_success 'diff with too many symmetric differences' '
+ test_must_fail git diff br1...master br2...br3 >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with symmetric difference and extraneous arg' '
+ test_must_fail git diff master br1...master >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with two ranges' '
+ test_must_fail git diff master br1..master br2..br3 >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_expect_success 'diff with ranges and extra arg' '
+ test_must_fail git diff master br1..master commit-D >tmp 2>err &&
+ test_i18ngrep "usage" err
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v4 2/3] git diff: improve range handling
2020-06-12 16:19 ` [PATCH v4 2/3] git diff: improve range handling Chris Torek via GitGitGadget
@ 2020-06-12 18:31 ` Junio C Hamano
2020-06-12 19:25 ` Chris Torek
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2020-06-12 18:31 UTC (permalink / raw)
To: Chris Torek via GitGitGadget; +Cc: git, Chris Torek
"Chris Torek via GitGitGadget" <gitgitgadget@gmail.com> writes:
> +struct symdiff {
> + struct bitmap *skip;
> + int warn;
> + const char *base, *left, *right;
> +};
> +
> +/*
> + * Check for symmetric-difference arguments, and if present, arrange
> + * everything we need to know to handle them correctly. As a bonus,
> + * weed out all bogus range-based revision specifications, e.g.,
> + * "git diff A..B C..D" or "git diff A..B C" get rejected.
> + *
> + * For an actual symmetric diff, *symdiff is set this way:
> + *
> + * - its skip is non-NULL and marks *all* rev->pending.objects[i]
> + * indices that the caller should ignore (extra merge bases, of
> + * which there might be many, and A in A...B). Note that the
> + * chosen merge base and right side are NOT marked.
> + * - warn is set if there are multiple merge bases.
> + * - base, left, and right point to the names to use in a
> + * warning about multiple merge bases.
> + *
> + * If there is no symmetric diff argument, sym->skip is NULL and
> + * sym->warn is cleared. The remaining fields are not set.
> + */
OK.
> +static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
> +{
> + int i, is_symdiff = 0, basecount = 0, othercount = 0;
> + int lpos = -1, rpos = -1, basepos = -1;
> + struct bitmap *map = NULL;
> +
> + /*
> + * Use the whence fields to find merge bases and left and
> + * right parts of symmetric difference, so that we do not
> + * depend on the order that revisions are parsed. If there
> + * are any revs that aren't from these sources, we have a
> + * "git diff C A...B" or "git diff A...B C" case. Or we
> + * could even get "git diff A...B C...E", for instance.
> + *
> + * If we don't have just one merge base, we pick one
> + * at random.
> + *
> + * NB: REV_CMD_LEFT, REV_CMD_RIGHT are also used for A..B,
> + * so we must check for SYMMETRIC_LEFT too. The two arrays
> + * rev->pending.objects and rev->cmdline.rev are parallel.
> + */
> + for (i = 0; i < rev->cmdline.nr; i++) {
> + struct object *obj = rev->pending.objects[i].item;
> + switch (rev->cmdline.rev[i].whence) {
> + case REV_CMD_MERGE_BASE:
> + if (basepos < 0)
> + basepos = i;
> + basecount++;
> + break; /* do mark all bases */
We find and use the first found merge base (i.e. "pick at random" as
promised in the comment before the function), but for warning, keep
track of how many merge bases there are.
> + case REV_CMD_LEFT:
> + if (lpos >= 0)
> + usage(builtin_diff_usage);
A range (either A..B or A...B) has already been seen, and we have
another, which is now rejected.
> + lpos = i;
> + if (obj->flags & SYMMETRIC_LEFT) {
> + is_symdiff = 1;
> + break; /* do mark A */
Inside this switch statement, "continue" is a sign that the caller
should use the rev, and "break" is a sign that the rev is to be
ignored. We obviously do not ignore "A" in ...
> + }
> + continue;
... "A..B" notation, so we "continue" here.
> + case REV_CMD_RIGHT:
> + if (rpos >= 0)
> + usage(builtin_diff_usage);
Here is the same "we reject having two or more ranges".
I actually suspect that this usage() would become dead code---we
would already have died when we saw the matching left end of the
second range (so this could become BUG(), even though usage() does
not hurt).
> + rpos = i;
> + continue; /* don't mark B */
And of course, whether "A..B" or "A...B", B will be used as the
"result" side of the diff, so won't be marked for skipping.
> + case REV_CMD_PARENTS_ONLY:
> + case REV_CMD_REF:
> + case REV_CMD_REV:
> + othercount++;
> + continue;
I wonder if we want to use "default" instead of these three
individual cases. Pros and cons?
- If we forgot to list a whence REV_CMD_* here, it will be silently
marked to be skipped with this code. With "default", it will be
counted to be diffed (which may trigger "giving too many revs to
be diffed" error from the diff machinery, which is good).
- With "default", when we add new type of whence to REV_CMD_* and
forget to adjust this code, it will be counted to be diffed.
With the current code, it will be skipped.
We probably could get the best of the both words by keeping the
above three for "counted in othercount and kept), and then add a
default arm to the switch() that just says
default:
BUG("forgot to handle %d",
rev->cmdline.rev[i].whence);
That way, every time we add a new type of whence, we would be forced
to think what should be done to them.
> + }
> + if (map == NULL)
> + map = bitmap_new();
> + bitmap_set(map, i);
> + }
> +
> + /*
> + * Forbid any additional revs for both A...B and A..B.
> + */
> + if (lpos >= 0 && othercount > 0)
> + usage(builtin_diff_usage);
Meaning "git diff A..B C" is bad. Reasonable.
> + if (!is_symdiff) {
> + bitmap_free(map);
It is not wrong per-se to free it unconditionally, but wouldn't it
be a bug if (map != NULL) at this point in the flow?
The merge bases would only be stuffed in the revs when A...B is
given, and we are not skipping anything involved in A..B or
non-range revs.
> + sym->warn = 0;
> + sym->skip = NULL;
Clearing these two fields are as promised to the callers in the
comment above, which is good.
> + return;
> + }
> +
> + sym->left = rev->pending.objects[lpos].name;
> + sym->right = rev->pending.objects[rpos].name;
> + sym->base = rev->pending.objects[basepos].name;
> + if (basecount == 0)
> + die(_("%s...%s: no merge base"), sym->left, sym->right);
Good.
> + bitmap_unset(map, basepos); /* unmark the base we want */
Hmph. You could
case REV_CMD_MERGE_BASE:
basecount++;
if (basepos < 0) {
basepos = i;
continue; /* keep this one */
}
break; /* skip all others */
and lose this unset(). I do not think it makes too much of a
difference, but it probably is easier to follow if we avoided this
"do something and then come back to correct" pattern.
> + sym->warn = basecount > 1;
> + sym->skip = map;
> +}
> +
> int cmd_diff(int argc, const char **argv, const char *prefix)
> {
> int i;
> @@ -263,6 +366,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> struct object_array_entry *blob[2];
> int nongit = 0, no_index = 0;
> int result = 0;
> + struct symdiff sdiff;
>
> /*
> * We could get N tree-ish in the rev.pending_objects list.
> @@ -382,6 +486,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> }
> }
>
> + symdiff_prepare(&rev, &sdiff);
> for (i = 0; i < rev.pending.nr; i++) {
> struct object_array_entry *entry = &rev.pending.objects[i];
> struct object *obj = entry->item;
> @@ -396,6 +501,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> obj = &get_commit_tree(((struct commit *)obj))->object;
>
> if (obj->type == OBJ_TREE) {
> + if (sdiff.skip && bitmap_get(sdiff.skip, i))
> + continue;
By the way, I cannot shake this feeling that, given that
rev.pending/cmdline.nr will not be an unreasonably large number, if
it is overkill to use the bitmap here. If I were writing this code,
I would have made symdiff_prepare() to fill a separate object array
by copying the elements to be used in the final "diff" out of the
rev.pending array and updated this loop to iterate over that array.
I am not saying that such an approach is better than the use of
bitmap code here. It just was a bit unexpected to see the bitmap
code used for set of objects that is typically less than a dozen.
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4 2/3] git diff: improve range handling
2020-06-12 18:31 ` Junio C Hamano
@ 2020-06-12 19:25 ` Chris Torek
0 siblings, 0 replies; 26+ messages in thread
From: Chris Torek @ 2020-06-12 19:25 UTC (permalink / raw)
To: Junio C Hamano, git
Ugh, forgot to tweak gmail reply to go to the mailing list. Also
I typed in a wrong word ("commit" should be "comment").
Corrected reply:
On Fri, Jun 12, 2020 at 11:31 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Chris Torek via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +struct symdiff {
> > + struct bitmap *skip;
> > + int warn;
> > + const char *base, *left, *right;
> > +};
> > +
> > +/*
> > + * Check for symmetric-difference arguments, and if present, arrange
> > + * everything we need to know to handle them correctly. As a bonus,
> > + * weed out all bogus range-based revision specifications, e.g.,
> > + * "git diff A..B C..D" or "git diff A..B C" get rejected.
> > + *
> > + * For an actual symmetric diff, *symdiff is set this way:
> > + *
> > + * - its skip is non-NULL and marks *all* rev->pending.objects[i]
> > + * indices that the caller should ignore (extra merge bases, of
> > + * which there might be many, and A in A...B). Note that the
> > + * chosen merge base and right side are NOT marked.
> > + * - warn is set if there are multiple merge bases.
> > + * - base, left, and right point to the names to use in a
> > + * warning about multiple merge bases.
> > + *
> > + * If there is no symmetric diff argument, sym->skip is NULL and
> > + * sym->warn is cleared. The remaining fields are not set.
> > + */
>
> OK.
>
> > +static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
> > +{
> > + int i, is_symdiff = 0, basecount = 0, othercount = 0;
> > + int lpos = -1, rpos = -1, basepos = -1;
> > + struct bitmap *map = NULL;
> > +
> > + /*
> > + * Use the whence fields to find merge bases and left and
> > + * right parts of symmetric difference, so that we do not
> > + * depend on the order that revisions are parsed. If there
> > + * are any revs that aren't from these sources, we have a
> > + * "git diff C A...B" or "git diff A...B C" case. Or we
> > + * could even get "git diff A...B C...E", for instance.
> > + *
> > + * If we don't have just one merge base, we pick one
> > + * at random.
> > + *
> > + * NB: REV_CMD_LEFT, REV_CMD_RIGHT are also used for A..B,
> > + * so we must check for SYMMETRIC_LEFT too. The two arrays
> > + * rev->pending.objects and rev->cmdline.rev are parallel.
> > + */
> > + for (i = 0; i < rev->cmdline.nr; i++) {
> > + struct object *obj = rev->pending.objects[i].item;
> > + switch (rev->cmdline.rev[i].whence) {
> > + case REV_CMD_MERGE_BASE:
> > + if (basepos < 0)
> > + basepos = i;
> > + basecount++;
> > + break; /* do mark all bases */
>
> We find and use the first found merge base (i.e. "pick at random" as
> promised in the comment before the function), but for warning, keep
> track of how many merge bases there are.
>
> > + case REV_CMD_LEFT:
> > + if (lpos >= 0)
> > + usage(builtin_diff_usage);
>
> A range (either A..B or A...B) has already been seen, and we have
> another, which is now rejected.
>
> > + lpos = i;
> > + if (obj->flags & SYMMETRIC_LEFT) {
> > + is_symdiff = 1;
> > + break; /* do mark A */
>
> Inside this switch statement, "continue" is a sign that the caller
> should use the rev, and "break" is a sign that the rev is to be
> ignored. We obviously do not ignore "A" in ...
>
> > + }
> > + continue;
>
> ... "A..B" notation, so we "continue" here.
>
> > + case REV_CMD_RIGHT:
> > + if (rpos >= 0)
> > + usage(builtin_diff_usage);
>
> Here is the same "we reject having two or more ranges".
>
> I actually suspect that this usage() would become dead code---we
> would already have died when we saw the matching left end of the
> second range (so this could become BUG(), even though usage() does
> not hurt).
Right - I considered it both ways and figured a second usage() was
simpler and straightforward, but I'm fine with either method.
> > + rpos = i;
> > + continue; /* don't mark B */
>
> And of course, whether "A..B" or "A...B", B will be used as the
> "result" side of the diff, so won't be marked for skipping.
>
> > + case REV_CMD_PARENTS_ONLY:
> > + case REV_CMD_REF:
> > + case REV_CMD_REV:
> > + othercount++;
> > + continue;
>
> I wonder if we want to use "default" instead of these three
> individual cases. Pros and cons?
>
> - If we forgot to list a whence REV_CMD_* here, it will be silently
> marked to be skipped with this code. With "default", it will be
> counted to be diffed (which may trigger "giving too many revs to
> be diffed" error from the diff machinery, which is good).
>
> - With "default", when we add new type of whence to REV_CMD_* and
> forget to adjust this code, it will be counted to be diffed.
> With the current code, it will be skipped.
>
> We probably could get the best of the both words by keeping the
> above three for "counted in othercount and kept), and then add a
> default arm to the switch() that just says
>
> default:
> BUG("forgot to handle %d",
> rev->cmdline.rev[i].whence);
I'm fine with this as well. I did it the way I did because at least some
compilers give a warning or error if you forgot an enum. Using default
(or adding one) defeats this, but the BUG method is reasonable.
> That way, every time we add a new type of whence, we would be forced
> to think what should be done to them.
>
> > + }
> > + if (map == NULL)
> > + map = bitmap_new();
> > + bitmap_set(map, i);
> > + }
> > +
> > + /*
> > + * Forbid any additional revs for both A...B and A..B.
> > + */
> > + if (lpos >= 0 && othercount > 0)
> > + usage(builtin_diff_usage);
>
> Meaning "git diff A..B C" is bad. Reasonable.
>
> > + if (!is_symdiff) {
> > + bitmap_free(map);
>
> It is not wrong per-se to free it unconditionally, but wouldn't it
> be a bug if (map != NULL) at this point in the flow?
Yes, because the A in A...B will have been marked. It didn't
seem worth a BUG() call though.
> The merge bases would only be stuffed in the revs when A...B is
> given, and we are not skipping anything involved in A..B or
> non-range revs.
>
> > + sym->warn = 0;
> > + sym->skip = NULL;
>
> Clearing these two fields are as promised to the callers in the
> comment above, which is good.
>
> > + return;
> > + }
> > +
> > + sym->left = rev->pending.objects[lpos].name;
> > + sym->right = rev->pending.objects[rpos].name;
> > + sym->base = rev->pending.objects[basepos].name;
> > + if (basecount == 0)
> > + die(_("%s...%s: no merge base"), sym->left, sym->right);
>
> Good.
>
> > + bitmap_unset(map, basepos); /* unmark the base we want */
>
> Hmph. You could
>
> case REV_CMD_MERGE_BASE:
> basecount++;
> if (basepos < 0) {
> basepos = i;
> continue; /* keep this one */
> }
> break; /* skip all others */
>
> and lose this unset(). I do not think it makes too much of a
> difference, but it probably is easier to follow if we avoided this
> "do something and then come back to correct" pattern.
OK, I'll change it to do this in the re-roll.
> > + sym->warn = basecount > 1;
> > + sym->skip = map;
> > +}
> > +
> > int cmd_diff(int argc, const char **argv, const char *prefix)
> > {
> > int i;
> > @@ -263,6 +366,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> > struct object_array_entry *blob[2];
> > int nongit = 0, no_index = 0;
> > int result = 0;
> > + struct symdiff sdiff;
> >
> > /*
> > * We could get N tree-ish in the rev.pending_objects list.
> > @@ -382,6 +486,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> > }
> > }
> >
> > + symdiff_prepare(&rev, &sdiff);
> > for (i = 0; i < rev.pending.nr; i++) {
> > struct object_array_entry *entry = &rev.pending.objects[i];
> > struct object *obj = entry->item;
> > @@ -396,6 +501,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> > obj = &get_commit_tree(((struct commit *)obj))->object;
> >
> > if (obj->type == OBJ_TREE) {
> > + if (sdiff.skip && bitmap_get(sdiff.skip, i))
> > + continue;
>
> By the way, I cannot shake this feeling that, given that
> rev.pending/cmdline.nr will not be an unreasonably large number, if
> it is overkill to use the bitmap here. If I were writing this code,
> I would have made symdiff_prepare() to fill a separate object array
> by copying the elements to be used in the final "diff" out of the
> rev.pending array and updated this loop to iterate over that array.
>
> I am not saying that such an approach is better than the use of
> bitmap code here. It just was a bit unexpected to see the bitmap
> code used for set of objects that is typically less than a dozen.
I am sure it is overkill -- but at the same time, it's already coded
and cheap enough to use that rolling our own separate array felt
worse. I can add a comment (NOT COMMIT) if you like.
Thanks,
Chris
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v4 3/3] Documentation: usage for diff combined commits
2020-06-12 16:19 ` [PATCH v4 0/3] improve git-diff documentation and A...B handling Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 1/3] t/t3430: avoid undefined git diff behavior Chris Torek via GitGitGadget
2020-06-12 16:19 ` [PATCH v4 2/3] git diff: improve range handling Chris Torek via GitGitGadget
@ 2020-06-12 16:20 ` Chris Torek via GitGitGadget
2 siblings, 0 replies; 26+ messages in thread
From: Chris Torek via GitGitGadget @ 2020-06-12 16:20 UTC (permalink / raw)
To: git; +Cc: Chris Torek, Chris Torek
From: Chris Torek <chris.torek@gmail.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
---
Documentation/git-diff.txt | 20 ++++++++++++++++----
builtin/diff.c | 8 +++++++-
2 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 37781cf1755..1018110ddc2 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -11,15 +11,17 @@ SYNOPSIS
[verse]
'git diff' [<options>] [<commit>] [--] [<path>...]
'git diff' [<options>] --cached [<commit>] [--] [<path>...]
-'git diff' [<options>] <commit> <commit> [--] [<path>...]
+'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]
+'git diff' [<options>] <commit>...<commit> [--] [<path>...]
'git diff' [<options>] <blob> <blob>
'git diff' [<options>] --no-index [--] <path> <path>
DESCRIPTION
-----------
Show changes between the working tree and the index or a tree, changes
-between the index and a tree, changes between two trees, changes between
-two blob objects, or changes between two files on disk.
+between the index and a tree, changes between two trees, changes resulting
+from a merge, changes between two blob objects, or changes between two
+files on disk.
'git diff' [<options>] [--] [<path>...]::
@@ -67,6 +69,15 @@ two blob objects, or changes between two files on disk.
one side is omitted, it will have the same effect as
using HEAD instead.
+'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]::
+
+ This form is to view the results of a merge commit. The first
+ listed <commit> must be the merge itself; the remaining two or
+ more commits should be its parents. A convenient way to produce
+ the desired set of revisions is to use the {caret}@ suffix.
+ For instance, if `master` names a merge commit, `git diff master
+ master^@` gives the same combined diff as `git show master`.
+
'git diff' [<options>] <commit>\...<commit> [--] [<path>...]::
This form is to view the changes on the branch containing
@@ -196,7 +207,8 @@ linkgit:git-difftool[1],
linkgit:git-log[1],
linkgit:gitdiffcore[7],
linkgit:git-format-patch[1],
-linkgit:git-apply[1]
+linkgit:git-apply[1],
+linkgit:git-show[1]
GIT
---
diff --git a/builtin/diff.c b/builtin/diff.c
index 0f48b0d3e71..b3d17340ee3 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -24,7 +24,13 @@
#define DIFF_NO_INDEX_IMPLICIT 2
static const char builtin_diff_usage[] =
-"git diff [<options>] [<commit> [<commit>]] [--] [<path>...]";
+"git diff [<options>] [<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] --cached [<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] <commit> [<commit>...] <commit> [--] [<path>...]\n"
+" or: git diff [<options>] <commit>...<commit>] [--] [<path>...]\n"
+" or: git diff [<options>] <blob> <blob>]\n"
+" or: git diff [<options>] --no-index [--] <path> <path>]\n"
+COMMON_DIFF_OPTIONS_HELP;
static const char *blob_path(struct object_array_entry *entry)
{
--
gitgitgadget
^ permalink raw reply related [flat|nested] 26+ messages in thread