* [PATCH] config: Introduce --patience config variable @ 2012-03-06 10:59 Michal Privoznik 2012-03-06 11:49 ` Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Michal Privoznik @ 2012-03-06 10:59 UTC (permalink / raw) To: git; +Cc: gitster Some users like the patience diff more than the bare diff. However, specifying the '--patience' argument every time diff is to be used is impractical. Moreover, creating an alias doesn't play with other tools nice, e.g. git-show; Therefore we need a configuration variable to turn this on/off across whole git tools. --- Please keep me CC'ed as I am not signed into list. Documentation/diff-config.txt | 3 +++ diff.c | 9 +++++++++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt index 1aed79e..b72b2fd 100644 --- a/Documentation/diff-config.txt +++ b/Documentation/diff-config.txt @@ -86,6 +86,9 @@ diff.mnemonicprefix:: diff.noprefix:: If set, 'git diff' does not show any source or destination prefix. +diff.patience: + If set, 'git diff' will use patience algorithm. + diff.renameLimit:: The number of files to consider when performing the copy/rename detection; equivalent to the 'git diff' option '-l'. diff --git a/diff.c b/diff.c index a1c06b5..8940d00 100644 --- a/diff.c +++ b/diff.c @@ -33,6 +33,7 @@ static int diff_mnemonic_prefix; static int diff_no_prefix; static int diff_dirstat_permille_default = 30; static struct diff_options default_diff_options; +static int diff_patience = 0; static char diff_colors[][COLOR_MAXLEN] = { GIT_COLOR_RESET, @@ -212,6 +213,11 @@ int git_diff_basic_config(const char *var, const char *value, void *cb) if (!prefixcmp(var, "submodule.")) return parse_submodule_config_option(var, value); + if (!strcmp(var, "diff.patience")) { + diff_patience = git_config_bool(var, value); + return 0; + } + return git_default_config(var, value, cb); } @@ -3202,6 +3208,9 @@ int diff_setup_done(struct diff_options *options) DIFF_OPT_SET(options, EXIT_WITH_STATUS); } + if (diff_patience) + DIFF_XDL_SET(options, PATIENCE_DIFF); + return 0; } -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 10:59 [PATCH] config: Introduce --patience config variable Michal Privoznik @ 2012-03-06 11:49 ` Jeff King 2012-03-06 13:01 ` Thomas Rast 2012-03-07 2:57 ` Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Jeff King @ 2012-03-06 11:49 UTC (permalink / raw) To: Michal Privoznik; +Cc: git, gitster On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote: > Some users like the patience diff more than the bare diff. However, > specifying the '--patience' argument every time diff is to be used > is impractical. Moreover, creating an alias doesn't play with other > tools nice, e.g. git-show; Therefore we need a configuration variable > to turn this on/off across whole git tools. The idea of turning on patience diff via config makes sense to me, and it is not a problem for plumbing tools to have patience diff on, since patience diffs are syntactically identical to non-patience diffs. So I think the intent is good. A few comments on the patch itself: > --- a/Documentation/diff-config.txt > +++ b/Documentation/diff-config.txt > @@ -86,6 +86,9 @@ diff.mnemonicprefix:: > diff.noprefix:: > If set, 'git diff' does not show any source or destination prefix. > > +diff.patience: > + If set, 'git diff' will use patience algorithm. > + Should this be a boolean? Or should we actually have a diff.algorithm option where you specify the algorithm you want (e.g., "diff.algorithm = patience")? That would free us up later to more easily add new values. In particular, I am thinking about --minimal. It is mutually exclusive with --patience, and is simply ignored if you use patience diff. we perhaps have "diff.algorithm" which can be one of "myers", "minimal" (which is really myers + the minimal flag), and "patience". Or I suppose you could think of it as "--minimal" being orthogonal to the algorithm chosen, and it is simply that "--minimal" does nothing (currently) with the patience algorithm. > @@ -3202,6 +3208,9 @@ int diff_setup_done(struct diff_options *options) > DIFF_OPT_SET(options, EXIT_WITH_STATUS); > } > > + if (diff_patience) > + DIFF_XDL_SET(options, PATIENCE_DIFF); > + > return 0; Hmm. Usually for a boolean config we would have "-1" mean "not specified", and otherwise 0/1 for true/false. But in your case, setting diff.patience to "false" will be the same as not setting it at all. I think this is probably OK. "false" is the default, so you would only want to specify it to override an earlier setting, but there is nothing earlier than config that you could possibly be overriding. Speaking of overriding, you may want to actually override the config option from the command line. You probably should also add a "--no-patience" option, so that one can turn off "diff.patience = true" for a particular invocation. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 11:49 ` Jeff King @ 2012-03-06 13:01 ` Thomas Rast 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast ` (2 more replies) 2012-03-07 2:57 ` Junio C Hamano 1 sibling, 3 replies; 23+ messages in thread From: Thomas Rast @ 2012-03-06 13:01 UTC (permalink / raw) To: Jeff King; +Cc: Michal Privoznik, git, gitster Jeff King <peff@peff.net> writes: > On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote: > >> --- a/Documentation/diff-config.txt >> +++ b/Documentation/diff-config.txt >> @@ -86,6 +86,9 @@ diff.mnemonicprefix:: >> diff.noprefix:: >> If set, 'git diff' does not show any source or destination prefix. >> >> +diff.patience: >> + If set, 'git diff' will use patience algorithm. >> + > > Should this be a boolean? Or should we actually have a diff.algorithm > option where you specify the algorithm you want (e.g., "diff.algorithm = > patience")? That would free us up later to more easily add new values. > > In particular, I am thinking about --minimal. It is mutually exclusive > with --patience, and is simply ignored if you use patience diff. > we perhaps have "diff.algorithm" which can be one of "myers", "minimal" > (which is really myers + the minimal flag), and "patience". Don't forget "histogram". I have no idea why it's not documented (evidently 8c912eea slipped through the review cracks) but --histogram is supported since 1.7.7. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/2] perf: compare diff algorithms 2012-03-06 13:01 ` Thomas Rast @ 2012-03-06 13:15 ` Thomas Rast 2012-03-06 13:15 ` [PATCH 2/2] Document the --histogram diff option Thomas Rast ` (2 more replies) 2012-03-06 13:30 ` [PATCH] config: Introduce --patience config variable Jeff King 2012-03-06 13:32 ` Michal Privoznik 2 siblings, 3 replies; 23+ messages in thread From: Thomas Rast @ 2012-03-06 13:15 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michal Privoznik, Jeff King 8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff was faster than both Myers and patience. We have since incorporated a performance testing framework, so add a test that compares the various diff tasks performed in a real 'log -p' workload. This does indeed show that histogram diff slightly beats Myers, while patience is much slower than the others. Signed-off-by: Thomas Rast <trast@student.ethz.ch> --- The 3000 is pretty arbitrary but makes for a nice test duration. I'm reluctant to put numbers into the message, since the whole point of the perf test framework is that you can easily get them too. But here's what I'm seeing: 4000.1: log -3000 (baseline) 0.04(0.02+0.01) 4000.2: log --raw -3000 (tree-only) 0.49(0.38+0.09) 4000.3: log -p -3000 (Myers) 1.93(1.75+0.17) 4000.4: log -p -3000 --histogram 1.90(1.74+0.15) 4000.5: log -p -3000 --patience 2.25(2.07+0.16) t/perf/p4000-diff-algorithms.sh | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100755 t/perf/p4000-diff-algorithms.sh diff --git a/t/perf/p4000-diff-algorithms.sh b/t/perf/p4000-diff-algorithms.sh new file mode 100755 index 0000000..d6e505c --- /dev/null +++ b/t/perf/p4000-diff-algorithms.sh @@ -0,0 +1,29 @@ +#!/bin/sh + +test_description="Tests diff generation performance" + +. ./perf-lib.sh + +test_perf_default_repo + +test_perf 'log -3000 (baseline)' ' + git log -1000 >/dev/null +' + +test_perf 'log --raw -3000 (tree-only)' ' + git log --raw -3000 >/dev/null +' + +test_perf 'log -p -3000 (Myers)' ' + git log -p -3000 >/dev/null +' + +test_perf 'log -p -3000 --histogram' ' + git log -p -3000 --histogram >/dev/null +' + +test_perf 'log -p -3000 --patience' ' + git log -p -3000 --patience >/dev/null +' + +test_done -- 1.7.9.2.467.g7fee4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/2] Document the --histogram diff option 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast @ 2012-03-06 13:15 ` Thomas Rast 2012-03-06 19:57 ` Junio C Hamano 2012-03-06 19:52 ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano 2012-03-10 7:13 ` René Scharfe 2 siblings, 1 reply; 23+ messages in thread From: Thomas Rast @ 2012-03-06 13:15 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michal Privoznik, Jeff King Signed-off-by: Thomas Rast <trast@student.ethz.ch> --- This is only the minimal update. I think in the long run, we should add a note saying why we support all of them. BUt off hand I didn't have any substantial evidence in favour of patience that could be used as an argument. Documentation/diff-options.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 87f0a5f..7d4566f 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -52,6 +52,9 @@ endif::git-format-patch[] --patience:: Generate a diff using the "patience diff" algorithm. +--histogram:: + Generate a diff using the "histogram diff" algorithm. + --stat[=<width>[,<name-width>[,<count>]]]:: Generate a diffstat. By default, as much space as necessary will be used for the filename part, and the rest for the graph -- 1.7.9.2.467.g7fee4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Document the --histogram diff option 2012-03-06 13:15 ` [PATCH 2/2] Document the --histogram diff option Thomas Rast @ 2012-03-06 19:57 ` Junio C Hamano 2012-03-06 20:42 ` Thomas Rast 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2012-03-06 19:57 UTC (permalink / raw) To: Thomas Rast; +Cc: git, Michal Privoznik, Jeff King Thomas Rast <trast@student.ethz.ch> writes: > Signed-off-by: Thomas Rast <trast@student.ethz.ch> > --- > > This is only the minimal update. I think in the long run, we should > add a note saying why we support all of them. But off hand I didn't > have any substantial evidence in favour of patience that could be used > as an argument. Isn't the main argument made by proponents of patience diff is more readable output, and not performance? That line of argument relies on a fairly subjective test "which one is easier to read?", so it is hard to come up with a substantial evidence, unless somebody invests in A/B test. Thanks, will queue for 1.7.7.x maintenance track and upwards. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Document the --histogram diff option 2012-03-06 19:57 ` Junio C Hamano @ 2012-03-06 20:42 ` Thomas Rast 0 siblings, 0 replies; 23+ messages in thread From: Thomas Rast @ 2012-03-06 20:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Rast, git, Michal Privoznik, Jeff King Junio C Hamano <gitster@pobox.com> writes: > Thomas Rast <trast@student.ethz.ch> writes: > >> Signed-off-by: Thomas Rast <trast@student.ethz.ch> >> --- >> >> This is only the minimal update. I think in the long run, we should >> add a note saying why we support all of them. But off hand I didn't >> have any substantial evidence in favour of patience that could be used >> as an argument. > > Isn't the main argument made by proponents of patience diff is more > readable output, and not performance? That line of argument relies > on a fairly subjective test "which one is easier to read?", so it is > hard to come up with a substantial evidence, unless somebody invests > in A/B test. Well, I was just too lazy to look up what I dimly remembered people had posted at some point: examples where patience beats Myers for readability. E.g., http://article.gmane.org/gmane.comp.version-control.git/104316 I don't think you need a blind test to justify that the patience result is more readable. So I think in the long run, the docs should say something like: --diff-algorithm={histogram|myers|minimal|patience}:: Choose a diff algorithm. The variants are as follows: + -- histogram:: This is the fastest algorithm, and thus the default. myers:: The classical Myers diff algorithm. <state a reason why myers would be useful> minimal:: Like 'myers', but spend extra time making sure that the diff is the shortest possible for the set of changes performed. patience:: The patience diff algorithm, which first matches unique lines with each other. This sometimes results in more readable (if longer) patches than the other algorithms. -- Or whatever -- magic is required to have a nested list in asciidoc. I can't be bothered to twiddle with that right now. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast 2012-03-06 13:15 ` [PATCH 2/2] Document the --histogram diff option Thomas Rast @ 2012-03-06 19:52 ` Junio C Hamano 2012-03-06 21:00 ` Thomas Rast 2012-03-10 7:13 ` René Scharfe 2 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2012-03-06 19:52 UTC (permalink / raw) To: Thomas Rast; +Cc: git, Michal Privoznik, Jeff King Thomas Rast <trast@student.ethz.ch> writes: > 8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff > was faster than both Myers and patience. > > We have since incorporated a performance testing framework, so add a > test that compares the various diff tasks performed in a real 'log -p' > workload. This does indeed show that histogram diff slightly beats > Myers, while patience is much slower than the others. > > Signed-off-by: Thomas Rast <trast@student.ethz.ch> > --- Fun. I am getting this (probably unrelated to this patch), by the way: $ make perf make -C t/perf/ all make[1]: Entering directory `/srv/project/git/git.git/t/perf' rm -rf test-results ./run ... perf 4 - grep --cached, expensive regex: 1 2 3 ok # passed all 4 test(s) 1..4 Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5. BEGIN failed--compilation aborted at ./aggregate.perl line 5. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-06 19:52 ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano @ 2012-03-06 21:00 ` Thomas Rast 2012-03-06 21:18 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Thomas Rast @ 2012-03-06 21:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Rast, git, Michal Privoznik, Jeff King Junio C Hamano <gitster@pobox.com> writes: > > I am getting this (probably unrelated to this patch), by the way: > > $ make perf > make -C t/perf/ all > make[1]: Entering directory `/srv/project/git/git.git/t/perf' > rm -rf test-results > ./run > ... > perf 4 - grep --cached, expensive regex: 1 2 3 ok > # passed all 4 test(s) > 1..4 > Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5. > BEGIN failed--compilation aborted at ./aggregate.perl line 5. It would seem that you are not installing Git.pm as part of your normal installation? aggregate.perl uses it, so you'd have to. Perhaps a Perl guru can tell us if it's possible to magically pull Git.pm from the build tree instead of the installed version... -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-06 21:00 ` Thomas Rast @ 2012-03-06 21:18 ` Junio C Hamano 2012-03-06 21:41 ` Jakub Narebski 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2012-03-06 21:18 UTC (permalink / raw) To: Thomas Rast; +Cc: Thomas Rast, git, Michal Privoznik, Jeff King Thomas Rast <trast@inf.ethz.ch> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> >> I am getting this (probably unrelated to this patch), by the way: >> >> $ make perf >> make -C t/perf/ all >> make[1]: Entering directory `/srv/project/git/git.git/t/perf' >> rm -rf test-results >> ./run >> ... >> perf 4 - grep --cached, expensive regex: 1 2 3 ok >> # passed all 4 test(s) >> 1..4 >> Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5. >> BEGIN failed--compilation aborted at ./aggregate.perl line 5. > > It would seem that you are not installing Git.pm as part of your normal > installation? I actually am installing it in a quite vanilla way. I think our installation procedure places Git.pm in git specific perl library path where a simple invocation of "perl" that is git-unaware will not look into, and we make sure that our scripts still find the matching version of Git.pm by having "use lib" at the beginning that points at the right directory. But of course, this from a command line would not work: $ perl -MGit I do not expect it to, and for the ease of testing new versions, I prefer it not to work. In any case, you should be able to do anything under t/ _before_ installing, so relying on having Git.pm in normal @INC is a double no-no. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-06 21:18 ` Junio C Hamano @ 2012-03-06 21:41 ` Jakub Narebski 2012-03-07 12:44 ` Thomas Rast 0 siblings, 1 reply; 23+ messages in thread From: Jakub Narebski @ 2012-03-06 21:41 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Rast, Thomas Rast, git, Michal Privoznik, Jeff King Junio C Hamano <gitster@pobox.com> writes: > Thomas Rast <trast@inf.ethz.ch> writes: >> Junio C Hamano <gitster@pobox.com> writes: >> >>> >>> I am getting this (probably unrelated to this patch), by the way: >>> >>> $ make perf >>> make -C t/perf/ all >>> make[1]: Entering directory `/srv/project/git/git.git/t/perf' >>> rm -rf test-results >>> ./run >>> ... >>> perf 4 - grep --cached, expensive regex: 1 2 3 ok >>> # passed all 4 test(s) >>> 1..4 >>> Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5. >>> BEGIN failed--compilation aborted at ./aggregate.perl line 5. >> >> It would seem that you are not installing Git.pm as part of your normal >> installation? > > I actually am installing it in a quite vanilla way. > > I think our installation procedure places Git.pm in git specific > perl library path where a simple invocation of "perl" that is > git-unaware will not look into, and we make sure that our scripts > still find the matching version of Git.pm by having "use lib" at the > beginning that points at the right directory. > > But of course, this from a command line would not work: > > $ perl -MGit > > I do not expect it to, and for the ease of testing new versions, I > prefer it not to work. > > In any case, you should be able to do anything under t/ _before_ > installing, so relying on having Git.pm in normal @INC is a double > no-no. Thomas, take a look at how it is solved in 't/t9700/test.pl', used by 't/t9700-perl-git.sh': use lib (split(/:/, $ENV{GITPERLLIB})); -- Jakub Narebski ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-06 21:41 ` Jakub Narebski @ 2012-03-07 12:44 ` Thomas Rast 2012-03-07 17:45 ` Junio C Hamano 2012-03-07 18:03 ` Jakub Narebski 0 siblings, 2 replies; 23+ messages in thread From: Thomas Rast @ 2012-03-07 12:44 UTC (permalink / raw) To: Jakub Narebski Cc: Junio C Hamano, Thomas Rast, git, Michal Privoznik, Jeff King Jakub Narebski <jnareb@gmail.com> writes: > Junio C Hamano <gitster@pobox.com> writes: >> >> But of course, this from a command line would not work: >> >> $ perl -MGit >> >> I do not expect it to, and for the ease of testing new versions, I >> prefer it not to work. >> >> In any case, you should be able to do anything under t/ _before_ >> installing, so relying on having Git.pm in normal @INC is a double >> no-no. > > Thomas, take a look at how it is solved in 't/t9700/test.pl', used by > 't/t9700-perl-git.sh': > > use lib (split(/:/, $ENV{GITPERLLIB})); Hum. The problem is that the user may invoke aggregate.perl manually, and GITPERLLIB won't be set in that case. Is there a better solution than duplicating the logic that sets GITPERLLIB in test-lib.sh within aggregate.perl? -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-07 12:44 ` Thomas Rast @ 2012-03-07 17:45 ` Junio C Hamano 2012-03-07 18:03 ` Jakub Narebski 1 sibling, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2012-03-07 17:45 UTC (permalink / raw) To: Thomas Rast; +Cc: Jakub Narebski, Thomas Rast, git, Michal Privoznik, Jeff King Thomas Rast <trast@inf.ethz.ch> writes: > Jakub Narebski <jnareb@gmail.com> writes: > >> Thomas, take a look at how it is solved in 't/t9700/test.pl', used by >> 't/t9700-perl-git.sh': >> >> use lib (split(/:/, $ENV{GITPERLLIB})); > > Hum. The problem is that the user may invoke aggregate.perl manually, > and GITPERLLIB won't be set in that case. I would equate "manual invocation" with "the user knows what he is doing", so if that is the only problem, I think we are good. > Is there a better solution than duplicating the logic that sets > GITPERLLIB in test-lib.sh within aggregate.perl? Perhaps the part to figure out the directory layout can and should be split out of test-lib.sh into test-env.sh or something and be dot-sourced at the beginning of test-lib.sh; would that help? t/test-env.sh | 112 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ t/test-lib.sh | 118 ++++----------------------------------------------------- 2 files changed, 120 insertions(+), 110 deletions(-) diff --git a/t/test-env.sh b/t/test-env.sh new file mode 100644 index 0000000..1159c5a --- /dev/null +++ b/t/test-env.sh @@ -0,0 +1,112 @@ +# figure out the environment a test-related script is being run +# taken from test-lib.sh, Copyright (c) 2005 Junio C Hamano + +# The directory to find other helper pieces e.g. lib-gpg.sh of the +# test suite; usually t/ in the git source tree. +if test -z "$TEST_DIRECTORY" +then + # We allow tests to override this, in case they want to run tests + # outside of t/, e.g. for running tests on the test library + # itself. + TEST_DIRECTORY=$(pwd) +fi +GIT_BUILD_DIR="$TEST_DIRECTORY"/.. +GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt +GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git + +if test -n "$valgrind" +then + make_symlink () { + test -h "$2" && + test "$1" = "$(readlink "$2")" || { + # be super paranoid + if mkdir "$2".lock + then + rm -f "$2" && + ln -s "$1" "$2" && + rm -r "$2".lock + else + while test -d "$2".lock + do + say "Waiting for lock on $2." + sleep 1 + done + fi + } + } + + make_valgrind_symlink () { + # handle only executables, unless they are shell libraries that + # need to be in the exec-path. We will just use "#!" as a + # guess for a shell-script, since we have no idea what the user + # may have configured as the shell path. + test -x "$1" || + test "#!" = "$(head -c 2 <"$1")" || + return; + + base=$(basename "$1") + symlink_target=$GIT_BUILD_DIR/$base + # do not override scripts + if test -x "$symlink_target" && + test ! -d "$symlink_target" && + test "#!" != "$(head -c 2 < "$symlink_target")" + then + symlink_target=../valgrind.sh + fi + case "$base" in + *.sh|*.perl) + symlink_target=../unprocessed-script + esac + # create the link, or replace it if it is out of date + make_symlink "$symlink_target" "$GIT_VALGRIND/bin/$base" || exit + } + + # override all git executables in TEST_DIRECTORY/.. + GIT_VALGRIND=$TEST_DIRECTORY/valgrind + mkdir -p "$GIT_VALGRIND"/bin + for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-* + do + make_valgrind_symlink $file + done + # special-case the mergetools loadables + make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools" + OLDIFS=$IFS + IFS=: + for path in $PATH + do + ls "$path"/git-* 2> /dev/null | + while read file + do + make_valgrind_symlink "$file" + done + done + IFS=$OLDIFS + PATH=$GIT_VALGRIND/bin:$PATH + GIT_EXEC_PATH=$GIT_VALGRIND/bin + export GIT_VALGRIND +elif test -n "$GIT_TEST_INSTALLED" ; then + GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path) || + error "Cannot run git from $GIT_TEST_INSTALLED." + PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR:$PATH + GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH} +else # normal case, use ../bin-wrappers only unless $with_dashes: + git_bin_dir="$GIT_BUILD_DIR/bin-wrappers" + if ! test -x "$git_bin_dir/git" ; then + if test -z "$with_dashes" ; then + say "$git_bin_dir/git is not executable; using GIT_EXEC_PATH" + fi + with_dashes=t + fi + PATH="$git_bin_dir:$PATH" + GIT_EXEC_PATH=$GIT_BUILD_DIR + if test -n "$with_dashes" ; then + PATH="$GIT_BUILD_DIR:$PATH" + fi +fi +GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt + +. "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS + +GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git + +export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GITPERLLIB diff --git a/t/test-lib.sh b/t/test-lib.sh index 30ed4d7..e892368 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -31,6 +31,13 @@ done,*) ;; esac +# Protect ourselves from common misconfiguration to export +# CDPATH into the environment +unset CDPATH +unset GREP_OPTIONS + +. ./test-env.sh + # Keep the original TERM for say_color ORIGINAL_TERM=$TERM @@ -72,12 +79,6 @@ export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME export EDITOR -# Protect ourselves from common misconfiguration to export -# CDPATH into the environment -unset CDPATH - -unset GREP_OPTIONS - case $(echo $GIT_TRACE |tr "[A-Z]" "[a-z]") in 1|2|true) echo "* warning: Some tests will not work if GIT_TRACE" \ @@ -380,117 +381,17 @@ test_done () { esac } -# Test the binaries we have just built. The tests are kept in -# t/ subdirectory and are run in 'trash directory' subdirectory. -if test -z "$TEST_DIRECTORY" -then - # We allow tests to override this, in case they want to run tests - # outside of t/, e.g. for running tests on the test library - # itself. - TEST_DIRECTORY=$(pwd) -fi if test -z "$TEST_OUTPUT_DIRECTORY" then # Similarly, override this to store the test-results subdir # elsewhere TEST_OUTPUT_DIRECTORY=$TEST_DIRECTORY fi -GIT_BUILD_DIR="$TEST_DIRECTORY"/.. -if test -n "$valgrind" -then - make_symlink () { - test -h "$2" && - test "$1" = "$(readlink "$2")" || { - # be super paranoid - if mkdir "$2".lock - then - rm -f "$2" && - ln -s "$1" "$2" && - rm -r "$2".lock - else - while test -d "$2".lock - do - say "Waiting for lock on $2." - sleep 1 - done - fi - } - } - - make_valgrind_symlink () { - # handle only executables, unless they are shell libraries that - # need to be in the exec-path. We will just use "#!" as a - # guess for a shell-script, since we have no idea what the user - # may have configured as the shell path. - test -x "$1" || - test "#!" = "$(head -c 2 <"$1")" || - return; - - base=$(basename "$1") - symlink_target=$GIT_BUILD_DIR/$base - # do not override scripts - if test -x "$symlink_target" && - test ! -d "$symlink_target" && - test "#!" != "$(head -c 2 < "$symlink_target")" - then - symlink_target=../valgrind.sh - fi - case "$base" in - *.sh|*.perl) - symlink_target=../unprocessed-script - esac - # create the link, or replace it if it is out of date - make_symlink "$symlink_target" "$GIT_VALGRIND/bin/$base" || exit - } - - # override all git executables in TEST_DIRECTORY/.. - GIT_VALGRIND=$TEST_DIRECTORY/valgrind - mkdir -p "$GIT_VALGRIND"/bin - for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-* - do - make_valgrind_symlink $file - done - # special-case the mergetools loadables - make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools" - OLDIFS=$IFS - IFS=: - for path in $PATH - do - ls "$path"/git-* 2> /dev/null | - while read file - do - make_valgrind_symlink "$file" - done - done - IFS=$OLDIFS - PATH=$GIT_VALGRIND/bin:$PATH - GIT_EXEC_PATH=$GIT_VALGRIND/bin - export GIT_VALGRIND -elif test -n "$GIT_TEST_INSTALLED" ; then - GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path) || - error "Cannot run git from $GIT_TEST_INSTALLED." - PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR:$PATH - GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH} -else # normal case, use ../bin-wrappers only unless $with_dashes: - git_bin_dir="$GIT_BUILD_DIR/bin-wrappers" - if ! test -x "$git_bin_dir/git" ; then - if test -z "$with_dashes" ; then - say "$git_bin_dir/git is not executable; using GIT_EXEC_PATH" - fi - with_dashes=t - fi - PATH="$git_bin_dir:$PATH" - GIT_EXEC_PATH=$GIT_BUILD_DIR - if test -n "$with_dashes" ; then - PATH="$GIT_BUILD_DIR:$PATH" - fi -fi -GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt unset GIT_CONFIG GIT_CONFIG_NOSYSTEM=1 GIT_ATTR_NOSYSTEM=1 -export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GIT_CONFIG_NOSYSTEM GIT_ATTR_NOSYSTEM +export GIT_CONFIG_NOSYSTEM GIT_ATTR_NOSYSTEM . "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS @@ -503,9 +404,6 @@ then GIT_TEST_CMP="$DIFF -u" fi fi - -GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git -export GITPERLLIB test -d "$GIT_BUILD_DIR"/templates/blt || { error "You haven't built things yet, have you?" } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-07 12:44 ` Thomas Rast 2012-03-07 17:45 ` Junio C Hamano @ 2012-03-07 18:03 ` Jakub Narebski 2012-03-07 18:19 ` Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Jakub Narebski @ 2012-03-07 18:03 UTC (permalink / raw) To: Thomas Rast; +Cc: Junio C Hamano, Thomas Rast, git, Michal Privoznik, Jeff King Thomas Rast wrote: > Jakub Narebski <jnareb@gmail.com> writes: > > Junio C Hamano <gitster@pobox.com> writes: > >> > >> But of course, this from a command line would not work: > >> > >> $ perl -MGit > >> > >> I do not expect it to, and for the ease of testing new versions, I > >> prefer it not to work. > >> > >> In any case, you should be able to do anything under t/ _before_ > >> installing, so relying on having Git.pm in normal @INC is a double > >> no-no. > > > > Thomas, take a look at how it is solved in 't/t9700/test.pl', used by > > 't/t9700-perl-git.sh': > > > > use lib (split(/:/, $ENV{GITPERLLIB})); > > Hum. The problem is that the user may invoke aggregate.perl manually, > and GITPERLLIB won't be set in that case. > > Is there a better solution than duplicating the logic that sets > GITPERLLIB in test-lib.sh within aggregate.perl? Beside extracting logic that sets GITPERLLIB into separate file like in Junio proposal? You can always assume that it is fixed relative to perl/Git.pm, and use __DIR__ or $FindBin to make "use lib", e.g. use FindBin; use lib "$FindBin::Bin/../../perl"; -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-07 18:03 ` Jakub Narebski @ 2012-03-07 18:19 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2012-03-07 18:19 UTC (permalink / raw) To: Jakub Narebski; +Cc: Thomas Rast, Thomas Rast, git, Michal Privoznik, Jeff King Jakub Narebski <jnareb@gmail.com> writes: > Beside extracting logic that sets GITPERLLIB into separate file like > in Junio proposal? That was not even a proposal. The part that deals with valgrind is blatantly wrong (it shouldn't create symlinks in that code, it only should figure out where things should be). > You can always assume that it is fixed relative > to perl/Git.pm, and use __DIR__ or $FindBin to make "use lib", e.g. > > use FindBin; > use lib "$FindBin::Bin/../../perl"; Use of FindBin to find the location of the script is OK, but does using "../../perl" really work? We have this GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git to look for two places in test-lib.sh ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] perf: compare diff algorithms 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast 2012-03-06 13:15 ` [PATCH 2/2] Document the --histogram diff option Thomas Rast 2012-03-06 19:52 ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano @ 2012-03-10 7:13 ` René Scharfe 2 siblings, 0 replies; 23+ messages in thread From: René Scharfe @ 2012-03-10 7:13 UTC (permalink / raw) To: Thomas Rast; +Cc: git, Junio C Hamano, Michal Privoznik, Jeff King Am 06.03.2012 14:15, schrieb Thomas Rast: > 8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff > was faster than both Myers and patience. > > We have since incorporated a performance testing framework, so add a > test that compares the various diff tasks performed in a real 'log -p' > workload. This does indeed show that histogram diff slightly beats > Myers, while patience is much slower than the others. > > Signed-off-by: Thomas Rast<trast@student.ethz.ch> > --- > > The 3000 is pretty arbitrary but makes for a nice test duration. > > I'm reluctant to put numbers into the message, since the whole point > of the perf test framework is that you can easily get them too. But > here's what I'm seeing: > > 4000.1: log -3000 (baseline) 0.04(0.02+0.01) > 4000.2: log --raw -3000 (tree-only) 0.49(0.38+0.09) > 4000.3: log -p -3000 (Myers) 1.93(1.75+0.17) > 4000.4: log -p -3000 --histogram 1.90(1.74+0.15) > 4000.5: log -p -3000 --patience 2.25(2.07+0.16) Just a data point: --histogram is slightly slower for me: Test this tree ----------------------------------------------------- 4000.1: log -3000 (baseline) 0.07(0.07+0.00) 4000.2: log --raw -3000 (tree-only) 0.35(0.31+0.04) 4000.3: log -p -3000 (Myers) 1.50(1.40+0.08) 4000.4: log -p -3000 --histogram 1.54(1.48+0.05) 4000.5: log -p -3000 --patience 1.79(1.71+0.06) (baseline with -3000) René ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 13:01 ` Thomas Rast 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast @ 2012-03-06 13:30 ` Jeff King 2012-03-06 13:32 ` Michal Privoznik 2 siblings, 0 replies; 23+ messages in thread From: Jeff King @ 2012-03-06 13:30 UTC (permalink / raw) To: Thomas Rast; +Cc: Michal Privoznik, git, gitster On Tue, Mar 06, 2012 at 02:01:42PM +0100, Thomas Rast wrote: > >> --- a/Documentation/diff-config.txt > >> +++ b/Documentation/diff-config.txt > >> @@ -86,6 +86,9 @@ diff.mnemonicprefix:: > >> diff.noprefix:: > >> If set, 'git diff' does not show any source or destination prefix. > >> > >> +diff.patience: > >> + If set, 'git diff' will use patience algorithm. > >> + > > > > Should this be a boolean? Or should we actually have a diff.algorithm > > option where you specify the algorithm you want (e.g., "diff.algorithm = > > patience")? That would free us up later to more easily add new values. > > > > In particular, I am thinking about --minimal. It is mutually exclusive > > with --patience, and is simply ignored if you use patience diff. > > we perhaps have "diff.algorithm" which can be one of "myers", "minimal" > > (which is really myers + the minimal flag), and "patience". > > Don't forget "histogram". I have no idea why it's not documented > (evidently 8c912eea slipped through the review cracks) but --histogram > is supported since 1.7.7. Ah, thanks. I had the vague feeling that we had a third algorithm already, but I didn't see it in the docs. So yeah, I really think this should be diff.algorithm, with a value of "myers", "patience", or "histogram" (and possibly "minimal", depending how we want to treat that). -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 13:01 ` Thomas Rast 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast 2012-03-06 13:30 ` [PATCH] config: Introduce --patience config variable Jeff King @ 2012-03-06 13:32 ` Michal Privoznik 2012-03-06 13:38 ` Matthieu Moy 2 siblings, 1 reply; 23+ messages in thread From: Michal Privoznik @ 2012-03-06 13:32 UTC (permalink / raw) To: Thomas Rast; +Cc: Jeff King, git, gitster On 06.03.2012 14:01, Thomas Rast wrote: > Jeff King <peff@peff.net> writes: > >> On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote: >> >>> --- a/Documentation/diff-config.txt >>> +++ b/Documentation/diff-config.txt >>> @@ -86,6 +86,9 @@ diff.mnemonicprefix:: >>> diff.noprefix:: >>> If set, 'git diff' does not show any source or destination prefix. >>> >>> +diff.patience: >>> + If set, 'git diff' will use patience algorithm. >>> + >> >> Should this be a boolean? Or should we actually have a diff.algorithm >> option where you specify the algorithm you want (e.g., "diff.algorithm = >> patience")? That would free us up later to more easily add new values. >> >> In particular, I am thinking about --minimal. It is mutually exclusive >> with --patience, and is simply ignored if you use patience diff. >> we perhaps have "diff.algorithm" which can be one of "myers", "minimal" >> (which is really myers + the minimal flag), and "patience". > > Don't forget "histogram". I have no idea why it's not documented > (evidently 8c912eea slipped through the review cracks) but --histogram > is supported since 1.7.7. > Okay guys. I'll got with diff.algorithm = [patience | minimal | histogram | myers] then. What I am not sure about is how to threat case when user have say algorithm = patience set in config but want to use myers. I guess we need --myers option then, don't we? Michal ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 13:32 ` Michal Privoznik @ 2012-03-06 13:38 ` Matthieu Moy 2012-03-06 14:09 ` Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Matthieu Moy @ 2012-03-06 13:38 UTC (permalink / raw) To: Michal Privoznik; +Cc: Thomas Rast, Jeff King, git, gitster Michal Privoznik <mprivozn@redhat.com> writes: > Okay guys. I'll got with diff.algorithm = [patience | minimal | > histogram | myers] then. What I am not sure about is how to threat case > when user have say algorithm = patience set in config but want to use > myers. I guess we need --myers option then, don't we? At this point, maybe it's time to have --diff-algorithm=[patience|minimal|histogram|myers], and keep --patience, --minimal and --histogram just as compatibility aliases. Having one option per algorithm feels wrong ... -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 13:38 ` Matthieu Moy @ 2012-03-06 14:09 ` Jeff King 0 siblings, 0 replies; 23+ messages in thread From: Jeff King @ 2012-03-06 14:09 UTC (permalink / raw) To: Matthieu Moy; +Cc: Michal Privoznik, Thomas Rast, git, gitster On Tue, Mar 06, 2012 at 02:38:16PM +0100, Matthieu Moy wrote: > Michal Privoznik <mprivozn@redhat.com> writes: > > > Okay guys. I'll got with diff.algorithm = [patience | minimal | > > histogram | myers] then. What I am not sure about is how to threat case > > when user have say algorithm = patience set in config but want to use > > myers. I guess we need --myers option then, don't we? > > At this point, maybe it's time to have > --diff-algorithm=[patience|minimal|histogram|myers], and keep > --patience, --minimal and --histogram just as compatibility aliases. > > Having one option per algorithm feels wrong ... Yeah, that was going to be my suggestion, too. And explaining it that way in the docs will make it clear that "--histogram" overrides "--patience" and so forth. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-06 11:49 ` Jeff King 2012-03-06 13:01 ` Thomas Rast @ 2012-03-07 2:57 ` Junio C Hamano 2012-03-07 11:47 ` Jeff King 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2012-03-07 2:57 UTC (permalink / raw) To: Jeff King; +Cc: Michal Privoznik, git Jeff King <peff@peff.net> writes: > On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote: > >> Some users like the patience diff more than the bare diff. However, >> specifying the '--patience' argument every time diff is to be used >> is impractical. Moreover, creating an alias doesn't play with other >> tools nice, e.g. git-show; Therefore we need a configuration variable >> to turn this on/off across whole git tools. > > The idea of turning on patience diff via config makes sense to me, and > it is not a problem for plumbing tools to have patience diff on, since > patience diffs are syntactically identical to non-patience diffs. Even though I do not strongly object to the general conclusion, I have to point out that the last line above is a dangerous thought. If you change the default diff algorithm, you will invalidate long term caches that computed their keys based on the shape of patches produced in the past. The prime example being the rerere database, but I wouldn't be surprised if somebody has a notes tree to record patch ids for existing commits to speed up "git cherry" (hence "git rebase"). Also kup tool kernel.org folks use to optimize the uploading of inter-release diff relies on having a stable output from "git diff A..B", so that the uploader can run the command to produce a diff locally, GPG sign it, and upload only the signature and have the k.org side produce the same diff between two tags, without having to upload the huge patchfile over the wire. IOW, "syntactically identical so it is OK" is not the right thought process. "It may change the shape of the patch, which is a potential problem for various tools, but as long as users understand that, it should be allowed." is OK. Cached patch-id database for "git cherry" would be a local and does not affect the correctness, so I would consider breaking it is fine. About kup use case, the potential problem can be worked around as long as the receiving end keeps the setting vanilla (and I do not see any reason it wouldn't) and the uploader remembers to choose the matching variant when he locally generates the patch, so I think it would be also OK. Unnecessarily invalidating rerere database may be more frustrating, but that again is local and personal, so the end user may suffer worse than cached patch-id use case, but that hopefully is just one time pain. My preference however is to limit this to Porcelains only, though. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-07 2:57 ` Junio C Hamano @ 2012-03-07 11:47 ` Jeff King 2012-03-07 17:24 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2012-03-07 11:47 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michal Privoznik, git On Tue, Mar 06, 2012 at 06:57:42PM -0800, Junio C Hamano wrote: > > The idea of turning on patience diff via config makes sense to me, and > > it is not a problem for plumbing tools to have patience diff on, since > > patience diffs are syntactically identical to non-patience diffs. > > Even though I do not strongly object to the general conclusion, I > have to point out that the last line above is a dangerous thought. > > If you change the default diff algorithm, you will invalidate long > term caches that computed their keys based on the shape of patches > produced in the past. I see your point, though I don't think I'd use the word "dangerous" to describe the invalidation of a cache. In a true cache, we must be ready for cache misses, so the "danger" is causing extra cache misses. That might be non-optimal, depending on the cost of a miss, but it's not going to result in data loss. [This email ended up long; the "tl;dr" is: "I don't think it's that big a deal, but I don't really have a problem limiting this feature to just porcelain". Michal, the way to do that would be to move the config parsing into git_diff_ui_config instead of git_diff_basic_config.] > The prime example being the rerere database, Having said that about caches, I don't know that I'd call rerere a cache. The "miss" means the user has to resolve the merge. So whether it is a cache or a database of precious information depends on how much you value the user's input. Anyway, would this option actually affect rerere? AFAICT, rerere does not care about diff generation, but rather about the merge parameters. And it is smart enough to tell the merge subsystem to use standard parameters when generating rr-cache entries (e.g., setting merge.conflictstyle does not affect what is put in the rr-cache). > but I wouldn't be surprised if somebody has a notes tree to record > patch ids for existing commits to speed up "git cherry" (hence "git > rebase"). I do that[1]. However, any cache of patch-ids that does not record the diff parameters used is wrong[2]. You're unlikely to come up with a false positive due to the length of the patch-id hash, and a few cache misses here and there are fine. A cache that invalidates all entries when its cache parameters change[3] would be rendered useless by two commands flip-flopping between two different sets of diff parameters. But that is a quality-of-implementation issue for the cache. [1] Actually, I don't store it in notes, but rather in the mmap'd persistent key value store I developed for the generation number cache last summer (which we ended up not using). [2] Mine doesn't actually do this, and is wrong. Which is part of the reason I haven't sent it upstream. [3] The notes_cache code invalidates like this. Since the only user is the textconv cache, that strategy works well. For the generation number cache, it maintains multiple cache entries. > Also kup tool kernel.org folks use to optimize the uploading of > inter-release diff relies on having a stable output from "git diff > A..B", so that the uploader can run the command to produce a diff > locally, GPG sign it, and upload only the signature and have the k.org > side produce the same diff between two tags, without having to upload > the huge patchfile over the wire. That's a cute trick, but I think it is going to be inherently flaky. Does kup run the "git diff" itself (or, I would hope "git diff-tree")? If not, then they are at the mercy of the uploader adding custom diff parameters. I don't know about others, but I sometimes tweak the amount of context in a diff I send if it makes the diff more readable (e.g., if two hunks are close by and are more easily read as a single hunk, or if relevant context is one or two lines outside of a hunk). But even if the kup client runs a sanitized, well-known, plumbing version of the diff on both sides, they're still not guaranteed to produce the same patch. Different versions of git may produce different results. For example, we stopped defaulting to XDF_NEED_MINIMAL in 582aa00. Git v1.7.1 produces a different diff for some commits than v1.7.1.1. > IOW, "syntactically identical so it is OK" is not the right thought > process. "It may change the shape of the patch, which is a potential > problem for various tools, but as long as users understand that, it > should be allowed." is OK. I would argue that it is those tools which should be the ones to care. One of the great things about git is that it uses simple, standard formats like unified diff, and doesn't care if you used git to generate them. But the actual algorithm used to generate the diff is outside the scope of git-apply. So in theory the email you are grabbing with "git am" could be generated by GNU diff, or even some weird proprietary diff, and we don't care. Applying a patch-id or anything else that makes assumptions about the diff algorithm is inherently risky. For internal cases where one tool is generating both sides of the diff, it should be sure to specify vanilla parameters. > My preference however is to limit this to Porcelains only, though. I'd be OK with that. The "...it should be sure to specify vanilla parameters" above can be spelled as "should use the plumbing version". Though even that is not entirely foolproof (due to differing versions, as I mentioned above, but also I think we respect things like $GIT_DIFF_OPTS), it works OK in practice. And given that the main motivation is for porcelain users, I don't think it is hurting anyone to be conservative at first. If "kup" relies on the user to run the diff themselves for the feature you mentioned, then it may cause breakage there (because most people will run the porcelain "git diff"). But I think it is already broken by things like diff.mnemonicprefix, as well as break- and rename-detection parameters. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] config: Introduce --patience config variable 2012-03-07 11:47 ` Jeff King @ 2012-03-07 17:24 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2012-03-07 17:24 UTC (permalink / raw) To: Jeff King; +Cc: Michal Privoznik, git Jeff King <peff@peff.net> writes: > On Tue, Mar 06, 2012 at 06:57:42PM -0800, Junio C Hamano wrote: > >> > The idea of turning on patience diff via config makes sense to me, and >> > it is not a problem for plumbing tools to have patience diff on, since >> > patience diffs are syntactically identical to non-patience diffs. >> >> Even though I do not strongly object to the general conclusion, I >> have to point out that the last line above is a dangerous thought. >> >> If you change the default diff algorithm, you will invalidate long >> term caches that computed their keys based on the shape of patches >> produced in the past. > > I see your point, though I don't think I'd use the word "dangerous" to > describe the invalidation of a cache. I probably was not writing clearly enough to avoid getting misread. The "dangerous" does not refer to "invalidation of a cache". What I meant was that "The output is syntactically identical, so it is OK" is a dangerous way to think when assessing the risk of regression, because applying to a given preimage and producing the same postimage is not the *only* way the output is used. I think the executive summary is that we are in agreement; your analysis of potential regression coming from differences of the shape of the patch output (not applicability to a given preimage to produce the same postimage) seems to match what I said in the previous message. Especially in the kup case, it is a minority tool used by people who knows or can be easily taught in a tightly controlled environment, and it is fine as long as the users have a way to make sure two diffs run on both ends of the communication produce the same result (in an earlier discussion on k.org users list where kup was first discussed, the limitation of users having to use the same version as k.org has was noted and the users are already aware of it). ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2012-03-10 7:14 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-03-06 10:59 [PATCH] config: Introduce --patience config variable Michal Privoznik 2012-03-06 11:49 ` Jeff King 2012-03-06 13:01 ` Thomas Rast 2012-03-06 13:15 ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast 2012-03-06 13:15 ` [PATCH 2/2] Document the --histogram diff option Thomas Rast 2012-03-06 19:57 ` Junio C Hamano 2012-03-06 20:42 ` Thomas Rast 2012-03-06 19:52 ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano 2012-03-06 21:00 ` Thomas Rast 2012-03-06 21:18 ` Junio C Hamano 2012-03-06 21:41 ` Jakub Narebski 2012-03-07 12:44 ` Thomas Rast 2012-03-07 17:45 ` Junio C Hamano 2012-03-07 18:03 ` Jakub Narebski 2012-03-07 18:19 ` Junio C Hamano 2012-03-10 7:13 ` René Scharfe 2012-03-06 13:30 ` [PATCH] config: Introduce --patience config variable Jeff King 2012-03-06 13:32 ` Michal Privoznik 2012-03-06 13:38 ` Matthieu Moy 2012-03-06 14:09 ` Jeff King 2012-03-07 2:57 ` Junio C Hamano 2012-03-07 11:47 ` Jeff King 2012-03-07 17:24 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).