* git diff too slow for a file
@ 2010-03-29 1:42 SungHyun Nam
2010-04-17 15:52 ` René Scharfe
0 siblings, 1 reply; 12+ messages in thread
From: SungHyun Nam @ 2010-03-29 1:42 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1250 bytes --]
Hello,
If I run a attached script for bunzipped attached files, I get:
(To reduce size, I removed many lines and bzipped.)
$ ./mk.sh
time diff -u x3 x4 >/dev/null 2>&1
real 0m0.011s
user 0m0.000s
sys 0m0.010s
time git diff >/dev/null 2>&1
real 0m0.193s
user 0m0.190s
sys 0m0.000s
$ git version
git version 1.7.0.2.273.gc2413
$ diff --version
diff (GNU diffutils) 2.8.1
...
Well, though the files are ascii file, they includes a random
hexa-decimal datas, so that I don't interest the diff result at
all. But the real problem is 'rebasing took so long if the file
was changed'. Because the git tree includes several such a file,
if they changed, rebase took some miniutes for every branch.
Such a branch includes a few lines of changes for a C source file,
though. Now I'm waiting an hour to finish rebasing all the
branches and yet a rebasing script is running... :-(
Please help!
Thanks,
namsh
The original file has more than 180000 lines and the result is:
$ ./mk.sh
time diff -u x3 x4 >/dev/null 2>&1
real 0m0.759s
user 0m0.740s
sys 0m0.010s
time git diff >/dev/null 2>&1
real 0m44.460s
user 0m44.390s
sys 0m0.030s
[-- Attachment #2: x3.bz2 --]
[-- Type: application/octet-stream, Size: 3348 bytes --]
[-- Attachment #3: x4.bz2 --]
[-- Type: application/octet-stream, Size: 2088 bytes --]
[-- Attachment #4: mk.sh --]
[-- Type: text/plain, Size: 298 bytes --]
#!/bin/bash
run_command()
{
echo "$@"
eval "$@"
}
run_command 'time diff -u x3 x4 >/dev/null 2>&1'
{
rm -rf u
mkdir u
cd u
cp ../x3 x
git init
git add .
git ci -m x
cp ../x4 x
} >/dev/null 2>&1
echo ''
run_command 'time git diff >/dev/null 2>&1'
exit 0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-03-29 1:42 git diff too slow for a file SungHyun Nam
@ 2010-04-17 15:52 ` René Scharfe
2010-04-17 17:10 ` Junio C Hamano
2010-04-19 0:43 ` SungHyun Nam
0 siblings, 2 replies; 12+ messages in thread
From: René Scharfe @ 2010-04-17 15:52 UTC (permalink / raw)
To: SungHyun Nam; +Cc: git
Am 29.03.2010 03:42, schrieb SungHyun Nam:
> Hello,
>
> If I run a attached script for bunzipped attached files, I get:
> (To reduce size, I removed many lines and bzipped.)
>
> $ ./mk.sh
> time diff -u x3 x4 >/dev/null 2>&1
>
> real 0m0.011s
> user 0m0.000s
> sys 0m0.010s
>
> time git diff >/dev/null 2>&1
>
> real 0m0.193s
> user 0m0.190s
> sys 0m0.000s
>
> $ git version
> git version 1.7.0.2.273.gc2413
>
> $ diff --version
> diff (GNU diffutils) 2.8.1
> ...
>
> Well, though the files are ascii file, they includes a random
> hexa-decimal datas, so that I don't interest the diff result at
> all. But the real problem is 'rebasing took so long if the file
> was changed'. Because the git tree includes several such a file,
> if they changed, rebase took some miniutes for every branch.
> Such a branch includes a few lines of changes for a C source file,
> though. Now I'm waiting an hour to finish rebasing all the
> branches and yet a rebasing script is running... :-(
I can reproduce it; I concatenated your example files five times to get
meaningful timings (x1 = five times x3, x2 = five times x4).
The difference between GNU diff and git diff is that the latter is trying
hard to minimize the size of the diff. Each user of the xdiff library in
git turns on the XDF_NEED_MINIMAL flag, which makes it very expensive
(specifically the function xdl_split()).
The following patch is not meant for inclusion, but rather to start a
dicussion. Is XDF_NEED_MINIMAL a good default to have?
The patch removes XDF_NEED_MINIMAL and replaces it with XDF_QUICK, with
reversed meaning. XDF_QUICK is only set if the new option --quick is
given, so without it the old behaviour is retained. Some numbers:
$ time diff ../x1 ../x2 | diffstat
unknown |34208 ++++++++++++++++++++++++++++++++--------------------------------
1 file changed, 17104 insertions(+), 17104 deletions(-)
real 0m0.043s
user 0m0.030s
sys 0m0.010s
$ time git diff --stat
x |32082 ++++++++++++++++++++++++++++++++++----------------------------------
1 files changed, 16041 insertions(+), 16041 deletions(-)
real 0m2.176s
user 0m2.170s
sys 0m0.010s
$ time git diff --quick --stat
x |34208 ++++++++++++++++++++++++++++++++++----------------------------------
1 files changed, 17104 insertions(+), 17104 deletions(-)
real 0m0.064s
user 0m0.060s
sys 0m0.010s
---
builtin/blame.c | 2 +-
builtin/merge-file.c | 2 +-
builtin/merge-tree.c | 2 +-
builtin/rerere.c | 2 +-
combine-diff.c | 2 +-
diff.c | 12 +++++++-----
merge-file.c | 2 +-
xdiff/xdiff.h | 2 +-
xdiff/xdiffi.c | 2 +-
9 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index fc15863..8deeee1 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -39,7 +39,7 @@ static int show_root;
static int reverse;
static int blank_boundary;
static int incremental;
-static int xdl_opts = XDF_NEED_MINIMAL;
+static int xdl_opts;
static enum date_mode blame_date_mode = DATE_ISO8601;
static size_t blame_date_width;
diff --git a/builtin/merge-file.c b/builtin/merge-file.c
index 610849a..b8e9e5b 100644
--- a/builtin/merge-file.c
+++ b/builtin/merge-file.c
@@ -25,7 +25,7 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
const char *names[3] = { NULL, NULL, NULL };
mmfile_t mmfs[3];
mmbuffer_t result = {NULL, 0};
- xmparam_t xmp = {{XDF_NEED_MINIMAL}};
+ xmparam_t xmp = {{0}};
int ret = 0, i = 0, to_stdout = 0;
int quiet = 0;
int nongit;
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index a4a4f2c..fc00d79 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -106,7 +106,7 @@ static void show_diff(struct merge_list *entry)
xdemitconf_t xecfg;
xdemitcb_t ecb;
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
ecb.outf = show_outf;
diff --git a/builtin/rerere.c b/builtin/rerere.c
index 34f9ace..0048f9e 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -89,7 +89,7 @@ static int diff_two(const char *file1, const char *label1,
printf("--- a/%s\n+++ b/%s\n", label1, label2);
fflush(stdout);
memset(&xpp, 0, sizeof(xpp));
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
ecb.outf = outf;
diff --git a/combine-diff.c b/combine-diff.c
index 6162691..29779be 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -221,7 +221,7 @@ static void combine_diff(const unsigned char *parent, unsigned int mode,
parent_file.ptr = grab_blob(parent, mode, &sz);
parent_file.size = sz;
memset(&xpp, 0, sizeof(xpp));
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
memset(&state, 0, sizeof(state));
state.nmask = nmask;
diff --git a/diff.c b/diff.c
index a1bf1e9..e32b47b 100644
--- a/diff.c
+++ b/diff.c
@@ -718,7 +718,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
memset(&xecfg, 0, sizeof(xecfg));
diff_words_fill(&diff_words->minus, &minus, diff_words->word_regex);
diff_words_fill(&diff_words->plus, &plus, diff_words->word_regex);
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
/* as only the hunk header will be parsed, we need a 0-context */
xecfg.ctxlen = 0;
xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, diff_words,
@@ -1744,7 +1744,7 @@ static void builtin_diff(const char *name_a,
check_blank_at_eof(&mf1, &mf2, &ecbdata);
ecbdata.file = o->file;
ecbdata.header = header.len ? &header : NULL;
- xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts;
+ xpp.flags = o->xdl_opts;
xecfg.ctxlen = o->context;
xecfg.interhunkctxlen = o->interhunkcontext;
xecfg.flags = XDL_EMIT_FUNCNAMES;
@@ -1834,7 +1834,7 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
- xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts;
+ xpp.flags = o->xdl_opts;
xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
&xpp, &xecfg, &ecb);
}
@@ -1883,7 +1883,7 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 1; /* at least one context line */
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
xdi_diff_outf(&mf1, &mf2, checkdiff_consume, &data,
&xpp, &xecfg, &ecb);
@@ -2816,6 +2816,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
DIFF_XDL_SET(options, IGNORE_WHITESPACE_AT_EOL);
else if (!strcmp(arg, "--patience"))
DIFF_XDL_SET(options, PATIENCE_DIFF);
+ else if (!strcmp(arg, "--quick"))
+ DIFF_XDL_SET(options, QUICK);
/* flags options */
else if (!strcmp(arg, "--binary")) {
@@ -3438,7 +3440,7 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1)
len2, p->two->path);
git_SHA1_Update(&ctx, buffer, len1);
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
xecfg.ctxlen = 3;
xecfg.flags = XDL_EMIT_FUNCNAMES;
xdi_diff_outf(&mf1, &mf2, patch_id_consume, &data,
diff --git a/merge-file.c b/merge-file.c
index c336c93..db4d0d5 100644
--- a/merge-file.c
+++ b/merge-file.c
@@ -66,7 +66,7 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
xdemitcb_t ecb;
memset(&xpp, 0, sizeof(xpp));
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
xecfg.flags = XDL_EMIT_COMMON;
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 711048e..3b8a962 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -28,7 +28,7 @@ extern "C" {
#endif /* #ifdef __cplusplus */
-#define XDF_NEED_MINIMAL (1 << 1)
+#define XDF_QUICK (1 << 1)
#define XDF_IGNORE_WHITESPACE (1 << 2)
#define XDF_IGNORE_WHITESPACE_CHANGE (1 << 3)
#define XDF_IGNORE_WHITESPACE_AT_EOL (1 << 4)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index da67c04..1f688cf 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -367,7 +367,7 @@ int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
dd2.rindex = xe->xdf2.rindex;
if (xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
- kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0, &xenv) < 0) {
+ kvdf, kvdb, (xpp->flags & XDF_QUICK) == 0, &xenv) < 0) {
xdl_free(kvd);
xdl_free_env(xe);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-17 15:52 ` René Scharfe
@ 2010-04-17 17:10 ` Junio C Hamano
2010-04-18 18:01 ` René Scharfe
2010-04-19 0:43 ` SungHyun Nam
1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2010-04-17 17:10 UTC (permalink / raw)
To: René Scharfe; +Cc: SungHyun Nam, git
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> Am 29.03.2010 03:42, schrieb SungHyun Nam:
>> ...
>> Well, though the files are ascii file, they includes a random
>> hexa-decimal datas, so that I don't interest the diff result at
>> all.
> ...
> The following patch is not meant for inclusion, but rather to start a
> dicussion. Is XDF_NEED_MINIMAL a good default to have?
This is a very valid question to ask. The choice of the default was done
without any benchmarking nor analysis on performance impact at all.
What we should do next would be to:
- see how much performance impact we have been getting from more normal
set of files (say, "git log -p" in the kernel archive) by our use of
MINIMAL; I suspect that git.git itself is too small to observe any
meaningful difference. We already _know_ that MINIMAL is more
expensive, so this is not very important, but it would be good to
know.
- inspect the difference of the quality of output for not using MINIMAL,
again for more normal set of files. We know that the quality does not
matter for pathological cases like the one in this thread --- the user
is not even "interested in the diff result at all".
Thanks for starting this.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-17 17:10 ` Junio C Hamano
@ 2010-04-18 18:01 ` René Scharfe
2010-04-20 7:40 ` Junio C Hamano
0 siblings, 1 reply; 12+ messages in thread
From: René Scharfe @ 2010-04-18 18:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: SungHyun Nam, git
Am 17.04.2010 19:10, schrieb Junio C Hamano:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
>> Am 29.03.2010 03:42, schrieb SungHyun Nam:
>>> ...
>>> Well, though the files are ascii file, they includes a random
>>> hexa-decimal datas, so that I don't interest the diff result at
>>> all.
>> ...
>> The following patch is not meant for inclusion, but rather to start a
>> dicussion. Is XDF_NEED_MINIMAL a good default to have?
>
> This is a very valid question to ask. The choice of the default was done
> without any benchmarking nor analysis on performance impact at all.
>
> What we should do next would be to:
>
> - see how much performance impact we have been getting from more normal
> set of files (say, "git log -p" in the kernel archive) by our use of
> MINIMAL; I suspect that git.git itself is too small to observe any
> meaningful difference. We already _know_ that MINIMAL is more
> expensive, so this is not very important, but it would be good to
> know.
>
> - inspect the difference of the quality of output for not using MINIMAL,
> again for more normal set of files. We know that the quality does not
> matter for pathological cases like the one in this thread --- the user
> is not even "interested in the diff result at all".
To see the what difference it makes to the output, I did the following:
git rev-list --no-merges HEAD |
while read commit
do
a=$(git show $commit | md5sum)
b=$(git show --quick $commit | md5sum)
test "$a" = "$b" || echo $commit
done
For git, it reports those 2 out of 18095 total commits as being
different with and without XDF_NEED_MINIMAL:
f522c9b5 717b8311
For Linux, these 161 out of 178107 commits are affected:
90d49b4f 83f3c715 3b5dd52a e97bd974 4e092d11 96b3c83d 4c96e893
1210db95 ca97b838 52bbe3c7 a3a4f7e1 de0710aa 8e730c15 b54f78a8
71034ba8 120a5d0d 1532ecea 0a18d7b5 3d0f8bc7 35c1b462 0f29f587
2cf71d2e 4d295db0 4953550a c0a5962f 5176fae4 51f94a7b 5d4f98a2
cb6492e4 4f774513 da0436e9 3772a991 20b09c29 dd4969a8 b4a4d568
a7fefd10 d93f87c8 9b7895ef a72bdb1c 96b8e145 b960074f 56bec294
f74df6fb 89cb7e7f e8324357 5ef3041e 833dfbe7 65f9f619 02227c28
56afef56 5db8dcc9 6261ab3a deee7c81 e7594072 2b82032c f1dc5600
060ae855 5503ac56 54168ed7 e2d1d6c0 1a40e23b d1310b2e 9c6bd790
8feceb67 9ac19a90 60f8b39c ef1a628d 95b86a9a 0391c828 a7707adf
3f038d80 f5b728a1 5915eb53 287ac01a c41f8cbd b497549a 6802e340
f8d79e79 34f80b04 c18487ee c9df406f 03718234 d10c2e46 584fffc8
624d5c51 6446a860 f1410647 1c45607a b21a15f6 0e078e2f 8e09f215
99ca4e58 8bf5e5ca 402aa76a 195a4ef6 a9de9248 20510f2f ae0b78d0
51219358 1a4f550a b2c258fb e2ebc74d 27c868c2 c3a2f0df 8e18257d
05ffdd7b 328d5752 92d7f7b0 39279cc3 4b19fcc3 f23a06f0 7699acd1
a1005012 eea221ce dcd0538f 48b4554a 56b6aeb0 e05d723f 3cee5a60
4d0b4af9 9e89dde2 483dfdb6 02f1175c a8dea4ec 28a6d671 933a27d3
572d3138 1450e6be 4ac4360b 9c7f852e 16a53ecc 5bb0b55a 2722971c
6a2900b6 c752666c 5c04a7b8 c92f222e a966f3e7 3ebc284d a1a5ea70
df694daa 7a88488b afbf30a2 b095c381 ea2b26e0 12d30d89 09af7b44
5e83d430 37448f7d 544393fe d203a7ec 73a25462 bd4c625c 330a115a
22e2c507 e9edcee0 303b86d9 47b5d69c 2d7edb92 cb624029 f4f051eb
I have briefly looked at a few of them. They were big and not obvious
with or without XDF_NEED_MINIMAL, but the flag clearly helped to cut
them down a bit.
XDF_NEED_MINIMAL doesn't seem to affect the overall runtime for the
Linux repo:
time git log --stat HEAD >/dev/null
real 4m37.378s
user 4m28.070s
sys 0m9.310s
time ../git/git log --stat --quick >/dev/null
real 4m37.239s
user 4m26.590s
sys 0m10.620s
The difference between the times for git's own repo are in the noise, too.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-17 15:52 ` René Scharfe
2010-04-17 17:10 ` Junio C Hamano
@ 2010-04-19 0:43 ` SungHyun Nam
1 sibling, 0 replies; 12+ messages in thread
From: SungHyun Nam @ 2010-04-19 0:43 UTC (permalink / raw)
To: René Scharfe; +Cc: git
René Scharfe wrote:
> Am 29.03.2010 03:42, schrieb SungHyun Nam:
>> Hello,
>>
>> If I run a attached script for bunzipped attached files, I get:
>> (To reduce size, I removed many lines and bzipped.)
>>
>> $ ./mk.sh
>> time diff -u x3 x4>/dev/null 2>&1
>>
>> real 0m0.011s
>> user 0m0.000s
>> sys 0m0.010s
>>
>> time git diff>/dev/null 2>&1
>>
>> real 0m0.193s
>> user 0m0.190s
>> sys 0m0.000s
>>
>> $ git version
>> git version 1.7.0.2.273.gc2413
>>
>> $ diff --version
>> diff (GNU diffutils) 2.8.1
>> ...
>>
>> Well, though the files are ascii file, they includes a random
>> hexa-decimal datas, so that I don't interest the diff result at
>> all. But the real problem is 'rebasing took so long if the file
>> was changed'. Because the git tree includes several such a file,
>> if they changed, rebase took some miniutes for every branch.
>> Such a branch includes a few lines of changes for a C source file,
>> though. Now I'm waiting an hour to finish rebasing all the
>> branches and yet a rebasing script is running... :-(
>
> I can reproduce it; I concatenated your example files five times to get
> meaningful timings (x1 = five times x3, x2 = five times x4).
>
> The difference between GNU diff and git diff is that the latter is trying
> hard to minimize the size of the diff. Each user of the xdiff library in
> git turns on the XDF_NEED_MINIMAL flag, which makes it very expensive
> (specifically the function xdl_split()).
>
> The following patch is not meant for inclusion, but rather to start a
> dicussion. Is XDF_NEED_MINIMAL a good default to have?
>
> The patch removes XDF_NEED_MINIMAL and replaces it with XDF_QUICK, with
> reversed meaning. XDF_QUICK is only set if the new option --quick is
> given, so without it the old behaviour is retained. Some numbers:
The patch is great for me. Thanks!
Added 'time git diff --quick' to the mk.sh and ran with a
original file (about 180000 lines):
$ ./mk.sh
time diff -u x3 x4 >/dev/null 2>&1
real 0m0.794s
user 0m0.720s
sys 0m0.010s
time git diff >/dev/null 2>&1
real 0m44.687s
user 0m44.670s
sys 0m0.020s
time git diff --quick >/dev/null 2>&1
real 0m1.853s
user 0m1.840s
sys 0m0.010s
Thanks!
namsh
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-18 18:01 ` René Scharfe
@ 2010-04-20 7:40 ` Junio C Hamano
2010-04-20 21:15 ` René Scharfe
2010-05-02 13:04 ` René Scharfe
0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-04-20 7:40 UTC (permalink / raw)
To: René Scharfe; +Cc: SungHyun Nam, git
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> For Linux, these 161 out of 178107 commits are affected:
>
> 90d49b4f 83f3c715 3b5dd52a e97bd974 4e092d11 96b3c83d 4c96e893
> ...
> 22e2c507 e9edcee0 303b86d9 47b5d69c 2d7edb92 cb624029 f4f051eb
>
> I have briefly looked at a few of them. They were big and not obvious
> with or without XDF_NEED_MINIMAL, but the flag clearly helped to cut
> them down a bit.
Thanks.
I am getting the same impression after staring some output.
Probably we should at least try to get rid of the use of MINIMAL
immediately after 1.7.1 and if nobody finds large discrepancies, aim to
ship 1.7.2 (and possibly 1.7.1.1) without even --quick/--slow options.
I expect that there will also be some differences in the blame output.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-20 7:40 ` Junio C Hamano
@ 2010-04-20 21:15 ` René Scharfe
2010-04-21 2:49 ` Junio C Hamano
2010-05-02 13:04 ` René Scharfe
1 sibling, 1 reply; 12+ messages in thread
From: René Scharfe @ 2010-04-20 21:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: SungHyun Nam, git
Am 20.04.2010 09:40, schrieb Junio C Hamano:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
>> For Linux, these 161 out of 178107 commits are affected:
>>
>> 90d49b4f 83f3c715 3b5dd52a e97bd974 4e092d11 96b3c83d 4c96e893
>> ...
>> 22e2c507 e9edcee0 303b86d9 47b5d69c 2d7edb92 cb624029 f4f051eb
>>
>> I have briefly looked at a few of them. They were big and not obvious
>> with or without XDF_NEED_MINIMAL, but the flag clearly helped to cut
>> them down a bit.
>
> Thanks.
>
> I am getting the same impression after staring some output.
>
> Probably we should at least try to get rid of the use of MINIMAL
> immediately after 1.7.1 and if nobody finds large discrepancies, aim to
> ship 1.7.2 (and possibly 1.7.1.1) without even --quick/--slow options.
Turning XDF_NEED_MINIMAL off by default looks like the sane thing to do
in order to help the fringe cases without hurting the normal ones.
A --slow/--minimal/--try-harder option for git diff could come in handy
for longer patches, though. GNU diff has it, too (-d/--minimal).
> I expect that there will also be some differences in the blame output.
I haven't looked at the impact on blame, but additionally patch IDs are
going to change (for those patches where XDF_NEED_MINIMAL makes a
difference). Are they stored somewhere? Do we need to worry about them?
René
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-20 21:15 ` René Scharfe
@ 2010-04-21 2:49 ` Junio C Hamano
0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-04-21 2:49 UTC (permalink / raw)
To: René Scharfe; +Cc: SungHyun Nam, git, Paolo Bonzini
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> Turning XDF_NEED_MINIMAL off by default looks like the sane thing to do
> in order to help the fringe cases without hurting the normal ones.
>
> A --slow/--minimal/--try-harder option for git diff could come in handy
> for longer patches, though. GNU diff has it, too (-d/--minimal).
>
>> I expect that there will also be some differences in the blame output.
>
> I haven't looked at the impact on blame, but additionally patch IDs are
> going to change (for those patches where XDF_NEED_MINIMAL makes a
> difference). Are they stored somewhere? Do we need to worry about them?
Patch-IDs are purely transient inside core-git as far as I know, but I
wouldn't surprised if people keep database of patch-IDs and corresponding
commits to somehow speed up change look-ups.
Theoretically, conflict IDs used to index the rerere database would also
be affected, but I don't think it is such a huge issue.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-04-20 7:40 ` Junio C Hamano
2010-04-20 21:15 ` René Scharfe
@ 2010-05-02 13:04 ` René Scharfe
2010-05-02 15:10 ` Junio C Hamano
1 sibling, 1 reply; 12+ messages in thread
From: René Scharfe @ 2010-05-02 13:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: SungHyun Nam, git
Am 20.04.2010 09:40, schrieb Junio C Hamano:
> Probably we should at least try to get rid of the use of MINIMAL
> immediately after 1.7.1 and if nobody finds large discrepancies, aim to
> ship 1.7.2 (and possibly 1.7.1.1) without even --quick/--slow options.
-- >8 --
Ever since the xdiff library had been introduced to git, all its callers
have used the flag XDF_NEED_MINIMAL. It makes sure that the smallest
possible diff is produced, but that takes quite some time if there are
lots of differences that can be expressed in multiple ways.
This flag makes a difference for only 0.1% of the non-merge commits in
the git repo of Linux, both in terms of diff size and execution time.
The patches there are mostly nice and small.
SungHyun Nam however reported a case in a different repo where a diff
took more than 20 times longer to generate with XDF_NEED_MINIMAL than
without. Rebasing became really slow.
This patch removes this flag from all callers. The default of xdiff is
saner because it has minimal to no impact in the normal case of small
diffs and doesn't incur that much of a speed penalty for large ones.
A follow-up patch may introduce a command line option to set the flag if
the user needs it, similar to GNU diff's -d/--minimal.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
builtin/blame.c | 2 +-
builtin/merge-file.c | 2 +-
builtin/merge-tree.c | 2 +-
builtin/rerere.c | 2 +-
combine-diff.c | 2 +-
diff.c | 10 +++++-----
merge-file.c | 2 +-
7 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index fc15863..8deeee1 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -39,7 +39,7 @@ static int show_root;
static int reverse;
static int blank_boundary;
static int incremental;
-static int xdl_opts = XDF_NEED_MINIMAL;
+static int xdl_opts;
static enum date_mode blame_date_mode = DATE_ISO8601;
static size_t blame_date_width;
diff --git a/builtin/merge-file.c b/builtin/merge-file.c
index 610849a..b8e9e5b 100644
--- a/builtin/merge-file.c
+++ b/builtin/merge-file.c
@@ -25,7 +25,7 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
const char *names[3] = { NULL, NULL, NULL };
mmfile_t mmfs[3];
mmbuffer_t result = {NULL, 0};
- xmparam_t xmp = {{XDF_NEED_MINIMAL}};
+ xmparam_t xmp = {{0}};
int ret = 0, i = 0, to_stdout = 0;
int quiet = 0;
int nongit;
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index a4a4f2c..fc00d79 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -106,7 +106,7 @@ static void show_diff(struct merge_list *entry)
xdemitconf_t xecfg;
xdemitcb_t ecb;
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
ecb.outf = show_outf;
diff --git a/builtin/rerere.c b/builtin/rerere.c
index 34f9ace..0048f9e 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -89,7 +89,7 @@ static int diff_two(const char *file1, const char *label1,
printf("--- a/%s\n+++ b/%s\n", label1, label2);
fflush(stdout);
memset(&xpp, 0, sizeof(xpp));
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
ecb.outf = outf;
diff --git a/combine-diff.c b/combine-diff.c
index 6162691..29779be 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -221,7 +221,7 @@ static void combine_diff(const unsigned char *parent, unsigned int mode,
parent_file.ptr = grab_blob(parent, mode, &sz);
parent_file.size = sz;
memset(&xpp, 0, sizeof(xpp));
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
memset(&state, 0, sizeof(state));
state.nmask = nmask;
diff --git a/diff.c b/diff.c
index a1bf1e9..29e608b 100644
--- a/diff.c
+++ b/diff.c
@@ -718,7 +718,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
memset(&xecfg, 0, sizeof(xecfg));
diff_words_fill(&diff_words->minus, &minus, diff_words->word_regex);
diff_words_fill(&diff_words->plus, &plus, diff_words->word_regex);
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
/* as only the hunk header will be parsed, we need a 0-context */
xecfg.ctxlen = 0;
xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, diff_words,
@@ -1744,7 +1744,7 @@ static void builtin_diff(const char *name_a,
check_blank_at_eof(&mf1, &mf2, &ecbdata);
ecbdata.file = o->file;
ecbdata.header = header.len ? &header : NULL;
- xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts;
+ xpp.flags = o->xdl_opts;
xecfg.ctxlen = o->context;
xecfg.interhunkctxlen = o->interhunkcontext;
xecfg.flags = XDL_EMIT_FUNCNAMES;
@@ -1834,7 +1834,7 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
- xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts;
+ xpp.flags = o->xdl_opts;
xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
&xpp, &xecfg, &ecb);
}
@@ -1883,7 +1883,7 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 1; /* at least one context line */
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
xdi_diff_outf(&mf1, &mf2, checkdiff_consume, &data,
&xpp, &xecfg, &ecb);
@@ -3438,7 +3438,7 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1)
len2, p->two->path);
git_SHA1_Update(&ctx, buffer, len1);
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
xecfg.ctxlen = 3;
xecfg.flags = XDL_EMIT_FUNCNAMES;
xdi_diff_outf(&mf1, &mf2, patch_id_consume, &data,
diff --git a/merge-file.c b/merge-file.c
index c336c93..db4d0d5 100644
--- a/merge-file.c
+++ b/merge-file.c
@@ -66,7 +66,7 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
xdemitcb_t ecb;
memset(&xpp, 0, sizeof(xpp));
- xpp.flags = XDF_NEED_MINIMAL;
+ xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
xecfg.flags = XDL_EMIT_COMMON;
--
1.7.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-05-02 13:04 ` René Scharfe
@ 2010-05-02 15:10 ` Junio C Hamano
2010-05-04 20:16 ` René Scharfe
0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2010-05-02 15:10 UTC (permalink / raw)
To: René Scharfe; +Cc: SungHyun Nam, git
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> Ever since the xdiff library had been introduced to git, all its callers
> have used the flag XDF_NEED_MINIMAL. It makes sure that the smallest
> possible diff is produced, but that takes quite some time if there are
> lots of differences that can be expressed in multiple ways.
>
> This flag makes a difference for only 0.1% of the non-merge commits in
> the git repo of Linux, both in terms of diff size and execution time.
> The patches there are mostly nice and small.
>
> SungHyun Nam however reported a case in a different repo where a diff
> took more than 20 times longer to generate with XDF_NEED_MINIMAL than
> without. Rebasing became really slow.
>
> This patch removes this flag from all callers. The default of xdiff is
> saner because it has minimal to no impact in the normal case of small
> diffs and doesn't incur that much of a speed penalty for large ones.
Thanks, will queue.
> diff --git a/merge-file.c b/merge-file.c
> index c336c93..db4d0d5 100644
> --- a/merge-file.c
> +++ b/merge-file.c
> @@ -66,7 +66,7 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
> xdemitcb_t ecb;
>
> memset(&xpp, 0, sizeof(xpp));
> - xpp.flags = XDF_NEED_MINIMAL;
> + xpp.flags = 0;
> memset(&xecfg, 0, sizeof(xecfg));
> xecfg.ctxlen = 3;
> xecfg.flags = XDL_EMIT_COMMON;
When I wrote the message you are responding to, I tried to decide which is
better, to replace the assigned value like your patch does, or to remove
the assignment altogether as we have memset(&xpp, 0, sizeof(xpp)). And I
was somewhat torn.
While it expresses what the patch wants to do a lot clearer (and it also
marks the places the "later patch" needs to touch), the resulting code
becomes slightly harder to read, because future readers of the code are
left with an obvious "why do we assign 0 after clearing the whole thing?
is there anything subtle going on?" unanswered.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-05-02 15:10 ` Junio C Hamano
@ 2010-05-04 20:16 ` René Scharfe
2010-05-04 22:56 ` Junio C Hamano
0 siblings, 1 reply; 12+ messages in thread
From: René Scharfe @ 2010-05-04 20:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: SungHyun Nam, git
Am 02.05.2010 17:10, schrieb Junio C Hamano:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>> diff --git a/merge-file.c b/merge-file.c
>> index c336c93..db4d0d5 100644
>> --- a/merge-file.c
>> +++ b/merge-file.c
>> @@ -66,7 +66,7 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
>> xdemitcb_t ecb;
>>
>> memset(&xpp, 0, sizeof(xpp));
>> - xpp.flags = XDF_NEED_MINIMAL;
>> + xpp.flags = 0;
>> memset(&xecfg, 0, sizeof(xecfg));
>> xecfg.ctxlen = 3;
>> xecfg.flags = XDL_EMIT_COMMON;
>
> When I wrote the message you are responding to, I tried to decide which is
> better, to replace the assigned value like your patch does, or to remove
> the assignment altogether as we have memset(&xpp, 0, sizeof(xpp)). And I
> was somewhat torn.
>
> While it expresses what the patch wants to do a lot clearer (and it also
> marks the places the "later patch" needs to touch), the resulting code
> becomes slightly harder to read, because future readers of the code are
> left with an obvious "why do we assign 0 after clearing the whole thing?
> is there anything subtle going on?" unanswered.
Well, I didn't do that because of a mix of laziness and caution. A
mechanical replacement is much less likely to introduce bugs..
But when I take a closer look at the surrounding code, I can't help but
ask if the flags really have be passed in such a complicated way.
How about the following, which makes xdi_diff*() take a simple flag
parameter instead, moving the code to handle xpparam_t into
xdiff-interface.c, which seems to be the proper place for it?
René
---
builtin/blame.c | 10 ++--------
builtin/merge-tree.c | 4 +---
builtin/rerere.c | 5 +----
combine-diff.c | 5 +----
diff.c | 25 +++++--------------------
merge-file.c | 5 +----
xdiff-interface.c | 18 +++++++++++-------
xdiff-interface.h | 8 ++++----
8 files changed, 26 insertions(+), 54 deletions(-)
diff --git a/builtin/blame.c b/builtin/blame.c
index 8deeee1..1c1c9e4 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -734,7 +734,6 @@ static int pass_blame_to_parent(struct scoreboard *sb,
int last_in_target;
mmfile_t file_p, file_o;
struct blame_chunk_cb_data d = { sb, target, parent, 0, 0 };
- xpparam_t xpp;
xdemitconf_t xecfg;
last_in_target = find_last_in_target(sb, target);
@@ -745,11 +744,9 @@ static int pass_blame_to_parent(struct scoreboard *sb,
fill_origin_blob(target, &file_o);
num_get_patch++;
- memset(&xpp, 0, sizeof(xpp));
- xpp.flags = xdl_opts;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 0;
- xdi_diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, &xpp, &xecfg);
+ xdi_diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, &xecfg, xdl_opts);
/* The rest (i.e. anything after tlno) are the same as the parent */
blame_chunk(sb, d.tlno, d.plno, last_in_target, target, parent);
@@ -876,7 +873,6 @@ static void find_copy_in_blob(struct scoreboard *sb,
int cnt;
mmfile_t file_o;
struct handle_split_cb_data d = { sb, ent, parent, split, 0, 0 };
- xpparam_t xpp;
xdemitconf_t xecfg;
/*
@@ -896,12 +892,10 @@ static void find_copy_in_blob(struct scoreboard *sb,
* file_o is a part of final image we are annotating.
* file_p partially may match that image.
*/
- memset(&xpp, 0, sizeof(xpp));
- xpp.flags = xdl_opts;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 1;
memset(split, 0, sizeof(struct blame_entry [3]));
- xdi_diff_hunks(file_p, &file_o, handle_split_cb, &d, &xpp, &xecfg);
+ xdi_diff_hunks(file_p, &file_o, handle_split_cb, &d, &xecfg, xdl_opts);
/* remainder, if any, all match the preimage */
handle_split(sb, ent, d.tlno, d.plno, ent->num_lines, parent, split);
}
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index fc00d79..d95cd1c 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -102,11 +102,9 @@ static void show_diff(struct merge_list *entry)
{
unsigned long size;
mmfile_t src, dst;
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
- xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
ecb.outf = show_outf;
@@ -120,7 +118,7 @@ static void show_diff(struct merge_list *entry)
if (!dst.ptr)
size = 0;
dst.size = size;
- xdi_diff(&src, &dst, &xpp, &xecfg, &ecb);
+ xdi_diff(&src, &dst, &xecfg, &ecb, 0);
free(src.ptr);
free(dst.ptr);
}
diff --git a/builtin/rerere.c b/builtin/rerere.c
index 0048f9e..43b75fa 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -78,7 +78,6 @@ static int outf(void *dummy, mmbuffer_t *ptr, int nbuf)
static int diff_two(const char *file1, const char *label1,
const char *file2, const char *label2)
{
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
mmfile_t minus, plus;
@@ -88,12 +87,10 @@ static int diff_two(const char *file1, const char *label1,
printf("--- a/%s\n+++ b/%s\n", label1, label2);
fflush(stdout);
- memset(&xpp, 0, sizeof(xpp));
- xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
ecb.outf = outf;
- xdi_diff(&minus, &plus, &xpp, &xecfg, &ecb);
+ xdi_diff(&minus, &plus, &xecfg, &ecb, 0);
free(minus.ptr);
free(plus.ptr);
diff --git a/combine-diff.c b/combine-diff.c
index 29779be..5b29226 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -208,7 +208,6 @@ static void combine_diff(const unsigned char *parent, unsigned int mode,
{
unsigned int p_lno, lno;
unsigned long nmask = (1UL << n);
- xpparam_t xpp;
xdemitconf_t xecfg;
mmfile_t parent_file;
xdemitcb_t ecb;
@@ -220,8 +219,6 @@ static void combine_diff(const unsigned char *parent, unsigned int mode,
parent_file.ptr = grab_blob(parent, mode, &sz);
parent_file.size = sz;
- memset(&xpp, 0, sizeof(xpp));
- xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
memset(&state, 0, sizeof(state));
state.nmask = nmask;
@@ -231,7 +228,7 @@ static void combine_diff(const unsigned char *parent, unsigned int mode,
state.n = n;
xdi_diff_outf(&parent_file, result_file, consume_line, &state,
- &xpp, &xecfg, &ecb);
+ &xecfg, &ecb, 0);
free(parent_file.ptr);
/* Assign line numbers for this parent.
diff --git a/diff.c b/diff.c
index 29e608b..d95a928 100644
--- a/diff.c
+++ b/diff.c
@@ -698,7 +698,6 @@ static void diff_words_fill(struct diff_words_buffer *buffer, mmfile_t *out,
/* this executes the word diff on the accumulated buffers */
static void diff_words_show(struct diff_words_data *diff_words)
{
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
mmfile_t minus, plus;
@@ -714,15 +713,13 @@ static void diff_words_show(struct diff_words_data *diff_words)
diff_words->current_plus = diff_words->plus.text.ptr;
- memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
diff_words_fill(&diff_words->minus, &minus, diff_words->word_regex);
diff_words_fill(&diff_words->plus, &plus, diff_words->word_regex);
- xpp.flags = 0;
/* as only the hunk header will be parsed, we need a 0-context */
xecfg.ctxlen = 0;
xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, diff_words,
- &xpp, &xecfg, &ecb);
+ &xecfg, &ecb, 0);
free(minus.ptr);
free(plus.ptr);
if (diff_words->current_plus != diff_words->plus.text.ptr +
@@ -1703,7 +1700,6 @@ static void builtin_diff(const char *name_a,
else {
/* Crazy xdl interfaces.. */
const char *diffopts = getenv("GIT_DIFF_OPTS");
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
struct emit_callback ecbdata;
@@ -1733,7 +1729,6 @@ static void builtin_diff(const char *name_a,
if (!pe)
pe = diff_funcname_pattern(two);
- memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
memset(&ecbdata, 0, sizeof(ecbdata));
ecbdata.label_path = lbl;
@@ -1744,7 +1739,6 @@ static void builtin_diff(const char *name_a,
check_blank_at_eof(&mf1, &mf2, &ecbdata);
ecbdata.file = o->file;
ecbdata.header = header.len ? &header : NULL;
- xpp.flags = o->xdl_opts;
xecfg.ctxlen = o->context;
xecfg.interhunkctxlen = o->interhunkcontext;
xecfg.flags = XDL_EMIT_FUNCNAMES;
@@ -1777,7 +1771,7 @@ static void builtin_diff(const char *name_a,
}
}
xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata,
- &xpp, &xecfg, &ecb);
+ &xecfg, &ecb, o->xdl_opts);
if (DIFF_OPT_TST(o, COLOR_DIFF_WORDS))
free_diff_words_data(&ecbdata);
if (textconv_one)
@@ -1828,15 +1822,12 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
data->deleted = mf1.size;
} else {
/* Crazy xdl interfaces.. */
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
- memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
- xpp.flags = o->xdl_opts;
xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
- &xpp, &xecfg, &ecb);
+ &xecfg, &ecb, o->xdl_opts);
}
free_and_return:
@@ -1876,16 +1867,13 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
goto free_and_return;
else {
/* Crazy xdl interfaces.. */
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
- memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 1; /* at least one context line */
- xpp.flags = 0;
xdi_diff_outf(&mf1, &mf2, checkdiff_consume, &data,
- &xpp, &xecfg, &ecb);
+ &xecfg, &ecb, 0);
if (data.ws_rule & WS_BLANK_AT_EOF) {
struct emit_callback ecbdata;
@@ -3378,14 +3366,12 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1)
data.ctx = &ctx;
for (i = 0; i < q->nr; i++) {
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
mmfile_t mf1, mf2;
struct diff_filepair *p = q->queue[i];
int len1, len2;
- memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
if (p->status == 0)
return error("internal diff status error");
@@ -3438,11 +3424,10 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1)
len2, p->two->path);
git_SHA1_Update(&ctx, buffer, len1);
- xpp.flags = 0;
xecfg.ctxlen = 3;
xecfg.flags = XDL_EMIT_FUNCNAMES;
xdi_diff_outf(&mf1, &mf2, patch_id_consume, &data,
- &xpp, &xecfg, &ecb);
+ &xecfg, &ecb, 0);
}
git_SHA1_Final(sha1, &ctx);
diff --git a/merge-file.c b/merge-file.c
index db4d0d5..6521561 100644
--- a/merge-file.c
+++ b/merge-file.c
@@ -61,12 +61,9 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
{
unsigned long size = f1->size < f2->size ? f1->size : f2->size;
void *ptr = xmalloc(size);
- xpparam_t xpp;
xdemitconf_t xecfg;
xdemitcb_t ecb;
- memset(&xpp, 0, sizeof(xpp));
- xpp.flags = 0;
memset(&xecfg, 0, sizeof(xecfg));
xecfg.ctxlen = 3;
xecfg.flags = XDL_EMIT_COMMON;
@@ -76,7 +73,7 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
res->size = 0;
ecb.priv = res;
- return xdi_diff(f1, f2, &xpp, &xecfg, &ecb);
+ return xdi_diff(f1, f2, &xecfg, &ecb, 0);
}
void *merge_file(const char *path, struct blob *base, struct blob *our, struct blob *their, unsigned long *size)
diff --git a/xdiff-interface.c b/xdiff-interface.c
index ca5e3fb..6457a5b 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -126,20 +126,24 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
b->size -= trimmed - recovered;
}
-int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t const *xecfg, xdemitcb_t *xecb)
+int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xdemitconf_t const *xecfg,
+ xdemitcb_t *xecb, int xpp_flags)
{
+ xpparam_t xpp;
mmfile_t a = *mf1;
mmfile_t b = *mf2;
+ memset(&xpp, 0, sizeof(xpp));
+ xpp.flags = xpp_flags;
+
trim_common_tail(&a, &b, xecfg->ctxlen);
- return xdl_diff(&a, &b, xpp, xecfg, xecb);
+ return xdl_diff(&a, &b, &xpp, xecfg, xecb);
}
int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
xdiff_emit_consume_fn fn, void *consume_callback_data,
- xpparam_t const *xpp,
- xdemitconf_t const *xecfg, xdemitcb_t *xecb)
+ xdemitconf_t const *xecfg, xdemitcb_t *xecb, int xpp_flags)
{
int ret;
struct xdiff_emit_state state;
@@ -150,7 +154,7 @@ int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
xecb->outf = xdiff_outf;
xecb->priv = &state;
strbuf_init(&state.remainder, 0);
- ret = xdi_diff(mf1, mf2, xpp, xecfg, xecb);
+ ret = xdi_diff(mf1, mf2, xecfg, xecb, xpp_flags);
strbuf_release(&state.remainder);
return ret;
}
@@ -185,7 +189,7 @@ static int process_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
int xdi_diff_hunks(mmfile_t *mf1, mmfile_t *mf2,
xdiff_emit_hunk_consume_fn fn, void *consume_callback_data,
- xpparam_t const *xpp, xdemitconf_t *xecfg)
+ xdemitconf_t *xecfg, int xpp_flags)
{
struct xdiff_emit_hunk_state state;
xdemitcb_t ecb;
@@ -196,7 +200,7 @@ int xdi_diff_hunks(mmfile_t *mf1, mmfile_t *mf2,
state.consume_callback_data = consume_callback_data;
xecfg->emit_func = (void (*)())process_diff;
ecb.priv = &state;
- return xdi_diff(mf1, mf2, xpp, xecfg, &ecb);
+ return xdi_diff(mf1, mf2, xecfg, &ecb, xpp_flags);
}
int read_mmfile(mmfile_t *ptr, const char *filename)
diff --git a/xdiff-interface.h b/xdiff-interface.h
index abba70c..05adeb3 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -6,14 +6,14 @@
typedef void (*xdiff_emit_consume_fn)(void *, char *, unsigned long);
typedef void (*xdiff_emit_hunk_consume_fn)(void *, long, long, long);
-int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t const *xecfg, xdemitcb_t *ecb);
+int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xdemitconf_t const *xecfg,
+ xdemitcb_t *ecb, int xpp_flags);
int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
xdiff_emit_consume_fn fn, void *consume_callback_data,
- xpparam_t const *xpp,
- xdemitconf_t const *xecfg, xdemitcb_t *xecb);
+ xdemitconf_t const *xecfg, xdemitcb_t *xecb, int xpp_flags);
int xdi_diff_hunks(mmfile_t *mf1, mmfile_t *mf2,
xdiff_emit_hunk_consume_fn fn, void *consume_callback_data,
- xpparam_t const *xpp, xdemitconf_t *xecfg);
+ xdemitconf_t *xecfg, int xpp_flags);
int parse_hunk_header(char *line, int len,
int *ob, int *on,
int *nb, int *nn);
--
1.7.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: git diff too slow for a file
2010-05-04 20:16 ` René Scharfe
@ 2010-05-04 22:56 ` Junio C Hamano
0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-05-04 22:56 UTC (permalink / raw)
To: René Scharfe; +Cc: SungHyun Nam, git
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> But when I take a closer look at the surrounding code, I can't help but
> ask if the flags really have be passed in such a complicated way.
>
> How about the following, which makes xdi_diff*() take a simple flag
> parameter instead, moving the code to handle xpparam_t into
> xdiff-interface.c, which seems to be the proper place for it?
This looks very sensible. Your patch doesn't touch xdiff/ proper but only
the thin interface layer, so we don't have to worry about deviating from
the upstream even further.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-05-04 22:57 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-29 1:42 git diff too slow for a file SungHyun Nam
2010-04-17 15:52 ` René Scharfe
2010-04-17 17:10 ` Junio C Hamano
2010-04-18 18:01 ` René Scharfe
2010-04-20 7:40 ` Junio C Hamano
2010-04-20 21:15 ` René Scharfe
2010-04-21 2:49 ` Junio C Hamano
2010-05-02 13:04 ` René Scharfe
2010-05-02 15:10 ` Junio C Hamano
2010-05-04 20:16 ` René Scharfe
2010-05-04 22:56 ` Junio C Hamano
2010-04-19 0:43 ` SungHyun Nam
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.