* [PATCH 0/2] Multiparent diff tree-walker + combine-diff speedup @ 2014-02-13 14:02 Kirill Smelkov 2014-02-13 14:02 ` [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov 2014-02-13 14:02 ` [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov 0 siblings, 2 replies; 10+ messages in thread From: Kirill Smelkov @ 2014-02-13 14:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kirill Smelkov Here go combine-diff speedup patches in form of first reworking diff tree-walker to work in general case - when a commit have several parents, not only one - we are traversing all 1+nparent trees in parallel. Then we are taking advantage of the new diff tree-walker for speeding up combine-diff, which for linux.git results in ~14 times speedup. I understand v1.9.0 is going to be released first, but wanted to finally send the patches, so that people could start reviewing them. Please apply on top of ks/tree-diff-more and thanks beforehand, Kirill Kirill Smelkov (2): tree-diff: rework diff_tree() to generate diffs for multiparent cases as well combine-diff: speed it up, by using multiparent diff tree-walker directly combine-diff.c | 85 +++++++++- diff.c | 2 + diff.h | 10 ++ tree-diff.c | 501 +++++++++++++++++++++++++++++++++++++++++++++++++-------- 4 files changed, 529 insertions(+), 69 deletions(-) -- 1.9.rc1.181.g641f458 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well 2014-02-13 14:02 [PATCH 0/2] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov @ 2014-02-13 14:02 ` Kirill Smelkov 2014-02-13 19:51 ` Junio C Hamano 2014-02-13 14:02 ` [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov 1 sibling, 1 reply; 10+ messages in thread From: Kirill Smelkov @ 2014-02-13 14:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kirill Smelkov Previously diff_tree(), which is now named __diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch *hurts* *performance* *badly* for generating combined diffs - in c839f1bd (combine-diff: optimize combine_diff_path sets intersection) I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could *skip* *recursing* into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(A,X1...Xn) calculation scheme ------------------------------- D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn) (regarding resulting paths set) D(A,Xj) - diff between A..Xj D(A,X1...Xn) - combined diff from A to parents X1,...,Xn We start from all trees, which are sorted, and compare their entries in lock-step: A X1 Xn - - - |a| |x1| |xn| |-| |--| ... |--| i = argmin(x1...xn) | | | | | | |-| |--| |--| |.| |. | |. | . . . . . . at any time there could be 3 cases: 1) a < xi; 2) a > xi; 3) a = xi. Schematic deduction of what every case means, and what to do, follows: 1) a < xi -> ∀j a ∉ Xj -> "+a" ∈ D(A,Xj) -> D += "+a"; a↓ 2) a > xi 2.1) ∃j: xj > xi -> "-xi" ∉ D(A,Xj) -> D += ø; ∀ xk=xi xk↓ 2.2) ∀j xj = xi -> xj ∉ A -> "-xj" ∈ D(A,Xj) -> D += "-xi"; ∀j xj↓ 3) a = xi 3.1) ∃j: xj > xi -> "+a" ∈ D(A,Xj) -> only xk=xi remains to investigate 3.2) xj = xi -> investigate δ(a,xj) | | v 3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø -> ⎧δ(a,xk) - if xk=xi -> D += ⎨ ⎩"+a" - if xk>xi in any case a↓ ∀ xk=xi xk↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ | | | | a = b -> investigate δ(a,b) a↓ b↓ |-| |-| |.| |.| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker __diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_p0only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path *) which, if set to !NULL, is called for every generated path. (see new compat __diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous __diff_tree_sha1() code according to "D(A,X1...Xn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(*): 1) For loops, i = 0; do { ... } while (++i < nparent); is used instead of for (i = 0; i < nparent; ++i) ... because for the former case, the compiler have to emit additional prologue code which checks for i >= nparent case before entering the loop. As we require nparent must be >0, that additional overhead conflicts with the "runs not slower for nparent=1 case than before" goal. 2) alloca(), for small arrays, is used for the same reason - if we change it to xmalloc()/free() the timings get worse 3) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 4) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We efficient hand-made realloc-style __path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 5) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch (729f7e8e in pu) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-5) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> --- diff.c | 1 + diff.h | 10 ++ tree-diff.c | 501 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 448 insertions(+), 64 deletions(-) diff --git a/diff.c b/diff.c index 8e4a6a9..cda4aa8 100644 --- a/diff.c +++ b/diff.c @@ -3216,6 +3216,7 @@ void diff_setup(struct diff_options *options) options->context = diff_context_default; DIFF_OPT_SET(options, RENAME_EMPTY); + /* pathchange left =NULL by default */ options->change = diff_change; options->add_remove = diff_addremove; options->use_color = diff_use_color_default; diff --git a/diff.h b/diff.h index bf834ff..b59e8d6 100644 --- a/diff.h +++ b/diff.h @@ -15,6 +15,10 @@ struct diff_filespec; struct userdiff_driver; struct sha1_array; struct commit; +struct combine_diff_path; + +typedef int (*pathchange_fn_t)(struct diff_options *options, + struct combine_diff_path *path); typedef void (*change_fn_t)(struct diff_options *options, unsigned old_mode, unsigned new_mode, @@ -157,6 +161,7 @@ struct diff_options { int close_file; struct pathspec pathspec; + pathchange_fn_t pathchange; change_fn_t change; add_remove_fn_t add_remove; diff_format_fn_t format_callback; @@ -189,6 +194,11 @@ const char *diff_line_prefix(struct diff_options *); extern const char mime_boundary_leader[]; +extern +struct combine_diff_path *diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parent_sha1, int nparent, + struct strbuf *base, struct diff_options *opt); extern int __diff_tree_sha1(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt); extern int diff_tree_sha1(const unsigned char *old, const unsigned char *new, diff --git a/tree-diff.c b/tree-diff.c index ab61a0a..92d4087 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -7,6 +7,25 @@ #include "tree.h" /* + * internal mode marker, saying a tree entry != entry of tp[imin] + * (see __diff_tree_paths for what it means there) + * + * it *must* not overlap with any valid modes, and we will update/use/emit + * entry for diff only with it unset. Only non-overlapping to valid modes is + * required, because mode in tree_desc, comes here canonicalized via + * canon_mode(). + * + * the definition assumes unsigned is at least 32 bits. + */ +#define S_IFXMIN_NEQ 0x80000000 + + +static struct combine_diff_path *__diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parents_sha1, int nparent, + struct strbuf *base, struct diff_options *opt); + +/* * Compare two tree entries, taking into account only path/S_ISDIR(mode), * but not their sha1's. * @@ -33,72 +52,148 @@ static int tree_entry_pathcmp(struct tree_desc *t1, struct tree_desc *t2) } -/* convert path, t1/t2 -> opt->diff_*() callbacks */ -static void emit_diff(struct diff_options *opt, struct strbuf *path, - struct tree_desc *t1, struct tree_desc *t2) +/* convert path -> opt->diff_*() callbacks + * + * emits diff to parent0 only. + */ +static int emit_diff_p0only(struct diff_options *opt, struct combine_diff_path *p) { - unsigned int mode1 = t1 ? t1->entry.mode : 0; - unsigned int mode2 = t2 ? t2->entry.mode : 0; - - if (mode1 && mode2) { - opt->change(opt, mode1, mode2, t1->entry.sha1, t2->entry.sha1, - 1, 1, path->buf, 0, 0); + struct combine_diff_parent *p0 = &p->parent[0]; + if (p->mode && p0->mode) { + opt->change(opt, p0->mode, p->mode, p0->sha1, p->sha1, + 1, 1, p->path, 0, 0); } else { const unsigned char *sha1; unsigned int mode; int addremove; - if (mode2) { + if (p->mode) { addremove = '+'; - sha1 = t2->entry.sha1; - mode = mode2; + sha1 = p->sha1; + mode = p->mode; } else { addremove = '-'; - sha1 = t1->entry.sha1; - mode = mode1; + sha1 = p0->sha1; + mode = p0->mode; } - opt->add_remove(opt, addremove, mode, sha1, 1, path->buf, 0); + opt->add_remove(opt, addremove, mode, sha1, 1, p->path, 0); } + + return 0; /* = no need to keep allocated combine_diff_path */ } -/* new path should be added to diff +/* + * Make a new combine_diff_path from path/mode/sha1 + * and append it to paths list tail. + * + * Memory for created elements could be reused: + * + * - if last->next == NULL, the memory is allocated; + * + * - if last->next != NULL, it is assumed that p=last->next was returned + * earlier by this function, and p->next was *not* modified. + * The memory is then reused from p. + * + * so for clients, + * + * - if you do need to keep the element + * + * p = __path_appendnew(p, ...); + * process(p); + * p->next = NULL; + * + * - if you don't need to keep the element after processing + * + * pprev = p; + * p = __path_appendnew(p, ...); + * process(p); + * p = pprev; + * ; don't forget to free tail->next in the end + * + * p->parent[] remains uninitialized. + */ +static struct combine_diff_path *__path_appendnew(struct combine_diff_path *last, + int nparent, const struct strbuf *base, const char *path, int pathlen, + unsigned mode, const unsigned char *sha1) +{ + struct combine_diff_path *p; + int len = base->len + pathlen; + int alloclen = combine_diff_path_size(nparent, len); + + /* if last->next is !NULL - it is a pre-allocated memory, we can reuse */ + p = last->next; + if (p && (alloclen > (int)p->next)) { + free(p); + p = NULL; + } + + if (!p) { + p = xmalloc(alloclen); + + /* until we go to it next round, .next holds how many bytes we + * allocated (for faster realloc - we don't need copying old data). + */ + p->next = (struct combine_diff_path *)alloclen; + } + + last->next = p; + + p->path = (char *)&(p->parent[nparent]); + memcpy(p->path, base->buf, base->len); + memcpy(p->path + base->len, path, pathlen); + p->path[len] = 0; + p->mode = mode; + hashcpy(p->sha1, sha1 ? sha1 : null_sha1); + + return p; +} + +/* new path should be added to combine diff * * 3 cases on how/when it should be called and behaves: * - * !t1, t2 -> path added, parent lacks it - * t1, !t2 -> path removed from parent - * t1, t2 -> path modified + * t, !tp -> path added, all parents lack it + * !t, tp -> path removed from all parents + * t, tp -> path modified/added + * (M for tp[i]=tp[imin], A otherwise) */ -static void show_path(struct strbuf *base, struct diff_options *opt, - struct tree_desc *t1, struct tree_desc *t2) +static struct combine_diff_path *emit_path(struct combine_diff_path *p, + struct strbuf *base, struct diff_options *opt, int nparent, + struct tree_desc *t, struct tree_desc *tp, + int imin) { unsigned mode; const char *path; + const unsigned char *sha1; int pathlen; int old_baselen = base->len; - int isdir, recurse = 0, emitthis = 1; + int i, isdir, recurse = 0, emitthis = 1; /* at least something has to be valid */ - assert(t1 || t2); + assert(t || tp); - if (t2) { + if (t) { /* path present in resulting tree */ - tree_entry_extract(t2, &path, &mode); - pathlen = tree_entry_len(&t2->entry); + sha1 = tree_entry_extract(t, &path, &mode); + pathlen = tree_entry_len(&t->entry); isdir = S_ISDIR(mode); } else { - /* a path was removed - take path from parent. Also take - * mode from parent, to decide on recursion. + /* a path was removed - take path from imin parent. Also take + * mode from that parent, to decide on recursion(1). + * + * 1) all modes for tp[k]=tp[imin] should be the same wrt + * S_ISDIR, thanks to base_name_compare(). */ - tree_entry_extract(t1, &path, &mode); - pathlen = tree_entry_len(&t1->entry); + tree_entry_extract(&tp[imin], &path, &mode); + pathlen = tree_entry_len(&tp[imin].entry); isdir = S_ISDIR(mode); + sha1 = NULL; mode = 0; } @@ -107,18 +202,78 @@ static void show_path(struct strbuf *base, struct diff_options *opt, emitthis = DIFF_OPT_TST(opt, TREE_IN_RECURSIVE); } - strbuf_add(base, path, pathlen); + if (emitthis) { + int keep; + struct combine_diff_path *pprev = p; + p = __path_appendnew(p, nparent, base, path, pathlen, mode, sha1); - if (emitthis) - emit_diff(opt, base, t1, t2); + i = 0; do { + /* tp[i] is valid, if present and if tp[i]==tp[imin] - + * otherwise, we should ignore it. + */ + int tpi_valid = tp && !(tp[i].entry.mode & S_IFXMIN_NEQ); + + const unsigned char *sha1_i; + unsigned mode_i; + + p->parent[i].status = + !t ? DIFF_STATUS_DELETED : + tpi_valid ? + DIFF_STATUS_MODIFIED : + DIFF_STATUS_ADDED; + + if (tpi_valid) { + sha1_i = tp[i].entry.sha1; + mode_i = tp[i].entry.mode; + } + else { + sha1_i = NULL; + mode_i = 0; + } + + p->parent[i].mode = mode_i; + hashcpy(p->parent[i].sha1, sha1_i ? sha1_i : null_sha1); + } while (++i < nparent); + + keep = 1; + if (opt->pathchange) + keep = opt->pathchange(opt, p); + + /* If a path was filtered or consumed - we don't need to add it + * to the list and can reuse its memory, leaving it as + * pre-allocated element on the tail. + * + * On the other hand, if path needs to be kept, we need to + * correct its .next to NULL, as it was pre-initialized to how + * much memory was allocated. + * + * see __path_appendnew() for details. + */ + if (!keep) + p = pprev; + else + p->next = NULL; + } if (recurse) { + const unsigned char **parents_sha1; + + parents_sha1 = alloca(nparent * sizeof(parents_sha1[0])); + i = 0; do { + /* same rule as in emitthis */ + int tpi_valid = tp && !(tp[i].entry.mode & S_IFXMIN_NEQ); + + parents_sha1[i] = tpi_valid ? tp[i].entry.sha1 + : NULL; + } while (++i < nparent); + + strbuf_add(base, path, pathlen); strbuf_addch(base, '/'); - __diff_tree_sha1(t1 ? t1->entry.sha1 : NULL, - t2 ? t2->entry.sha1 : NULL, base, opt); + p = __diff_tree_paths(p, sha1, parents_sha1, nparent, base, opt); } strbuf_setlen(base, old_baselen); + return p; } static void skip_uninteresting(struct tree_desc *t, struct strbuf *base, @@ -137,59 +292,259 @@ static void skip_uninteresting(struct tree_desc *t, struct strbuf *base, } } -int __diff_tree_sha1(const unsigned char *old, const unsigned char *new, - struct strbuf *base, struct diff_options *opt) + +/* generate paths for combined diff D(sha1,parents_sha1[]) + * + * Resulting paths are appended to combine_diff_path linked list, and also, are + * emitted on the go via opt->pathchange() callback, so it is possible to + * process the result as batch or incrementally. + * + * The paths are generated scanning new tree and all parents trees + * simultaneously, similarly to what diff_tree() was doing for 2 trees. + * The theory behind such scan is as follows: + * + * + * D(A,X1...Xn) calculation scheme + * ------------------------------- + * + * D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn) (regarding resulting paths set) + * + * D(A,Xj) - diff between A..Xj + * D(A,X1...Xn) - combined diff from A to parents X1,...,Xn + * + * + * We start from all trees, which are sorted, and compare their entries in + * lock-step: + * + * A X1 Xn + * - - - + * |a| |x1| |xn| + * |-| |--| ... |--| i = argmin(x1...xn) + * | | | | | | + * |-| |--| |--| + * |.| |. | |. | + * . . . + * . . . + * + * at any time there could be 3 cases: + * + * 1) a < xi; + * 2) a > xi; + * 3) a = xi. + * + * Schematic deduction of what every case means, and what to do, follows: + * + * 1) a < xi -> ∀j a ∉ Xj -> "+a" ∈ D(A,Xj) -> D += "+a"; a↓ + * + * 2) a > xi + * + * 2.1) ∃j: xj > xi -> "-xi" ∉ D(A,Xj) -> D += ø; ∀ xk=xi xk↓ + * 2.2) ∀j xj = xi -> xj ∉ A -> "-xj" ∈ D(A,Xj) -> D += "-xi"; ∀j xj↓ + * + * 3) a = xi + * + * 3.1) ∃j: xj > xi -> "+a" ∈ D(A,Xj) -> only xk=xi remains to investigate + * 3.2) xj = xi -> investigate δ(a,xj) + * | + * | + * v + * + * 3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø -> + * + * ⎧δ(a,xk) - if xk=xi + * -> D += ⎨ + * ⎩"+a" - if xk>xi + * + * + * in any case a↓ ∀ xk=xi xk↓ + * + * + * ~~~~~~~~ + * + * NOTE + * + * Usual diff D(A,B) is by definition the same as combined diff D(A,[B]), + * so this diff paths generator can, and is used, for plain diffs + * generation too. + * + * Please keep attention to the common D(A,[B]) case when working on the + * code, in order not to slow it down. + * + * NOTE + * nparent must be > 0. + */ + + +/* ∀ xk=xi xk↓ */ +static inline void update_tp_entries(struct tree_desc *tp, int nparent) { - struct tree_desc t1, t2; - void *t1tree, *t2tree; + int i; + i = 0; do { + if (!(tp[i].entry.mode & S_IFXMIN_NEQ)) + update_tree_entry(&tp[i]); + } while (++i < nparent); +} + +static struct combine_diff_path *__diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parents_sha1, int nparent, + struct strbuf *base, struct diff_options *opt) +{ + struct tree_desc t, *tp; + void *ttree, **tptree; + int i; - t1tree = fill_tree_descriptor(&t1, old); - t2tree = fill_tree_descriptor(&t2, new); + tp = alloca(nparent * sizeof(tp[0])); + tptree = alloca(nparent * sizeof(tptree[0])); + + /* load parents first, as they are probably already cached. + * + * ( log_tree_diff() parses commit->parent before calling here via + * diff_tree_sha1(parent, commit) ) + */ + i = 0; do { + tptree[i] = fill_tree_descriptor(&tp[i], parents_sha1[i]); + } while (++i < nparent); + ttree = fill_tree_descriptor(&t, sha1); /* Enable recursion indefinitely */ opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE); for (;;) { - int cmp; + int imin, cmp; if (diff_can_quit_early(opt)) break; + if (opt->pathspec.nr) { - skip_uninteresting(&t1, base, opt); - skip_uninteresting(&t2, base, opt); + skip_uninteresting(&t, base, opt); + for (i = 0; i < nparent; i++) + skip_uninteresting(&tp[i], base, opt); } - if (!t1.size && !t2.size) - break; - cmp = tree_entry_pathcmp(&t1, &t2); + /* comparing is finished when all trees are done */ + if (!t.size) { + int done = 1; + i = 0; do { + if (tp[i].size) { + done = 0; + break; + } + } while (++i < nparent); + if (done) + break; + } - /* t1 = t2 */ - if (cmp == 0) { - if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) || - hashcmp(t1.entry.sha1, t2.entry.sha1) || - (t1.entry.mode != t2.entry.mode)) - show_path(base, opt, &t1, &t2); - update_tree_entry(&t1); - update_tree_entry(&t2); + /* lookup imin = argmin(x1...xn), + * mark entries whether they =tp[imin] along the way + */ + imin = 0; + tp[0].entry.mode &= ~S_IFXMIN_NEQ; + + for (i = 1; i < nparent; ++i) { + cmp = tree_entry_pathcmp(&tp[i], &tp[imin]); + if (cmp < 0) { + imin = i; + tp[i].entry.mode &= ~S_IFXMIN_NEQ; + } + else if (cmp == 0) { + tp[i].entry.mode &= ~S_IFXMIN_NEQ; + } + else { + tp[i].entry.mode |= S_IFXMIN_NEQ; + } } - /* t1 < t2 */ + /* fixup markings for entries before imin */ + for (i = 0; i < imin; ++i) + tp[i].entry.mode |= S_IFXMIN_NEQ; /* x[i] > x[imin] */ + + + + /* compare a vs x[imin] */ + cmp = tree_entry_pathcmp(&t, &tp[imin]); + + /* a = xi */ + if (cmp == 0) { + /* are either xk > xi or diff(a,xk) != ø ? */ + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER)) { + i = 0; do { + /* x[i] > x[imin] */ + if (tp[i].entry.mode & S_IFXMIN_NEQ) + continue; + + /* diff(a,xk) != ø */ + if (hashcmp(t.entry.sha1, tp[i].entry.sha1) || + (t.entry.mode != tp[i].entry.mode)) + continue; + + goto skip_emit_t_tp; + } while (++i < nparent); + } + + /* D += {δ(a,xk) if xk=xi; "+a" if xk > xi} */ + p = emit_path(p, base, opt, nparent, + &t, tp, imin); + + skip_emit_t_tp: + /* a↓, ∀ xk=ximin xk↓ */ + update_tree_entry(&t); + update_tp_entries(tp, nparent); + } + + /* a < xi */ else if (cmp < 0) { - show_path(base, opt, &t1, /*t2=*/NULL); - update_tree_entry(&t1); + /* D += "+a" */ + p = emit_path(p, base, opt, nparent, + &t, /*tp=*/NULL, -1); + + /* a↓ */ + update_tree_entry(&t); } - /* t1 > t2 */ + /* a > xi */ else { - show_path(base, opt, /*t1=*/NULL, &t2); - update_tree_entry(&t2); + /* ∀j xj=ximin -> D += "-xi" */ + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER)) { + i = 0; do { + if (tp[i].entry.mode & S_IFXMIN_NEQ) + goto skip_emit_tp; + } while (++i < nparent); + } + + p = emit_path(p, base, opt, nparent, + /*t=*/NULL, tp, imin); + + skip_emit_tp: + /* ∀ xk=ximin xk↓ */ + update_tp_entries(tp, nparent); } } - free(t2tree); - free(t1tree); - return 0; + free(ttree); + for (i = nparent-1; i >= 0; i--) + free(tptree[i]); + + return p; +} + +struct combine_diff_path *diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parents_sha1, int nparent, + struct strbuf *base, struct diff_options *opt) +{ + p = __diff_tree_paths(p, sha1, parents_sha1, nparent, base, opt); + + /* free pre-allocated last element, if any + * (see __path_appendnew() for details about why) + */ + if (p->next) { + free(p->next); + p->next = NULL; + } + + return p; } /* @@ -300,6 +655,24 @@ static void try_to_follow_renames(const unsigned char *old, const unsigned char q->nr = 1; } +int __diff_tree_sha1(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt) +{ + struct combine_diff_path phead, *p; + const unsigned char *parents_sha1[1] = {old}; + + phead.next = NULL; + opt->pathchange = emit_diff_p0only; + diff_tree_paths(&phead, new, parents_sha1, 1, base, opt); + + for (p = phead.next; p;) { + struct combine_diff_path *pprev = p; + p = p->next; + free(pprev); + } + + return 0; +} + int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base_str, struct diff_options *opt) { struct strbuf base; -- 1.9.rc1.181.g641f458 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well 2014-02-13 14:02 ` [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov @ 2014-02-13 19:51 ` Junio C Hamano 2014-02-14 12:15 ` Kirill Smelkov 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2014-02-13 19:51 UTC (permalink / raw) To: Kirill Smelkov; +Cc: git Kirill Smelkov <kirr@mns.spb.ru> writes: > + /* until we go to it next round, .next holds how many bytes we > + * allocated (for faster realloc - we don't need copying old data). > + */ > + p->next = (struct combine_diff_path *)alloclen; I am getting this here: tree-diff.c: In function '__path_appendnew': tree-diff.c:140:13: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well 2014-02-13 19:51 ` Junio C Hamano @ 2014-02-14 12:15 ` Kirill Smelkov 2014-02-14 17:37 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Kirill Smelkov @ 2014-02-14 12:15 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Feb 13, 2014 at 11:51:19AM -0800, Junio C Hamano wrote: > Kirill Smelkov <kirr@mns.spb.ru> writes: > > > + /* until we go to it next round, .next holds how many bytes we > > + * allocated (for faster realloc - we don't need copying old data). > > + */ > > + p->next = (struct combine_diff_path *)alloclen; > > I am getting this here: > > tree-diff.c: In function '__path_appendnew': > tree-diff.c:140:13: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] Ah, sorry, I only tested on 32 bits, and this indeed could be a valid warning on systems where sizeof(ptr) != sizeof(int). In our case, alloclen is small enough that the code should be valid, and we can avoid the warning, via proper usage of intptr_t. Please find corrected patch below. Thanks, Kirill ---- 8< ---- From: Kirill Smelkov <kirr@mns.spb.ru> Subject: [PATCH v2 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously diff_tree(), which is now named __diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch *hurts* *performance* *badly* for generating combined diffs - in c839f1bd (combine-diff: optimize combine_diff_path sets intersection) I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could *skip* *recursing* into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(A,X1...Xn) calculation scheme ------------------------------- D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn) (regarding resulting paths set) D(A,Xj) - diff between A..Xj D(A,X1...Xn) - combined diff from A to parents X1,...,Xn We start from all trees, which are sorted, and compare their entries in lock-step: A X1 Xn - - - |a| |x1| |xn| |-| |--| ... |--| i = argmin(x1...xn) | | | | | | |-| |--| |--| |.| |. | |. | . . . . . . at any time there could be 3 cases: 1) a < xi; 2) a > xi; 3) a = xi. Schematic deduction of what every case means, and what to do, follows: 1) a < xi -> ∀j a ∉ Xj -> "+a" ∈ D(A,Xj) -> D += "+a"; a↓ 2) a > xi 2.1) ∃j: xj > xi -> "-xi" ∉ D(A,Xj) -> D += ø; ∀ xk=xi xk↓ 2.2) ∀j xj = xi -> xj ∉ A -> "-xj" ∈ D(A,Xj) -> D += "-xi"; ∀j xj↓ 3) a = xi 3.1) ∃j: xj > xi -> "+a" ∈ D(A,Xj) -> only xk=xi remains to investigate 3.2) xj = xi -> investigate δ(a,xj) | | v 3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø -> ⎧δ(a,xk) - if xk=xi -> D += ⎨ ⎩"+a" - if xk>xi in any case a↓ ∀ xk=xi xk↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ | | | | a = b -> investigate δ(a,b) a↓ b↓ |-| |-| |.| |.| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker __diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_p0only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path *) which, if set to !NULL, is called for every generated path. (see new compat __diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous __diff_tree_sha1() code according to "D(A,X1...Xn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(*): 1) For loops, i = 0; do { ... } while (++i < nparent); is used instead of for (i = 0; i < nparent; ++i) ... because for the former case, the compiler have to emit additional prologue code which checks for i >= nparent case before entering the loop. As we require nparent must be >0, that additional overhead conflicts with the "runs not slower for nparent=1 case than before" goal. 2) alloca(), for small arrays, is used for the same reason - if we change it to xmalloc()/free() the timings get worse 3) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 4) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We efficient hand-made realloc-style __path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 5) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. >From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch (729f7e8e in pu) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-5) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> --- diff.c | 1 + diff.h | 10 ++ tree-diff.c | 510 ++++++++++++++++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 457 insertions(+), 64 deletions(-) Changes since v1: - fixed p->next assignment/"dereference" as integer, via proper use of intptr_t; - reworked multiline comments to start and end with /* and */ on separate lines. diff --git a/diff.c b/diff.c index 8e4a6a9..cda4aa8 100644 --- a/diff.c +++ b/diff.c @@ -3216,6 +3216,7 @@ void diff_setup(struct diff_options *options) options->context = diff_context_default; DIFF_OPT_SET(options, RENAME_EMPTY); + /* pathchange left =NULL by default */ options->change = diff_change; options->add_remove = diff_addremove; options->use_color = diff_use_color_default; diff --git a/diff.h b/diff.h index bf834ff..b59e8d6 100644 --- a/diff.h +++ b/diff.h @@ -15,6 +15,10 @@ struct diff_filespec; struct userdiff_driver; struct sha1_array; struct commit; +struct combine_diff_path; + +typedef int (*pathchange_fn_t)(struct diff_options *options, + struct combine_diff_path *path); typedef void (*change_fn_t)(struct diff_options *options, unsigned old_mode, unsigned new_mode, @@ -157,6 +161,7 @@ struct diff_options { int close_file; struct pathspec pathspec; + pathchange_fn_t pathchange; change_fn_t change; add_remove_fn_t add_remove; diff_format_fn_t format_callback; @@ -189,6 +194,11 @@ const char *diff_line_prefix(struct diff_options *); extern const char mime_boundary_leader[]; +extern +struct combine_diff_path *diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parent_sha1, int nparent, + struct strbuf *base, struct diff_options *opt); extern int __diff_tree_sha1(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt); extern int diff_tree_sha1(const unsigned char *old, const unsigned char *new, diff --git a/tree-diff.c b/tree-diff.c index ab61a0a..2b7c991 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -7,6 +7,25 @@ #include "tree.h" /* + * internal mode marker, saying a tree entry != entry of tp[imin] + * (see __diff_tree_paths for what it means there) + * + * it *must* not overlap with any valid modes, and we will update/use/emit + * entry for diff only with it unset. Only non-overlapping to valid modes is + * required, because mode in tree_desc, comes here canonicalized via + * canon_mode(). + * + * the definition assumes unsigned is at least 32 bits. + */ +#define S_IFXMIN_NEQ 0x80000000 + + +static struct combine_diff_path *__diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parents_sha1, int nparent, + struct strbuf *base, struct diff_options *opt); + +/* * Compare two tree entries, taking into account only path/S_ISDIR(mode), * but not their sha1's. * @@ -33,72 +52,152 @@ static int tree_entry_pathcmp(struct tree_desc *t1, struct tree_desc *t2) } -/* convert path, t1/t2 -> opt->diff_*() callbacks */ -static void emit_diff(struct diff_options *opt, struct strbuf *path, - struct tree_desc *t1, struct tree_desc *t2) +/* + * convert path -> opt->diff_*() callbacks + * + * emits diff to parent0 only. + */ +static int emit_diff_p0only(struct diff_options *opt, struct combine_diff_path *p) { - unsigned int mode1 = t1 ? t1->entry.mode : 0; - unsigned int mode2 = t2 ? t2->entry.mode : 0; - - if (mode1 && mode2) { - opt->change(opt, mode1, mode2, t1->entry.sha1, t2->entry.sha1, - 1, 1, path->buf, 0, 0); + struct combine_diff_parent *p0 = &p->parent[0]; + if (p->mode && p0->mode) { + opt->change(opt, p0->mode, p->mode, p0->sha1, p->sha1, + 1, 1, p->path, 0, 0); } else { const unsigned char *sha1; unsigned int mode; int addremove; - if (mode2) { + if (p->mode) { addremove = '+'; - sha1 = t2->entry.sha1; - mode = mode2; + sha1 = p->sha1; + mode = p->mode; } else { addremove = '-'; - sha1 = t1->entry.sha1; - mode = mode1; + sha1 = p0->sha1; + mode = p0->mode; } - opt->add_remove(opt, addremove, mode, sha1, 1, path->buf, 0); + opt->add_remove(opt, addremove, mode, sha1, 1, p->path, 0); } + + return 0; /* = no need to keep allocated combine_diff_path */ } -/* new path should be added to diff +/* + * Make a new combine_diff_path from path/mode/sha1 + * and append it to paths list tail. + * + * Memory for created elements could be reused: + * + * - if last->next == NULL, the memory is allocated; + * + * - if last->next != NULL, it is assumed that p=last->next was returned + * earlier by this function, and p->next was *not* modified. + * The memory is then reused from p. + * + * so for clients, + * + * - if you do need to keep the element + * + * p = __path_appendnew(p, ...); + * process(p); + * p->next = NULL; + * + * - if you don't need to keep the element after processing + * + * pprev = p; + * p = __path_appendnew(p, ...); + * process(p); + * p = pprev; + * ; don't forget to free tail->next in the end + * + * p->parent[] remains uninitialized. + */ +static struct combine_diff_path *__path_appendnew(struct combine_diff_path *last, + int nparent, const struct strbuf *base, const char *path, int pathlen, + unsigned mode, const unsigned char *sha1) +{ + struct combine_diff_path *p; + int len = base->len + pathlen; + int alloclen = combine_diff_path_size(nparent, len); + + /* if last->next is !NULL - it is a pre-allocated memory, we can reuse */ + p = last->next; + if (p && (alloclen > (intptr_t)p->next)) { + free(p); + p = NULL; + } + + if (!p) { + p = xmalloc(alloclen); + + /* + * until we go to it next round, .next holds how many bytes we + * allocated (for faster realloc - we don't need copying old data). + */ + p->next = (struct combine_diff_path *)(intptr_t)alloclen; + } + + last->next = p; + + p->path = (char *)&(p->parent[nparent]); + memcpy(p->path, base->buf, base->len); + memcpy(p->path + base->len, path, pathlen); + p->path[len] = 0; + p->mode = mode; + hashcpy(p->sha1, sha1 ? sha1 : null_sha1); + + return p; +} + +/* + * new path should be added to combine diff * * 3 cases on how/when it should be called and behaves: * - * !t1, t2 -> path added, parent lacks it - * t1, !t2 -> path removed from parent - * t1, t2 -> path modified + * t, !tp -> path added, all parents lack it + * !t, tp -> path removed from all parents + * t, tp -> path modified/added + * (M for tp[i]=tp[imin], A otherwise) */ -static void show_path(struct strbuf *base, struct diff_options *opt, - struct tree_desc *t1, struct tree_desc *t2) +static struct combine_diff_path *emit_path(struct combine_diff_path *p, + struct strbuf *base, struct diff_options *opt, int nparent, + struct tree_desc *t, struct tree_desc *tp, + int imin) { unsigned mode; const char *path; + const unsigned char *sha1; int pathlen; int old_baselen = base->len; - int isdir, recurse = 0, emitthis = 1; + int i, isdir, recurse = 0, emitthis = 1; /* at least something has to be valid */ - assert(t1 || t2); + assert(t || tp); - if (t2) { + if (t) { /* path present in resulting tree */ - tree_entry_extract(t2, &path, &mode); - pathlen = tree_entry_len(&t2->entry); + sha1 = tree_entry_extract(t, &path, &mode); + pathlen = tree_entry_len(&t->entry); isdir = S_ISDIR(mode); } else { - /* a path was removed - take path from parent. Also take - * mode from parent, to decide on recursion. + /* + * a path was removed - take path from imin parent. Also take + * mode from that parent, to decide on recursion(1). + * + * 1) all modes for tp[k]=tp[imin] should be the same wrt + * S_ISDIR, thanks to base_name_compare(). */ - tree_entry_extract(t1, &path, &mode); - pathlen = tree_entry_len(&t1->entry); + tree_entry_extract(&tp[imin], &path, &mode); + pathlen = tree_entry_len(&tp[imin].entry); isdir = S_ISDIR(mode); + sha1 = NULL; mode = 0; } @@ -107,18 +206,80 @@ static void show_path(struct strbuf *base, struct diff_options *opt, emitthis = DIFF_OPT_TST(opt, TREE_IN_RECURSIVE); } - strbuf_add(base, path, pathlen); + if (emitthis) { + int keep; + struct combine_diff_path *pprev = p; + p = __path_appendnew(p, nparent, base, path, pathlen, mode, sha1); - if (emitthis) - emit_diff(opt, base, t1, t2); + i = 0; do { + /* + * tp[i] is valid, if present and if tp[i]==tp[imin] - + * otherwise, we should ignore it. + */ + int tpi_valid = tp && !(tp[i].entry.mode & S_IFXMIN_NEQ); + + const unsigned char *sha1_i; + unsigned mode_i; + + p->parent[i].status = + !t ? DIFF_STATUS_DELETED : + tpi_valid ? + DIFF_STATUS_MODIFIED : + DIFF_STATUS_ADDED; + + if (tpi_valid) { + sha1_i = tp[i].entry.sha1; + mode_i = tp[i].entry.mode; + } + else { + sha1_i = NULL; + mode_i = 0; + } + + p->parent[i].mode = mode_i; + hashcpy(p->parent[i].sha1, sha1_i ? sha1_i : null_sha1); + } while (++i < nparent); + + keep = 1; + if (opt->pathchange) + keep = opt->pathchange(opt, p); + + /* + * If a path was filtered or consumed - we don't need to add it + * to the list and can reuse its memory, leaving it as + * pre-allocated element on the tail. + * + * On the other hand, if path needs to be kept, we need to + * correct its .next to NULL, as it was pre-initialized to how + * much memory was allocated. + * + * see __path_appendnew() for details. + */ + if (!keep) + p = pprev; + else + p->next = NULL; + } if (recurse) { + const unsigned char **parents_sha1; + + parents_sha1 = alloca(nparent * sizeof(parents_sha1[0])); + i = 0; do { + /* same rule as in emitthis */ + int tpi_valid = tp && !(tp[i].entry.mode & S_IFXMIN_NEQ); + + parents_sha1[i] = tpi_valid ? tp[i].entry.sha1 + : NULL; + } while (++i < nparent); + + strbuf_add(base, path, pathlen); strbuf_addch(base, '/'); - __diff_tree_sha1(t1 ? t1->entry.sha1 : NULL, - t2 ? t2->entry.sha1 : NULL, base, opt); + p = __diff_tree_paths(p, sha1, parents_sha1, nparent, base, opt); } strbuf_setlen(base, old_baselen); + return p; } static void skip_uninteresting(struct tree_desc *t, struct strbuf *base, @@ -137,59 +298,262 @@ static void skip_uninteresting(struct tree_desc *t, struct strbuf *base, } } -int __diff_tree_sha1(const unsigned char *old, const unsigned char *new, - struct strbuf *base, struct diff_options *opt) + +/* + * generate paths for combined diff D(sha1,parents_sha1[]) + * + * Resulting paths are appended to combine_diff_path linked list, and also, are + * emitted on the go via opt->pathchange() callback, so it is possible to + * process the result as batch or incrementally. + * + * The paths are generated scanning new tree and all parents trees + * simultaneously, similarly to what diff_tree() was doing for 2 trees. + * The theory behind such scan is as follows: + * + * + * D(A,X1...Xn) calculation scheme + * ------------------------------- + * + * D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn) (regarding resulting paths set) + * + * D(A,Xj) - diff between A..Xj + * D(A,X1...Xn) - combined diff from A to parents X1,...,Xn + * + * + * We start from all trees, which are sorted, and compare their entries in + * lock-step: + * + * A X1 Xn + * - - - + * |a| |x1| |xn| + * |-| |--| ... |--| i = argmin(x1...xn) + * | | | | | | + * |-| |--| |--| + * |.| |. | |. | + * . . . + * . . . + * + * at any time there could be 3 cases: + * + * 1) a < xi; + * 2) a > xi; + * 3) a = xi. + * + * Schematic deduction of what every case means, and what to do, follows: + * + * 1) a < xi -> ∀j a ∉ Xj -> "+a" ∈ D(A,Xj) -> D += "+a"; a↓ + * + * 2) a > xi + * + * 2.1) ∃j: xj > xi -> "-xi" ∉ D(A,Xj) -> D += ø; ∀ xk=xi xk↓ + * 2.2) ∀j xj = xi -> xj ∉ A -> "-xj" ∈ D(A,Xj) -> D += "-xi"; ∀j xj↓ + * + * 3) a = xi + * + * 3.1) ∃j: xj > xi -> "+a" ∈ D(A,Xj) -> only xk=xi remains to investigate + * 3.2) xj = xi -> investigate δ(a,xj) + * | + * | + * v + * + * 3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø -> + * + * ⎧δ(a,xk) - if xk=xi + * -> D += ⎨ + * ⎩"+a" - if xk>xi + * + * + * in any case a↓ ∀ xk=xi xk↓ + * + * + * ~~~~~~~~ + * + * NOTE + * + * Usual diff D(A,B) is by definition the same as combined diff D(A,[B]), + * so this diff paths generator can, and is used, for plain diffs + * generation too. + * + * Please keep attention to the common D(A,[B]) case when working on the + * code, in order not to slow it down. + * + * NOTE + * nparent must be > 0. + */ + + +/* ∀ xk=xi xk↓ */ +static inline void update_tp_entries(struct tree_desc *tp, int nparent) +{ + int i; + i = 0; do { + if (!(tp[i].entry.mode & S_IFXMIN_NEQ)) + update_tree_entry(&tp[i]); + } while (++i < nparent); +} + +static struct combine_diff_path *__diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parents_sha1, int nparent, + struct strbuf *base, struct diff_options *opt) { - struct tree_desc t1, t2; - void *t1tree, *t2tree; + struct tree_desc t, *tp; + void *ttree, **tptree; + int i; - t1tree = fill_tree_descriptor(&t1, old); - t2tree = fill_tree_descriptor(&t2, new); + tp = alloca(nparent * sizeof(tp[0])); + tptree = alloca(nparent * sizeof(tptree[0])); + + /* + * load parents first, as they are probably already cached. + * + * ( log_tree_diff() parses commit->parent before calling here via + * diff_tree_sha1(parent, commit) ) + */ + i = 0; do { + tptree[i] = fill_tree_descriptor(&tp[i], parents_sha1[i]); + } while (++i < nparent); + ttree = fill_tree_descriptor(&t, sha1); /* Enable recursion indefinitely */ opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE); for (;;) { - int cmp; + int imin, cmp; if (diff_can_quit_early(opt)) break; + if (opt->pathspec.nr) { - skip_uninteresting(&t1, base, opt); - skip_uninteresting(&t2, base, opt); + skip_uninteresting(&t, base, opt); + for (i = 0; i < nparent; i++) + skip_uninteresting(&tp[i], base, opt); } - if (!t1.size && !t2.size) - break; - cmp = tree_entry_pathcmp(&t1, &t2); + /* comparing is finished when all trees are done */ + if (!t.size) { + int done = 1; + i = 0; do { + if (tp[i].size) { + done = 0; + break; + } + } while (++i < nparent); + if (done) + break; + } + + /* + * lookup imin = argmin(x1...xn), + * mark entries whether they =tp[imin] along the way + */ + imin = 0; + tp[0].entry.mode &= ~S_IFXMIN_NEQ; + + for (i = 1; i < nparent; ++i) { + cmp = tree_entry_pathcmp(&tp[i], &tp[imin]); + if (cmp < 0) { + imin = i; + tp[i].entry.mode &= ~S_IFXMIN_NEQ; + } + else if (cmp == 0) { + tp[i].entry.mode &= ~S_IFXMIN_NEQ; + } + else { + tp[i].entry.mode |= S_IFXMIN_NEQ; + } + } + + /* fixup markings for entries before imin */ + for (i = 0; i < imin; ++i) + tp[i].entry.mode |= S_IFXMIN_NEQ; /* x[i] > x[imin] */ + - /* t1 = t2 */ - if (cmp == 0) { - if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) || - hashcmp(t1.entry.sha1, t2.entry.sha1) || - (t1.entry.mode != t2.entry.mode)) - show_path(base, opt, &t1, &t2); - update_tree_entry(&t1); - update_tree_entry(&t2); + /* compare a vs x[imin] */ + cmp = tree_entry_pathcmp(&t, &tp[imin]); + + /* a = xi */ + if (cmp == 0) { + /* are either xk > xi or diff(a,xk) != ø ? */ + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER)) { + i = 0; do { + /* x[i] > x[imin] */ + if (tp[i].entry.mode & S_IFXMIN_NEQ) + continue; + + /* diff(a,xk) != ø */ + if (hashcmp(t.entry.sha1, tp[i].entry.sha1) || + (t.entry.mode != tp[i].entry.mode)) + continue; + + goto skip_emit_t_tp; + } while (++i < nparent); + } + + /* D += {δ(a,xk) if xk=xi; "+a" if xk > xi} */ + p = emit_path(p, base, opt, nparent, + &t, tp, imin); + + skip_emit_t_tp: + /* a↓, ∀ xk=ximin xk↓ */ + update_tree_entry(&t); + update_tp_entries(tp, nparent); } - /* t1 < t2 */ + /* a < xi */ else if (cmp < 0) { - show_path(base, opt, &t1, /*t2=*/NULL); - update_tree_entry(&t1); + /* D += "+a" */ + p = emit_path(p, base, opt, nparent, + &t, /*tp=*/NULL, -1); + + /* a↓ */ + update_tree_entry(&t); } - /* t1 > t2 */ + /* a > xi */ else { - show_path(base, opt, /*t1=*/NULL, &t2); - update_tree_entry(&t2); + /* ∀j xj=ximin -> D += "-xi" */ + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER)) { + i = 0; do { + if (tp[i].entry.mode & S_IFXMIN_NEQ) + goto skip_emit_tp; + } while (++i < nparent); + } + + p = emit_path(p, base, opt, nparent, + /*t=*/NULL, tp, imin); + + skip_emit_tp: + /* ∀ xk=ximin xk↓ */ + update_tp_entries(tp, nparent); } } - free(t2tree); - free(t1tree); - return 0; + free(ttree); + for (i = nparent-1; i >= 0; i--) + free(tptree[i]); + + return p; +} + +struct combine_diff_path *diff_tree_paths( + struct combine_diff_path *p, const unsigned char *sha1, + const unsigned char **parents_sha1, int nparent, + struct strbuf *base, struct diff_options *opt) +{ + p = __diff_tree_paths(p, sha1, parents_sha1, nparent, base, opt); + + /* + * free pre-allocated last element, if any + * (see __path_appendnew() for details about why) + */ + if (p->next) { + free(p->next); + p->next = NULL; + } + + return p; } /* @@ -300,6 +664,24 @@ static void try_to_follow_renames(const unsigned char *old, const unsigned char q->nr = 1; } +int __diff_tree_sha1(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt) +{ + struct combine_diff_path phead, *p; + const unsigned char *parents_sha1[1] = {old}; + + phead.next = NULL; + opt->pathchange = emit_diff_p0only; + diff_tree_paths(&phead, new, parents_sha1, 1, base, opt); + + for (p = phead.next; p;) { + struct combine_diff_path *pprev = p; + p = p->next; + free(pprev); + } + + return 0; +} + int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base_str, struct diff_options *opt) { struct strbuf base; -- 1.9.rc1.181.g641f458 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well 2014-02-14 12:15 ` Kirill Smelkov @ 2014-02-14 17:37 ` Junio C Hamano 2014-02-16 8:08 ` Kirill Smelkov 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2014-02-14 17:37 UTC (permalink / raw) To: Kirill Smelkov; +Cc: git Kirill Smelkov <kirr@mns.spb.ru> writes: > ---- 8< ---- > From: Kirill Smelkov <kirr@mns.spb.ru> > Subject: [PATCH v2 1/2] tree-diff: rework diff_tree() to generate diffs for > multiparent cases as well > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit The last three do not belong here. Only From, Date and Subject are relevant and taken as in-body headers. For that matter, as the From and Subject above are exactly the same as what you have on the e-mail header, you do not want any of them. I'll strip them out here, so no need to resend. Thanks. > Previously diff_tree(), which is now named __diff_tree_sha1(), was That name with two leading underscores is a rather unfortunate, especially for a function that is not a file scope static. No way to rename this to something more sensible? > That impedance mismatch *hurts* *performance* *badly* for generating > combined diffs - in c839f1bd (combine-diff: optimize combine_diff_path Please avoid referring to a commit that is not in 'master' by its object name. It can be reworked later and get a different name. > That slowness comes from the fact that currently, while generating > combined diff, a lot of time is spent computing diff(commit,commit^2) > just to only then intersect that huge diff to almost small set of files > from diff(commit,commit^1). Good observation. > |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ > |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ > | | | | a = b -> investigate δ(a,b) a↓ b↓ In both the "n-parallel" and "diff-tree", when an entry 'a' is a tree, I take this "D(A,B) += +a" to mean (recursively) adding all the paths within 'a' to the result as addition. Sounds sensible. > D(A,B) > > is by definition the same as combined diff > > D(A,[B]), > > so if we could rework the code for common case and make it be not slower > for nparent=1 case, usual diff(t1,t2) generation will not be slower, and > multiparent diff tree-walker would greatly benefit generating > combine-diff. OK. > What we do is as follows: > > 1) diff tree-walker __diff_tree_sha1() is internally reworked to be > a paths generator (new name diff_tree_paths()), with each generated path > being `struct combine_diff_path` with info for path, new sha1,mode and for > every parent which sha1,mode it was in it. > > 2) From that info, we can still generate usual diff queue with > struct diff_filepairs, via "exporting" generated > combine_diff_path, if we know we run for nparent=1 case. > (see emit_diff() which is now named emit_diff_p0only()) s/p0/first_parent_/; perhaps? > 3) In order for diff_can_quit_early(), which checks > > DIFF_OPT_TST(opt, HAS_CHANGES)) > > to work, that exporting have to be happening not in bulk, but > incrementally, one diff path at a time. Good thinking. > Some notes(*): > > 1) For loops, > > i = 0; do { ... } while (++i < nparent); > > is used instead of > > for (i = 0; i < nparent; ++i) > ... > > because for the former case, the compiler have to emit additional > prologue code which checks for i >= nparent case before entering the > loop. > > As we require nparent must be >0, that additional overhead > conflicts with the "runs not slower for nparent=1 case than before" > goal. Unfortunate. I'd rather see us stick to more readable and familiar form for maintainability if this were not measurable. > 2) alloca(), for small arrays, is used for the same reason - if we change > it to xmalloc()/free() the timings get worse Do you see any use of it outside compat/? I thought we specifically avoid alloca() for portability. Also we do not use variable-length-arrays on the stack either, I think. > 3) For every parent tree, we need to keep a tag, whether entry from that > parent equals to entry from minimal parent. For performance reasons I'm > keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Unfortunate, but I do not see another place to keep this information offhand (nor implement this approach without keeping that piece of information). > P.S. and combined diff is not some exotic/for-play-only stuff - for No need to convince us about that ;-) > example for a program I write to represent Git archives as readonly > filesystem, there is initial scan with > > `git log --reverse --raw --no-abbrev --no-renames -c` > > to extract log of what was created/changed when, as a result building a > map > > {} sha1 -> in which commit (and date) a content was added > > that `-c` means also show combined diff for merges, and without them, if > a merge is non-trivial (merges changes from two parents with both having > separate changes to a file), or an evil one, the map will not be full, > i.e. some valid sha1 would be absent from it. > > That case was my initial motivation for combined diffs speedup. I wonder if this machinery can be reused for "log -m" as well (or perhaps you do that already?). After all, by performing a single parallel scan, you are gathering all the necessary information to let you pretend that you did N pairwise diff-tree. > diff --git a/tree-diff.c b/tree-diff.c > index ab61a0a..2b7c991 100644 > --- a/tree-diff.c > +++ b/tree-diff.c > @@ -7,6 +7,25 @@ > #include "tree.h" > > /* > + * internal mode marker, saying a tree entry != entry of tp[imin] > + * (see __diff_tree_paths for what it means there) > + * > + * it *must* not overlap with any valid modes, and we will update/use/emit > + * entry for diff only with it unset. Only non-overlapping to valid modes is > + * required, because mode in tree_desc, comes here canonicalized via > + * canon_mode(). > + * > + * the definition assumes unsigned is at least 32 bits. > + */ > +#define S_IFXMIN_NEQ 0x80000000 To allow better coordination across multiple codepaths that deal with modes, I am wondering if this should be defined in cache.h where made-up S_FIGITLINK and S_IFINVALID are defined (note the comment that is there, as well). > +static struct combine_diff_path *__diff_tree_paths( > + struct combine_diff_path *p, const unsigned char *sha1, > + const unsigned char **parents_sha1, int nparent, > + struct strbuf *base, struct diff_options *opt); Most of our code do not name private helper functions with leading underscores. I do like the direction this is going, but it looks to me that "struct combine_diff" is now misnamed, because it no longer is about combined diff. You are introducing a good framework for n-way diff, and producing combined diff (i.e. -c or --cc) is now merely one way to use that framework. We may want to clean these names up after this series settles---perhaps "struct nway_diff" or something. > + > +/* > * Compare two tree entries, taking into account only path/S_ISDIR(mode), > * but not their sha1's. > * > @@ -33,72 +52,152 @@ static int tree_entry_pathcmp(struct tree_desc *t1, struct tree_desc *t2) > } > > > -/* convert path, t1/t2 -> opt->diff_*() callbacks */ > -static void emit_diff(struct diff_options *opt, struct strbuf *path, > - struct tree_desc *t1, struct tree_desc *t2) > +/* > + * convert path -> opt->diff_*() callbacks > + * > + * emits diff to parent0 only. Please call that "first parent". > + */ "Returns 0 to tell the caller that we are done with p and it can be freed" or something? > +static int emit_diff_p0only(struct diff_options *opt, struct combine_diff_path *p) > { > ... > + > + return 0; /* = no need to keep allocated combine_diff_path */ Curious; what is that equal sign in the comment? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well 2014-02-14 17:37 ` Junio C Hamano @ 2014-02-16 8:08 ` Kirill Smelkov 2014-02-18 21:29 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Kirill Smelkov @ 2014-02-16 8:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: kirr, git On Fri, Feb 14, 2014 at 09:37:00AM -0800, Junio C Hamano wrote: > Kirill Smelkov <kirr@mns.spb.ru> writes: > > > Previously diff_tree(), which is now named __diff_tree_sha1(), was > > That name with two leading underscores is a rather unfortunate, > especially for a function that is not a file scope static. No way > to rename this to something more sensible? I agree. In preparatory patches I thought this will go away, but it stayed. I'll try to come up with something reasonable. > > That impedance mismatch *hurts* *performance* *badly* for generating > > combined diffs - in c839f1bd (combine-diff: optimize combine_diff_path > > Please avoid referring to a commit that is not in 'master' by its > object name. It can be reworked later and get a different name. I agree, this makes sense. Is it ok to refer to nearby commits, by say HEAD~3, if we know we are referring to 3-times previous commit? > > That slowness comes from the fact that currently, while generating > > combined diff, a lot of time is spent computing diff(commit,commit^2) > > just to only then intersect that huge diff to almost small set of files > > from diff(commit,commit^1). > > Good observation. > > > |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ > > |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ > > | | | | a = b -> investigate δ(a,b) a↓ b↓ > > In both the "n-parallel" and "diff-tree", when an entry 'a' is a > tree, I take this "D(A,B) += +a" to mean (recursively) adding all > the paths within 'a' to the result as addition. Sounds sensible. Correct. > > D(A,B) > > > > is by definition the same as combined diff > > > > D(A,[B]), > > > > so if we could rework the code for common case and make it be not slower > > for nparent=1 case, usual diff(t1,t2) generation will not be slower, and > > multiparent diff tree-walker would greatly benefit generating > > combine-diff. > > OK. Thanks. My first goal was to demonstrate it is doable - i.e. we could join two diff tree-walkers into generalized one, and that approach would be sound and have chances to be accepted. > > What we do is as follows: > > > > 1) diff tree-walker __diff_tree_sha1() is internally reworked to be > > a paths generator (new name diff_tree_paths()), with each generated path > > being `struct combine_diff_path` with info for path, new sha1,mode and for > > every parent which sha1,mode it was in it. > > > > 2) From that info, we can still generate usual diff queue with > > struct diff_filepairs, via "exporting" generated > > combine_diff_path, if we know we run for nparent=1 case. > > (see emit_diff() which is now named emit_diff_p0only()) > > s/p0/first_parent_/; perhaps? Ok. > > 3) In order for diff_can_quit_early(), which checks > > > > DIFF_OPT_TST(opt, HAS_CHANGES)) > > > > to work, that exporting have to be happening not in bulk, but > > incrementally, one diff path at a time. > > Good thinking. Yes. This requirement made the diff-paths producer be real generator, which emits paths incrementally, which is imho, could be a good design for making the component more reusable. > > Some notes(*): > > > > 1) For loops, > > > > i = 0; do { ... } while (++i < nparent); > > > > is used instead of > > > > for (i = 0; i < nparent; ++i) > > ... > > > > because for the former case, the compiler have to emit additional > > prologue code which checks for i >= nparent case before entering the > > loop. > > > > As we require nparent must be >0, that additional overhead > > conflicts with the "runs not slower for nparent=1 case than before" > > goal. > > Unfortunate. I'd rather see us stick to more readable and familiar > form for maintainability if this were not measurable. The most effect on performance were avoiding mallocs and reduce register pressure. This too, was measurable, but of lower impact. If giving away some part of percent for nparent=1 case is ok, this could be back to usual for loops. By the way, I find it a bit unfortunate, we don't have some for-loop form with post-conditions in C, only do-while. Another observation, is that it would be good to say to compiler e.g. assume nparent > 0; and then in e.g. for (i = 0; i < nparent; i++) ... the compiler could know it should not do pre-conditions checks before entering the loop. I've verified, that such behaviour, at least with gcc, could be achieved with if (nparent <= 0) return; in function prologue - then the compiler deduces, at least with O2, that after return point, nparent is >0, but this adds slight overhead on each entry to function, and we have as many calls as there would be recursions. To me, the do-while is still readable, but if fors are preferred, I'll re-measure what it costs and come back. > > 2) alloca(), for small arrays, is used for the same reason - if we change > > it to xmalloc()/free() the timings get worse > > Do you see any use of it outside compat/? > > I thought we specifically avoid alloca() for portability. Also we > do not use variable-length-arrays on the stack either, I think. No, no usage outside compat/ and I knew alloca and VLAs are not used in Git codebase for portability, and I understand alloca will be criticized, but wanted to start the discussion rolling. I've actually started without alloca, and used xmalloc/free for [nparent] vectors, but the impact was measurable, so it just had to be changed to something more optimal. For me, personally, alloca is ok, but I understand there could be portability issues (by the way, what compiler/system Git cares about does not have working alloca?). Thats why I propose we do the following 1. at configure time, determine, do we have working alloca, and define #define HAVE_ALLOCA if yes. 2. in code #ifdef HAVE_ALLOCA # define xalloca(size) (alloca(size)) # define xalloca_free(p) do {} while(0) #else # define xalloca(size) (xmalloc(size)) # define xalloca_free(p) (free(p)) #endif and use it like func() { p = xalloca(size); ... xalloca_free(p); } This way, for systems, where alloca is available, we'll have optimal on-stack allocations with fast executions. On the other hand, on systems, where alloca is not available, this gracefully fallbacks to xmalloc/free. Please tell me what you think. > > 3) For every parent tree, we need to keep a tag, whether entry from that > > parent equals to entry from minimal parent. For performance reasons I'm > > keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. > > Unfortunate, but I do not see another place to keep this > information offhand (nor implement this approach without keeping > that piece of information). Me neither. I've actually started with separate on-stack (via alloca) [nparent] vector for such tags, and removing it was measurable. > > P.S. and combined diff is not some exotic/for-play-only stuff - for > > No need to convince us about that ;-) Thanks ;) May I ask to please keep my "motivation" part in the commit log for historical reasons? > > example for a program I write to represent Git archives as readonly > > filesystem, there is initial scan with > > > > `git log --reverse --raw --no-abbrev --no-renames -c` > > > > to extract log of what was created/changed when, as a result building a > > map > > > > {} sha1 -> in which commit (and date) a content was added > > > > that `-c` means also show combined diff for merges, and without them, if > > a merge is non-trivial (merges changes from two parents with both having > > separate changes to a file), or an evil one, the map will not be full, > > i.e. some valid sha1 would be absent from it. > > > > That case was my initial motivation for combined diffs speedup. > > I wonder if this machinery can be reused for "log -m" as well (or > perhaps you do that already?). After all, by performing a single > parallel scan, you are gathering all the necessary information to > let you pretend that you did N pairwise diff-tree. Unfortunately, as it is now, no, and let me explain why: The reason that is not true, is that we omit recursing into directories, if we know D(A,some-parent) for that path is empty. That means we don't calculate D(A,any-other-parents) for that path and subpaths. More structured description is that combined diff and "log -m", which could be though as all diffs D(A,Pi) are different things: - the combined diff is D(A,B) generalization based on "^" (sets intersection) operator, and - log -m, aka "all diffs" is D(A,B) generalization based on "v" (sets union) operator. Intersection means, we can omit calculating parts from other sets, if we know some set does not have an element (remember "don't recurse into subdirectories"?), and unioning does not have this property. It does so happen, that "^" case (combine-diff) is more interesting, because in the end it allows to see new information - the diff a merge itself introduces. "log -m" does not have this property and is no more interesting to what plain diff(HEAD,HEAD^n) can provide - in other words it's just a convenience. Now, the diff tree-walker could be generalized once more, to allow clients specify, which diffs combination operator to use - intersection or unioning, but I doubt that for unioning case that would add significant speedup - we can't reduce any diff generation based on another diff and the only saving is that we traverse resulting commit tree once, but for some cases that could be maybe slower, say if result and some parents don't have a path and some parent does, we'll be recursing into that path and do more work compared to plain D(A,Pi) for Pi that lacks the path. In short: it could be generalized more, if needed, but I propose we first establish the ground with generalizing to just combine-diff. > > diff --git a/tree-diff.c b/tree-diff.c > > index ab61a0a..2b7c991 100644 > > --- a/tree-diff.c > > +++ b/tree-diff.c > > @@ -7,6 +7,25 @@ > > #include "tree.h" > > > > /* > > + * internal mode marker, saying a tree entry != entry of tp[imin] > > + * (see __diff_tree_paths for what it means there) > > + * > > + * it *must* not overlap with any valid modes, and we will update/use/emit > > + * entry for diff only with it unset. Only non-overlapping to valid modes is > > + * required, because mode in tree_desc, comes here canonicalized via > > + * canon_mode(). > > + * > > + * the definition assumes unsigned is at least 32 bits. > > + */ > > +#define S_IFXMIN_NEQ 0x80000000 > > To allow better coordination across multiple codepaths that deal > with modes, I am wondering if this should be defined in cache.h > where made-up S_FIGITLINK and S_IFINVALID are defined (note the > comment that is there, as well). I knew about S_IFGITLINK and S_IFINVALID being located in cache.h with comments. The reason I decided to place S_IFXMIN_NEQ here, is that it is local tag - it is used locally in diff-tree, and neither is coming in, nor coming out of it in set state. So putting it in cache.h would mean we'll reserve the bit for something which is used only temporarily and that bit would be not available for other temporary uses. On the other hand, it would be better, to mark in cache.h that some bits could be used temporarily for application specific things. Maybe provide a mask there or something similar to S_IFUSR1, S_IFUSR2 :) S_IFXMIN_NEQ is just has no meaning for code outside of tree-diff.c ... What do you think? > > +static struct combine_diff_path *__diff_tree_paths( > > + struct combine_diff_path *p, const unsigned char *sha1, > > + const unsigned char **parents_sha1, int nparent, > > + struct strbuf *base, struct diff_options *opt); > > Most of our code do not name private helper functions with leading > underscores. What would be a good name here? In contrast to __diff_tree_sha1, it is static, and adding some prefix/suffix to the name is maybe adding more noise than value? We have diff_tree_paths() which does the setup and cleanup and this worker __diff_tree_paths... Some names come to me as diff_tree_paths_raw, or diff_tree_paths_worker, but having Linux heritage, I like __diff_tree_paths more. What would be a good approach for Git style here? > I do like the direction this is going, but it looks to me that > "struct combine_diff" is now misnamed, because it no longer is about > combined diff. You are introducing a good framework for n-way diff, > and producing combined diff (i.e. -c or --cc) is now merely one way > to use that framework. We may want to clean these names up after > this series settles---perhaps "struct nway_diff" or something. Thanks. It was my main question whether this reworking will be a good/accepted idea or not. I agree about `struct combine_diff_path` name and that it should be changed afterwards (my idea was `struct diff_path` for it). Given that we had diff_filepair for n=2 diffs, it would logical to call it as maybe diff_nway, diff_filevector, or something else but maybe emphasizing n/vector/... in the name is not so good in the long run, and using, for generalized diff just diff_file, or diff_path is better. Naming is important and I don't settled on it however, yet... > > + > > +/* > > * Compare two tree entries, taking into account only path/S_ISDIR(mode), > > * but not their sha1's. > > * > > @@ -33,72 +52,152 @@ static int tree_entry_pathcmp(struct tree_desc *t1, struct tree_desc *t2) > > } > > > > > > -/* convert path, t1/t2 -> opt->diff_*() callbacks */ > > -static void emit_diff(struct diff_options *opt, struct strbuf *path, > > - struct tree_desc *t1, struct tree_desc *t2) > > +/* > > + * convert path -> opt->diff_*() callbacks > > + * > > + * emits diff to parent0 only. > > Please call that "first parent". Ok. > > + */ > > "Returns 0 to tell the caller that we are done with p and it can be > freed" or something? > > > +static int emit_diff_p0only(struct diff_options *opt, struct combine_diff_path *p) > > { > > ... > > + > > + return 0; /* = no need to keep allocated combine_diff_path */ > > Curious; what is that equal sign in the comment? It stands for "that means", "i.e." or something similar. In other words it is semantically equal to your comment from the above. I'm open to reworking the commenting style for it to reads more well. Should I? ~~~~ Thanks again for reviewing this and for generally accepting the approach. Please tell me your thoughts. Based on your input, I'll try to come up with something improved on monday. Thanks, Kirill ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well 2014-02-16 8:08 ` Kirill Smelkov @ 2014-02-18 21:29 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2014-02-18 21:29 UTC (permalink / raw) To: Kirill Smelkov; +Cc: kirr, git Kirill Smelkov <kirr@navytux.spb.ru> writes: >> > 2) alloca(), for small arrays, is used for the same reason - if we change >> > it to xmalloc()/free() the timings get worse >> >> Do you see any use of it outside compat/? >> >> I thought we specifically avoid alloca() for portability. Also we >> do not use variable-length-arrays on the stack either, I think. > > No, no usage outside compat/ and I knew alloca and VLAs are not used in > Git codebase for portability, and I understand alloca will be > criticized, but wanted to start the discussion rolling. > > I've actually started without alloca, and used xmalloc/free for > [nparent] vectors, but the impact was measurable, so it just had to be > changed to something more optimal. > > For me, personally, alloca is ok, but I understand there could be > portability issues (by the way, what compiler/system Git cares about > does not have working alloca?). Thats why I propose we do the following > > 1. at configure time, determine, do we have working alloca, and define > > #define HAVE_ALLOCA > > if yes. > > 2. in code > > #ifdef HAVE_ALLOCA > # define xalloca(size) (alloca(size)) > # define xalloca_free(p) do {} while(0) > #else > # define xalloca(size) (xmalloc(size)) > # define xalloca_free(p) (free(p)) > #endif > > and use it like > > func() { > p = xalloca(size); > ... > > xalloca_free(p); > } > > This way, for systems, where alloca is available, we'll have optimal > on-stack allocations with fast executions. On the other hand, on > systems, where alloca is not available, this gracefully fallbacks to > xmalloc/free. > > Please tell me what you think. I guess the above is clean enough, and we cannot do better than that, if we want to use alloca() on platforms where we can. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly 2014-02-13 14:02 [PATCH 0/2] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov 2014-02-13 14:02 ` [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov @ 2014-02-13 14:02 ` Kirill Smelkov 2014-02-13 19:55 ` Junio C Hamano 1 sibling, 1 reply; 10+ messages in thread From: Kirill Smelkov @ 2014-02-13 14:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kirill Smelkov As was recently shown (c839f1bd "combine-diff: optimize combine_diff_path sets intersection"), combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(*), we can call multiparent tree-walker directly and get ready paths. (*) because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s 16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> --- combine-diff.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---- diff.c | 1 + 2 files changed, 81 insertions(+), 5 deletions(-) diff --git a/combine-diff.c b/combine-diff.c index 1732dfd..ddf7495 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -1303,7 +1303,7 @@ static const char *path_path(void *obj) /* find set of paths that every parent touches */ -static struct combine_diff_path *find_paths(const unsigned char *sha1, +static struct combine_diff_path *find_paths_generic(const unsigned char *sha1, const struct sha1_array *parents, struct diff_options *opt) { struct combine_diff_path *paths = NULL; @@ -1316,6 +1316,7 @@ static struct combine_diff_path *find_paths(const unsigned char *sha1, /* tell diff_tree to emit paths in sorted (=tree) order */ opt->orderfile = NULL; + /* D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (wrt paths) */ for (i = 0; i < num_parent; i++) { /* * show stat against the first parent even when doing @@ -1346,6 +1347,35 @@ static struct combine_diff_path *find_paths(const unsigned char *sha1, } +/* + * find set of paths that everybody touches, assuming diff is run without + * rename/copy detection, etc, comparing all trees simultaneously (= faster). + */ +static struct combine_diff_path *find_paths_multitree( + const unsigned char *sha1, const struct sha1_array *parents, + struct diff_options *opt) +{ + int i, nparent = parents->nr; + const unsigned char **parents_sha1; + struct combine_diff_path paths_head; + struct strbuf base; + + parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0])); + for (i = 0; i < nparent; i++) + parents_sha1[i] = parents->sha1[i]; + + /* fake list head, so worker can assume it is non-NULL */ + paths_head.next = NULL; + + strbuf_init(&base, PATH_MAX); + diff_tree_paths(&paths_head, sha1, parents_sha1, nparent, &base, opt); + + strbuf_release(&base); + free(parents_sha1); + return paths_head.next; +} + + void diff_tree_combined(const unsigned char *sha1, const struct sha1_array *parents, int dense, @@ -1355,6 +1385,7 @@ void diff_tree_combined(const unsigned char *sha1, struct diff_options diffopts; struct combine_diff_path *p, *paths; int i, num_paths, needsep, show_log_first, num_parent = parents->nr; + int need_generic_pathscan; /* nothing to do, if no parents */ if (!num_parent) @@ -1377,11 +1408,55 @@ void diff_tree_combined(const unsigned char *sha1, /* find set of paths that everybody touches * - * NOTE find_paths() also handles --stat, as it computes - * diff(sha1,parent_i) for all i to do the job, specifically - * for parent0. + * NOTE + * + * Diffcore transformations are bound to diff_filespec and logic + * comparing two entries - i.e. they do not apply directly to combine + * diff. + * + * If some of such transformations is requested - we launch generic + * path scanning, which works significantly slower compared to + * simultaneous all-trees-in-one-go scan in find_paths_multitree(). + * + * TODO some of the filters could be ported to work on + * combine_diff_paths - i.e. all functionality that skips paths, so in + * theory, we could end up having only multitree path scanning. + * + * NOTE please keep this semantically in sync with diffcore_std() */ - paths = find_paths(sha1, parents, &diffopts); + need_generic_pathscan = opt->skip_stat_unmatch || + DIFF_OPT_TST(opt, FOLLOW_RENAMES) || + opt->break_opt != -1 || + opt->detect_rename || + opt->pickaxe || + opt->filter; + + + if (need_generic_pathscan) { + /* NOTE generic case also handles --stat, as it computes + * diff(sha1,parent_i) for all i to do the job, specifically + * for parent0. + */ + paths = find_paths_generic(sha1, parents, &diffopts); + } + else { + paths = find_paths_multitree(sha1, parents, &diffopts); + + /* show stat against the first parent even + * when doing combined diff. + */ + int stat_opt = (opt->output_format & + (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT)); + if (stat_opt) { + diffopts.output_format = stat_opt; + + diff_tree_sha1(parents->sha1[0], sha1, "", &diffopts); + diffcore_std(&diffopts); + if (opt->orderfile) + diffcore_order(opt->orderfile); + diff_flush(&diffopts); + } + } /* find out number of surviving paths */ for (num_paths = 0, p = paths; p; p = p->next) diff --git a/diff.c b/diff.c index cda4aa8..f2fff46 100644 --- a/diff.c +++ b/diff.c @@ -4764,6 +4764,7 @@ void diffcore_fix_diff_index(struct diff_options *options) void diffcore_std(struct diff_options *options) { + /* NOTE please keep the following in sync with diff_tree_combined() */ if (options->skip_stat_unmatch) diffcore_skip_stat_unmatch(options); if (!options->found_follow) { -- 1.9.rc1.181.g641f458 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly 2014-02-13 14:02 ` [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov @ 2014-02-13 19:55 ` Junio C Hamano 2014-02-14 12:16 ` Kirill Smelkov 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2014-02-13 19:55 UTC (permalink / raw) To: Kirill Smelkov; +Cc: git Kirill Smelkov <kirr@mns.spb.ru> writes: > + if (need_generic_pathscan) { > + /* NOTE generic case also handles --stat, as it computes > + * diff(sha1,parent_i) for all i to do the job, specifically > + * for parent0. > + */ > + paths = find_paths_generic(sha1, parents, &diffopts); > + } > + else { > + paths = find_paths_multitree(sha1, parents, &diffopts); > + > + /* show stat against the first parent even > + * when doing combined diff. > + */ > + int stat_opt = (opt->output_format & > + (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT)); /* * We see decl-after-stmt here. * Also please have slash-asterisk and asterisk-slash * at the beginning and the end of a multi-line comment * block on their own line. */ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly 2014-02-13 19:55 ` Junio C Hamano @ 2014-02-14 12:16 ` Kirill Smelkov 0 siblings, 0 replies; 10+ messages in thread From: Kirill Smelkov @ 2014-02-14 12:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Feb 13, 2014 at 11:55:08AM -0800, Junio C Hamano wrote: > Kirill Smelkov <kirr@mns.spb.ru> writes: > > > + if (need_generic_pathscan) { > > + /* NOTE generic case also handles --stat, as it computes > > + * diff(sha1,parent_i) for all i to do the job, specifically > > + * for parent0. > > + */ > > + paths = find_paths_generic(sha1, parents, &diffopts); > > + } > > + else { > > + paths = find_paths_multitree(sha1, parents, &diffopts); > > + > > + /* show stat against the first parent even > > + * when doing combined diff. > > + */ > > + int stat_opt = (opt->output_format & > > + (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT)); > > /* > * We see decl-after-stmt here. > * Also please have slash-asterisk and asterisk-slash > * at the beginning and the end of a multi-line comment > * block on their own line. > */ Sorry, and thanks for noticing. I usually compile with -Wall, but it seems it is not enough without explicitly specifying -std=c89. Comments corrected and the decl-after-stmt fixed, and this time I've compiled with `-std=c89 -pedantic -Wall -Wextra` to assure no new warnings are introduced. Please apply and thanks beforehand, Kirill ---- 8< ---- From: Kirill Smelkov <kirr@mns.spb.ru> Subject: [PATCH v2 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly As was recently shown (c839f1bd "combine-diff: optimize combine_diff_path sets intersection"), combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(*), we can call multiparent tree-walker directly and get ready paths. (*) because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s 16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> --- combine-diff.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---- diff.c | 1 + 2 files changed, 84 insertions(+), 5 deletions(-) Chages since v1: - fixed declaration-after-statement, and reworked multiline comments to start and end with /* and */ on separate lines. diff --git a/combine-diff.c b/combine-diff.c index 1732dfd..12764fb 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -1303,7 +1303,7 @@ static const char *path_path(void *obj) /* find set of paths that every parent touches */ -static struct combine_diff_path *find_paths(const unsigned char *sha1, +static struct combine_diff_path *find_paths_generic(const unsigned char *sha1, const struct sha1_array *parents, struct diff_options *opt) { struct combine_diff_path *paths = NULL; @@ -1316,6 +1316,7 @@ static struct combine_diff_path *find_paths(const unsigned char *sha1, /* tell diff_tree to emit paths in sorted (=tree) order */ opt->orderfile = NULL; + /* D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (wrt paths) */ for (i = 0; i < num_parent; i++) { /* * show stat against the first parent even when doing @@ -1346,6 +1347,35 @@ static struct combine_diff_path *find_paths(const unsigned char *sha1, } +/* + * find set of paths that everybody touches, assuming diff is run without + * rename/copy detection, etc, comparing all trees simultaneously (= faster). + */ +static struct combine_diff_path *find_paths_multitree( + const unsigned char *sha1, const struct sha1_array *parents, + struct diff_options *opt) +{ + int i, nparent = parents->nr; + const unsigned char **parents_sha1; + struct combine_diff_path paths_head; + struct strbuf base; + + parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0])); + for (i = 0; i < nparent; i++) + parents_sha1[i] = parents->sha1[i]; + + /* fake list head, so worker can assume it is non-NULL */ + paths_head.next = NULL; + + strbuf_init(&base, PATH_MAX); + diff_tree_paths(&paths_head, sha1, parents_sha1, nparent, &base, opt); + + strbuf_release(&base); + free(parents_sha1); + return paths_head.next; +} + + void diff_tree_combined(const unsigned char *sha1, const struct sha1_array *parents, int dense, @@ -1355,6 +1385,7 @@ void diff_tree_combined(const unsigned char *sha1, struct diff_options diffopts; struct combine_diff_path *p, *paths; int i, num_paths, needsep, show_log_first, num_parent = parents->nr; + int need_generic_pathscan; /* nothing to do, if no parents */ if (!num_parent) @@ -1377,11 +1408,58 @@ void diff_tree_combined(const unsigned char *sha1, /* find set of paths that everybody touches * - * NOTE find_paths() also handles --stat, as it computes - * diff(sha1,parent_i) for all i to do the job, specifically - * for parent0. + * NOTE + * + * Diffcore transformations are bound to diff_filespec and logic + * comparing two entries - i.e. they do not apply directly to combine + * diff. + * + * If some of such transformations is requested - we launch generic + * path scanning, which works significantly slower compared to + * simultaneous all-trees-in-one-go scan in find_paths_multitree(). + * + * TODO some of the filters could be ported to work on + * combine_diff_paths - i.e. all functionality that skips paths, so in + * theory, we could end up having only multitree path scanning. + * + * NOTE please keep this semantically in sync with diffcore_std() */ - paths = find_paths(sha1, parents, &diffopts); + need_generic_pathscan = opt->skip_stat_unmatch || + DIFF_OPT_TST(opt, FOLLOW_RENAMES) || + opt->break_opt != -1 || + opt->detect_rename || + opt->pickaxe || + opt->filter; + + + if (need_generic_pathscan) { + /* + * NOTE generic case also handles --stat, as it computes + * diff(sha1,parent_i) for all i to do the job, specifically + * for parent0. + */ + paths = find_paths_generic(sha1, parents, &diffopts); + } + else { + int stat_opt; + paths = find_paths_multitree(sha1, parents, &diffopts); + + /* + * show stat against the first parent even + * when doing combined diff. + */ + stat_opt = (opt->output_format & + (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT)); + if (stat_opt) { + diffopts.output_format = stat_opt; + + diff_tree_sha1(parents->sha1[0], sha1, "", &diffopts); + diffcore_std(&diffopts); + if (opt->orderfile) + diffcore_order(opt->orderfile); + diff_flush(&diffopts); + } + } /* find out number of surviving paths */ for (num_paths = 0, p = paths; p; p = p->next) diff --git a/diff.c b/diff.c index cda4aa8..f2fff46 100644 --- a/diff.c +++ b/diff.c @@ -4764,6 +4764,7 @@ void diffcore_fix_diff_index(struct diff_options *options) void diffcore_std(struct diff_options *options) { + /* NOTE please keep the following in sync with diff_tree_combined() */ if (options->skip_stat_unmatch) diffcore_skip_stat_unmatch(options); if (!options->found_follow) { -- 1.9.rc1.181.g641f458 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-02-18 21:29 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-02-13 14:02 [PATCH 0/2] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov 2014-02-13 14:02 ` [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov 2014-02-13 19:51 ` Junio C Hamano 2014-02-14 12:15 ` Kirill Smelkov 2014-02-14 17:37 ` Junio C Hamano 2014-02-16 8:08 ` Kirill Smelkov 2014-02-18 21:29 ` Junio C Hamano 2014-02-13 14:02 ` [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov 2014-02-13 19:55 ` Junio C Hamano 2014-02-14 12:16 ` Kirill Smelkov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.