From: Kirill Smelkov <kirr@navytux.spb.ru>
To: Junio C Hamano <gitster@pobox.com>
Cc: kirr@mns.spb.ru, git@vger.kernel.org
Subject: Re: [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion
Date: Thu, 27 Mar 2014 18:22:07 +0400 [thread overview]
Message-ID: <20140327142207.GB17333@mini.zxlink> (raw)
In-Reply-To: <20140325092320.GC3777@mini.zxlink>
On Tue, Mar 25, 2014 at 01:23:20PM +0400, Kirill Smelkov wrote:
> On Mon, Mar 24, 2014 at 02:43:36PM -0700, Junio C Hamano wrote:
> > Kirill Smelkov <kirr@mns.spb.ru> writes:
> >
> > > instead of allocating it all the time for every subtree in
> > > __diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all
> > > callee just use it in stacking style, without memory allocations.
> > >
> > > This should be faster, and for me this change gives the following
> > > slight speedups for
> > >
> > > git log --raw --no-abbrev --no-renames --format='%H'
> > >
> > > navy.git linux.git v3.10..v3.11
> > >
> > > before 0.618s 1.903s
> > > after 0.611s 1.889s
> > > speedup 1.1% 0.7%
> > >
> > > Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
> > > ---
> > >
> > > Changes since v1:
> > >
> > > - don't need to touch diff.h, as the function we are changing became static.
> > >
> > > tree-diff.c | 36 ++++++++++++++++++------------------
> > > 1 file changed, 18 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/tree-diff.c b/tree-diff.c
> > > index aea0297..c76821d 100644
> > > --- a/tree-diff.c
> > > +++ b/tree-diff.c
> > > @@ -115,7 +115,7 @@ static void show_path(struct strbuf *base, struct diff_options *opt,
> > > if (recurse) {
> > > strbuf_addch(base, '/');
> > > __diff_tree_sha1(t1 ? t1->entry.sha1 : NULL,
> > > - t2 ? t2->entry.sha1 : NULL, base->buf, opt);
> > > + t2 ? t2->entry.sha1 : NULL, base, opt);
> > > }
> > >
> > > strbuf_setlen(base, old_baselen);
> >
> > I was scratching my head for a while, after seeing that there does
> > not seem to be any *new* code added by this patch in order to
> > store-away the original length and restore the singleton base buffer
> > to the original length after using addch/addstr to extend it.
> >
> > But I see that the code has already been prepared to do this
> > conversion. I wonder why we didn't do this earlier ;-)
>
> The conversion to reusing memory started in 48932677 "diff-tree: convert
> base+baselen to writable strbuf" which allowed to avoid "quite a bit of
> malloc() and memcpy()", but for this to work allocation at diff_tree()
> entry had to be there.
>
> In particular it had to be there, because diff_tree() accepted base as C
> string, not strbuf, and since diff_tree() was calling itself
> recursively - oops - new allocation on every subtree.
>
> I've opened the door for avoiding allocations via splitting diff_tree
> into high-level and low-level parts. The high-level part still accepts
> `char *base`, but low-level function operates on strbuf and recurses
> into low-level self.
>
> The high-level diff_tree_sha1() still allocates memory for every
> diff(tree1,tree2), but that is significantly lower compared to
> allocating memory on every subtree...
>
> The lesson here is: better use strbuf for api unless there is a reason
> not to.
>
>
> > Looks good. Thanks.
>
> Thanks.
Thanks again. Here it goes adjusted to __diff_tree_sha1 -> ll_diff_tree_sha1 renaming:
(please keep author email)
---- 8< ----
From: Kirill Smelkov <kirr@mns.spb.ru>
Date: Mon, 24 Feb 2014 20:21:48 +0400
Subject: [PATCH v3] tree-diff: reuse base str(buf) memory on sub-tree recursion
instead of allocating it all the time for every subtree in
ll_diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all
callee just use it in stacking style, without memory allocations.
This should be faster, and for me this change gives the following
slight speedups for
git log --raw --no-abbrev --no-renames --format='%H'
navy.git linux.git v3.10..v3.11
before 0.618s 1.903s
after 0.611s 1.889s
speedup 1.1% 0.7%
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
---
Changes since v2:
- adjust to __diff_tree_sha1 -> ll_diff_tree_sha1 renaming.
Changes since v1:
- don't need to touch diff.h, as the function we are changing became
static.
tree-diff.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/tree-diff.c b/tree-diff.c
index 7fbb022..8c8bde6 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -8,7 +8,7 @@
static int ll_diff_tree_sha1(const unsigned char *old, const unsigned char *new,
- const char *base_str, struct diff_options *opt);
+ struct strbuf *base, struct diff_options *opt);
/*
* Compare two tree entries, taking into account only path/S_ISDIR(mode),
@@ -123,7 +123,7 @@ static void show_path(struct strbuf *base, struct diff_options *opt,
if (recurse) {
strbuf_addch(base, '/');
ll_diff_tree_sha1(t1 ? t1->entry.sha1 : NULL,
- t2 ? t2->entry.sha1 : NULL, base->buf, opt);
+ t2 ? t2->entry.sha1 : NULL, base, opt);
}
strbuf_setlen(base, old_baselen);
@@ -146,12 +146,10 @@ static void skip_uninteresting(struct tree_desc *t, struct strbuf *base,
}
static int ll_diff_tree_sha1(const unsigned char *old, const unsigned char *new,
- const char *base_str, struct diff_options *opt)
+ struct strbuf *base, struct diff_options *opt)
{
struct tree_desc t1, t2;
void *t1tree, *t2tree;
- struct strbuf base;
- int baselen = strlen(base_str);
t1tree = fill_tree_descriptor(&t1, old);
t2tree = fill_tree_descriptor(&t2, new);
@@ -159,17 +157,14 @@ static int ll_diff_tree_sha1(const unsigned char *old, const unsigned char *new,
/* Enable recursion indefinitely */
opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
- strbuf_init(&base, PATH_MAX);
- strbuf_add(&base, base_str, baselen);
-
for (;;) {
int cmp;
if (diff_can_quit_early(opt))
break;
if (opt->pathspec.nr) {
- skip_uninteresting(&t1, &base, opt);
- skip_uninteresting(&t2, &base, opt);
+ skip_uninteresting(&t1, base, opt);
+ skip_uninteresting(&t2, base, opt);
}
if (!t1.size && !t2.size)
break;
@@ -181,7 +176,7 @@ static int ll_diff_tree_sha1(const unsigned char *old, const unsigned char *new,
if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
hashcmp(t1.entry.sha1, t2.entry.sha1) ||
(t1.entry.mode != t2.entry.mode))
- show_path(&base, opt, &t1, &t2);
+ show_path(base, opt, &t1, &t2);
update_tree_entry(&t1);
update_tree_entry(&t2);
@@ -189,18 +184,17 @@ static int ll_diff_tree_sha1(const unsigned char *old, const unsigned char *new,
/* t1 < t2 */
else if (cmp < 0) {
- show_path(&base, opt, &t1, /*t2=*/NULL);
+ show_path(base, opt, &t1, /*t2=*/NULL);
update_tree_entry(&t1);
}
/* t1 > t2 */
else {
- show_path(&base, opt, /*t1=*/NULL, &t2);
+ show_path(base, opt, /*t1=*/NULL, &t2);
update_tree_entry(&t2);
}
}
- strbuf_release(&base);
free(t2tree);
free(t1tree);
return 0;
@@ -217,7 +211,7 @@ static inline int diff_might_be_rename(void)
!DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
}
-static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
+static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt)
{
struct diff_options diff_opts;
struct diff_queue_struct *q = &diff_queued_diff;
@@ -314,13 +308,19 @@ static void try_to_follow_renames(const unsigned char *old, const unsigned char
q->nr = 1;
}
-int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
+int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base_str, struct diff_options *opt)
{
+ struct strbuf base;
int retval;
- retval = ll_diff_tree_sha1(old, new, base, opt);
- if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
- try_to_follow_renames(old, new, base, opt);
+ strbuf_init(&base, PATH_MAX);
+ strbuf_addstr(&base, base_str);
+
+ retval = ll_diff_tree_sha1(old, new, &base, opt);
+ if (!*base_str && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
+ try_to_follow_renames(old, new, &base, opt);
+
+ strbuf_release(&base);
return retval;
}
--
1.9.rc0.143.g6fd479e
next prev parent reply other threads:[~2014-03-27 14:18 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-24 16:21 [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov
2014-02-24 16:21 ` [PATCH 01/19] combine-diff: move show_log_first logic/action out of paths scanning Kirill Smelkov
2014-02-24 16:21 ` [PATCH 02/19] combine-diff: move changed-paths scanning logic into its own function Kirill Smelkov
2014-02-24 16:21 ` [PATCH 03/19] tree-diff: no need to manually verify that there is no mode change for a path Kirill Smelkov
2014-02-24 16:21 ` [PATCH 04/19] tree-diff: no need to pass match to skip_uninteresting() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 05/19] tree-diff: show_tree() is not needed Kirill Smelkov
2014-02-24 16:21 ` [PATCH 06/19] tree-diff: consolidate code for emitting diffs and recursion in one place Kirill Smelkov
2014-02-24 16:21 ` [PATCH 07/19] tree-diff: don't assume compare_tree_entry() returns -1,0,1 Kirill Smelkov
2014-02-24 16:21 ` [PATCH 08/19] tree-diff: move all action-taking code out of compare_tree_entry() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 09/19] tree-diff: rename compare_tree_entry -> tree_entry_pathcmp Kirill Smelkov
2014-02-24 16:21 ` [PATCH 10/19] tree-diff: show_path prototype is not needed anymore Kirill Smelkov
2014-02-24 16:21 ` [PATCH 11/19] tree-diff: simplify tree_entry_pathcmp Kirill Smelkov
2014-03-24 21:25 ` Junio C Hamano
2014-03-25 9:23 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 12/19] tree-diff: remove special-case diff-emitting code for empty-tree cases Kirill Smelkov
2014-03-24 21:18 ` Junio C Hamano
2014-03-25 9:20 ` Kirill Smelkov
2014-03-25 17:45 ` Junio C Hamano
2014-03-26 18:32 ` Kirill Smelkov
2014-03-25 22:07 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 13/19] tree-diff: diff_tree() should now be static Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based Kirill Smelkov
2014-03-24 21:36 ` Junio C Hamano
2014-03-25 9:22 ` Kirill Smelkov
2014-03-25 17:46 ` Junio C Hamano
2014-03-26 19:52 ` Kirill Smelkov
2014-03-26 21:34 ` Junio C Hamano
2014-03-27 14:24 ` Kirill Smelkov
2014-03-27 18:48 ` Junio C Hamano
2014-03-27 19:43 ` Kirill Smelkov
2014-03-28 6:52 ` Johannes Sixt
2014-03-28 17:06 ` Junio C Hamano
2014-03-28 17:46 ` Johannes Sixt
2014-03-28 18:36 ` Junio C Hamano
2014-03-28 19:08 ` Johannes Sixt
2014-03-28 19:27 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 15/19] tree-diff: no need to call "full" diff_tree_sha1 from show_path() Kirill Smelkov
2014-03-27 14:21 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion Kirill Smelkov
2014-03-24 21:43 ` Junio C Hamano
2014-03-25 9:23 ` Kirill Smelkov
2014-03-27 14:22 ` Kirill Smelkov [this message]
2014-02-24 16:21 ` [PATCH 17/19] Portable alloca for Git Kirill Smelkov
2014-02-28 10:58 ` Thomas Schwinge
2014-02-28 13:44 ` Erik Faye-Lund
2014-02-28 13:50 ` Erik Faye-Lund
2014-02-28 17:00 ` Kirill Smelkov
2014-02-28 17:19 ` Erik Faye-Lund
2014-03-05 9:31 ` Kirill Smelkov
2014-03-24 21:47 ` Junio C Hamano
2014-03-27 14:22 ` Kirill Smelkov
2014-04-09 12:48 ` Kirill Smelkov
2014-04-09 13:01 ` Erik Faye-Lund
2014-04-10 17:30 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH v2 18/19] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov
2014-03-27 14:23 ` Kirill Smelkov
2014-04-04 18:42 ` Junio C Hamano
2014-04-06 21:46 ` Kirill Smelkov
2014-04-07 17:29 ` Junio C Hamano
2014-04-07 20:26 ` Kirill Smelkov
2014-04-07 18:07 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 19/19] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov
2014-02-24 23:43 ` [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Duy Nguyen
2014-02-25 10:38 ` Kirill Smelkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140327142207.GB17333@mini.zxlink \
--to=kirr@navytux.spb.ru \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=kirr@mns.spb.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).