From: Kirill Smelkov <kirr@navytux.spb.ru>
To: Junio C Hamano <gitster@pobox.com>
Cc: kirr@mns.spb.ru, git@vger.kernel.org
Subject: Re: [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion
Date: Tue, 25 Mar 2014 13:23:20 +0400 [thread overview]
Message-ID: <20140325092320.GC3777@mini.zxlink> (raw)
In-Reply-To: <xmqq61n3p913.fsf@gitster.dls.corp.google.com>
On Mon, Mar 24, 2014 at 02:43:36PM -0700, Junio C Hamano wrote:
> Kirill Smelkov <kirr@mns.spb.ru> writes:
>
> > instead of allocating it all the time for every subtree in
> > __diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all
> > callee just use it in stacking style, without memory allocations.
> >
> > This should be faster, and for me this change gives the following
> > slight speedups for
> >
> > git log --raw --no-abbrev --no-renames --format='%H'
> >
> > navy.git linux.git v3.10..v3.11
> >
> > before 0.618s 1.903s
> > after 0.611s 1.889s
> > speedup 1.1% 0.7%
> >
> > Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
> > ---
> >
> > Changes since v1:
> >
> > - don't need to touch diff.h, as the function we are changing became static.
> >
> > tree-diff.c | 36 ++++++++++++++++++------------------
> > 1 file changed, 18 insertions(+), 18 deletions(-)
> >
> > diff --git a/tree-diff.c b/tree-diff.c
> > index aea0297..c76821d 100644
> > --- a/tree-diff.c
> > +++ b/tree-diff.c
> > @@ -115,7 +115,7 @@ static void show_path(struct strbuf *base, struct diff_options *opt,
> > if (recurse) {
> > strbuf_addch(base, '/');
> > __diff_tree_sha1(t1 ? t1->entry.sha1 : NULL,
> > - t2 ? t2->entry.sha1 : NULL, base->buf, opt);
> > + t2 ? t2->entry.sha1 : NULL, base, opt);
> > }
> >
> > strbuf_setlen(base, old_baselen);
>
> I was scratching my head for a while, after seeing that there does
> not seem to be any *new* code added by this patch in order to
> store-away the original length and restore the singleton base buffer
> to the original length after using addch/addstr to extend it.
>
> But I see that the code has already been prepared to do this
> conversion. I wonder why we didn't do this earlier ;-)
The conversion to reusing memory started in 48932677 "diff-tree: convert
base+baselen to writable strbuf" which allowed to avoid "quite a bit of
malloc() and memcpy()", but for this to work allocation at diff_tree()
entry had to be there.
In particular it had to be there, because diff_tree() accepted base as C
string, not strbuf, and since diff_tree() was calling itself
recursively - oops - new allocation on every subtree.
I've opened the door for avoiding allocations via splitting diff_tree
into high-level and low-level parts. The high-level part still accepts
`char *base`, but low-level function operates on strbuf and recurses
into low-level self.
The high-level diff_tree_sha1() still allocates memory for every
diff(tree1,tree2), but that is significantly lower compared to
allocating memory on every subtree...
The lesson here is: better use strbuf for api unless there is a reason
not to.
> Looks good. Thanks.
Thanks.
> > @@ -138,12 +138,10 @@ static void skip_uninteresting(struct tree_desc *t, struct strbuf *base,
> > }
> >
> > static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > - const char *base_str, struct diff_options *opt)
> > + struct strbuf *base, struct diff_options *opt)
> > {
> > struct tree_desc t1, t2;
> > void *t1tree, *t2tree;
> > - struct strbuf base;
> > - int baselen = strlen(base_str);
> >
> > t1tree = fill_tree_descriptor(&t1, old);
> > t2tree = fill_tree_descriptor(&t2, new);
> > @@ -151,17 +149,14 @@ static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > /* Enable recursion indefinitely */
> > opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
> >
> > - strbuf_init(&base, PATH_MAX);
> > - strbuf_add(&base, base_str, baselen);
> > -
> > for (;;) {
> > int cmp;
> >
> > if (diff_can_quit_early(opt))
> > break;
> > if (opt->pathspec.nr) {
> > - skip_uninteresting(&t1, &base, opt);
> > - skip_uninteresting(&t2, &base, opt);
> > + skip_uninteresting(&t1, base, opt);
> > + skip_uninteresting(&t2, base, opt);
> > }
> > if (!t1.size && !t2.size)
> > break;
> > @@ -173,7 +168,7 @@ static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
> > hashcmp(t1.entry.sha1, t2.entry.sha1) ||
> > (t1.entry.mode != t2.entry.mode))
> > - show_path(&base, opt, &t1, &t2);
> > + show_path(base, opt, &t1, &t2);
> >
> > update_tree_entry(&t1);
> > update_tree_entry(&t2);
> > @@ -181,18 +176,17 @@ static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> >
> > /* t1 < t2 */
> > else if (cmp < 0) {
> > - show_path(&base, opt, &t1, /*t2=*/NULL);
> > + show_path(base, opt, &t1, /*t2=*/NULL);
> > update_tree_entry(&t1);
> > }
> >
> > /* t1 > t2 */
> > else {
> > - show_path(&base, opt, /*t1=*/NULL, &t2);
> > + show_path(base, opt, /*t1=*/NULL, &t2);
> > update_tree_entry(&t2);
> > }
> > }
> >
> > - strbuf_release(&base);
> > free(t2tree);
> > free(t1tree);
> > return 0;
> > @@ -209,7 +203,7 @@ static inline int diff_might_be_rename(void)
> > !DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
> > }
> >
> > -static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
> > +static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt)
> > {
> > struct diff_options diff_opts;
> > struct diff_queue_struct *q = &diff_queued_diff;
> > @@ -306,13 +300,19 @@ static void try_to_follow_renames(const unsigned char *old, const unsigned char
> > q->nr = 1;
> > }
> >
> > -int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
> > +int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base_str, struct diff_options *opt)
> > {
> > + struct strbuf base;
> > int retval;
> >
> > - retval = __diff_tree_sha1(old, new, base, opt);
> > - if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
> > - try_to_follow_renames(old, new, base, opt);
> > + strbuf_init(&base, PATH_MAX);
> > + strbuf_addstr(&base, base_str);
> > +
> > + retval = __diff_tree_sha1(old, new, &base, opt);
> > + if (!*base_str && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
> > + try_to_follow_renames(old, new, &base, opt);
> > +
> > + strbuf_release(&base);
> >
> > return retval;
> > }
next prev parent reply other threads:[~2014-03-25 9:27 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-24 16:21 [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov
2014-02-24 16:21 ` [PATCH 01/19] combine-diff: move show_log_first logic/action out of paths scanning Kirill Smelkov
2014-02-24 16:21 ` [PATCH 02/19] combine-diff: move changed-paths scanning logic into its own function Kirill Smelkov
2014-02-24 16:21 ` [PATCH 03/19] tree-diff: no need to manually verify that there is no mode change for a path Kirill Smelkov
2014-02-24 16:21 ` [PATCH 04/19] tree-diff: no need to pass match to skip_uninteresting() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 05/19] tree-diff: show_tree() is not needed Kirill Smelkov
2014-02-24 16:21 ` [PATCH 06/19] tree-diff: consolidate code for emitting diffs and recursion in one place Kirill Smelkov
2014-02-24 16:21 ` [PATCH 07/19] tree-diff: don't assume compare_tree_entry() returns -1,0,1 Kirill Smelkov
2014-02-24 16:21 ` [PATCH 08/19] tree-diff: move all action-taking code out of compare_tree_entry() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 09/19] tree-diff: rename compare_tree_entry -> tree_entry_pathcmp Kirill Smelkov
2014-02-24 16:21 ` [PATCH 10/19] tree-diff: show_path prototype is not needed anymore Kirill Smelkov
2014-02-24 16:21 ` [PATCH 11/19] tree-diff: simplify tree_entry_pathcmp Kirill Smelkov
2014-03-24 21:25 ` Junio C Hamano
2014-03-25 9:23 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 12/19] tree-diff: remove special-case diff-emitting code for empty-tree cases Kirill Smelkov
2014-03-24 21:18 ` Junio C Hamano
2014-03-25 9:20 ` Kirill Smelkov
2014-03-25 17:45 ` Junio C Hamano
2014-03-26 18:32 ` Kirill Smelkov
2014-03-25 22:07 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 13/19] tree-diff: diff_tree() should now be static Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based Kirill Smelkov
2014-03-24 21:36 ` Junio C Hamano
2014-03-25 9:22 ` Kirill Smelkov
2014-03-25 17:46 ` Junio C Hamano
2014-03-26 19:52 ` Kirill Smelkov
2014-03-26 21:34 ` Junio C Hamano
2014-03-27 14:24 ` Kirill Smelkov
2014-03-27 18:48 ` Junio C Hamano
2014-03-27 19:43 ` Kirill Smelkov
2014-03-28 6:52 ` Johannes Sixt
2014-03-28 17:06 ` Junio C Hamano
2014-03-28 17:46 ` Johannes Sixt
2014-03-28 18:36 ` Junio C Hamano
2014-03-28 19:08 ` Johannes Sixt
2014-03-28 19:27 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 15/19] tree-diff: no need to call "full" diff_tree_sha1 from show_path() Kirill Smelkov
2014-03-27 14:21 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion Kirill Smelkov
2014-03-24 21:43 ` Junio C Hamano
2014-03-25 9:23 ` Kirill Smelkov [this message]
2014-03-27 14:22 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 17/19] Portable alloca for Git Kirill Smelkov
2014-02-28 10:58 ` Thomas Schwinge
2014-02-28 13:44 ` Erik Faye-Lund
2014-02-28 13:50 ` Erik Faye-Lund
2014-02-28 17:00 ` Kirill Smelkov
2014-02-28 17:19 ` Erik Faye-Lund
2014-03-05 9:31 ` Kirill Smelkov
2014-03-24 21:47 ` Junio C Hamano
2014-03-27 14:22 ` Kirill Smelkov
2014-04-09 12:48 ` Kirill Smelkov
2014-04-09 13:01 ` Erik Faye-Lund
2014-04-10 17:30 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH v2 18/19] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov
2014-03-27 14:23 ` Kirill Smelkov
2014-04-04 18:42 ` Junio C Hamano
2014-04-06 21:46 ` Kirill Smelkov
2014-04-07 17:29 ` Junio C Hamano
2014-04-07 20:26 ` Kirill Smelkov
2014-04-07 18:07 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 19/19] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov
2014-02-24 23:43 ` [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Duy Nguyen
2014-02-25 10:38 ` Kirill Smelkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140325092320.GC3777@mini.zxlink \
--to=kirr@navytux.spb.ru \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=kirr@mns.spb.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).