All of lore.kernel.org
 help / color / mirror / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Jeff King <peff@peff.net>
Cc: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
	Git List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Christian Couder <chriscool@tuxfamily.org>,
	Hariom Verma <hariom18599@gmail.com>
Subject: Re: [PATCH] [GSOC] ref-filter: use single strbuf for all output
Date: Tue, 6 Apr 2021 17:49:47 +0800	[thread overview]
Message-ID: <CAOLTT8TwjRyT6MK_ekEx9APBv7jn17JRKj=mJQMO5Sk-DgHA-A@mail.gmail.com> (raw)
In-Reply-To: <YGuMaxoYgRkUR1sa@coredump.intra.peff.net>

Jeff King <peff@peff.net> 于2021年4月6日周二 上午6:17写道:
>
> On Mon, Apr 05, 2021 at 02:01:19PM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > When we use `git for-each-ref`, every ref will call
> > `show_ref_array_item()` and allocate its own final strbuf
> > and error strbuf. Instead, we can provide two single strbuf:
> > final_buf and error_buf that get reused for each output.
> >
> > When run it 100 times:
> >
> > $ git for-each-ref
> >
> > on git.git :
> >
> > 3.19s user
> > 3.88s system
> > 35% cpu
> > 20.199 total
> >
> > to:
> >
> > 2.89s user
> > 4.00s system
> > 34% cpu
> > 19.741 total
>
> That's a bigger performance improvement than I'd expect from this. I'm
> having trouble reproducing it here (I get the same time before and
> after). I also notice that you don't seem to be CPU bound, and we spend
> most of our time on system CPU (so program startup stuff, not the loop
> you're optimizing).
>
> I think a more interesting test is timing a single invocation with a
> large number of refs. If you are planning to do a lot of work on the
> formatting code, it might be worth adding such a test into t/perf (both
> to show off results, but also to catch any regressions after we make
> things faster).
>

It makes sense. A lot of refs can be convincing. Just like the number of
objects measured in `cat-files` is large enough.

But this is the first time I use `t/perf/*` and there is a little problem.
It seem like whatever I run single script like `sh ./p0007-write-cache.sh`
or just `make` or `./run ${HOME}/git -- ./p0002-read-cache.sh` , these
tests will fail.

> >     This patch learned Jeff King's optimization measures in git
> >     cat-file(79ed0a5): using a single strbuf for all objects output Instead
> >     of allocating a large number of small strbuf for every object.
>
> I do think this is a good direction (for all the reasons laid out in
> 79ed0a5), though it wasn't actually the part I was most worried about
> for ref-filter performance. The bigger issue in ref-filter is that each
> individual atom will generally have its own allocated string, and then
> we'll format all of the atoms into the final strbuf output. In most
> cases we could avoid those intermediate copies entirely.
>

Yes! In `ref-filter` we set object info in `v->s` and then append them to
current `stack` buffer, and finally set in `final_buf`, the copy of the string
is expensive. I don’t know if the optimization should start by removing the
stack buffer?

> I do think this would be a useful optimization to have in addition,
> though. As for the patch itself, I looked over the review that Eric
> gave, and he already said everything I would have. :)
>

I think it should be optimized, It will reduce the overhead of malloc and
free, but it is not obvious enough.

Yes, there is a lot of bad code in my patch.

> -Peff

Thanks.
--
ZheNing Hu

  reply	other threads:[~2021-04-06  9:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-05 14:01 [PATCH] [GSOC] ref-filter: use single strbuf for all output ZheNing Hu via GitGitGadget
2021-04-05 17:05 ` Eric Sunshine
2021-04-06  8:53   ` ZheNing Hu
2021-04-05 21:02 ` Derrick Stolee
2021-04-06  8:58   ` ZheNing Hu
2021-04-05 22:17 ` Jeff King
2021-04-06  9:49   ` ZheNing Hu [this message]
2021-04-06 10:35     ` ZheNing Hu
2021-04-06 14:00       ` Jeff King
2021-04-06 14:35         ` ZheNing Hu
2021-04-06 18:34 ` René Scharfe
2021-04-07 13:57   ` ZheNing Hu
2021-04-07 15:26 ` [PATCH v2] " ZheNing Hu via GitGitGadget
2021-04-07 20:31   ` Junio C Hamano
2021-04-08 12:05     ` ZheNing Hu
2021-04-07 21:27   ` Jeff King
2021-04-08 12:18     ` ZheNing Hu
2021-04-08 14:32       ` Jeff King
2021-04-08 14:43         ` ZheNing Hu
2021-04-08 14:51           ` Jeff King
2021-04-08 15:12             ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLTT8TwjRyT6MK_ekEx9APBv7jn17JRKj=mJQMO5Sk-DgHA-A@mail.gmail.com' \
    --to=adlternative@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.