From: ZheNing Hu <adlternative@gmail.com> To: Jeff King <peff@peff.net> Cc: "Git List" <git@vger.kernel.org>, "Junio C Hamano" <gitster@pobox.com>, "Christian Couder" <christian.couder@gmail.com>, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com> Subject: Re: [PATCH 2/2] ref-filter: implement "quick" formats Date: Sun, 5 Sep 2021 16:20:07 +0800 [thread overview] Message-ID: <CAOLTT8QYe3PBPxSH8CYY+FatSfT7C5m6nccR2xMZ1yxSDFh5OQ@mail.gmail.com> (raw) In-Reply-To: <YTNps0YBOaRNvPzk@coredump.intra.peff.net> Jeff King <peff@peff.net> 于2021年9月4日周六 下午8:42写道: > > Some commonly-used formats can be output _much_ faster than going > through the usual atom-formatting code. E.g., "%(objectname) %(refname)" > can just be a simple printf. This commit detects a few easy cases and > uses a hard-coded output function instead. > > Note two things about the implementation: > > - we could probably go further here. E.g., %(refname:lstrip) should be > easy-ish to optimize, too. Likewise, mixed-text like "delete > %(refname)" would be nice to have. > > - the code is repetitive and enumerates all the cases, and it feels > like we ought to be able to generalize it more. But that's exactly > what the current formatting tries to do! > > So this whole thing is pretty horrible, and is a hack around the > slowness of the whole used_atom system. It _should_ be possible to > refactor that system to have roughly the same cost, but this will serve > in the meantime. > > Here are some numbers ("stream" is Git with the streaming optimization > from the previous commit, and "quick" is this commit): > > Benchmark #1: ./git.stream for-each-ref --format='%(objectname) %(refname)' > Time (mean ± σ): 229.2 ms ± 6.6 ms [User: 228.3 ms, System: 0.9 ms] > Range (min … max): 220.4 ms … 242.6 ms 13 runs > > Benchmark #2: ./git.quick for-each-ref --format='%(objectname) %(refname)' > Time (mean ± σ): 94.8 ms ± 2.2 ms [User: 93.5 ms, System: 1.4 ms] > Range (min … max): 90.8 ms … 98.2 ms 32 runs > > Summary > './git.quick for-each-ref --format='%(objectname) %(refname)'' ran > 2.42 ± 0.09 times faster than './git.stream for-each-ref --format='%(objectname) %(refname)'' > > Signed-off-by: Jeff King <peff@peff.net> > --- > ref-filter.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > ref-filter.h | 13 +++++++++++ > 2 files changed, 76 insertions(+) > > diff --git a/ref-filter.c b/ref-filter.c > index 17b78b1d30..1efa3aadc8 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -1009,6 +1009,37 @@ static int reject_atom(enum atom_type atom_type) > return atom_type == ATOM_REST; > } > > +static void set_up_quick_format(struct ref_format *format) > +{ > + /* quick formats don't handle any special quoting */ > + if (format->quote_style) > + return; > + > + /* > + * no atoms at all; this should be uncommon in real life, but it may be > + * interesting for benchmarking > + */ > + if (!used_atom_cnt) { > + format->quick = REF_FORMAT_QUICK_VERBATIM; > + return; > + } > + > + /* > + * It's tempting to look at used_atom here, but it's wrong to do so: we > + * need not only to be sure of the needed atoms, but also their order > + * and any verbatim parts of the format. So instead, let's just > + * hard-code some specific formats. > + */ > + if (!strcmp(format->format, "%(refname)")) > + format->quick = REF_FORMAT_QUICK_REFNAME; > + else if (!strcmp(format->format, "%(objectname)")) > + format->quick = REF_FORMAT_QUICK_OBJECTNAME; > + else if (!strcmp(format->format, "%(refname) %(objectname)")) > + format->quick = REF_FORMAT_QUICK_REFNAME_OBJECTNAME; > + else if (!strcmp(format->format, "%(objectname) %(refname)")) > + format->quick = REF_FORMAT_QUICK_OBJECTNAME_REFNAME; > +} > + > /* > * Make sure the format string is well formed, and parse out > * the used atoms. > @@ -1047,6 +1078,9 @@ int verify_ref_format(struct ref_format *format) > } > if (format->need_color_reset_at_eol && !want_color(format->use_color)) > format->need_color_reset_at_eol = 0; > + > + set_up_quick_format(format); > + > return 0; > } > > @@ -2617,6 +2651,32 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting > } > } > > +static int quick_ref_format(const struct ref_format *format, > + const char *refname, > + const struct object_id *oid) > +{ > + switch(format->quick) { > + case REF_FORMAT_QUICK_NONE: > + return -1; > + case REF_FORMAT_QUICK_VERBATIM: > + printf("%s\n", format->format); > + return 0; > + case REF_FORMAT_QUICK_REFNAME: > + printf("%s\n", refname); > + return 0; > + case REF_FORMAT_QUICK_OBJECTNAME: > + printf("%s\n", oid_to_hex(oid)); > + return 0; > + case REF_FORMAT_QUICK_REFNAME_OBJECTNAME: > + printf("%s %s\n", refname, oid_to_hex(oid)); > + return 0; > + case REF_FORMAT_QUICK_OBJECTNAME_REFNAME: > + printf("%s %s\n", oid_to_hex(oid), refname); > + return 0; > + } > + BUG("unknown ref_format_quick value: %d", format->quick); > +} > + So as a fast path, we actually avoided format_ref_array_item() when we are using %(objectname) and %(refname). But the problem is that it’s not very elegant (using string compare), and it is no optimization for other atoms that require in-depth parsing. I remember the "fast path" used by Ævar last time, and it seems that Junio doesn't like them. [1][2] > int format_ref_array_item(struct ref_array_item *info, > struct ref_format *format, > struct strbuf *final_buf, > @@ -2670,6 +2730,9 @@ void pretty_print_ref(const char *name, const struct object_id *oid, > struct strbuf output = STRBUF_INIT; > struct strbuf err = STRBUF_INIT; > > + if (!quick_ref_format(format, name, oid)) > + return; > + > ref_item = new_ref_array_item(name, oid); > ref_item->kind = ref_kind_from_refname(name); > if (format_ref_array_item(ref_item, format, &output, &err)) > diff --git a/ref-filter.h b/ref-filter.h > index ecea1837a2..fde5c3a1cb 100644 > --- a/ref-filter.h > +++ b/ref-filter.h > @@ -87,6 +87,19 @@ struct ref_format { > > /* Internal state to ref-filter */ > int need_color_reset_at_eol; > + > + /* > + * Set by verify_ref_format(); if not NONE, we can skip the usual > + * formatting and use an optimized routine. > + */ > + enum ref_format_quick { > + REF_FORMAT_QUICK_NONE = 0, > + REF_FORMAT_QUICK_VERBATIM, > + REF_FORMAT_QUICK_REFNAME, > + REF_FORMAT_QUICK_OBJECTNAME, > + REF_FORMAT_QUICK_REFNAME_OBJECTNAME, > + REF_FORMAT_QUICK_OBJECTNAME_REFNAME, > + } quick; > }; > > #define REF_FORMAT_INIT { .use_color = -1 } > -- > 2.33.0.618.g5b11852304 [1]: https://lore.kernel.org/git/5903d02324f3275b3aa442bb3ca2602564c543dc.1626363626.git.gitgitgadget@gmail.com/ [2]: https://lore.kernel.org/git/87eecf8ork.fsf@evledraar.gmail.com/
next prev parent reply other threads:[~2021-09-05 8:20 UTC|newest] Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-04 12:40 [hacky PATCH 0/2] speeding up trivial for-each-ref invocations Jeff King 2021-09-04 12:41 ` [PATCH 1/2] ref-filter: hacky "streaming" mode Jeff King 2021-09-05 8:20 ` ZheNing Hu 2021-09-05 13:04 ` Jeff King 2021-09-07 5:28 ` ZheNing Hu 2021-09-07 18:01 ` Jeff King 2021-09-09 14:45 ` ZheNing Hu 2021-09-10 14:26 ` Jeff King 2021-09-15 12:27 ` ZheNing Hu 2021-09-15 14:23 ` ZheNing Hu 2021-09-16 21:45 ` Jeff King 2021-09-20 7:42 ` ZheNing Hu 2021-09-16 21:31 ` Jeff King 2021-09-05 13:15 ` Jeff King 2021-09-07 5:42 ` ZheNing Hu 2021-09-04 12:42 ` [PATCH 2/2] ref-filter: implement "quick" formats Jeff King 2021-09-05 8:20 ` ZheNing Hu [this message] 2021-09-05 13:07 ` Jeff King 2021-09-06 13:34 ` ZheNing Hu 2021-09-07 20:06 ` Junio C Hamano 2021-09-05 8:19 ` [hacky PATCH 0/2] speeding up trivial for-each-ref invocations ZheNing Hu 2021-09-05 12:49 ` Jeff King 2021-09-06 13:30 ` ZheNing Hu 2021-09-07 17:28 ` Jeff King 2021-09-09 13:20 ` ZheNing Hu 2021-09-06 6:54 ` Patrick Steinhardt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAOLTT8QYe3PBPxSH8CYY+FatSfT7C5m6nccR2xMZ1yxSDFh5OQ@mail.gmail.com \ --to=adlternative@gmail.com \ --cc=avarab@gmail.com \ --cc=christian.couder@gmail.com \ --cc=git@vger.kernel.org \ --cc=gitster@pobox.com \ --cc=peff@peff.net \ --subject='Re: [PATCH 2/2] ref-filter: implement "quick" formats' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).