All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: ZheNing Hu <adlternative@gmail.com>
Subject: [PATCH 2/2] ref-filter: implement "quick" formats
Date: Sat, 4 Sep 2021 08:42:27 -0400	[thread overview]
Message-ID: <YTNps0YBOaRNvPzk@coredump.intra.peff.net> (raw)
In-Reply-To: <YTNpQ7Od1U/5i0R7@coredump.intra.peff.net>

Some commonly-used formats can be output _much_ faster than going
through the usual atom-formatting code. E.g., "%(objectname) %(refname)"
can just be a simple printf. This commit detects a few easy cases and
uses a hard-coded output function instead.

Note two things about the implementation:

 - we could probably go further here. E.g., %(refname:lstrip) should be
   easy-ish to optimize, too. Likewise, mixed-text like "delete
   %(refname)" would be nice to have.

 - the code is repetitive and enumerates all the cases, and it feels
   like we ought to be able to generalize it more. But that's exactly
   what the current formatting tries to do!

So this whole thing is pretty horrible, and is a hack around the
slowness of the whole used_atom system. It _should_ be possible to
refactor that system to have roughly the same cost, but this will serve
in the meantime.

Here are some numbers ("stream" is Git with the streaming optimization
from the previous commit, and "quick" is this commit):

  Benchmark #1: ./git.stream for-each-ref --format='%(objectname) %(refname)'
    Time (mean ± σ):     229.2 ms ±   6.6 ms    [User: 228.3 ms, System: 0.9 ms]
    Range (min … max):   220.4 ms … 242.6 ms    13 runs

  Benchmark #2: ./git.quick for-each-ref --format='%(objectname) %(refname)'
    Time (mean ± σ):      94.8 ms ±   2.2 ms    [User: 93.5 ms, System: 1.4 ms]
    Range (min … max):    90.8 ms …  98.2 ms    32 runs

  Summary
    './git.quick for-each-ref --format='%(objectname) %(refname)'' ran
      2.42 ± 0.09 times faster than './git.stream for-each-ref --format='%(objectname) %(refname)''

Signed-off-by: Jeff King <peff@peff.net>
---
 ref-filter.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 ref-filter.h | 13 +++++++++++
 2 files changed, 76 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index 17b78b1d30..1efa3aadc8 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1009,6 +1009,37 @@ static int reject_atom(enum atom_type atom_type)
 	return atom_type == ATOM_REST;
 }
 
+static void set_up_quick_format(struct ref_format *format)
+{
+	/* quick formats don't handle any special quoting */
+	if (format->quote_style)
+		return;
+
+	/*
+	 * no atoms at all; this should be uncommon in real life, but it may be
+	 * interesting for benchmarking
+	 */
+	if (!used_atom_cnt) {
+		format->quick = REF_FORMAT_QUICK_VERBATIM;
+		return;
+	}
+
+	/*
+	 * It's tempting to look at used_atom here, but it's wrong to do so: we
+	 * need not only to be sure of the needed atoms, but also their order
+	 * and any verbatim parts of the format. So instead, let's just
+	 * hard-code some specific formats.
+	 */
+	if (!strcmp(format->format, "%(refname)"))
+		format->quick = REF_FORMAT_QUICK_REFNAME;
+	else if (!strcmp(format->format, "%(objectname)"))
+		format->quick = REF_FORMAT_QUICK_OBJECTNAME;
+	else if (!strcmp(format->format, "%(refname) %(objectname)"))
+		format->quick = REF_FORMAT_QUICK_REFNAME_OBJECTNAME;
+	else if (!strcmp(format->format, "%(objectname) %(refname)"))
+		format->quick = REF_FORMAT_QUICK_OBJECTNAME_REFNAME;
+}
+
 /*
  * Make sure the format string is well formed, and parse out
  * the used atoms.
@@ -1047,6 +1078,9 @@ int verify_ref_format(struct ref_format *format)
 	}
 	if (format->need_color_reset_at_eol && !want_color(format->use_color))
 		format->need_color_reset_at_eol = 0;
+
+	set_up_quick_format(format);
+
 	return 0;
 }
 
@@ -2617,6 +2651,32 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting
 	}
 }
 
+static int quick_ref_format(const struct ref_format *format,
+			    const char *refname,
+			    const struct object_id *oid)
+{
+	switch(format->quick) {
+	case REF_FORMAT_QUICK_NONE:
+		return -1;
+	case REF_FORMAT_QUICK_VERBATIM:
+		printf("%s\n", format->format);
+		return 0;
+	case REF_FORMAT_QUICK_REFNAME:
+		printf("%s\n", refname);
+		return 0;
+	case REF_FORMAT_QUICK_OBJECTNAME:
+		printf("%s\n", oid_to_hex(oid));
+		return 0;
+	case REF_FORMAT_QUICK_REFNAME_OBJECTNAME:
+		printf("%s %s\n", refname, oid_to_hex(oid));
+		return 0;
+	case REF_FORMAT_QUICK_OBJECTNAME_REFNAME:
+		printf("%s %s\n", oid_to_hex(oid), refname);
+		return 0;
+	}
+	BUG("unknown ref_format_quick value: %d", format->quick);
+}
+
 int format_ref_array_item(struct ref_array_item *info,
 			  struct ref_format *format,
 			  struct strbuf *final_buf,
@@ -2670,6 +2730,9 @@ void pretty_print_ref(const char *name, const struct object_id *oid,
 	struct strbuf output = STRBUF_INIT;
 	struct strbuf err = STRBUF_INIT;
 
+	if (!quick_ref_format(format, name, oid))
+		return;
+
 	ref_item = new_ref_array_item(name, oid);
 	ref_item->kind = ref_kind_from_refname(name);
 	if (format_ref_array_item(ref_item, format, &output, &err))
diff --git a/ref-filter.h b/ref-filter.h
index ecea1837a2..fde5c3a1cb 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -87,6 +87,19 @@ struct ref_format {
 
 	/* Internal state to ref-filter */
 	int need_color_reset_at_eol;
+
+	/*
+	 * Set by verify_ref_format(); if not NONE, we can skip the usual
+	 * formatting and use an optimized routine.
+	 */
+	enum ref_format_quick {
+		REF_FORMAT_QUICK_NONE = 0,
+		REF_FORMAT_QUICK_VERBATIM,
+		REF_FORMAT_QUICK_REFNAME,
+		REF_FORMAT_QUICK_OBJECTNAME,
+		REF_FORMAT_QUICK_REFNAME_OBJECTNAME,
+		REF_FORMAT_QUICK_OBJECTNAME_REFNAME,
+	} quick;
 };
 
 #define REF_FORMAT_INIT { .use_color = -1 }
-- 
2.33.0.618.g5b11852304

  parent reply	other threads:[~2021-09-04 12:42 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-04 12:40 [hacky PATCH 0/2] speeding up trivial for-each-ref invocations Jeff King
2021-09-04 12:41 ` [PATCH 1/2] ref-filter: hacky "streaming" mode Jeff King
2021-09-05  8:20   ` ZheNing Hu
2021-09-05 13:04     ` Jeff King
2021-09-07  5:28       ` ZheNing Hu
2021-09-07 18:01         ` Jeff King
2021-09-09 14:45           ` ZheNing Hu
2021-09-10 14:26             ` Jeff King
2021-09-15 12:27               ` ZheNing Hu
2021-09-15 14:23                 ` ZheNing Hu
2021-09-16 21:45                   ` Jeff King
2021-09-20  7:42                     ` ZheNing Hu
2021-09-16 21:31                 ` Jeff King
2021-09-05 13:15     ` Jeff King
2021-09-07  5:42       ` ZheNing Hu
2021-09-04 12:42 ` Jeff King [this message]
2021-09-05  8:20   ` [PATCH 2/2] ref-filter: implement "quick" formats ZheNing Hu
2021-09-05 13:07     ` Jeff King
2021-09-06 13:34       ` ZheNing Hu
2021-09-07 20:06       ` Junio C Hamano
2021-09-05  8:19 ` [hacky PATCH 0/2] speeding up trivial for-each-ref invocations ZheNing Hu
2021-09-05 12:49   ` Jeff King
2021-09-06 13:30     ` ZheNing Hu
2021-09-07 17:28       ` Jeff King
2021-09-09 13:20         ` ZheNing Hu
2021-09-06  6:54 ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTNps0YBOaRNvPzk@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=adlternative@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.