All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Anders Waldenborg <anders@0x63.nu>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	christian.couder@gmail.com, peff@peff.net,
	jonathantanmy@google.com
Subject: Re: [PATCH 0/5] pretty format %(trailers): improve machine readability
Date: Mon, 07 Dec 2020 09:53:22 +0100	[thread overview]
Message-ID: <87h7oyxail.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <87wnxwp15o.fsf@0x63.nu>


On Sat, Dec 05 2020, Anders Waldenborg wrote:

> Ævar Arnfjörð Bjarmason writes:
>
>> I started writing this on top of "master", but then saw the
>> outstanding series of other miscellaneous fixes to this
>> facility[1]. This is on top of that topic & rebased on master.
>>
>> Anders, any plans to re-roll yours? Otherwise the conflicts I'd have
>> on mine are easy to fix, so I can also submit it as a stand-alone.
>
> Yes, I have plans to do that. But have yet to carve out the required
> time from my copious spare time to actually do it.
>
> So please don't hold your breath waiting for me to do that.

Thanks. I sent a v2 of mine yesterday as
https://lore.kernel.org/git/20201206002449.31452-1-avarab@gmail.com/

As noted there the merge conflict with yours is trivial, so hopefully it
won't cause you much hassle if you re-roll while it's outstanding.

>> This series comes out of a discussion at work today (well, yesterday
>> at this point) where someone wanted to parse %(trailers) output. As
>> noted in 3/5 doing this is rather tedious now if you're trying to
>> unambiguously grap trailers as a stream of key-value pairs.
>>
>> So this series adds a "key_value_separator" and "keyonly" parameters,
>> and fixes a few bugs I saw along the way.
>
> Interesting. When adding "valueonly" I never consider it being used
> without "key". The trick you are doing with separate keyonly and
> valueonly is quite clever.
>
> I've only been doing machine parsing for explicit keys, things like:
> "%cn%x00%x00%an%x00%x00%(trailers:key=Reviewed-By,valueonly,unfold,separator=%x00)%x00%x00%(trailers:key=Backport-Reviewed-By,valueonly,unfold,separator=%x00)"
> (double-NUL to separate field, single-NUL to separate values within field).
>
> But I can't help wonder that if the goal just is to have a nice machine
> parsable format maybe it would be easier (both for user and
> implementation) to have a separate placeholder for "machine readable
> trailers" which by default emits in a format suitable for machine
> parsing. Something like a new "%(ztrailers)" (but with a better name)
> which simply emits a sequence of "<KEY> NUL <VAL> NUL" for each trailer

I think it's a bit tricky to make something general in the middle of all
the custom format printf-likes in the pretty format. E.g. some users
might want to use \0 as a delimiter for key-values, others \0\0
etc. because they used \0, or the other way around.

Maybe if there's a reason to extend the optimization it could be smarter
about detecting that you only wanted some fixed-string separator and
nothing else custom?

B.t.w. I tried just deleting the optimization for testing and it slowed
down by around 8% on linux.git according to an extended
p4205-log-pretty-formats.sh.

Looking at the code I wonder if there aren't other lower hanging
optimizations, e.g. it seems we call find_separator() on multiple passes
instead of saving it away, e.g. in the format_trailers_from_commit()
entry point if there's any custom options such as "unfold".

I also wonder if memory allocation is a bottleneck in the "git log"
path, but didn't have time to refactor & test it. For each commit the
walking machinery eventually calls the trailer.c code, which allocates &
free()'s internal structures that could be re-used for parsing the next
commit.

  reply	other threads:[~2020-12-07  8:54 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-25 21:26 [PATCH 00/21] trailer fixes Anders Waldenborg
2020-10-25 21:26 ` [PATCH 01/21] trailer: change token_{from,matches}_item into taking conf_info Anders Waldenborg
2020-10-26 11:56   ` Christian Couder
2020-10-25 21:26 ` [PATCH 02/21] trailer: don't use 'struct arg_item' for storing config Anders Waldenborg
2020-10-25 21:26 ` [PATCH 03/21] doc: mention canonicalization in git i-t manual Anders Waldenborg
2020-10-26 12:14   ` Christian Couder
2020-10-25 21:26 ` [PATCH 04/21] pretty: allow using aliases in %(trailer:key=xyz) Anders Waldenborg
2020-10-26 12:38   ` Christian Couder
2020-10-25 21:26 ` [PATCH 05/21] trailer: rename 'free_all' to 'free_all_trailer_items' Anders Waldenborg
2020-10-26 12:42   ` Christian Couder
2020-11-10 19:52     ` Jeff King
2020-10-25 21:26 ` [PATCH 06/21] t4205: add test for trailer in log with nonstandard separator Anders Waldenborg
2020-10-26 12:43   ` Christian Couder
2020-11-09 22:12     ` Anders Waldenborg
2020-11-10  7:55       ` Christian Couder
2020-11-10 19:54       ` Jeff King
2020-10-25 21:26 ` [PATCH 07/21] trailer: simplify 'arg_item' lifetime Anders Waldenborg
2020-10-25 21:26 ` [PATCH 08/21] trailer: keep track of conf in trailer_item Anders Waldenborg
2020-11-10 19:58   ` Jeff King
2020-10-25 21:26 ` [PATCH 09/21] trailer: refactor print_tok_val into taking item Anders Waldenborg
2020-10-25 21:26 ` [PATCH 10/21] trailer: move trailer token canonicalization print time Anders Waldenborg
2020-10-25 21:26 ` [PATCH 11/21] trailer: remember separator used in input Anders Waldenborg
2020-10-25 21:26 ` [PATCH 12/21] trailer: handle configured nondefault separators explicitly Anders Waldenborg
2020-11-10 20:06   ` Jeff King
2020-10-25 21:26 ` [PATCH 13/21] trailer: add option to make canonicalization optional Anders Waldenborg
2020-11-10 20:10   ` Jeff King
2020-10-25 21:26 ` [PATCH 14/21] trailer: move skipping of blank lines to own loop when finding trailer Anders Waldenborg
2020-10-25 21:26 ` [PATCH 15/21] trailer: factor out classify_trailer_line Anders Waldenborg
2020-10-25 21:26 ` [PATCH 16/21] t7513: add failing test for configured trailing line classification Anders Waldenborg
2020-10-25 21:26 ` [PATCH 17/21] trailer: don't treat line with prefix of known trailer as known Anders Waldenborg
2020-11-10 20:16   ` Jeff King
2020-10-25 21:26 ` [PATCH 18/21] trailer: factor out config lookup to separate function Anders Waldenborg
2020-10-25 21:26 ` [PATCH 19/21] trailer: move config lookup out of parse_trailer Anders Waldenborg
2020-10-25 21:26 ` [PATCH 20/21] trailer: add failing tests for matching trailers against input Anders Waldenborg
2020-10-25 21:26 ` [PATCH 21/21] trailer: only do prefix matching for configured trailers on commandline Anders Waldenborg
2020-11-10  7:44 ` [PATCH 00/21] trailer fixes Christian Couder
2020-12-05  1:39 ` [PATCH 0/5] pretty format %(trailers): improve machine readability Ævar Arnfjörð Bjarmason
2020-12-05 18:18   ` Anders Waldenborg
2020-12-07  8:53     ` Ævar Arnfjörð Bjarmason [this message]
2020-12-06  0:24   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2020-12-09 15:52     ` [PATCH v3 " Ævar Arnfjörð Bjarmason
2020-12-10 10:48       ` Christian Couder
2020-12-10 19:00         ` Junio C Hamano
2020-12-09 15:52     ` [PATCH v3 1/5] pretty format %(trailers) test: split a long line Ævar Arnfjörð Bjarmason
2020-12-09 15:52     ` [PATCH v3 2/5] pretty format %(trailers) doc: avoid repetition Ævar Arnfjörð Bjarmason
2020-12-10 19:01       ` Junio C Hamano
2020-12-09 15:52     ` [PATCH v3 3/5] pretty-format %(trailers): fix broken standalone "valueonly" Ævar Arnfjörð Bjarmason
2020-12-09 15:52     ` [PATCH v3 4/5] pretty format %(trailers): add a "keyonly" Ævar Arnfjörð Bjarmason
2020-12-09 15:52     ` [PATCH v3 5/5] pretty format %(trailers): add a "key_value_separator" Ævar Arnfjörð Bjarmason
2020-12-06  0:24   ` [PATCH v2 1/5] pretty format %(trailers) test: split a long line Ævar Arnfjörð Bjarmason
2020-12-06  0:24   ` [PATCH v2 2/5] pretty format %(trailers) doc: avoid repetition Ævar Arnfjörð Bjarmason
2020-12-07  9:09     ` Christian Couder
2020-12-06  0:24   ` [PATCH v2 3/5] pretty-format %(trailers): fix broken standalone "valueonly" Ævar Arnfjörð Bjarmason
2020-12-06  0:24   ` [PATCH v2 4/5] pretty format %(trailers): add a "keyonly" Ævar Arnfjörð Bjarmason
2020-12-07  9:17     ` Christian Couder
2020-12-06  0:24   ` [PATCH v2 5/5] pretty format %(trailers): add a "key_value_separator" Ævar Arnfjörð Bjarmason
2020-12-05  1:39 ` [PATCH 1/5] pretty format %(trailers) test: split a long line Ævar Arnfjörð Bjarmason
2020-12-05  1:39 ` [PATCH 2/5] pretty format %(trailers): avoid needless repetition Ævar Arnfjörð Bjarmason
2020-12-05  5:43   ` Christian Couder
2020-12-05  1:39 ` [PATCH 3/5] pretty format %(trailers): add a "keyonly" Ævar Arnfjörð Bjarmason
2020-12-05  6:11   ` Christian Couder
2020-12-05 12:26     ` Ævar Arnfjörð Bjarmason
2020-12-05  1:39 ` [PATCH 4/5] pretty-format %(trailers): fix broken standalone "valueonly" Ævar Arnfjörð Bjarmason
2020-12-05  6:46   ` Christian Couder
2020-12-05  1:39 ` [PATCH 5/5] pretty format %(trailers): add a "key_value_separator" Ævar Arnfjörð Bjarmason
2020-12-05  7:13   ` Christian Couder
2020-12-05  8:49     ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7oyxail.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=anders@0x63.nu \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.