From: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Christian Couder" <christian.couder@gmail.com>,
"Hariom Verma" <hariom18599@gmail.com>,
"Bagas Sanjaya" <bagasdotme@gmail.com>,
"Jeff King" <peff@peff.net>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Eric Sunshine" <sunshine@sunshineco.com>,
"Philip Oakley" <philipoakley@iee.email>,
"ZheNing Hu" <adlternative@gmail.com>
Subject: [PATCH 00/27] [GSOC] [RFC] cat-file: reuse ref-filter logic
Date: Fri, 13 Aug 2021 08:22:43 +0000 [thread overview]
Message-ID: <pull.1016.git.1628842990.gitgitgadget@gmail.com> (raw)
This patch series makes cat-file reuse ref-filter logic. At the same time,
some performance optimizations have been carried out. It's last version is
here:
https://lore.kernel.org/git/pull.993.v2.git.1626363626.gitgitgadget@gmail.com/#t
It seems that zh/ref-filter-raw-data is still hovering in the next branch
(Because git is rc2) So I now want to show some recent performance
optimizations first.
Change from last version:
1. Use free_global_resource() to avoid memory leaks.
2. Skip parse_object_buffer() which bring 12.5% performance optimization.
3. Merge two for loop in grab_person() which bring 2% performance
optimization.
4. Remove strlen from find_subpos.
5. Introducing xstrvfmt_len() and xstrfmt_len().
6. Remove second parsing in format_ref_array_item() which bring 1.9%
performance optimization
7. Introduction ref_filter_slopbuf to instread xstrdup("").
8. Add deref member to struct used_atom to simplify the logic of the
program.
9. Introduce symref_atom_parser() to make the program logic more concise.
10. Use switch/case instread of if/else to increase the readability of the
code.
11. Reuse finnal buffer which bring 2% performance optimization.
12. Add need_get_object_info flag to reduce memory comparing.
This is the result of the performance test after I did some optimization:
Test upstream/master this tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check 0.08(0.07+0.00) 0.09(0.08+0.01) +12.5%
1006.3: cat-file --batch-check with atoms 0.06(0.04+0.02) 0.08(0.06+0.02) +33.3%
1006.4: cat-file --batch 0.49(0.46+0.02) 0.50(0.47+0.03) +2.0%
1006.5: cat-file --batch with atoms 0.47(0.45+0.01) 0.49(0.47+0.02) +4.3%
We can see that the performance of the current patch of git cat-file --batch
is very close to upstream/master. The optimization of git cat-file
--batch-check does not seem obvious, because its optimization degree will be
affected by noise, which may appear in the range of +12.5% to +50.0%. From
an optimistic point of view, the execution time of git cat-file
--batch-check itself is relatively short, the optimization is of course not
obvious.
As GSOC is about to end, this patch series is estimated to be adjusted for
some time, I can only wish this patch can be accepted in the future.
Note: The previous part of this patch series is the duplicate content
belonging to zh/ref-filter-raw-data.
ZheNing Hu (27):
[GSOC] ref-filter: add obj-type check in grab contents
[GSOC] ref-filter: add %(raw) atom
[GSOC] ref-filter: --format=%(raw) support --perl
[GSOC] ref-filter: use non-const ref_format in *_atom_parser()
[GSOC] ref-filter: add %(rest) atom
[GSOC] ref-filter: pass get_object() return value to their callers
[GSOC] ref-filter: introduce free_ref_array_item_value() function
[GSOC] ref-filter: add cat_file_mode to ref_format
[GSOC] ref-filter: modify the error message and value in get_object
[GSOC] cat-file: add has_object_file() check
[GSOC] cat-file: change batch_objects parameter name
[GSOC] cat-file: create p1006-cat-file.sh
[GSOC] cat-file: reuse ref-filter logic
[GSOC] cat-file: reuse err buf in batch_object_write()
[GSOC] cat-file: re-implement --textconv, --filters options
[GSOC] ref-filter: remove grab_oid() function
[GSOC] ref-filter: performance optimization by skip
parse_object_buffer
[GSOC] ref-filter: use atom_type and merge two for loop in grab_person
[GSOC] ref-filter: remove strlen from find_subpos
[GSOC] ref-filter: introducing xstrvfmt_len() and xstrfmt_len()
[GSOC] ref-filter: remove second parsing in format_ref_array_item
[GSOC] ref-filter: introduction ref_filter_slopbuf
[GSOC] ref-filter: add deref member to struct used_atom
[GSOC] ref-filter: introduce symref_atom_parser()
[GSOC] ref-filter: use switch case instread of if else
[GSOC] ref-filter: reuse finnal buffer if no stack need
[GSOC] ref-filter: add need_get_object_info flag to struct expand_data
Documentation/git-cat-file.txt | 6 +
Documentation/git-for-each-ref.txt | 9 +
builtin/branch.c | 2 +
builtin/cat-file.c | 275 +++------
builtin/for-each-ref.c | 3 +-
builtin/tag.c | 4 +-
builtin/verify-tag.c | 2 +
quote.c | 17 +
quote.h | 1 +
ref-filter.c | 902 +++++++++++++++++++----------
ref-filter.h | 30 +-
strbuf.c | 21 +
strbuf.h | 6 +
t/perf/p1006-cat-file.sh | 28 +
t/t1006-cat-file.sh | 239 ++++++++
t/t3203-branch-output.sh | 4 +
t/t6300-for-each-ref.sh | 235 ++++++++
t/t6301-for-each-ref-errors.sh | 2 +-
t/t7004-tag.sh | 4 +
t/t7030-verify-tag.sh | 4 +
20 files changed, 1283 insertions(+), 511 deletions(-)
create mode 100755 t/perf/p1006-cat-file.sh
base-commit: 5d213e46bb7b880238ff5ea3914e940a50ae9369
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1016%2Fadlternative%2Fcat-file-reuse-ref-filter-logic-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1016/adlternative/cat-file-reuse-ref-filter-logic-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1016
--
gitgitgadget
next reply other threads:[~2021-08-13 8:23 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-13 8:22 ZheNing Hu via GitGitGadget [this message]
2021-08-13 8:22 ` [PATCH 01/27] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 02/27] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 03/27] [GSOC] ref-filter: --format=%(raw) support --perl ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 04/27] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 05/27] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 06/27] [GSOC] ref-filter: pass get_object() return value to their callers ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 07/27] [GSOC] ref-filter: introduce free_ref_array_item_value() function ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 08/27] [GSOC] ref-filter: add cat_file_mode to ref_format ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 09/27] [GSOC] ref-filter: modify the error message and value in get_object ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 10/27] [GSOC] cat-file: add has_object_file() check ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 11/27] [GSOC] cat-file: change batch_objects parameter name ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 12/27] [GSOC] cat-file: create p1006-cat-file.sh ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 13/27] [GSOC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 14/27] [GSOC] cat-file: reuse err buf in batch_object_write() ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 15/27] [GSOC] cat-file: re-implement --textconv, --filters options ZheNing Hu via GitGitGadget
2021-08-13 8:22 ` [PATCH 16/27] [GSOC] ref-filter: remove grab_oid() function ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 17/27] [GSOC] ref-filter: performance optimization by skip parse_object_buffer ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 18/27] [GSOC] ref-filter: use atom_type and merge two for loop in grab_person ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 19/27] [GSOC] ref-filter: remove strlen from find_subpos ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 20/27] [GSOC] ref-filter: introducing xstrvfmt_len() and xstrfmt_len() ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 21/27] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 22/27] [GSOC] ref-filter: introduction ref_filter_slopbuf ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 23/27] [GSOC] ref-filter: add deref member to struct used_atom ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 24/27] [GSOC] ref-filter: introduce symref_atom_parser() ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 25/27] [GSOC] ref-filter: use switch case instread of if else ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 26/27] [GSOC] ref-filter: reuse finnal buffer if no stack need ZheNing Hu via GitGitGadget
2021-08-13 8:23 ` [PATCH 27/27] [GSOC] ref-filter: add need_get_object_info flag to struct expand_data ZheNing Hu via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1016.git.1628842990.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=adlternative@gmail.com \
--cc=avarab@gmail.com \
--cc=bagasdotme@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hariom18599@gmail.com \
--cc=peff@peff.net \
--cc=philipoakley@iee.email \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).