git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Hariom Verma" <hariom18599@gmail.com>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Philip Oakley" <philipoakley@iee.email>,
	"ZheNing Hu" <adlternative@gmail.com>
Subject: [PATCH 00/27] [GSOC] [RFC] cat-file: reuse ref-filter logic
Date: Fri, 13 Aug 2021 08:22:43 +0000	[thread overview]
Message-ID: <pull.1016.git.1628842990.gitgitgadget@gmail.com> (raw)

This patch series makes cat-file reuse ref-filter logic. At the same time,
some performance optimizations have been carried out. It's last version is
here:
https://lore.kernel.org/git/pull.993.v2.git.1626363626.gitgitgadget@gmail.com/#t

It seems that zh/ref-filter-raw-data is still hovering in the next branch
(Because git is rc2) So I now want to show some recent performance
optimizations first.

Change from last version:

 1.  Use free_global_resource() to avoid memory leaks.
 2.  Skip parse_object_buffer() which bring 12.5% performance optimization.
 3.  Merge two for loop in grab_person() which bring 2% performance
     optimization.
 4.  Remove strlen from find_subpos.
 5.  Introducing xstrvfmt_len() and xstrfmt_len().
 6.  Remove second parsing in format_ref_array_item() which bring 1.9%
     performance optimization
 7.  Introduction ref_filter_slopbuf to instread xstrdup("").
 8.  Add deref member to struct used_atom to simplify the logic of the
     program.
 9.  Introduce symref_atom_parser() to make the program logic more concise.
 10. Use switch/case instread of if/else to increase the readability of the
     code.
 11. Reuse finnal buffer which bring 2% performance optimization.
 12. Add need_get_object_info flag to reduce memory comparing.

This is the result of the performance test after I did some optimization:

Test                                        upstream/master   this tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.08(0.07+0.00)   0.09(0.08+0.01) +12.5%
1006.3: cat-file --batch-check with atoms   0.06(0.04+0.02)   0.08(0.06+0.02) +33.3%
1006.4: cat-file --batch                    0.49(0.46+0.02)   0.50(0.47+0.03) +2.0%
1006.5: cat-file --batch with atoms         0.47(0.45+0.01)   0.49(0.47+0.02) +4.3%


We can see that the performance of the current patch of git cat-file --batch
is very close to upstream/master. The optimization of git cat-file
--batch-check does not seem obvious, because its optimization degree will be
affected by noise, which may appear in the range of +12.5% to +50.0%. From
an optimistic point of view, the execution time of git cat-file
--batch-check itself is relatively short, the optimization is of course not
obvious.

As GSOC is about to end, this patch series is estimated to be adjusted for
some time, I can only wish this patch can be accepted in the future.

Note: The previous part of this patch series is the duplicate content
belonging to zh/ref-filter-raw-data.

ZheNing Hu (27):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: --format=%(raw) support --perl
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom
  [GSOC] ref-filter: pass get_object() return value to their callers
  [GSOC] ref-filter: introduce free_ref_array_item_value() function
  [GSOC] ref-filter: add cat_file_mode to ref_format
  [GSOC] ref-filter: modify the error message and value in get_object
  [GSOC] cat-file: add has_object_file() check
  [GSOC] cat-file: change batch_objects parameter name
  [GSOC] cat-file: create p1006-cat-file.sh
  [GSOC] cat-file: reuse ref-filter logic
  [GSOC] cat-file: reuse err buf in batch_object_write()
  [GSOC] cat-file: re-implement --textconv, --filters options
  [GSOC] ref-filter: remove grab_oid() function
  [GSOC] ref-filter: performance optimization by skip
    parse_object_buffer
  [GSOC] ref-filter: use atom_type and merge two for loop in grab_person
  [GSOC] ref-filter: remove strlen from find_subpos
  [GSOC] ref-filter: introducing xstrvfmt_len() and xstrfmt_len()
  [GSOC] ref-filter: remove second parsing in format_ref_array_item
  [GSOC] ref-filter: introduction ref_filter_slopbuf
  [GSOC] ref-filter: add deref member to struct used_atom
  [GSOC] ref-filter: introduce symref_atom_parser()
  [GSOC] ref-filter: use switch case instread of if else
  [GSOC] ref-filter: reuse finnal buffer if no stack need
  [GSOC] ref-filter: add need_get_object_info flag to struct expand_data

 Documentation/git-cat-file.txt     |   6 +
 Documentation/git-for-each-ref.txt |   9 +
 builtin/branch.c                   |   2 +
 builtin/cat-file.c                 | 275 +++------
 builtin/for-each-ref.c             |   3 +-
 builtin/tag.c                      |   4 +-
 builtin/verify-tag.c               |   2 +
 quote.c                            |  17 +
 quote.h                            |   1 +
 ref-filter.c                       | 902 +++++++++++++++++++----------
 ref-filter.h                       |  30 +-
 strbuf.c                           |  21 +
 strbuf.h                           |   6 +
 t/perf/p1006-cat-file.sh           |  28 +
 t/t1006-cat-file.sh                | 239 ++++++++
 t/t3203-branch-output.sh           |   4 +
 t/t6300-for-each-ref.sh            | 235 ++++++++
 t/t6301-for-each-ref-errors.sh     |   2 +-
 t/t7004-tag.sh                     |   4 +
 t/t7030-verify-tag.sh              |   4 +
 20 files changed, 1283 insertions(+), 511 deletions(-)
 create mode 100755 t/perf/p1006-cat-file.sh


base-commit: 5d213e46bb7b880238ff5ea3914e940a50ae9369
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1016%2Fadlternative%2Fcat-file-reuse-ref-filter-logic-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1016/adlternative/cat-file-reuse-ref-filter-logic-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1016
-- 
gitgitgadget

             reply	other threads:[~2021-08-13  8:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-13  8:22 ZheNing Hu via GitGitGadget [this message]
2021-08-13  8:22 ` [PATCH 01/27] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 02/27] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 03/27] [GSOC] ref-filter: --format=%(raw) support --perl ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 04/27] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 05/27] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 06/27] [GSOC] ref-filter: pass get_object() return value to their callers ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 07/27] [GSOC] ref-filter: introduce free_ref_array_item_value() function ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 08/27] [GSOC] ref-filter: add cat_file_mode to ref_format ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 09/27] [GSOC] ref-filter: modify the error message and value in get_object ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 10/27] [GSOC] cat-file: add has_object_file() check ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 11/27] [GSOC] cat-file: change batch_objects parameter name ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 12/27] [GSOC] cat-file: create p1006-cat-file.sh ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 13/27] [GSOC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 14/27] [GSOC] cat-file: reuse err buf in batch_object_write() ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 15/27] [GSOC] cat-file: re-implement --textconv, --filters options ZheNing Hu via GitGitGadget
2021-08-13  8:22 ` [PATCH 16/27] [GSOC] ref-filter: remove grab_oid() function ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 17/27] [GSOC] ref-filter: performance optimization by skip parse_object_buffer ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 18/27] [GSOC] ref-filter: use atom_type and merge two for loop in grab_person ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 19/27] [GSOC] ref-filter: remove strlen from find_subpos ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 20/27] [GSOC] ref-filter: introducing xstrvfmt_len() and xstrfmt_len() ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 21/27] [GSOC] ref-filter: remove second parsing in format_ref_array_item ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 22/27] [GSOC] ref-filter: introduction ref_filter_slopbuf ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 23/27] [GSOC] ref-filter: add deref member to struct used_atom ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 24/27] [GSOC] ref-filter: introduce symref_atom_parser() ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 25/27] [GSOC] ref-filter: use switch case instread of if else ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 26/27] [GSOC] ref-filter: reuse finnal buffer if no stack need ZheNing Hu via GitGitGadget
2021-08-13  8:23 ` [PATCH 27/27] [GSOC] ref-filter: add need_get_object_info flag to struct expand_data ZheNing Hu via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1016.git.1628842990.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=adlternative@gmail.com \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    --cc=philipoakley@iee.email \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).