All of lore.kernel.org
 help / color / mirror / Atom feed
* [GSoC] Git Blog 13
@ 2021-08-15  7:08 ZheNing Hu
  2021-08-15 10:28 ` Christian Couder
  0 siblings, 1 reply; 3+ messages in thread
From: ZheNing Hu @ 2021-08-15  7:08 UTC (permalink / raw)
  To: Git List, Junio C Hamano, Hariom verma, Christian Couder,
	Ævar Arnfjörð Bjarmason

My thirteenth week blog finished:
The web version is here:
https://adlternative.github.io/GSOC-Git-Blog-13/

### Project Progress

This week I continue to try to optimize the performance of `git
cat-file --batch`.

You can see them here:

```bash
git fetch git@github.com:adlternative/git.git cat-file-reuse-ref-filter-logic
git rev-list  2c6ce95c82..8591897fbc
```

Or here:

[[PATCH 00/27] [GSOC] [RFC] cat-file: reuse ref-filter
logic](https://lore.kernel.org/git/pull.1016.git.1628842990.gitgitgadget@gmail.com/)

Several of these commits are critical:

* 31acac9fde `[GSOC] ref-filter: remove second parsing in
format_ref_array_item`:
It use a `parsed_atom_list` to save the parsed state after
we calling verify_ref_format(), so we can reduce second
parsing in format_ref_array_item(), which brings 1.9%
performance improvement.
* 4602b62a92 `[GSOC] ref-filter: reuse finnal buffer if no
stack need`:It can reduce some unnecessary copies,
which bring 2% performance improvement.

This time I made sure that there was not too much noise,
to ensure the stability of this performance test:

#### Test Examples:
* upstream/master: `5d213e46bb (tag: v2.33.0-rc2, upstream/master) Git 2.33-rc2`
* 898e36a92b (before performance optimization): `898e36a92b [GSOC]
cat-file: re-implement --textconv, --filters options`
* this tree (after performance optimization): `8591897fbc (HEAD ->
cat-file-reuse-ref-filter-logic) [GSOC] ref-filter: add
need_get_object_info flag to struct expand_data`


#### Test Results:
```bash
$ GIT_PERF_REPEAT_COUNT=50  GIT_PERF_MAKE_OPTS=-j8 ./run
upstream/master . ./p1006-cat-file.sh

Test                                        upstream/master   this tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.08(0.06+0.01)
0.08(0.07+0.01) +0.0%
1006.3: cat-file --batch-check with atoms   0.06(0.05+0.00)
0.07(0.06+0.01) +16.7%
1006.4: cat-file --batch                    0.48(0.45+0.03)
0.50(0.46+0.03) +4.2%
1006.5: cat-file --batch with atoms         0.47(0.43+0.03)
0.49(0.46+0.02) +4.3%

$ GIT_PERF_REPEAT_COUNT=50  GIT_PERF_MAKE_OPTS=-j8 ./run
upstream/master 898e36a92b ./p1006-cat-file.sh

Test                                        upstream/master   898e36a92b
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.08(0.07+0.00)
0.09(0.09+0.00) +12.5%
1006.3: cat-file --batch-check with atoms   0.06(0.04+0.01)
0.07(0.05+0.02) +16.7%
1006.4: cat-file --batch                    0.48(0.44+0.03)
0.60(0.58+0.02) +25.0%
1006.5: cat-file --batch with atoms         0.47(0.44+0.02)
0.58(0.56+0.02) +23.4%
```

The performance of `git cat-file --batch-check` is very close
to `upstream/master`!

The performance difference of `git cat-file --batch` has also
changed from 25.0% to 4.2%!

This result is far better than my expectations, I have reason to
believe that the performance of `git cat-file --batch` can be
improved again!

Good job!

I think the key to continuing to optimize is still to reduce unnecessary copies.

### Additional advice

During the optimization process this week, I found that I might want
to use a `strbuf_move()` function, although I did not adopt it in my work
(because it did not allow my work to be greatly optimized), but I think it might
be useful in some situations: We don’t want to copy the data of strbuf,
but just want to move its buf pointer:

```c
void strbuf_move(struct strbuf *sb, struct strbuf *sb2)
{
strbuf_release(sb);
*sb = *sb2;
strbuf_init(sb2, 0);
}
```

Yes, it's like `std::move()` in c++.
[std::move](https://en.cppreference.com/w/cpp/utility/move)
Maybe we can use strbuf_move() in some scenarios.

Thanks, Git!
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GSoC] Git Blog 13
  2021-08-15  7:08 [GSoC] Git Blog 13 ZheNing Hu
@ 2021-08-15 10:28 ` Christian Couder
  2021-08-16  9:13   ` ZheNing Hu
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Couder @ 2021-08-15 10:28 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Git List, Junio C Hamano, Hariom verma,
	Ævar Arnfjörð Bjarmason

On Sun, Aug 15, 2021 at 9:08 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> My thirteenth week blog finished:
> The web version is here:
> https://adlternative.github.io/GSOC-Git-Blog-13/

Thanks!

> ### Project Progress
>
> This week I continue to try to optimize the performance of `git
> cat-file --batch`.
>
> You can see them here:


> Test                                        upstream/master   this tree
> ------------------------------------------------------------------------------------
> 1006.2: cat-file --batch-check              0.08(0.06+0.01)
> 0.08(0.07+0.01) +0.0%
> 1006.3: cat-file --batch-check with atoms   0.06(0.05+0.00)
> 0.07(0.06+0.01) +16.7%
> 1006.4: cat-file --batch                    0.48(0.45+0.03)
> 0.50(0.46+0.03) +4.2%
> 1006.5: cat-file --batch with atoms         0.47(0.43+0.03)
> 0.49(0.46+0.02) +4.3%


> The performance of `git cat-file --batch-check` is very close
> to `upstream/master`!
>
> The performance difference of `git cat-file --batch` has also
> changed from 25.0% to 4.2%!
>
> This result is far better than my expectations, I have reason to
> believe that the performance of `git cat-file --batch` can be
> improved again!
>
> Good job!

Yeah, great!

> I think the key to continuing to optimize is still to reduce unnecessary copies.

Do you have data or hints about this?

Have you looked at why there is still +16.7% in the cat-file
--batch-check with atoms case? Maybe solving this would improve things
in the other cases too.

> ### Additional advice
>
> During the optimization process this week, I found that I might want
> to use a `strbuf_move()` function, although I did not adopt it in my work
> (because it did not allow my work to be greatly optimized), but I think it might
> be useful in some situations: We don’t want to copy the data of strbuf,
> but just want to move its buf pointer:

Yeah, if we are sure that it improves performance, that might be worth it.

About your patch series
(https://lore.kernel.org/git/pull.1016.git.1628842990.gitgitgadget@gmail.com/),
I wonder if it might be possible and better to have some performance
improvements before the commit that uses the ref-filter code
("cat-file: reuse ref-filter logic"). If you are going this way, it
might make sense to add a few performance tests, if some don't already
exist, to show the effect of these performance improvements on the
ref-filter code when it's used by other commands like for-each-ref.

This could show that these performance improvements are worth doing
even if we didn't want to reuse the ref-filter code in cat-file. And
then perhaps these performance improvements could be merged as part of
one or more small patch series, before the patch series that reuses
the ref-filter code in cat-file.

Best,
Christian.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GSoC] Git Blog 13
  2021-08-15 10:28 ` Christian Couder
@ 2021-08-16  9:13   ` ZheNing Hu
  0 siblings, 0 replies; 3+ messages in thread
From: ZheNing Hu @ 2021-08-16  9:13 UTC (permalink / raw)
  To: Christian Couder
  Cc: Git List, Junio C Hamano, Hariom verma,
	Ævar Arnfjörð Bjarmason

Christian Couder <christian.couder@gmail.com> 于2021年8月15日周日 下午6:28写道:
>
> > I think the key to continuing to optimize is still to reduce unnecessary copies.
>
> Do you have data or hints about this?
>

Here is the result of perf diff with git cat-file --batch-check
--batch-all-objects
between upstream/master and cat-file-reuse-ref-filter-logic:

# Event 'cycles:u'
#
# Baseline  Delta Abs  Shared Object     Symbol
# ........  .........  ................  ..................................
#
    18.08%     -2.11%  libz.so.1.2.11    [.] 0x0000000000008dde
               +2.09%  libc-2.33.so      [.] malloc_consolidate
    21.65%     -1.70%  git               [.] unpack_object_header_buffer
     5.62%     -1.57%  git               [.] use_pack
     0.59%     +1.49%  libc-2.33.so      [.] _int_free
               +1.40%  git               [.] get_object
    13.62%     -1.39%  libz.so.1.2.11    [.] inflate
     1.54%     -0.98%  libc-2.33.so      [.] __strchrnul_avx2
               +0.90%  git               [.] format_ref_array_item
               +0.82%  git               [.] populate_value
     0.77%     +0.74%  libc-2.33.so      [.] __strlen_avx2
     0.37%     +0.68%  libc-2.33.so      [.] cfree@GLIBC_2.2.5
               +0.56%  git               [.] append_literal.isra.0
     0.32%     +0.53%  libc-2.33.so      [.] malloc
     1.09%     -0.48%  git               [.] get_size_from_delta
     0.77%     -0.46%  git               [.] oid_array_for_each_unique
     2.48%     -0.45%  git               [.] get_delta_base
     0.20%     +0.41%  libc-2.33.so      [.] _IO_fwrite
               +0.41%  git               [.] batch_object_write.constprop.0
               +0.40%  git               [.] quote_formatting
               +0.39%  git               [.] get_ref_atom_value
     0.31%     +0.38%  git               [.] strbuf_add
               +0.36%  libc-2.33.so      [.] __libc_calloc
     0.41%     +0.35%  [unknown]         [k] 0xffffffff84c00158
     0.58%     -0.33%  libc-2.33.so      [.] unlink_chunk.constprop.0
     1.04%     -0.31%  libc-2.33.so      [.] __GI___qsort_r
               +0.30%  git               [.] 0x000000000001cd80
               +0.30%  git               [.] append_atom
     3.23%     +0.30%  libc-2.33.so      [.] __memmove_avx_unaligned_erms

malloc_consolidate(), _int_free(), malloc(), __libc_calloc(), they
take up a larger
proportion of time (through gprof, we can also see that the number of
calls is also
increasing), it shows that our memory allocation is worth optimizing.

__memmove_avx_unaligned_erms() take up a larger proportion of time,
it shows that we have some unnecessary copies.

Note that 4602b62 ([GSOC] ref-filter: reuse finnal buffer if no stack
need) is trying
to alleviate this problem, and the results show that it does have a 2%
optimization.

So today I added two new commits to verify my guess: We need to reduce copying.
See: https://github.com/adlternative/git/commits/cat-file-reuse-ref-filter-logic-temp-5


Test                                        upstream/master   this
tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.08(0.06+0.01)
0.08(0.06+0.01) +0.0%
1006.3: cat-file --batch-check with atoms   0.06(0.04+0.01)
0.07(0.06+0.01) +16.7%
1006.4: cat-file --batch                    0.49(0.47+0.02)
0.47(0.46+0.00) -4.1%
1006.5: cat-file --batch with atoms         0.47(0.45+0.01)
0.46(0.46+0.00) -2.1%

This is good news. The performance of our git cat-file --batch is
better than before!

> Have you looked at why there is still +16.7% in the cat-file
> --batch-check with atoms case? Maybe solving this would improve things
> in the other cases too.
>

I am continuing to think about it, I guess it is still a
malloc/free/memmove problem.
The execution time of git cat-file --batch-check is relatively short,
and the disadvantages
of using ref-filter logic will be more prominent.

> > ### Additional advice
> >
> > During the optimization process this week, I found that I might want
> > to use a `strbuf_move()` function, although I did not adopt it in my work
> > (because it did not allow my work to be greatly optimized), but I think it might
> > be useful in some situations: We don’t want to copy the data of strbuf,
> > but just want to move its buf pointer:
>
> Yeah, if we are sure that it improves performance, that might be worth it.
>
> About your patch series
> (https://lore.kernel.org/git/pull.1016.git.1628842990.gitgitgadget@gmail.com/),
> I wonder if it might be possible and better to have some performance
> improvements before the commit that uses the ref-filter code
> ("cat-file: reuse ref-filter logic"). If you are going this way, it
> might make sense to add a few performance tests, if some don't already
> exist, to show the effect of these performance improvements on the
> ref-filter code when it's used by other commands like for-each-ref.
>
> This could show that these performance improvements are worth doing
> even if we didn't want to reuse the ref-filter code in cat-file. And
> then perhaps these performance improvements could be merged as part of
> one or more small patch series, before the patch series that reuses
> the ref-filter code in cat-file.
>

Ah, this is estimated to take a lot of effort. I feel that it will
take a long time
for these optimization patches to be merged into the master.

I tried to add a performance test for for-each-ref, but the effect is very
insignificant, because its execution time is too short.

> Best,
> Christian.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-16  9:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-15  7:08 [GSoC] Git Blog 13 ZheNing Hu
2021-08-15 10:28 ` Christian Couder
2021-08-16  9:13   ` ZheNing Hu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.