git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pack-objects.h: remove outdated pahole results
@ 2022-06-28 18:30 Taylor Blau
  2022-06-28 20:03 ` Derrick Stolee
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Taylor Blau @ 2022-06-28 18:30 UTC (permalink / raw)
  To: git; +Cc: gitster, pclouds

The size and padding of `struct object_entry` is an important factor in
determining the memory usage of `pack-objects`. For this reason,
3b13a5f263 (pack-objects: reorder members to shrink struct object_entry,
2018-04-14) added a comment containing some information from pahole
indicating the size and padding of that struct.

Unfortunately, this comment hasn't been updated since 9ac3f0e5b3
(pack-objects: fix performance issues on packing large deltas,
2018-07-22), despite the size of this struct changing many times since
that commit.

To see just how often the size of object_entry changes, I skimmed the
first-parent history with this script:

    for sha in $(git rev-list --first-parent --reverse 9ac3f0e..)
    do
      echo -n "$sha "
      git checkout -q $sha
      make -s pack-objects.o 2>/dev/null
      pahole -C object_entry pack-objects.o | sed -n \
        -e 's/\/\* size: \([0-9]*\).*/size \1/p' \
        -e 's/\/\*.*padding: \([0-9]*\).*/padding \1/p' | xargs
    done | uniq -f1

In between each merge, the size of object_entry changes too often to
record every instance here. But the important merges (along with their
corresponding sizes and bit paddings) in chronological order are:

    ad635e82d6 (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23) size 80 padding 4
    29d9e3e2c4 (Merge branch 'nd/pack-deltify-regression-fix', 2018-08-22) size 80 padding 9
    3ebdef2e1b (Merge branch 'jk/pack-delta-reuse-with-bitmap', 2018-09-17) size 80 padding 8
    33e4ae9c50 (Merge branch 'bc/sha-256', 2019-01-29) size 96 padding 8

(indicating that the current size of the struct is 96 bytes, with 8
padding bits).

Even though this comment was written in a good spirit, it is updated
infrequently enough that is serves to confuse rather than to encourage
contributors to update the appropriate values when the modify the
definition of object_entry.

For that reason, eliminate the confusion by removing the comment
altogether.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Noticed this while reverting an old topic out of GitHub's fork, and
realized that this comment was severely out-of-date.

 pack-objects.h | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/pack-objects.h b/pack-objects.h
index 393b9db546..579476687c 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -116,16 +116,6 @@ struct object_entry {
 	unsigned dfs_state:OE_DFS_STATE_BITS;
 	unsigned depth:OE_DEPTH_BITS;
 	unsigned ext_base:1; /* delta_idx points outside packlist */
-
-	/*
-	 * pahole results on 64-bit linux (gcc and clang)
-	 *
-	 *   size: 80, bit_padding: 9 bits
-	 *
-	 * and on 32-bit (gcc)
-	 *
-	 *   size: 76, bit_padding: 9 bits
-	 */
 };

 struct packing_data {
--
2.37.0.1.g1379af2e9d

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] pack-objects.h: remove outdated pahole results
  2022-06-28 18:30 [PATCH] pack-objects.h: remove outdated pahole results Taylor Blau
@ 2022-06-28 20:03 ` Derrick Stolee
  2022-06-28 21:04 ` brian m. carlson
  2022-07-01 18:16 ` Jeff King
  2 siblings, 0 replies; 7+ messages in thread
From: Derrick Stolee @ 2022-06-28 20:03 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: gitster, pclouds

On 6/28/2022 2:30 PM, Taylor Blau wrote:
> The size and padding of `struct object_entry` is an important factor in
> determining the memory usage of `pack-objects`. For this reason,
> 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry,
> 2018-04-14) added a comment containing some information from pahole
> indicating the size and padding of that struct.
> 
> Unfortunately, this comment hasn't been updated since 9ac3f0e5b3
> (pack-objects: fix performance issues on packing large deltas,
> 2018-07-22), despite the size of this struct changing many times since
> that commit.
...
> For that reason, eliminate the confusion by removing the comment
> altogether.

> -
> -	/*
> -	 * pahole results on 64-bit linux (gcc and clang)
> -	 *
> -	 *   size: 80, bit_padding: 9 bits
> -	 *
> -	 * and on 32-bit (gcc)
> -	 *
> -	 *   size: 76, bit_padding: 9 bits
> -	 */
>  };

I will shed no tears over this being gone. Thanks!

-Stolee

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pack-objects.h: remove outdated pahole results
  2022-06-28 18:30 [PATCH] pack-objects.h: remove outdated pahole results Taylor Blau
  2022-06-28 20:03 ` Derrick Stolee
@ 2022-06-28 21:04 ` brian m. carlson
  2022-06-28 21:26   ` Taylor Blau
  2022-07-01 18:16 ` Jeff King
  2 siblings, 1 reply; 7+ messages in thread
From: brian m. carlson @ 2022-06-28 21:04 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, pclouds

[-- Attachment #1: Type: text/plain, Size: 2366 bytes --]

On 2022-06-28 at 18:30:20, Taylor Blau wrote:
> The size and padding of `struct object_entry` is an important factor in
> determining the memory usage of `pack-objects`. For this reason,
> 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry,
> 2018-04-14) added a comment containing some information from pahole
> indicating the size and padding of that struct.
> 
> Unfortunately, this comment hasn't been updated since 9ac3f0e5b3
> (pack-objects: fix performance issues on packing large deltas,
> 2018-07-22), despite the size of this struct changing many times since
> that commit.
> 
> To see just how often the size of object_entry changes, I skimmed the
> first-parent history with this script:
> 
>     for sha in $(git rev-list --first-parent --reverse 9ac3f0e..)
>     do
>       echo -n "$sha "
>       git checkout -q $sha
>       make -s pack-objects.o 2>/dev/null
>       pahole -C object_entry pack-objects.o | sed -n \
>         -e 's/\/\* size: \([0-9]*\).*/size \1/p' \
>         -e 's/\/\*.*padding: \([0-9]*\).*/padding \1/p' | xargs
>     done | uniq -f1
> 
> In between each merge, the size of object_entry changes too often to
> record every instance here. But the important merges (along with their
> corresponding sizes and bit paddings) in chronological order are:
> 
>     ad635e82d6 (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23) size 80 padding 4
>     29d9e3e2c4 (Merge branch 'nd/pack-deltify-regression-fix', 2018-08-22) size 80 padding 9
>     3ebdef2e1b (Merge branch 'jk/pack-delta-reuse-with-bitmap', 2018-09-17) size 80 padding 8
>     33e4ae9c50 (Merge branch 'bc/sha-256', 2019-01-29) size 96 padding 8
> 
> (indicating that the current size of the struct is 96 bytes, with 8
> padding bits).
> 
> Even though this comment was written in a good spirit, it is updated
> infrequently enough that is serves to confuse rather than to encourage

I think you wanted to say, "that it serves:.

> contributors to update the appropriate values when the modify the
> definition of object_entry.
> 
> For that reason, eliminate the confusion by removing the comment
> altogether.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>

I agree with your rationale and that we should remove this.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pack-objects.h: remove outdated pahole results
  2022-06-28 21:04 ` brian m. carlson
@ 2022-06-28 21:26   ` Taylor Blau
  0 siblings, 0 replies; 7+ messages in thread
From: Taylor Blau @ 2022-06-28 21:26 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Taylor Blau, git, gitster, pclouds

On Tue, Jun 28, 2022 at 09:04:27PM +0000, brian m. carlson wrote:
> > Even though this comment was written in a good spirit, it is updated
> > infrequently enough that is serves to confuse rather than to encourage
>
> I think you wanted to say, "that it serves:.

Oops. Thanks for pointing it out.

> > contributors to update the appropriate values when the modify the
> > definition of object_entry.
> >
> > For that reason, eliminate the confusion by removing the comment
> > altogether.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
>
> I agree with your rationale and that we should remove this.

I'm glad that you agree.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pack-objects.h: remove outdated pahole results
  2022-06-28 18:30 [PATCH] pack-objects.h: remove outdated pahole results Taylor Blau
  2022-06-28 20:03 ` Derrick Stolee
  2022-06-28 21:04 ` brian m. carlson
@ 2022-07-01 18:16 ` Jeff King
  2022-07-01 19:48   ` Taylor Blau
  2 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2022-07-01 18:16 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, pclouds

On Tue, Jun 28, 2022 at 02:30:20PM -0400, Taylor Blau wrote:

> Even though this comment was written in a good spirit, it is updated
> infrequently enough that is serves to confuse rather than to encourage
> contributors to update the appropriate values when the modify the
> definition of object_entry.
> 
> For that reason, eliminate the confusion by removing the comment
> altogether.

I agree the actual numbers aren't helping anybody. We _could_ leave a
comment that says "we store a lot of these in memory; be careful of
where and how you add new fields to avoid increasing the struct size".
And then people can run "pahole" before and after their changes.

But then that is also true of other structs (like "struct object"), and
we do not bother there. So it probably is fine not to annotate this
specifically.

Speaking of which, I suspect quite a lot of memory could be saved if
"pack-objects --revs" freed the object structs it allocates during its
traversal. Unless we're generating bitmaps, I don't think they get used
again after the initial packing list is generated. At peak you'd
still be storing all of the object_entry structs alongside the objects
as you finish the traversal, but it wouldn't overlap with any memory
used for the delta search, and of course we'd be at that peak for a much
smaller time.

Not a blocker for your patch obviously, but maybe a fun experiment in an
adjacent area. Possibly even an ambitious #leftoverbits opportunity. :)

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pack-objects.h: remove outdated pahole results
  2022-07-01 18:16 ` Jeff King
@ 2022-07-01 19:48   ` Taylor Blau
  2022-07-01 21:05     ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Taylor Blau @ 2022-07-01 19:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, pclouds

On Fri, Jul 01, 2022 at 02:16:26PM -0400, Jeff King wrote:
> On Tue, Jun 28, 2022 at 02:30:20PM -0400, Taylor Blau wrote:
>
> > Even though this comment was written in a good spirit, it is updated
> > infrequently enough that is serves to confuse rather than to encourage
> > contributors to update the appropriate values when the modify the
> > definition of object_entry.
> >
> > For that reason, eliminate the confusion by removing the comment
> > altogether.
>
> I agree the actual numbers aren't helping anybody. We _could_ leave a
> comment that says "we store a lot of these in memory; be careful of
> where and how you add new fields to avoid increasing the struct size".
> And then people can run "pahole" before and after their changes.
>
> But then that is also true of other structs (like "struct object"), and
> we do not bother there. So it probably is fine not to annotate this
> specifically.

We have such a comment at the very type of the block comment above
`struct object_entry`'s definition:

    "The size of struct nearly determines pack-object's memory
    consumption. This struct is packed tight for that reason. When you
    add or reorder something in this struct, think a bit about this".

thanks to Duy back in 3b13a5f263 (pack-objects: reorder members to
shrink struct object_entry, 2018-04-14).

> Speaking of which, I suspect quite a lot of memory could be saved if
> "pack-objects --revs" freed the object structs it allocates during its
> traversal. Unless we're generating bitmaps, I don't think they get used
> again after the initial packing list is generated. At peak you'd
> still be storing all of the object_entry structs alongside the objects
> as you finish the traversal, but it wouldn't overlap with any memory
> used for the delta search, and of course we'd be at that peak for a much
> smaller time.
>
> Not a blocker for your patch obviously, but maybe a fun experiment in an
> adjacent area. Possibly even an ambitious #leftoverbits opportunity. :)

Challenge accepted! ;-)

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pack-objects.h: remove outdated pahole results
  2022-07-01 19:48   ` Taylor Blau
@ 2022-07-01 21:05     ` Jeff King
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2022-07-01 21:05 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, pclouds

On Fri, Jul 01, 2022 at 03:48:19PM -0400, Taylor Blau wrote:

> > I agree the actual numbers aren't helping anybody. We _could_ leave a
> > comment that says "we store a lot of these in memory; be careful of
> > where and how you add new fields to avoid increasing the struct size".
> > And then people can run "pahole" before and after their changes.
> >
> > But then that is also true of other structs (like "struct object"), and
> > we do not bother there. So it probably is fine not to annotate this
> > specifically.
> 
> We have such a comment at the very type of the block comment above
> `struct object_entry`'s definition:
> 
>     "The size of struct nearly determines pack-object's memory
>     consumption. This struct is packed tight for that reason. When you
>     add or reorder something in this struct, think a bit about this".
> 
> thanks to Duy back in 3b13a5f263 (pack-objects: reorder members to
> shrink struct object_entry, 2018-04-14).

Oh, indeed. Then I withdraw my (non-)complaint. :)

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-01 21:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28 18:30 [PATCH] pack-objects.h: remove outdated pahole results Taylor Blau
2022-06-28 20:03 ` Derrick Stolee
2022-06-28 21:04 ` brian m. carlson
2022-06-28 21:26   ` Taylor Blau
2022-07-01 18:16 ` Jeff King
2022-07-01 19:48   ` Taylor Blau
2022-07-01 21:05     ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).