All of lore.kernel.org
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Eric Wong <e@80x24.org>, git@vger.kernel.org
Cc: Jeff King <peff@peff.net>
Subject: Re: [PATCH 3/5] make object_directory.loose_objects_subdir_seen a bitmap
Date: Sun, 27 Jun 2021 12:23:18 +0200	[thread overview]
Message-ID: <496545dc-e372-401c-13f4-daa7ee765d39@web.de> (raw)
In-Reply-To: <20210627024718.25383-4-e@80x24.org>

Am 27.06.21 um 04:47 schrieb Eric Wong:
> There's no point in using 8 bits per-directory when 1 bit
> will do.  This saves us 224 bytes per object directory, which
> ends up being 22MB when dealing with 100K alternates.

The point was simplicity under the assumption that the number of
repositories is low -- for most users it's only one.  That obviously
doesn't hold for your use case anymore. :)

A compact representation should also reduce dcache misses, so this
should be a win for the single-repo case as well.

> Signed-off-by: Eric Wong <e@80x24.org>
> ---
>  object-file.c  | 10 +++++++---
>  object-store.h |  2 +-
>  2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/object-file.c b/object-file.c
> index 6be43c2b60..2c8b9c05f9 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -2463,12 +2463,16 @@ struct oid_array *odb_loose_cache(struct object_directory *odb,
>  {
>  	int subdir_nr = oid->hash[0];
>  	struct strbuf buf = STRBUF_INIT;
> +	size_t BM_SIZE = sizeof(odb->loose_objects_subdir_seen[0]) * CHAR_BIT;

With that name I'd expect the variable to contain the number of bytes or
bits in the whole bitmap.  And to not be a variable at all, but rather a
macro.  Perhaps word_bits?

bitsizeof() does the same and is slightly shorter.

> +	uint32_t *bitmap;

Ah, you call the array items bitmap, which they are.  Hmm.  I rather
think of the whole thing as a bitmap and its uint32_t elements as words.
Does it matter?  Not sure.

> +	uint32_t bit = 1 << (subdir_nr % BM_SIZE);

I'd call that mask, but bit is fine as well..

Anyway, it would look something like this:

	size_t word_bits = bitsizeof(odb->loose_objects_subdir_seen[0]);
	size_t word_index = subdir_nr / word_bits;
	size_t mask = 1 << (subdir_nr % word_bits);

>
>  	if (subdir_nr < 0 ||
> -	    subdir_nr >= ARRAY_SIZE(odb->loose_objects_subdir_seen))
> +	    subdir_nr >= ARRAY_SIZE(odb->loose_objects_subdir_seen) * BM_SIZE)

bitsizeof(odb->loose_objects_subdir_seen) would be easier to read and
understand, I think.

>  		BUG("subdir_nr out of range");
>
> -	if (odb->loose_objects_subdir_seen[subdir_nr])
> +	bitmap = &odb->loose_objects_subdir_seen[subdir_nr / BM_SIZE];
> +	if (*bitmap & bit)
>  		return &odb->loose_objects_cache[subdir_nr];
>
>  	strbuf_addstr(&buf, odb->path);
> @@ -2476,7 +2480,7 @@ struct oid_array *odb_loose_cache(struct object_directory *odb,
>  				    append_loose_object,
>  				    NULL, NULL,
>  				    &odb->loose_objects_cache[subdir_nr]);
> -	odb->loose_objects_subdir_seen[subdir_nr] = 1;
> +	*bitmap |= bit;
>  	strbuf_release(&buf);
>  	return &odb->loose_objects_cache[subdir_nr];
>  }
> diff --git a/object-store.h b/object-store.h
> index 20c1cedb75..8fcddf3e65 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -22,7 +22,7 @@ struct object_directory {
>  	 *
>  	 * Be sure to call odb_load_loose_cache() before using.
>  	 */
> -	char loose_objects_subdir_seen[256];
> +	uint32_t loose_objects_subdir_seen[8]; /* 256 bits */

Perhaps	DIV_ROUND_UP(256, bitsizeof(uint32_t))?  The comment explains
it nicely already, though.

>  	struct oid_array loose_objects_cache[256];
>
>  	/*
>

Summary: Good idea, the implementation looks correct, I stumbled
over some of the names, bitsizeof() could be used.

René

  reply	other threads:[~2021-06-27 10:23 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24  0:58 [PATCH] speed up alt_odb_usable() with many alternates Eric Wong
2021-06-27  2:47 ` [PATCH 0/5] optimizations for many odb alternates Eric Wong
2021-06-27  2:47   ` [PATCH 1/5] speed up alt_odb_usable() with many alternates Eric Wong
2021-06-27  2:47   ` [PATCH 2/5] avoid strlen via strbuf_addstr in link_alt_odb_entry Eric Wong
2021-06-27  2:47   ` [PATCH 3/5] make object_directory.loose_objects_subdir_seen a bitmap Eric Wong
2021-06-27 10:23     ` René Scharfe [this message]
2021-06-28 23:09       ` Eric Wong
2021-06-27  2:47   ` [PATCH 4/5] oidcpy_with_padding: constify `src' arg Eric Wong
2021-06-27  2:47   ` [PATCH 5/5] oidtree: a crit-bit tree for odb_loose_cache Eric Wong
2021-06-29 14:42     ` Junio C Hamano
2021-06-29 20:17       ` Eric Wong
2021-06-29 20:53   ` [PATCH v2 0/5] optimizations for many alternates Eric Wong
2021-07-07 23:10     ` [PATCH v3 " Eric Wong
2021-07-07 23:10     ` [PATCH v3 1/5] speed up alt_odb_usable() with " Eric Wong
2021-07-08  0:20       ` Junio C Hamano
2021-07-08  1:14         ` Eric Wong
2021-07-08  4:30           ` Junio C Hamano
2021-07-07 23:10     ` [PATCH v3 2/5] avoid strlen via strbuf_addstr in link_alt_odb_entry Eric Wong
2021-07-08  4:57       ` Junio C Hamano
2021-07-07 23:10     ` [PATCH v3 3/5] make object_directory.loose_objects_subdir_seen a bitmap Eric Wong
2021-07-07 23:10     ` [PATCH v3 4/5] oidcpy_with_padding: constify `src' arg Eric Wong
2021-07-07 23:10     ` [PATCH v3 5/5] oidtree: a crit-bit tree for odb_loose_cache Eric Wong
2021-06-29 20:53   ` [PATCH v2 1/5] speed up alt_odb_usable() with many alternates Eric Wong
2021-07-03 10:05     ` René Scharfe
2021-07-04  9:02       ` René Scharfe
2021-07-06 23:01       ` Eric Wong
2021-06-29 20:53   ` [PATCH v2 2/5] avoid strlen via strbuf_addstr in link_alt_odb_entry Eric Wong
2021-06-29 20:53   ` [PATCH v2 3/5] make object_directory.loose_objects_subdir_seen a bitmap Eric Wong
2021-06-29 20:53   ` [PATCH v2 4/5] oidcpy_with_padding: constify `src' arg Eric Wong
2021-06-29 20:53   ` [PATCH v2 5/5] oidtree: a crit-bit tree for odb_loose_cache Eric Wong
2021-07-04  9:02     ` René Scharfe
2021-07-06 23:21       ` Eric Wong
2021-07-04  9:32     ` Ævar Arnfjörð Bjarmason
2021-07-07 23:12       ` Eric Wong
2021-08-06 15:31     ` Andrzej Hunt
2021-08-06 17:53       ` René Scharfe
2021-08-07 22:49         ` Eric Wong
2021-08-09  1:35           ` Carlo Arenas
2021-08-09  1:38             ` [PATCH/RFC 0/3] pedantic errors in next Carlo Marcelo Arenas Belón
2021-08-09  1:38               ` [PATCH/RFC 1/3] oidtree: avoid nested struct oidtree_node Carlo Marcelo Arenas Belón
2021-08-09  1:38               ` [PATCH/RFC 2/3] object-store: avoid extra ';' from KHASH_INIT Carlo Marcelo Arenas Belón
2021-08-09 15:53                 ` Junio C Hamano
2021-08-09  1:38               ` [PATCH/RFC 3/3] ci: run a pedantic build as part of the GitHub workflow Carlo Marcelo Arenas Belón
2021-08-09 10:50                 ` Bagas Sanjaya
2021-08-09 22:03                   ` Carlo Arenas
2021-08-09 14:56                 ` Phillip Wood
2021-08-09 22:48                   ` Carlo Arenas
2021-08-10 15:24                     ` Phillip Wood
2021-08-10 18:25                       ` Junio C Hamano
2021-08-30 11:36                   ` Ævar Arnfjörð Bjarmason
2021-08-31 20:28                     ` Carlo Arenas
2021-08-31 20:51                       ` Ævar Arnfjörð Bjarmason
2021-08-31 23:54                         ` Carlo Arenas
2021-09-01  1:52                           ` Jeff King
2021-09-01 17:55                             ` Junio C Hamano
2021-08-30 11:40                 ` Ævar Arnfjörð Bjarmason
2021-09-01  9:19                 ` [RFC PATCH v2 0/4] developer: support pedantic Carlo Marcelo Arenas Belón
2021-09-01  9:19                   ` [RFC PATCH v2 1/4] developer: retire USE_PARENS_AROUND_GETTEXT_N support Carlo Marcelo Arenas Belón
2021-09-01  9:19                   ` [RFC PATCH v2 2/4] developer: enable pedantic by default Carlo Marcelo Arenas Belón
2021-09-01  9:19                   ` [RFC PATCH v2 3/4] developer: add an alternative script for detecting broken N_() Carlo Marcelo Arenas Belón
2021-09-01  9:19                   ` [RFC PATCH v2 4/4] developer: move detect-compiler out of the main directory Carlo Marcelo Arenas Belón
2021-09-01 10:10                   ` [RFC PATCH v2 0/4] developer: support pedantic Jeff King
2021-09-01 11:25                     ` [PATCH] gettext: remove optional non-standard parens in N_() definition Ævar Arnfjörð Bjarmason
2021-09-01 17:31                       ` Eric Sunshine
2021-09-02  9:13                       ` Jeff King
2021-09-02 19:19                       ` Junio C Hamano
2021-09-01 11:27                     ` [RFC PATCH v2 0/4] developer: support pedantic Ævar Arnfjörð Bjarmason
2021-09-01 18:03                       ` Carlo Arenas
2021-09-03 17:02                   ` [PATCH v3 0/3] support pedantic in developer mode Carlo Marcelo Arenas Belón
2021-09-03 17:02                     ` [PATCH v3 1/3] gettext: remove optional non-standard parens in N_() definition Carlo Marcelo Arenas Belón
2021-09-10 15:39                       ` Ævar Arnfjörð Bjarmason
2021-09-03 17:02                     ` [PATCH v3 2/3] win32: allow building with pedantic mode enabled Carlo Marcelo Arenas Belón
2021-09-03 18:47                       ` René Scharfe
2021-09-03 20:13                         ` Carlo Marcelo Arenas Belón
2021-09-03 20:32                           ` Junio C Hamano
2021-09-03 20:38                           ` René Scharfe
2021-09-04  9:37                             ` René Scharfe
2021-09-04 14:42                               ` Carlo Arenas
2021-09-27 23:04                       ` Jonathan Tan
2021-09-28  0:30                         ` Carlo Arenas
2021-09-28 16:50                           ` Jonathan Tan
2021-09-28 17:37                           ` Junio C Hamano
2021-09-28 20:16                             ` Jonathan Tan
2021-09-29  1:00                               ` Carlo Arenas
2021-09-29 15:55                                 ` Junio C Hamano
2021-09-03 17:02                     ` [PATCH v3 3/3] developer: enable pedantic by default Carlo Marcelo Arenas Belón
2021-09-05  7:54                     ` [PATCH v3 0/3] support pedantic in developer mode Ævar Arnfjörð Bjarmason
2021-08-09 16:44               ` [PATCH/RFC 0/3] pedantic errors in next Junio C Hamano
2021-08-09 20:10                 ` Eric Wong
2021-08-10  6:16                 ` Carlo Marcelo Arenas Belón
2021-08-10 19:30                   ` René Scharfe
2021-08-10 23:49                     ` Carlo Arenas
2021-08-11  0:57                       ` Carlo Arenas
2021-08-11 14:57                       ` René Scharfe
2021-08-11 17:20                         ` Junio C Hamano
2021-08-10 18:59             ` [PATCH v2 5/5] oidtree: a crit-bit tree for odb_loose_cache René Scharfe
2021-08-10 19:40           ` René Scharfe
2021-08-14 20:00       ` [PATCH] oidtree: avoid unaligned access to crit-bit tree René Scharfe
2021-08-16 19:11         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=496545dc-e372-401c-13f4-daa7ee765d39@web.de \
    --to=l.s.r@web.de \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.