git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Victoria Dye <vdye@github.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, Phillip Wood <phillip.wood123@gmail.com>
Subject: [PATCH 2/3] read-cache: read on-disk entries in byte order
Date: Wed, 28 Sep 2022 00:21:14 -0400	[thread overview]
Message-ID: <YzPLuizoOlDuPu4G@coredump.intra.peff.net> (raw)
In-Reply-To: <YzPLBN09zzlTdNgc@coredump.intra.peff.net>

An index entry starts with stat data and mode, followed by an oid,
followed by flags, followed by the entry name. We parse this out of
order:

  1. we must read the flags to know how long the name is

  2. we must know how long the name is in order to allocate the
     struct cache_entry, since it has a FLEX_ARRAY

  3. we must allocate the cache_entry in order to parse the stat_data
     and oid into it

This makes the parser a little hard to follow, because we have to access
the flags using an offset, rather than walking through it in byte order.

We can break the cyclic dependency by parsing the stat_data, etc, into
temporary variables, then allocating the cache_entry and copying the
parsed values in. This sets us up for simplifying the parsing in the
next commit.

The downside is that we're copying the data an extra time. It's not very
much data, and it's all fixed size, so the compiler should be able to do
a reasonable job of optimizing here. But I didn't time the potential
impact.

Note one subtlety in the patch: besides reordering the flags/name versus
stat data, we adjust the order within the stat data to match the on-disk
order (notably both fields of each cache_time struct are adjacent).

Signed-off-by: Jeff King <peff@peff.net>
---
 read-cache.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index d16eb97906..efb9efa5ee 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1879,11 +1879,14 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 					    unsigned long *ent_size,
 					    const struct cache_entry *previous_ce)
 {
+	struct stat_data sd;
+	unsigned int mode;
+	struct object_id oid;
 	struct cache_entry *ce;
 	size_t len;
 	const char *name;
 	const unsigned hashsz = the_hash_algo->rawsz;
-	const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
+	const char *flagsp;
 	unsigned int flags;
 	size_t copy_len = 0;
 	/*
@@ -1895,6 +1898,24 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 	 */
 	int expand_name_field = version == 4;
 
+	sd.sd_ctime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ctime)
+							+ offsetof(struct cache_time, sec));
+	sd.sd_ctime.nsec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ctime)
+							 + offsetof(struct cache_time, nsec));
+	sd.sd_mtime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime)
+							+ offsetof(struct cache_time, sec));
+	sd.sd_mtime.nsec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime)
+							 + offsetof(struct cache_time, nsec));
+	sd.sd_dev   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, dev));
+	sd.sd_ino   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ino));
+	mode        = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mode));
+	sd.sd_uid   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, uid));
+	sd.sd_gid   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, gid));
+	sd.sd_size  = get_be32(ondisk + offsetof(struct ondisk_cache_entry, size));
+
+	oidread(&oid, (const unsigned char *)ondisk + offsetof(struct ondisk_cache_entry, data));
+	flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
+
 	/* On-disk flags are just 16 bits */
 	flags = get_be16(flagsp);
 	len = flags & CE_NAMEMASK;
@@ -1934,25 +1955,12 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 	}
 
 	ce = mem_pool__ce_alloc(ce_mem_pool, len);
-
-	ce->ce_stat_data.sd_ctime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ctime)
-							+ offsetof(struct cache_time, sec));
-	ce->ce_stat_data.sd_mtime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime)
-							+ offsetof(struct cache_time, sec));
-	ce->ce_stat_data.sd_ctime.nsec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ctime)
-							 + offsetof(struct cache_time, nsec));
-	ce->ce_stat_data.sd_mtime.nsec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime)
-							 + offsetof(struct cache_time, nsec));
-	ce->ce_stat_data.sd_dev   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, dev));
-	ce->ce_stat_data.sd_ino   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ino));
-	ce->ce_mode  = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mode));
-	ce->ce_stat_data.sd_uid   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, uid));
-	ce->ce_stat_data.sd_gid   = get_be32(ondisk + offsetof(struct ondisk_cache_entry, gid));
-	ce->ce_stat_data.sd_size  = get_be32(ondisk + offsetof(struct ondisk_cache_entry, size));
+	ce->ce_stat_data = sd;
+	ce->ce_mode = mode;
 	ce->ce_flags = flags & ~CE_NAMEMASK;
 	ce->ce_namelen = len;
 	ce->index = 0;
-	oidread(&ce->oid, (const unsigned char *)ondisk + offsetof(struct ondisk_cache_entry, data));
+	oidcpy(&ce->oid, &oid);
 
 	if (expand_name_field) {
 		if (copy_len)
-- 
2.38.0.rc2.615.g4fac75f9e3


  parent reply	other threads:[~2022-09-28  4:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27 21:11 What's cooking in git.git (Sep 2022, #08; Tue, 27) Junio C Hamano
2022-09-28  1:52 ` Victoria Dye
2022-09-28  4:18   ` vd/fix-unaligned-read-index-v4, was " Jeff King
2022-09-28  4:19     ` [PATCH 1/3] pack-bitmap: make read_be32() public Jeff King
2022-09-28  4:21     ` Jeff King [this message]
2022-09-29 11:27       ` [PATCH 2/3] read-cache: read on-disk entries in byte order Jeff King
2022-09-29 15:47         ` Junio C Hamano
2022-09-28  4:23     ` [PATCH 3/3] read-cache: use read_be32() in create_from_disk() Jeff King
2022-09-28 16:41     ` vd/fix-unaligned-read-index-v4, was Re: What's cooking in git.git (Sep 2022, #08; Tue, 27) Junio C Hamano
2022-09-28 17:01       ` Ævar Arnfjörð Bjarmason
2022-09-28 17:41         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YzPLuizoOlDuPu4G@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=phillip.wood123@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).