git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Victoria Dye <vdye@github.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, Phillip Wood <phillip.wood123@gmail.com>
Subject: vd/fix-unaligned-read-index-v4, was Re: What's cooking in git.git (Sep 2022, #08; Tue, 27)
Date: Wed, 28 Sep 2022 00:18:12 -0400	[thread overview]
Message-ID: <YzPLBN09zzlTdNgc@coredump.intra.peff.net> (raw)
In-Reply-To: <be8f11f2-c4ad-0542-066b-3bbc99a16dae@github.com>

On Tue, Sep 27, 2022 at 06:52:02PM -0700, Victoria Dye wrote:

> Junio C Hamano wrote:
> > * vd/fix-unaligned-read-index-v4 (2022-09-23) 1 commit
> >  - read-cache: avoid misaligned reads in index v4
> > 
> >  The codepath that reads from the index v4 had unaligned memory
> >  accesses, which has been corrected.
> > 
> >  Expecting a reroll.
> >  cf. <Yy4nkEnhuzt2iH+R@coredump.intra.peff.net>
> >  cf. <bb3a2470-7ff5-e4a6-040a-96e0e3833978@gmail.com>
> >  source: <pull.1366.git.1663962236069.gitgitgadget@gmail.com>
> > 
> 
> How drastic an update were you expecting for this re-roll? To keep the fix
> minimal (that is, focused on 'create_from_disk()'), I was planning to just
> add some comments explaining the implementation (in response to [1]). If the
> goal is to get this merged quickly, I'd want to avoid a larger refactor
> (suggested in [2] & [3]), since doing so would either make the
> implementations of "read from disk" ('create_from_disk()') and "write to
> disk" ('copy_cache_entry_to_ondisk()') different/difficult to compare, or
> would involve a more invasive refactor to update both functions [4].

The first "cf" there is my initial request for a v2, but I since
retracted that. I have no objection to adding more comments, but I am
happy enough without them (like Junio, it may be that I'm overly
familiar with how I expect our get_be() functions to handle alignment).

I think even if we want to go further in the near term, it is still best
to build it on top of your basic fix. And we can take that fix and make
further refactoring (if any) its own topic.

I don't think even that basic fix needs to happen before the release,
though.  While it is a slight regression that SANITIZE=undefined does
not pass in the 2.38 release candidates:

  - the bug really was there all along. It's just that a new test
    triggers it.

  - I don't think it's a sign of actual problems; we are forming an
    unaligned pointer via a funny cast, which is technically UB, but I
    don't think any real-world platforms actually care, since we
    dereference it only via get_be16().

So it's mostly just a minor annoyance for running the tests; we're
probably better not to change any code, even trivially, this late in the
release cycle.

I was going to point to my branch with commits that are slightly less
rough than what I posted, for somebody to work on it later if
interested. But after applying a minimal amount of polish, I think they
are pretty reasonable, so I'll actually share them here.

IMHO it would be OK to proceed with them even if we don't redo the
writing side. But I can see the argument that they should come together.

I'm also OK if we just drop it. It was a fun puzzle to make it all work,
and I do like the result. But arguably it's just churn, and in
particular I haven't measured the re-ordering patch to see if it
produces a performance difference. I doubt it does, but it's important
enough that we should get a clear answer.

So here are my refactoring patches for the read side, built on top of
your earlier patch. They resolve the "yuck" comment by storing and
copying individual fields (which is much less verbose than you'd think
because of struct assignment). I left the pointer type as signed; there
are still come casts, but the same ones that were necessary before my
series.

  [1/3]: pack-bitmap: make read_be32() public
  [2/3]: read-cache: read on-disk entries in byte order
  [3/3]: read-cache: use read_be32() in create_from_disk()

 compat/bswap.h | 22 ++++++++++++++++++
 pack-bitmap.c  | 12 ----------
 read-cache.c   | 60 +++++++++++++++++++++++++-------------------------
 3 files changed, 52 insertions(+), 42 deletions(-)

-Peff

  reply	other threads:[~2022-09-28  4:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27 21:11 What's cooking in git.git (Sep 2022, #08; Tue, 27) Junio C Hamano
2022-09-28  1:52 ` Victoria Dye
2022-09-28  4:18   ` Jeff King [this message]
2022-09-28  4:19     ` [PATCH 1/3] pack-bitmap: make read_be32() public Jeff King
2022-09-28  4:21     ` [PATCH 2/3] read-cache: read on-disk entries in byte order Jeff King
2022-09-29 11:27       ` Jeff King
2022-09-29 15:47         ` Junio C Hamano
2022-09-28  4:23     ` [PATCH 3/3] read-cache: use read_be32() in create_from_disk() Jeff King
2022-09-28 16:41     ` vd/fix-unaligned-read-index-v4, was Re: What's cooking in git.git (Sep 2022, #08; Tue, 27) Junio C Hamano
2022-09-28 17:01       ` Ævar Arnfjörð Bjarmason
2022-09-28 17:41         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YzPLBN09zzlTdNgc@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=phillip.wood123@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).