git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: "René Scharfe" <l.s.r@web.de>
Cc: Junio C Hamano <gitster@pobox.com>,
	Han-Wen Nienhuys <hanwen@google.com>, git <git@vger.kernel.org>,
	Han-Wen Nienhuys <hanwenn@gmail.com>
Subject: Re: [PATCH 1/2] bswap.h: drop unaligned loads
Date: Fri, 25 Sep 2020 00:56:20 -0400	[thread overview]
Message-ID: <20200925045620.GA3110076@coredump.intra.peff.net> (raw)
In-Reply-To: <b4b9475f-9c84-1889-835d-9f6e81198e5b@web.de>

On Fri, Sep 25, 2020 at 12:02:38AM +0200, René Scharfe wrote:

> Older versions of gcc and clang didn't see through the shifting
> put_be32() implementation.  If you go back further there are also
> versions that didn't optimize the shifting get_be32().  And the latest
> icc still can't do that.
> 
> gcc 10.2 just optimizes all functions to a bswap and a mov.  Can't do
> any better than that, can you?
> 
> But why do we then see a difference in our benchmark results?  Not sure,
> but https://www.godbolt.org/z/7xh8ao shows that gcc is shuffling some
> instructions around depending on the implementation.  Switch to clang if
> you want to see more vigorous shuffling.

We do redefine ntohl(), etc in compat/bswap.h. Looking at them, I'd
think the result would end up as a bswap instruction, though. And
indeed, trying to feed that to godbolt results in the same output you
got.

It does sound like older compilers were benefiting from the unaligned
versions. Building with gcc-4.8 (from debian jessie in a docker
container on the same machine), I get ~6.25s with the unaligned load
versus ~6.6s with the bit-shifting code. So that's the opposite of how
the newer compiler behaves.

Benchmarking clang-8 (which your results showed doesn't handle the
shifting version well). It likewise is just _slightly_ slower after my
patch (11.47s versus 11.57s).

Given that newer compilers behave the opposite way, and the overall
small magnitude of the impact, I'm still comfortable with the change.
It's nice to have a better understanding of how the compiler is
impacting it (even if I am still confused how anything at all changes on
the newer compilers). Thanks for digging.

-Peff

  reply	other threads:[~2020-09-25  4:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-24 19:16 [PATCH 0/2] drop unaligned loads Jeff King
2020-09-24 19:21 ` [PATCH 1/2] bswap.h: " Jeff King
2020-09-24 22:02   ` René Scharfe
2020-09-25  4:56     ` Jeff King [this message]
2020-09-25  1:13   ` brian m. carlson
2020-09-25  9:05     ` Carlo Arenas
2020-09-25  9:09       ` Jeff King
2020-09-25 20:48       ` Thomas Guyot
2020-09-24 19:22 ` [PATCH 2/2] Revert "fast-export: use local array to store anonymized oid" Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200925045620.GA3110076@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanwen@google.com \
    --cc=hanwenn@gmail.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).