All of lore.kernel.org
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Git List <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/5] add SWAP macro
Date: Mon, 30 Jan 2017 22:46:19 +0100	[thread overview]
Message-ID: <6760493c-3684-b8bb-2c01-6621b8417246@web.de> (raw)
In-Reply-To: <alpine.DEB.2.20.1701301806591.3469@virtualbox>

Am 30.01.2017 um 21:48 schrieb Johannes Schindelin:
> So I tried to verify that Visual C optimizes this well, and oh my deity,
> this was not easy. In Debug mode, it does not optimize, i.e. the memcpy()
> will be called, even for simple 32-bit integers. In Release mode, Visual
> Studio's defaults turn on "whole-program optimization" which means that
> the only swapping that is going on in the end is that the meaning of two
> registers is swapped, i.e. the SWAP() macro is expanded to... no assembler
> code at all.
>
> After trying this and that and something else, I finally ended up with the
> file-scope optimized SWAP() resulting in this assembly code:
>
> 00007FF791311000  mov         eax,dword ptr [rcx]
> 00007FF791311002  mov         r8d,dword ptr [rdx]
> 00007FF791311005  mov         dword ptr [rcx],r8d
> 00007FF791311008  mov         dword ptr [rdx],eax

This looks good.

> However, recent events (including some discussions on this mailing list
> which had unfortunate consequences) made me trust my instincts more. And
> my instincts tell me that it would be unwise to replace code that swaps
> primitive types in the straight-forward way by code that relies on
> advanced compiler optimization to generate efficient code, otherwise
> producing very suboptimal code.

I don't know how difficult it was to arrive at the result above, but I 
wouldn't call inlining memcpy(3) an advanced optimization (anymore?), 
since the major compilers seem to be doing that.

The SWAP in prio-queue.c seems to be the one with the biggest 
performance impact.  Or perhaps it's the one in lookup_object()?  The 
former is easier to measure, though.

Here's what I get with CFLAGS="-builtin -O2" (best of five):

$ time ./t/helper/test-prio-queue $(seq 100000 -1 1) dump >/dev/null

real    0m0.142s
user    0m0.120s
sys     0m0.020s

And this is with CFLAGS="-no-builtin -O2":

$ time ./t/helper/test-prio-queue $(seq 100000 -1 1) dump >/dev/null

real    0m0.170s
user    0m0.156s
sys     0m0.012s

Hmm.  Not nice, but also not prohibitively slow.

> The commit you quoted embarrasses me, and I have no excuse for it. I would
> love to see that myswap() ugliness fixed by replacing it with a construct
> that is simpler, and generates good code even without any smart compiler.

I don't see a way to do that without adding a type parameter.

René

  reply	other threads:[~2017-01-30 21:46 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-28 21:13 [PATCH 0/5] introduce SWAP macro René Scharfe
2017-01-28 21:38 ` [PATCH 1/5] add " René Scharfe
2017-01-30 15:39   ` Johannes Schindelin
2017-01-30 16:48     ` René Scharfe
2017-01-30 20:48       ` Johannes Schindelin
2017-01-30 21:46         ` René Scharfe [this message]
2017-01-31 12:13           ` Johannes Schindelin
2017-01-31 21:02             ` René Scharfe
2017-02-01  0:44               ` Ramsay Jones
2017-02-01 11:39               ` Johannes Schindelin
2017-01-30 16:01   ` Johannes Schindelin
2017-01-30 16:59     ` René Scharfe
2017-01-30 18:41     ` Johannes Sixt
2017-01-30 21:03       ` Johannes Schindelin
2017-01-30 22:09         ` René Scharfe
2017-01-30 22:21           ` Brandon Williams
2017-01-31 21:03             ` René Scharfe
2017-01-31 21:35               ` Jeff King
2017-01-31 22:29                 ` Junio C Hamano
2017-01-31 22:36                   ` Jeff King
2017-02-01 11:28                 ` Johannes Schindelin
2017-02-01 11:47                   ` Jeff King
2017-02-01 18:06                     ` René Scharfe
2017-02-01 18:33                       ` Junio C Hamano
2017-02-07 22:04                         ` René Scharfe
2017-02-07 22:30                           ` Junio C Hamano
2017-02-08 15:14                             ` Johannes Schindelin
2017-01-31 12:03           ` Johannes Schindelin
2017-04-24 11:29   ` Jeff King
2017-04-24 11:49     ` Jeff King
2017-04-24 13:13       ` Duy Nguyen
2017-04-28 17:04     ` René Scharfe
2017-04-28 21:49       ` Jeff King
2017-04-29 18:16         ` René Scharfe
2017-04-30  3:11           ` Jeff King
2017-05-02  5:29             ` René Scharfe
2017-01-28 21:40 ` [PATCH 2/5] apply: use " René Scharfe
2017-01-28 21:40 ` [PATCH 3/5] " René Scharfe
2017-01-30 16:03   ` Johannes Schindelin
2017-01-30 17:18     ` René Scharfe
2017-01-30 22:22   ` Junio C Hamano
2017-01-31 21:02     ` René Scharfe
2017-01-28 21:41 ` [PATCH 4/5] diff: " René Scharfe
2017-01-30 16:04   ` Johannes Schindelin
2017-01-30 17:26     ` René Scharfe
2017-01-30 22:22   ` Junio C Hamano
2017-01-28 21:42 ` [PATCH 5/5] graph: " René Scharfe
2017-01-30 16:16   ` Johannes Schindelin
2017-01-30 17:41     ` René Scharfe
2017-01-30 23:20 ` [PATCH 0/5] introduce " Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6760493c-3684-b8bb-2c01-6621b8417246@web.de \
    --to=l.s.r@web.de \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.