All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Florian Weimer <fweimer@redhat.com>,
	linux-toolchains@vger.kernel.org, Will Deacon <will@kernel.org>,
	Paul McKenney <paulmck@kernel.org>,
	linux-kernel@vger.kernel.org, stern@rowland.harvard.edu,
	parri.andrea@gmail.com, boqun.feng@gmail.com, npiggin@gmail.com,
	dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr,
	akiyks@gmail.com, dlustig@nvidia.com, joel@joelfernandes.org,
	torvalds@linux-foundation.org
Subject: Re: Control Dependencies vs C Compilers
Date: Wed, 7 Oct 2020 12:30:33 +0200	[thread overview]
Message-ID: <20201007103033.GB6550@1wt.eu> (raw)
In-Reply-To: <20201007093243.GB2628@hirez.programming.kicks-ass.net>

On Wed, Oct 07, 2020 at 11:32:43AM +0200, Peter Zijlstra wrote:
> A branch that cannot be optimized away and prohibits lifting stores
> over. One possible suggestion would be allowing the volatile keyword as
> a qualifier to if.
> 
> 	x = *foo;
> 	volatile if (x > 42)
> 		*bar = 1;
> 
> This would tell the compiler that the condition is special in that it
> must emit a conditional branch instruction and that it must not lift
> stores (or sequence points) over it.

This test is interesting, because if foo and bar are of the same type,
nothing prevents them from aliasing and the compiler cannot make wild
guesses on them (i.e. they may be plain memory as well as memory-mapped
registers).

Extending it like this shows a difference between the use of volatile
and __atomic_{load,store}_n. While both are correct in that each access
is properly performed, for an unknown reason the compiler decided to
implement two distinct branches in the atomic case and to inflate the
code:

$ gcc -v
gcc version 9.3.0 (GCC)

$ cat foo-volatile.c
long foobar(long *foo, long *bar)
{
        *(volatile long *)bar = 10;
        if (*(volatile long *)foo <= 42)
                *(volatile long *)bar = 64;
        if (*(volatile long *)foo > 42)
                *(volatile long *)bar = 0;
        return *(volatile long *)bar;
}
$ gcc -c -O2 foo-volatile.c
$ objdump -dr foo-volatile.o 
0000000000000000 <foobar>:
   0:   48 c7 06 0a 00 00 00    movq   $0xa,(%rsi)
   7:   48 8b 07                mov    (%rdi),%rax
   a:   48 83 f8 2a             cmp    $0x2a,%rax
   e:   7f 07                   jg     17 <foobar+0x17>
  10:   48 c7 06 40 00 00 00    movq   $0x40,(%rsi)
  17:   48 8b 07                mov    (%rdi),%rax
  1a:   48 83 f8 2a             cmp    $0x2a,%rax
  1e:   7e 07                   jle    27 <foobar+0x27>
  20:   48 c7 06 00 00 00 00    movq   $0x0,(%rsi)
  27:   48 8b 06                mov    (%rsi),%rax
  2a:   c3                      retq   


$ cat foo-atomic.c  
long foobar(long *foo, long *bar)
{
        __atomic_store_n(bar, 10, __ATOMIC_RELAXED);
        if (__atomic_load_n(foo, __ATOMIC_RELAXED) <= 42)
                __atomic_store_n(bar, 64, __ATOMIC_RELAXED);
        if (__atomic_load_n(foo, __ATOMIC_RELAXED) > 42)
                __atomic_store_n(bar, 0, __ATOMIC_RELAXED);
        return __atomic_load_n(bar, __ATOMIC_RELAXED);
}
$ objdump -dr foo-atomic.o 
0000000000000000 <foobar>:
   0:   48 c7 06 0a 00 00 00    movq   $0xa,(%rsi)
   7:   48 8b 07                mov    (%rdi),%rax
   a:   48 83 f8 2a             cmp    $0x2a,%rax
   e:   7e 10                   jle    20 <foobar+0x20>
  10:   48 8b 07                mov    (%rdi),%rax
  13:   48 83 f8 2a             cmp    $0x2a,%rax
  17:   7f 17                   jg     30 <foobar+0x30>
  19:   48 8b 06                mov    (%rsi),%rax
  1c:   c3                      retq   
  1d:   0f 1f 00                nopl   (%rax)
  20:   48 c7 06 40 00 00 00    movq   $0x40,(%rsi)
  27:   48 8b 07                mov    (%rdi),%rax
  2a:   48 83 f8 2a             cmp    $0x2a,%rax
  2e:   7e e9                   jle    19 <foobar+0x19>
  30:   48 c7 06 00 00 00 00    movq   $0x0,(%rsi)
  37:   48 8b 06                mov    (%rsi),%rax
  3a:   c3                      retq   

When building at -Os both produce the same code as the volatile version
above. It *seems* to me that the volatile version always produces more
optimal code, but is it always correct ? This is just an illustration of
how tricky this can currently be and how confusing it can sometimes be
for the developer to make sure the desired code is emitted in a few
special cases. And just for this, having the compiler support more easily
predictable constructs would be a nice improvement.

Willy

      parent reply	other threads:[~2020-10-07 10:31 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-06 11:47 Control Dependencies vs C Compilers Peter Zijlstra
2020-10-06 12:37 ` David Laight
2020-10-06 12:49   ` Willy Tarreau
2020-10-06 13:31   ` Peter Zijlstra
2020-10-06 14:23     ` stern
2020-10-06 14:43       ` Peter Zijlstra
2020-10-06 15:16         ` Nick Clifton
2020-10-06 15:37           ` David Laight
2020-10-06 15:50             ` Paul E. McKenney
2020-10-06 16:10               ` Willy Tarreau
2020-10-06 16:22                 ` David Laight
2020-10-06 16:31                   ` Paul E. McKenney
2020-10-06 15:07     ` David Laight
2020-10-06 21:20 ` Florian Weimer
2020-10-07  9:32   ` Peter Zijlstra
2020-10-07 10:20     ` Florian Weimer
2020-10-07 11:50       ` Peter Zijlstra
2020-10-07 17:11         ` Paul E. McKenney
2020-10-07 21:07           ` Peter Zijlstra
2020-10-07 21:20             ` Paul E. McKenney
2020-10-07 10:30     ` Willy Tarreau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201007103033.GB6550@1wt.eu \
    --to=w@1wt.eu \
    --cc=akiyks@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=fweimer@redhat.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=luc.maranget@inria.fr \
    --cc=npiggin@gmail.com \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.