llvm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Alan Stern <stern@rowland.harvard.edu>
To: "Paul Heidekrüger" <paul.heidekrueger@in.tum.de>
Cc: paulmck@kernel.org, will@kernel.org, peterz@infradead.org,
	boqun.feng@gmail.com, parri.andrea@gmail.com,
	linux-kernel@vger.kernel.org, llvm@lists.linux.dev,
	elver@google.com, charalampos.mainas@gmail.com,
	pramod.bhatotia@in.tum.de
Subject: Re: Potentially Broken Address Dependency via test_bit() When Compiling With Clang
Date: Thu, 28 Oct 2021 10:34:46 -0400	[thread overview]
Message-ID: <20211028143446.GA1351384@rowland.harvard.edu> (raw)
In-Reply-To: <YXqZm6XTlMGDSpMT@Pauls-MacBook-Pro>

On Thu, Oct 28, 2021 at 02:37:47PM +0200, Paul Heidekrüger wrote:
> On Wed, Oct 27, 2021 at 10:27:20AM -0400, Alan Stern wrote:
> > On Wed, Oct 27, 2021 at 12:19:48PM +0200, Paul Heidekrüger wrote:

> > > Address dependency in source code, lines 373 - 375 in fs/afs/addr_list.c:
> > > 
> > > > [...]
> > > >   index = READ_ONCE(ac->alist->preferred);
> > > >   if (test_bit(index, &set))
> > > >     goto selected;
> > > > [...]
> > > 
> > > where test_bit() expands to the following in 
> > > include/asm-generic/bitops/non-atomic.h, lines 115 - 122:
> > > 
> > > > static __always_inline int
> > > > arch_test_bit(unsigned int nr, const volatile unsigned long *addr)
> > > > {
> > > >   return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
> > > > }
> > > > #define test_bit arch_test_bit

> > However, I can't follow the IR code.  Can you please explain in ordinary 
> > English how the LLVM compiler manages to lose track of this dependency?
> > 
> > Alan Stern
> 
> Here's what we think might be going on:
> - In 'arch_test_bit()', 'addr[BIT_WORD(nr)]' expands to 'addr[(nr) / 64]'.
> - Since 'addr' points to an 'unsigned long', any other result than '0' for
>   '(nr) / 64' would be out of bounds and therefore undefined.
> - We assume LLVM is able to figure this out and use it to get rid of the
>   address computation all together.

Ah, that explains it.  Yes, when set is a single unsigned long (or an 
array of length 1), the address dependency is only syntactic, not 
semantic.  As a result, we should expect that compilers will sometimes 
not preserve it.

The danger, of course, is that people relying on an ordering prescribed 
by the LKMM may get fooled because (unbeknownst to them) the dependency 
in question is not semantic.  It would be great if a static checker 
could detect such things -- but this would require some way for us to 
inform the checker about when the code does rely on a dependency 
ordering.

> We ran some experiments to see how optimisations behave when 'set' is in fact
> an array and / or in global scope.
> 
> 1. Insert a 'barrier()' in 'arch_test_bit()' before the 'return':
> The dependency gets broken.
> 
> 2. Make 'set' an 'unsigned long' array of size '42', keep local scope: 
> The dependency gets preserved.
> 
> 3. Keep 'set' as 'unsigend long', move to global scope: 
> The dependency gets preserved.

That one's a little strange.  I don't see why the scope should make any 
difference, so long as the compiler knows the actual type and length.

> 4. Make 'set' an 'unsigned long' array of size '42', move to global scope: 
> The dependency gets preserved.

Alan

  reply	other threads:[~2021-10-28 14:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27 10:19 Potentially Broken Address Dependency via test_bit() When Compiling With Clang Paul Heidekrüger
2021-10-27 11:56 ` David Laight
2021-10-27 12:17 ` Peter Zijlstra
2021-10-27 12:24   ` Marco Elver
2021-10-27 12:34   ` Peter Zijlstra
2021-10-27 14:27 ` Alan Stern
2021-10-28 12:37   ` Paul Heidekrüger
2021-10-28 14:34     ` Alan Stern [this message]
2021-11-02 18:35       ` Paul Heidekrüger
2021-11-02 19:01         ` Alan Stern
2021-11-04 18:16           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211028143446.GA1351384@rowland.harvard.edu \
    --to=stern@rowland.harvard.edu \
    --cc=boqun.feng@gmail.com \
    --cc=charalampos.mainas@gmail.com \
    --cc=elver@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=parri.andrea@gmail.com \
    --cc=paul.heidekrueger@in.tum.de \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pramod.bhatotia@in.tum.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).