From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726642AbgF3UMy (ORCPT ); Tue, 30 Jun 2020 16:12:54 -0400 Date: Tue, 30 Jun 2020 22:12:43 +0200 From: Peter Zijlstra Subject: Re: [PATCH 00/22] add support for Clang LTO Message-ID: <20200630201243.GD4817@hirez.programming.kicks-ass.net> References: <20200624203200.78870-1-samitolvanen@google.com> <20200624211540.GS4817@hirez.programming.kicks-ass.net> <20200625080313.GY4817@hirez.programming.kicks-ass.net> <20200625082433.GC117543@hirez.programming.kicks-ass.net> <20200625085745.GD117543@hirez.programming.kicks-ass.net> <20200630191931.GA884155@elver.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200630191931.GA884155@elver.google.com> Sender: linux-kbuild-owner@vger.kernel.org List-ID: To: Marco Elver Cc: Nick Desaulniers , Sami Tolvanen , Masahiro Yamada , Will Deacon , Greg Kroah-Hartman , "Paul E. McKenney" , Kees Cook , clang-built-linux , Kernel Hardening , linux-arch , Linux ARM , Linux Kbuild mailing list , LKML , linux-pci@vger.kernel.org, "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" On Tue, Jun 30, 2020 at 09:19:31PM +0200, Marco Elver wrote: > I was asked for input on this, and after a few days digging through some > history, thought I'd comment. Hope you don't mind. Not at all, being the one that asked :-) > First of all, I agree with the concerns, but not because of LTO. > > To set the stage better, and summarize the fundamental problem again: > we're in the unfortunate situation that no compiler today has a way to > _efficiently_ deal with C11's memory_order_consume > [https://lwn.net/Articles/588300/]. If we did, we could just use that > and be done with it. But, sadly, that doesn't seem possible right now -- > compilers just say consume==acquire. I'm not convinced C11 memory_order_consume would actually work for us, even if it would work. That is, given: https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ only pointers can have consume, but like I pointed out, we have code that relies on dependent loads from integers. > Will suggests doing the same in the > kernel: https://lkml.kernel.org/r/20200630173734.14057-19-will@kernel.org PowerPC would need a similar thing, it too will not preserve causality for control dependecies. > What we're most worried about right now is the existence of compiler > transformations that could break data dependencies by e.g. turning them > into control dependencies. Correct. > If this is a real worry, I don't think LTO is the magical feature that > will uncover those optimizations. If these compiler transformations are > real, they also exist in a normal build! Agreed, _however_ with the caveat that LTO could make them more common. After all, with whole program analysis, the compiler might be able to more easily determine that our pointer @ptr is only ever assigned the values of &A, &B or &C, while without that visibility it would not be able to determine this. Once it knows @ptr has a limited number of determined values, the conversion into control dependencies becomes much more likely. > And if we are worried about them, we need to stop relying on dependent > load ordering across the board; or switch to -O0 for everything. > Clearly, we don't want either. Agreed. > Why do we think LTO is special? As argued above, whole-program analysis would make it more likely. But I agree the fundamental problem exists independent from LTO. > But as far as we can tell, there is no evidence of the dreaded "data > dependency to control dependency" conversion with LTO that isn't there > in non-LTO builds, if it's even there at all. Has the data to control > dependency conversion been encountered in the wild? If not, is the > resulting reaction an overreaction? If so, we need to be careful blaming > LTO for something that it isn't even guilty of. It is mostly paranoia; in a large part driven by the fact that even if such a conversion were to be done, it could go a very long time without actually causing problems, and longer still for such problems to be traced back to such an 'optimization'. That is, the collective hurt from debugging too many ordering issues. > So, we are probably better off untangling LTO from the story: > > 1. LTO or no LTO does not matter. The LTO series should not get tangled > up with memory model issues. > > 2. The memory model question and problems need to be answered and > addressed separately. > > Thoughts? How hard would it be to creates something that analyzes a build and looks for all 'dependent load -> control dependency' transformations headed by a volatile (and/or from asm) load and issues a warning for them? This would give us an indication of how valuable this transformation is for the kernel. I'm hoping/expecting it's vanishingly rare, but what do I know.