From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCCB3ECAAA2 for ; Thu, 25 Aug 2022 22:12:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244302AbiHYWM3 (ORCPT ); Thu, 25 Aug 2022 18:12:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244122AbiHYWMY (ORCPT ); Thu, 25 Aug 2022 18:12:24 -0400 Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A177AB2D80; Thu, 25 Aug 2022 15:12:21 -0700 (PDT) Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 27PLvwJJ030264; Thu, 25 Aug 2022 16:57:58 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 27PLvtxj030255; Thu, 25 Aug 2022 16:57:55 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Thu, 25 Aug 2022 16:57:54 -0500 From: Segher Boessenkool To: Linus Torvalds Cc: Alexander Potapenko , Matthew Wilcox , Thomas Gleixner , Alexander Viro , Alexei Starovoitov , Andrew Morton , Andrey Konovalov , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Christoph Hellwig , Christoph Lameter , David Rientjes , Dmitry Vyukov , Eric Dumazet , Greg Kroah-Hartman , Herbert Xu , Ilya Leoshkevich , Ingo Molnar , Jens Axboe , Joonsoo Kim , Kees Cook , Marco Elver , Mark Rutland , "Michael S. Tsirkin" , Pekka Enberg , Peter Zijlstra , Petr Mladek , Steven Rostedt , Vasily Gorbik , Vegard Nossum , Vlastimil Babka , kasan-dev , Linux Memory Management List , Linux-Arch , LKML Subject: Re: [PATCH v4 44/45] mm: fs: initialize fsdata passed to write_begin/write_end interface Message-ID: <20220825215754.GI25951@gate.crashing.org> References: <20220701142310.2188015-1-glider@google.com> <20220701142310.2188015-45-glider@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 25, 2022 at 09:33:18AM -0700, Linus Torvalds wrote: > On Thu, Aug 25, 2022 at 8:40 AM Alexander Potapenko wrote: > > > > On Mon, Jul 4, 2022 at 10:07 PM Matthew Wilcox wrote: > > > > > > ... wait, passing an uninitialised variable to a function *which doesn't > > > actually use it* is now UB? What genius came up with that rule? What > > > purpose does it serve? > > > > > > > There is a discussion at [1], with Segher pointing out a reason for > > this rule [2] and Linus requesting that we should be warning about the > > cases where uninitialized variables are passed by value. > > I think Matthew was actually more wondering how that UB rule came to be. > > Personally, I pretty much despise *all* cases of "undefined behavior", Let me start by saying you're not alone. But some UB *cannot* be worked around by compilers (we cannot solve the halting problem), and some is very expensive to work around (initialising huge structures is a typical example). Many (if not most) instances of undefined behaviour are unavoidable with a language like C. A very big part of this is separate compilation, that is, compiling translation units separately from each other, so that the compiler does not see all the ways that something is used when it is compiling it. There only is UB if something is *used*. > but "uninitialized argument" across a function call is one of the more > understandable ones. Allowing this essentially never allows generating better machine code, so there are no real arguments for ever allowing it, other than just inertia: uninitialised everything else is allowed just fine, and only actually *using* such data is UB. Passing it around is not! That is how everything used to work (with static data, automatic data, function parameters, the lot). But it now is clarified that passing data to a function as function argument is a use of that data by itself, even if the function will not even look at it ever. > I personally was actually surprised compilers didn't warn for "you are > using an uninitialized value" for a function call argument, because I > mentally consider function call arguments to *be* a use of a value. The function call is a use of all passed arguments. > Except when the function is inlined, and then it's all different - the > call itself goes away, and I *expect* the compiler to DTRT and not > "use" the argument except when it's used inside the inlined function. > Because hey, that's literally the whole point of inlining, and it > makes the "static checking" problem go away at least for a compiler. But UB is defined in terms of the abstract machine (like *all* of C), not in terms of the generated machine code. Typically things will work fine if they "become invisible" by inlining, but this does not make the program a correct program ever. Sorry :-( Segher