From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52555C43331 for ; Sun, 10 Nov 2019 19:10:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1356720818 for ; Sun, 10 Nov 2019 19:10:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kjaZ60yx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726976AbfKJTKU (ORCPT ); Sun, 10 Nov 2019 14:10:20 -0500 Received: from mail-oi1-f172.google.com ([209.85.167.172]:39686 "EHLO mail-oi1-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726835AbfKJTKT (ORCPT ); Sun, 10 Nov 2019 14:10:19 -0500 Received: by mail-oi1-f172.google.com with SMTP id v138so9648564oif.6 for ; Sun, 10 Nov 2019 11:10:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jtANkV35F2JXgqRyYaqnd8apA5ZKbg0CsqX6f/wW7G0=; b=kjaZ60yx1U8QwWJ5AdH1KL3JLfOSG0oZmA/VyGX+A5FPB6Upc5KtothrJXt4ppFlnC tKwd6mlOJ7rDzrYA06u7YLuTKKbanB7IqFM8v61c9aDB94/GLEEoVhE6u6//Hgl2Ugm9 WP2q2xFw21K7e1mCyU6p8HvW6Sjn6NN5iMjs6DI9zy5hqQJ5sOEU4vJe5lxF6dGqRjWw LSvQaNC88OJl7hkgBvyL1dkMNRfdm/0MytwEmwbyl0BaySWBbK5wWGzF3UKOUBkO+xOT FTuhOFAEpyo/fBVb1UJSVauL65p5NWJxUDj+SZ9k6alOmw9Oz3Hcdeg/c7/jF4o0PyQ0 ZJcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jtANkV35F2JXgqRyYaqnd8apA5ZKbg0CsqX6f/wW7G0=; b=R3myHkDgC0phEzRWmwu/Q7dYmW0se+zEnBQt5Ic/G3VqOHQJMGuZhN/PT+wgOJE6cd ec5I6GC5XVufSHhsMQfJj/gTgPYes8nD7Bi2AoLj8oNzF6p8coi4xgMrnB+TNSJT3XwO Rsb7fyJ9enJ7+mWtBZ8gf6u6Ss3Wo2SOOMSkVtFEzD4KcsYlSS+RBZr/eeOWMuNh0Fhx oaiRyqWID2M2SuThtxYq9TrYHHydJ40veg1NuJhdmQtgLx9X8ZpcUccmBXTr2+QAq+W3 GCaSgH/vginWA1B7tdwqHvYaeBRxCwGOtYTwDTpQH2Bv9/3WJAQFHaZm11hhYU/5clSg hI7Q== X-Gm-Message-State: APjAAAUVmgeG9jvl2V3vFOE6nkviPRi1qlCDgrT/NTjNqpXsvKEggp/d ZA4ftSwJ5vtIkd3L+pwxzByVbu2CSiYCBQy2KkL9ezPZelo= X-Google-Smtp-Source: APXvYqwfrFdDnkCEcNdO1DQUFeM2s0tDxzH8nYCw+xKb/U+TiqTqLRoxEZIuKyGJqfJRPVtQjPz0FxU8yjhJpVWTXi8= X-Received: by 2002:aca:f046:: with SMTP id o67mr19655355oih.155.1573413017750; Sun, 10 Nov 2019 11:10:17 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Marco Elver Date: Sun, 10 Nov 2019 20:10:05 +0100 Message-ID: Subject: Re: KCSAN: data-race in __alloc_file / __alloc_file To: Alan Stern Cc: Linus Torvalds , Eric Dumazet , Eric Dumazet , syzbot , linux-fsdevel , Linux Kernel Mailing List , syzkaller-bugs , Al Viro , Andrea Parri , "Paul E. McKenney" , LKMM Maintainers -- Akira Yokosawa Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 10 Nov 2019 at 17:09, Alan Stern wrote: > > On Sat, 9 Nov 2019, Linus Torvalds wrote: > > > On Sat, Nov 9, 2019, 15:08 Alan Stern wrote: > > > > > On Fri, 8 Nov 2019, Linus Torvalds wrote: > > > > > > > > Two writes to normal memory are *not* idempotent if they write > > > > different values. The ordering very much matters, and it's racy and a > > > > tool should complain. > > > > > > What you have written strongly indicates that either you think the word > > > "idempotent" means something different from its actual meaning or else > > > you are misusing the word in a very confusing way. > > > > > > > "Idempotence is the property of certain operations in mathematics and > > computer science whereby they can be applied multiple times without > > changing the result beyond the initial application. " > > > > This is (for example) commonly used when talking about nfs operations, > > where you can re-send the same nfs operation, and it's ok (even if it has > > side effects) because the server remembers that it already did the > > operation. If it's already been done, nothing changes. > > > > It may not match your definition in some other area, but this is very much > > the accepted meaning of the word in computer science and operating systems. > > Agreed. My point was that you were using the word in a way which did > not match this definition. > > Never mind that. You did not respond to the question at the end of my > previous email: Should the LKMM be changed so that two writes are not > considered to race with each other if they store the same value? > > That change would take care of the original issue of this email thread, > wouldn't it? And it would render WRITE_IDEMPOTENT unnecessary. > > Making that change would amount to formalizing your requirement that > the compiler should not invent stores to shared variables. In C11 such > invented stores are allowed. Given something like this: > > require a register spill> > x = 1234; > > C11 allows the compiler to store an intermediate value in x rather than > allocating a slot on the stack for the register spill. After all, x is > going to be overwritten anyway, and if any other thread accessed x > during the complex computation then it would race with the final store > and so the behavior would be undefined in any case. > > If you want to specifically forbid the compiler from doing this, it > makes sense to change the memory model accordingly. > > For those used to thinking in terms of litmus tests, consider this one: > > C equivalent-writes > > {} > > P0(int *x) > { > *x = 1; > } > > P1(int *x) > { > *x = 1; > } > > exists (~x=1) > > Should the LKMM say that this litmus test contains a race? > > This suggests that we might also want to relax the notion of a write > racing with a read, although in that case I'm not at all sure what the > appropriate change to the memory model would be. Something along the > lines of: If a write W races with a read R, but W stores the same value > that R would have read if W were not present, then it's not really a > race. But of course this is far too vague to be useful. What if you introduce to the above litmus test: P2(int *x) { *x = 2; } How can a developer, using the LKMM as a reference, hope to prove their code is free from data races without having to enumerate all possible values a variable could contain (in addition to all possible interleavings)? I view introducing data value dependencies, for the sake of allowing more programs, to a language memory model as a slippery slope, and am not aware of any precedent where this worked out. The additional complexity in the memory model would put a burden on developers and the compiler that is unlikely to be a real benefit (as you pointed out, the compiler may even need to disable some transformations). From a practical point of view, if the LKMM departs further and further from C11's memory model, how do we ensure all compilers do the right thing? My vote would go to explicit annotation, not only because it reduces hidden complexity, but also because it makes the code more understandable, for developers and tooling. As an additional point, I find the original suggestion to add WRITE_ONCE to be the least bad (or some other better named WRITE_). Consider somebody changing the code, changing the semantics and the values written to "non_rcu". With a WRITE_ONCE, the developer would be clear about the fact that the write can happen concurrently, and ensure new code is written with the assumption that concurrent writes can happen. Thanks, -- Marco