From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67B5FCA9EA0 for ; Tue, 22 Oct 2019 17:43:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0ECBC2084B for ; Tue, 22 Oct 2019 17:43:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="D8bfP3dY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731889AbfJVRnD (ORCPT ); Tue, 22 Oct 2019 13:43:03 -0400 Received: from mail-oi1-f193.google.com ([209.85.167.193]:42307 "EHLO mail-oi1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731053AbfJVRnC (ORCPT ); Tue, 22 Oct 2019 13:43:02 -0400 Received: by mail-oi1-f193.google.com with SMTP id i185so14918203oif.9 for ; Tue, 22 Oct 2019 10:43:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MHo+bPZ3FyZTrXrvFae5gOHqYffPj0NYcnQ23J1Mtd8=; b=D8bfP3dYb9Xw3Ejz/XW6sJaiJoIzrU0if9m7dseFDXz0GuoQXFZZBSTgbjJU0sjON7 CTsIv3bjWf2AvHSPeKOaRXcyEjBAx+URak3jnpv4lulJ/EVuiqqWnNdoWoe554TF/cJ0 FU8/9VTLt4e6HnKwcSlzNGR54imzjgAORjvN05VsZdT3aFYiGR+7MO+6zvojPCzmaXQe rqmerCdhyIjHFH+noVK7tj/FkhNO4aAZ0Up4wIhGTOOBiYn85Qd1A+J2olHu9J0wWQTq mFpeecULEnrKhZxONseYGkjQ356U0Iwm1ionhs0PmWk/gwCIyyzyxv1w+qkcqY7dnMsh p2hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MHo+bPZ3FyZTrXrvFae5gOHqYffPj0NYcnQ23J1Mtd8=; b=oAc4iGIcb90FIkBqLihQJkShxv9xUdlKmNebe0Jx/rPri2rXMvtmrcYMrkpYDHcO2W sRzvdRJJr2+NhZDC9enb3vD4jd0M/mxxSLiE5S3pgADWPUMGX7yOjNdtTaSSwAnysLBq 980GasgeIcRtP7f3a1ETjywEOcbc2uZwuCQIU9RcD0nPUSUpYVG1cfADyO1/HoCGruXF 1qR91AVCDei9LbQID4GjQcEhRuisHIBLJbK7lSpL72IoiRzEJdHStT976pVv1NGuYd/T QE7H0/kJiN90YhGyEuoLyjdMCTR1ZorphUgSvQ3L4G1YXCGNDP9O+KWj6qrVJxdWgtdh uteA== X-Gm-Message-State: APjAAAXpUMFghYRXABzvN0bmi+7d3wFlULcmMUnThelTnN48BnoERgcS rNbSlvEa7LV4Idddqfbm+N8CTsw6AriyP8Y+3jMFdw== X-Google-Smtp-Source: APXvYqxfVwLHQu/nljm2npVYMv0/xxXLHw8hKdDqlhGb7PXQz9eRPAcgKMk0ub1hWAYFfjq7ip3as3hOP5jSlLN4VGw= X-Received: by 2002:aca:f492:: with SMTP id s140mr4056153oih.83.1571766180963; Tue, 22 Oct 2019 10:43:00 -0700 (PDT) MIME-Version: 1.0 References: <20191017141305.146193-1-elver@google.com> <20191017141305.146193-2-elver@google.com> <20191022154858.GA13700@redhat.com> In-Reply-To: <20191022154858.GA13700@redhat.com> From: Marco Elver Date: Tue, 22 Oct 2019 19:42:48 +0200 Message-ID: Subject: Re: [PATCH v2 1/8] kcsan: Add Kernel Concurrency Sanitizer infrastructure To: Oleg Nesterov Cc: LKMM Maintainers -- Akira Yokosawa , Alan Stern , Alexander Potapenko , Andrea Parri , Andrey Konovalov , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Boqun Feng , Borislav Petkov , Daniel Axtens , Daniel Lustig , Dave Hansen , David Howells , Dmitry Vyukov , "H. Peter Anvin" , Ingo Molnar , Jade Alglave , Joel Fernandes , Jonathan Corbet , Josh Poimboeuf , Luc Maranget , Mark Rutland , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Thomas Gleixner , Will Deacon , kasan-dev , linux-arch , "open list:DOCUMENTATION" , linux-efi@vger.kernel.org, Linux Kbuild mailing list , LKML , Linux Memory Management List , "the arch/x86 maintainers" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 22 Oct 2019 at 17:49, Oleg Nesterov wrote: > > On 10/17, Marco Elver wrote: > > > > + /* > > + * Delay this thread, to increase probability of observing a racy > > + * conflicting access. > > + */ > > + udelay(get_delay()); > > + > > + /* > > + * Re-read value, and check if it is as expected; if not, we infer a > > + * racy access. > > + */ > > + switch (size) { > > + case 1: > > + is_expected = expect_value._1 == READ_ONCE(*(const u8 *)ptr); > > + break; > > + case 2: > > + is_expected = expect_value._2 == READ_ONCE(*(const u16 *)ptr); > > + break; > > + case 4: > > + is_expected = expect_value._4 == READ_ONCE(*(const u32 *)ptr); > > + break; > > + case 8: > > + is_expected = expect_value._8 == READ_ONCE(*(const u64 *)ptr); > > + break; > > + default: > > + break; /* ignore; we do not diff the values */ > > + } > > + > > + /* Check if this access raced with another. */ > > + if (!remove_watchpoint(watchpoint)) { > > + /* > > + * No need to increment 'race' counter, as the racing thread > > + * already did. > > + */ > > + kcsan_report(ptr, size, is_write, smp_processor_id(), > > + kcsan_report_race_setup); > > + } else if (!is_expected) { > > + /* Inferring a race, since the value should not have changed. */ > > + kcsan_counter_inc(kcsan_counter_races_unknown_origin); > > +#ifdef CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN > > + kcsan_report(ptr, size, is_write, smp_processor_id(), > > + kcsan_report_race_unknown_origin); > > +#endif > > + } > > Not sure I understand this code... > > Just for example. Suppose that task->state = TASK_UNINTERRUPTIBLE, this task > does __set_current_state(TASK_RUNNING), another CPU does wake_up_process(task) > which does the same UNINTERRUPTIBLE -> RUNNING transition. > > Looks like, this is the "data race" according to kcsan? Yes, they are "data races". They are probably not "race conditions" though. This is a fair distinction to make, and we never claimed to find "race conditions" only -- race conditions are logic bugs that result in bad state due to unexpected interleaving of threads. Data races are more subtle, and become relevant at the programming language level. In Documentation we summarize: "Informally, two operations conflict if they access the same memory location, and at least one of them is a write operation. In an execution, two memory operations from different threads form a data-race if they conflict, at least one of them is a *plain* access (non-atomic), and they are unordered in the "happens-before" order according to the LKMM." KCSAN's goal is to find *data races* according to the LKMM. Some data races are race conditions (usually the more interesting bugs) -- but not *all* data races are race conditions. Those are what are usually referred to as "benign", but they can still become bugs on the wrong arch/compiler combination. Hence, the need to annotate these accesses with READ_ONCE, WRITE_ONCE or use atomic_t: - https://lwn.net/Articles/793253/ - https://lwn.net/Articles/799218/ > Hmm. even the "if (!(p->state & state))" check in try_to_wake_up() can trigger > kcsan_report() ? We blacklisted sched (KCSAN_SANITIZE := n in kernel/sched/Makefile), so these data races won't actually be reported. Thanks, -- Marco > Oleg. >