From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FSL_HELO_FAKE,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69578C432C0 for ; Wed, 20 Nov 2019 15:54:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 32D5520714 for ; Wed, 20 Nov 2019 15:54:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jtMhRC6P" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728412AbfKTPy6 (ORCPT ); Wed, 20 Nov 2019 10:54:58 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:39660 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729188AbfKTPy6 (ORCPT ); Wed, 20 Nov 2019 10:54:58 -0500 Received: by mail-wm1-f65.google.com with SMTP id t26so117067wmi.4 for ; Wed, 20 Nov 2019 07:54:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Pc/B5s8Op7DZcUpCkoozM2pKb7wkORn3PJUzQ+3VJls=; b=jtMhRC6Pk0HadowGTjQERXbTA9SrlkgA6uoK4GrRBBSya5IrU9FKwc5sKWuazC7hWY 91aODgBraKSoG3+eK4B2Za6Q/SMrQiR5aWNtuyZaBsUjHqSN2Zi+JRcOllkgpeTVuzp8 KDT853LwBilN+sUKYomYUcWf/qSSmZVkW+82quasRshDxUIO+1hR/nIVZmRvxQmr68TG JuuUYYKg35fGImgfsm/WZdIhy6TJt7Bd0SxC2TXYB82TjG7g7WhD7Xc4n2MJVs5c58xj OsGpyLKVVbGaZmacdh697dBEAEP6D0PtBat1Xcjiao3BpaFQFCZ1O7xp2tWnSqIW3UyD DLNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Pc/B5s8Op7DZcUpCkoozM2pKb7wkORn3PJUzQ+3VJls=; b=WBjKOeZppZcFIINIPpgh7gBzhkyu23GRlqUOw756RM1j0r1nUDPcUVTCghJfvknX2v NUceMdxu45uv1lrMfhyQ0N52ZA8ORJr/12fhmxOmTZHAPGjNL2AHBreUqcIOJvtX/eLw WC5nVHk8GiQwwPch++jly18362H/tCALhD0un0pg7iu5a7CsUOL6CHpXfvxD1/P8zvbP w2P8X+3/Jr7Q9yLHDZE130eRXx1OZq8UHuPt7Y4gP1wyUHBy60QQaCOnwngKwj7l0iqI AlXtBGFQF9iRsgin4Ns396p+SMLUaXw8nLSKV+xN587XCz0wCMdms98nDw/+kAgSm6NS LieQ== X-Gm-Message-State: APjAAAUmwRggh6XwUAPIalGWSt61mts1JM7s4pkCFCA/B93sF/Wye3mr jXLfhhIMD1327fP+HDV90GWRPw== X-Google-Smtp-Source: APXvYqygyv2epnPyQ8ZUCWpvIvDmkk1mlhM7bZ2yAb9gUjZi3pn0UqvfOAVs0g05UGhLYqygkKm6GA== X-Received: by 2002:a1c:38c3:: with SMTP id f186mr4147629wma.58.1574265294776; Wed, 20 Nov 2019 07:54:54 -0800 (PST) Received: from google.com ([100.105.32.75]) by smtp.gmail.com with ESMTPSA id z6sm33020710wro.18.2019.11.20.07.54.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Nov 2019 07:54:53 -0800 (PST) Date: Wed, 20 Nov 2019 16:54:48 +0100 From: Marco Elver To: Qian Cai Cc: LKMM Maintainers -- Akira Yokosawa , Alan Stern , Alexander Potapenko , Andrea Parri , Andrey Konovalov , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Boqun Feng , Borislav Petkov , Daniel Axtens , Daniel Lustig , Dave Hansen , David Howells , Dmitry Vyukov , "H. Peter Anvin" , Ingo Molnar , Jade Alglave , Joel Fernandes , Jonathan Corbet , Josh Poimboeuf , Luc Maranget , Mark Rutland , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Thomas Gleixner , Will Deacon , Eric Dumazet , kasan-dev , linux-arch , "open list:DOCUMENTATION" , linux-efi@vger.kernel.org, Linux Kbuild mailing list , LKML , Linux Memory Management List , the arch/x86 maintainers Subject: Re: [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN) Message-ID: <20191120155448.GA21320@google.com> References: <20191114180303.66955-1-elver@google.com> <1574194379.9585.10.camel@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-doc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On Tue, 19 Nov 2019, Marco Elver wrote: > On Tue, 19 Nov 2019 at 21:13, Qian Cai wrote: > > > > On Thu, 2019-11-14 at 19:02 +0100, 'Marco Elver' via kasan-dev wrote: > > > This is the patch-series for the Kernel Concurrency Sanitizer (KCSAN). > > > KCSAN is a sampling watchpoint-based *data race detector*. More details > > > are included in **Documentation/dev-tools/kcsan.rst**. This patch-series > > > only enables KCSAN for x86, but we expect adding support for other > > > architectures is relatively straightforward (we are aware of > > > experimental ARM64 and POWER support). > > > > This does not allow the system to boot. Just hang forever at the end. > > > > https://cailca.github.io/files/dmesg.txt > > > > the config (dselect KASAN and select KCSAN with default options): > > > > https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config > > Thanks! That config enables lots of other debug code. I could > reproduce the hang. It's related to CONFIG_PROVE_LOCKING etc. > > The problem is definitely not the fact that kcsan_setup_watchpoint > disables interrupts (tested by removing that code). Although lockdep > still complains here, and looking at the code in kcsan/core.c, I just > can't see how local_irq_restore cannot be called before returning (in > the stacktrace you provided, there is no kcsan function), and > interrupts should always be re-enabled. (Interrupts are only disabled > during delay in kcsan_setup_watchpoint.) > > What I also notice is that this happens when the console starts > getting spammed with data-race reports (presumably because some extra > debug code has lots of data races according to KCSAN). > > My guess is that some of the extra debug logic enabled in that config > is incompatible with KCSAN. However, so far I cannot tell where > exactly the problem is. For now the work-around would be not using > KCSAN with these extra debug options. I will investigate more, but > nothing obviously wrong stands out.. It seems that due to spinlock_debug.c containing data races, the console gets spammed with reports. However, it's also possible to encounter deadlock, e.g. printk lock -> spinlock_debug -> KCSAN detects data race -> kcsan_print_report() -> printk lock -> deadlock. So the best thing is to fix the data races in spinlock_debug. I will send a patch separately for you to test. The issue that lockdep still reports inconsistency in IRQ flags tracing I cannot yet say what the problem is. It seems that lockdep IRQ flags tracing may have an issue with KCSAN for numerous reasons: let's say lockdep and IRQ flags tracing code is instrumented, which then calls into KCSAN, which disables/enables interrupts, but due to tracing calls back into lockdep code. In other words, there may be some recursion which corrupts hardirqs_enabled. Thanks, -- Marco