From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756355AbcK2OCN (ORCPT ); Tue, 29 Nov 2016 09:02:13 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:51192 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754909AbcK2OCG (ORCPT ); Tue, 29 Nov 2016 09:02:06 -0500 Date: Tue, 29 Nov 2016 14:59:01 +0100 (CET) From: Thomas Gleixner To: Borislav Petkov cc: Peter Zijlstra , Steven Rostedt , Jiri Olsa , "Paul E. McKenney" , linux-kernel@vger.kernel.org, Ingo Molnar , Josh Triplett , Andi Kleen , Jan Stancek Subject: Re: [BUG] msr-trace.h:42 suspicious rcu_dereference_check() usage! In-Reply-To: <20161129131649.hajagzcjfhn5cenp@pd.tnic> Message-ID: References: <20161121005343.GB1891@krava> <20161121092850.GF3102@twins.programming.kicks-ass.net> <20161121093424.GA9814@krava> <20161121125830.GE3092@twins.programming.kicks-ass.net> <20161121091543.45f49945@gandalf.local.home> <20161121143716.GG3092@twins.programming.kicks-ass.net> <20161121153538.27wegzmdv3om52xq@pd.tnic> <20161121154104.GA3124@twins.programming.kicks-ass.net> <20161121160653.s4i3nua46rtpvj5l@pd.tnic> <20161129131649.hajagzcjfhn5cenp@pd.tnic> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 29 Nov 2016, Borislav Petkov wrote: > On Mon, Nov 21, 2016 at 05:06:54PM +0100, Borislav Petkov wrote: > > IOW, what's the worst thing that can happen if we did this below? > > > > We basically get rid of the detection and switch the timer to broadcast > > mode immediately on the halting CPU. > > > > amd_e400_idle() is behind an "if (cpu_has_bug(c, X86_BUG_AMD_APIC_C1E))" > > check so it will run on the affected CPUs only... > > > > Thoughts? > > Actually, here's a better version. The E400 detection works only after > ACPI has been enabled so we piggyback the end of acpi_init(). > > We don't need the MSR read now - we do > > if (static_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) > > on the idle path which is as fast as it gets. > > Any complaints about this before I go and test it everywhere? The issue is that you obvioulsy start with the assumption, that the machine has this bug. As a consequence the machine is brute forced into tick broadcast mode, which cannot be reverted when you clear that misfeature after ACPI init. So in case of !NOHZ and !HIGHRES the periodic tick is forced into broadcast mode, which is not what you want. As far as I understood the whole magic, this C1E misfeature takes only effect _after_ ACPI has been initialized. So instead of setting the bug in early boot and therefor forcing the broadcast nonsense, we should only set it when ACPI has actually detected it. Thanks, tglx