From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754861AbdEIQSi (ORCPT ); Tue, 9 May 2017 12:18:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52974 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753554AbdEIQSh (ORCPT ); Tue, 9 May 2017 12:18:37 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 92BC87F6AC Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jpoimboe@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 92BC87F6AC Date: Tue, 9 May 2017 11:18:35 -0500 From: Josh Poimboeuf To: "Paul E. McKenney" Cc: Steven Rostedt , Petr Mladek , Jessica Yu , Jiri Kosina , Miroslav Benes , live-patching@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/3] livepatch/rcu: Warn when system consistency is broken in RCU code Message-ID: <20170509161835.64ihfts7xuytaryp@treble> References: <1493895316-19165-1-git-send-email-pmladek@suse.com> <1493895316-19165-3-git-send-email-pmladek@suse.com> <20170508165108.d3vd4h6ffa25bfui@treble> <20170508151322.76e8e9db@gandalf.local.home> <20170508194729.jjq7qrc7gkiq2s5v@treble> <20170508201558.GD3956@linux.vnet.ibm.com> <20170508204333.xc3isvr4riv26his@treble> <20170508210754.GE3956@linux.vnet.ibm.com> <20170508221609.roaeaidj7mpfozcq@treble> <20170508223600.GH3956@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20170508223600.GH3956@linux.vnet.ibm.com> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 09 May 2017 16:18:36 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 08, 2017 at 03:36:00PM -0700, Paul E. McKenney wrote: > On Mon, May 08, 2017 at 05:16:09PM -0500, Josh Poimboeuf wrote: > > On Mon, May 08, 2017 at 02:07:54PM -0700, Paul E. McKenney wrote: > > > This would be a problem if step 2's NMI hit rcu_irq_enter(), > > > rcu_irq_exit(), and friends in just the wrong place. > > > > > > I would suggest that ftrace() do something like this... > > > > > > if (in_nmi()) > > > rcu_nmi_enter(); > > > else > > > rcu_irq_enter(); > > > > > > Except that, as Steven will quickly point out, this won't work at the > > > very edges of the NMI, when NMI_MASK won't be set in preempt_count(). > > > > > > Other thoughts? > > > > Ok. So I think the livepatch ftrace handler would need the in_nmi() > > check, in case it's called early in the NMI. > > > > But on x86, rcu_nmi_enter() is also called in some non-NMI exception > > cases, from ist_enter(). So it appears that the in_nmi() check wouldn't > > be sufficient. We might instead need something like: > > > > if (in_nmi() || in_some_other_exception()) > > rcu_nmi_enter(); > > else > > rcu_irq_enter(); > > > > But unfortunately the in_some_other_exception() function doesn't > > currently exist. > > > > So, one more question. Would it work if we just always called > > rcu_nmi_enter()? > > I am a bit nervous about this. It would -at- -least- be necessary to have > interrupts disabled throughout the entire time from the rcu_nmi_enter() > through the matching rcu_nmi_exit(). And there might be other failure > modes that I don't immediately see. Ok, let's forget about that idea for now then :-) > But do we really need this, given the in_nmi() check that Steven > pointed out? The in_nmi() check doesn't work for non-NMI exceptions. An exception can come from anywhere, which is presumably why ist_enter() calls rcu_nmi_enter(), even though it might not have been in NMI context. The exception could, for example, happen while you're twiddling important bits in rcu_irq_enter(). Or it could happen early in do_nmi(), before it had a chance to set NMI_MASK or call rcu_nmi_enter(). In either case, in_nmi() would be false, yet calling rcu_irq_enter() would be bad. I think I have convinced myself that, as long as the user doesn't patch ist_enter() or rcu_dynticks_eqs_enter(), it'll be fine. So the following should be sufficient: if (in_nmi()) rcu_nmi_enter(); /* in case we're called before nmi_enter() */ else rcu_irq_enter_irqson(); if (unlikely(!rcu_is_watching())) { klp_block_patch_removal = true; WARN_ON_ONCE(1); /* this presumably means */ } I think the alternative, calling rcu_irq_enter_disabled() beforehand, isn't sufficient, because it only checks the rcu_dynticks_eqs_enter() case. It doesn't check the IST exception ist_enter() case, before rcu_nmi_enter() has been called. -- Josh