From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754813AbbBERyi (ORCPT <rfc822;w@1wt.eu>);
	Thu, 5 Feb 2015 12:54:38 -0500
Received: from e31.co.us.ibm.com ([32.97.110.149]:54470 "EHLO
	e31.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751886AbbBERyh (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 5 Feb 2015 12:54:37 -0500
Date: Thu, 5 Feb 2015 09:54:31 -0800
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>,
        linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org,
        Arnd Bergmann <arnd@arndb.de>, Mark Rutland <mark.rutland@arm.com>,
        Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
        Marek Szyprowski <m.szyprowski@samsung.com>,
        Stephen Boyd <sboyd@codeaurora.org>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH v2] ARM: Don't use complete() during __cpu_die
Message-ID: <20150205175431.GH5370@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <1423131270-24047-1-git-send-email-k.kozlowski@samsung.com>
 <20150205105035.GL8656@n2100.arm.linux.org.uk>
 <20150205142918.GA10634@linux.vnet.ibm.com>
 <20150205161100.GQ8656@n2100.arm.linux.org.uk>
 <20150205170228.GZ5370@linux.vnet.ibm.com>
 <20150205173440.GR8656@n2100.arm.linux.org.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150205173440.GR8656@n2100.arm.linux.org.uk>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 15020517-8236-0000-0000-000009339FE4
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Feb 05, 2015 at 05:34:40PM +0000, Russell King - ARM Linux wrote:
> On Thu, Feb 05, 2015 at 09:02:28AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 05, 2015 at 04:11:00PM +0000, Russell King - ARM Linux wrote:
> > > On Thu, Feb 05, 2015 at 06:29:18AM -0800, Paul E. McKenney wrote:
> > > > Works for me, assuming no hidden uses of RCU in the IPI code.  ;-)
> > > 
> > > Sigh... I kind'a new it wouldn't be this simple.  The gic code which
> > > actually raises the IPI takes a raw spinlock, so it's not going to be
> > > this simple - there's a small theoretical window where we have taken
> > > this lock, written the register to send the IPI, and then dropped the
> > > lock - the update to the lock to release it could get lost if the
> > > CPU power is quickly cut at that point.
> > > 
> > > Also, we _do_ need the second cache flush in place to ensure that the
> > > unlock is seen to other CPUs.
> > > 
> > > We could work around that by taking and releasing the lock in the IPI
> > > processing function... but this is starting to look less attractive
> > > as the lock is private to irq-gic.c.
> > > 
> > > Well, we're very close to 3.19, we're too close to be trying to sort
> > > this out, so I'm hoping that your changes which cause this RCU error
> > > are *not* going in during this merge window, because we seem to have
> > > something of a problem right now which needs more time to resolve.
> > 
> > Most likely into the 3.20 merge window.  But please keep in mind that
> > RCU is just the messenger here -- the current code will break if any
> > CPU for whatever reason takes more than a jiffy to get from its
> > _stop_machine() handler to the end of its last RCU read-side critical
> > section on its way out.  A jiffy may sound like a lot, but it is not
> > hard to exceed this limit, especially in virtualized environments.
> 
> What I'm saying is that we can't likely get a good fix prepared before
> the 3.20 merge window opens.

And I cannot count.  Or cannot type.  Or something.

I meant do say "Most likely into the 3.21 merge window."  I agree that
this stuff is not ready for next week's merge window.  For one thing,
there are similar issues with a number of other architectures as well.

> I don't term the set_bit/clear_bit solution a "good fix" because it is
> far too complex - I've not done a thorough review on it, but the idea
> of setting and clearing a couple of bits in unison, making sure that
> their state is set appropriately through multiple different code paths
> does not strike me as a provably correct replacement for this completion.
> The reason for that complexity is because there is no pre-notification
> to arch code that a CPU might be going down, so there's no way for the
> "CPU is dead" flag to be properly reset (which is why there's all the
> manipulation in lots of possible failure paths.)
> 
> The idea that we could reset it in the CPU up code doesn't fly - that
> would only work if we had one secondary CPU (which would guarantee a
> strict up/down/up ordering on it) but as soon as you have more than one
> CPU, that doesn't hold true.
> 
> We could hook into the CPU hotplug notifiers - which would be quite a
> lot of additional code to achieve the reset early enough in the hot
> unplug path, though it would probably be the most reliable solution to
> the wait-for-bit solution.
> 
> However, any of those solutions needs writing and thorough testing,
> which, if Linus opens the merge window on Sunday, isn't going to
> happen before hand (and we know Linus doesn't like extra development
> appearing which wasn't in -next prior to the merge window - he's taken
> snapshots of -next to check during the merge window in the past - so
> it's not something I'm going to be adding during that time, not even
> as a "fix" because we know about the problem right now, before the
> merge window.  To me, to treat this as a "fix" would be wilfully
> deceitful.)
> 
> I don't think the existing code is a big problem at the moment - it's
> been like this for about 10 years, and no one has ever reported an
> issue with it, although there have been changes over that time:
> 
> aa033810461ee56abbef6cef10aabd6b97f5caee
> ARM: smp: Drop RCU_NONIDLE usage in cpu_die()
> 
> 	This removed the RCU_NONIDLE() from the completion() call.
> 
> ff081e05bfba3461119cd280201d163b6858eda2
> ARM: 7457/1: smp: Fix suspicious RCU originating from cpu_die()
> 
> 	This added the RCU_NONIDLE() to the completion() call.
> 
> 3c030beabf937b1d3b4ecaedfd1fb2f1e2aa0c70
> ARM: CPU hotplug: move cpu_killed completion to core code
> 
> 	This moved the completion code from Realview (and other ARM
> 	platforms) into core ARM code.
> 
> and 97a63ecff4bd06da5d8feb8c0394a4d020f2d34d
> [ARM SMP] Add CPU hotplug support for Realview MPcore
> 
> 	The earliest current introduction of CPU hotplug in 2005.

Agreed.  It does need to be fixed, but it makes sense to take a few
weeks and get a fix that is more likely to be correct.

							Thanx, Paul


From mboxrd@z Thu Jan  1 00:00:00 1970
From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
Date: Thu, 5 Feb 2015 09:54:31 -0800
Subject: [PATCH v2] ARM: Don't use complete() during __cpu_die
In-Reply-To: <20150205173440.GR8656@n2100.arm.linux.org.uk>
References: <1423131270-24047-1-git-send-email-k.kozlowski@samsung.com>
 <20150205105035.GL8656@n2100.arm.linux.org.uk>
 <20150205142918.GA10634@linux.vnet.ibm.com>
 <20150205161100.GQ8656@n2100.arm.linux.org.uk>
 <20150205170228.GZ5370@linux.vnet.ibm.com>
 <20150205173440.GR8656@n2100.arm.linux.org.uk>
Message-ID: <20150205175431.GH5370@linux.vnet.ibm.com>
To: linux-arm-kernel@lists.infradead.org
List-Id: linux-arm-kernel.lists.infradead.org

On Thu, Feb 05, 2015 at 05:34:40PM +0000, Russell King - ARM Linux wrote:
> On Thu, Feb 05, 2015 at 09:02:28AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 05, 2015 at 04:11:00PM +0000, Russell King - ARM Linux wrote:
> > > On Thu, Feb 05, 2015 at 06:29:18AM -0800, Paul E. McKenney wrote:
> > > > Works for me, assuming no hidden uses of RCU in the IPI code.  ;-)
> > > 
> > > Sigh... I kind'a new it wouldn't be this simple.  The gic code which
> > > actually raises the IPI takes a raw spinlock, so it's not going to be
> > > this simple - there's a small theoretical window where we have taken
> > > this lock, written the register to send the IPI, and then dropped the
> > > lock - the update to the lock to release it could get lost if the
> > > CPU power is quickly cut at that point.
> > > 
> > > Also, we _do_ need the second cache flush in place to ensure that the
> > > unlock is seen to other CPUs.
> > > 
> > > We could work around that by taking and releasing the lock in the IPI
> > > processing function... but this is starting to look less attractive
> > > as the lock is private to irq-gic.c.
> > > 
> > > Well, we're very close to 3.19, we're too close to be trying to sort
> > > this out, so I'm hoping that your changes which cause this RCU error
> > > are *not* going in during this merge window, because we seem to have
> > > something of a problem right now which needs more time to resolve.
> > 
> > Most likely into the 3.20 merge window.  But please keep in mind that
> > RCU is just the messenger here -- the current code will break if any
> > CPU for whatever reason takes more than a jiffy to get from its
> > _stop_machine() handler to the end of its last RCU read-side critical
> > section on its way out.  A jiffy may sound like a lot, but it is not
> > hard to exceed this limit, especially in virtualized environments.
> 
> What I'm saying is that we can't likely get a good fix prepared before
> the 3.20 merge window opens.

And I cannot count.  Or cannot type.  Or something.

I meant do say "Most likely into the 3.21 merge window."  I agree that
this stuff is not ready for next week's merge window.  For one thing,
there are similar issues with a number of other architectures as well.

> I don't term the set_bit/clear_bit solution a "good fix" because it is
> far too complex - I've not done a thorough review on it, but the idea
> of setting and clearing a couple of bits in unison, making sure that
> their state is set appropriately through multiple different code paths
> does not strike me as a provably correct replacement for this completion.
> The reason for that complexity is because there is no pre-notification
> to arch code that a CPU might be going down, so there's no way for the
> "CPU is dead" flag to be properly reset (which is why there's all the
> manipulation in lots of possible failure paths.)
> 
> The idea that we could reset it in the CPU up code doesn't fly - that
> would only work if we had one secondary CPU (which would guarantee a
> strict up/down/up ordering on it) but as soon as you have more than one
> CPU, that doesn't hold true.
> 
> We could hook into the CPU hotplug notifiers - which would be quite a
> lot of additional code to achieve the reset early enough in the hot
> unplug path, though it would probably be the most reliable solution to
> the wait-for-bit solution.
> 
> However, any of those solutions needs writing and thorough testing,
> which, if Linus opens the merge window on Sunday, isn't going to
> happen before hand (and we know Linus doesn't like extra development
> appearing which wasn't in -next prior to the merge window - he's taken
> snapshots of -next to check during the merge window in the past - so
> it's not something I'm going to be adding during that time, not even
> as a "fix" because we know about the problem right now, before the
> merge window.  To me, to treat this as a "fix" would be wilfully
> deceitful.)
> 
> I don't think the existing code is a big problem at the moment - it's
> been like this for about 10 years, and no one has ever reported an
> issue with it, although there have been changes over that time:
> 
> aa033810461ee56abbef6cef10aabd6b97f5caee
> ARM: smp: Drop RCU_NONIDLE usage in cpu_die()
> 
> 	This removed the RCU_NONIDLE() from the completion() call.
> 
> ff081e05bfba3461119cd280201d163b6858eda2
> ARM: 7457/1: smp: Fix suspicious RCU originating from cpu_die()
> 
> 	This added the RCU_NONIDLE() to the completion() call.
> 
> 3c030beabf937b1d3b4ecaedfd1fb2f1e2aa0c70
> ARM: CPU hotplug: move cpu_killed completion to core code
> 
> 	This moved the completion code from Realview (and other ARM
> 	platforms) into core ARM code.
> 
> and 97a63ecff4bd06da5d8feb8c0394a4d020f2d34d
> [ARM SMP] Add CPU hotplug support for Realview MPcore
> 
> 	The earliest current introduction of CPU hotplug in 2005.

Agreed.  It does need to be fixed, but it makes sense to take a few
weeks and get a fix that is more likely to be correct.

							Thanx, Paul