From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934182AbbBDPKy (ORCPT ); Wed, 4 Feb 2015 10:10:54 -0500 Received: from e32.co.us.ibm.com ([32.97.110.150]:46806 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934149AbbBDPKx (ORCPT ); Wed, 4 Feb 2015 10:10:53 -0500 Date: Wed, 4 Feb 2015 07:10:28 -0800 From: "Paul E. McKenney" To: Krzysztof Kozlowski Cc: Russell King - ARM Linux , Fengguang Wu , LKP , linux-kernel@vger.kernel.org, Bartlomiej Zolnierkiewicz , linux-arm-kernel@lists.infradead.org, Arnd Bergmann , MarkRutland Subject: Re: [rcu] [ INFO: suspicious RCU usage. ] Message-ID: <20150204151028.GD5370@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150201025922.GA16820@wfg-t540p.sh.intel.com> <1422957702.17540.1.camel@AMDC1943> <20150203162704.GR19109@linux.vnet.ibm.com> <1423049947.19547.6.camel@AMDC1943> <20150204130018.GG8656@n2100.arm.linux.org.uk> <20150204131420.GC5370@linux.vnet.ibm.com> <1423059387.24415.2.camel@AMDC1943> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1423059387.24415.2.camel@AMDC1943> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15020415-0005-0000-0000-00000889F875 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 04, 2015 at 03:16:27PM +0100, Krzysztof Kozlowski wrote: > On śro, 2015-02-04 at 05:14 -0800, Paul E. McKenney wrote: > > On Wed, Feb 04, 2015 at 01:00:18PM +0000, Russell King - ARM Linux wrote: > > > On Wed, Feb 04, 2015 at 12:39:07PM +0100, Krzysztof Kozlowski wrote: > > > > +Cc some ARM people > > > > > > I wish that people would CC this list with problems seen on ARM. I'm > > > minded to just ignore this message because of this in the hope that by > > > doing so, people will learn something... > > > > > > > > Another thing I could do would be to have an arch-specific Kconfig > > > > > variable that made ARM responsible for informing RCU that the CPU > > > > > was departing, which would allow a call to as follows to be placed > > > > > immediately after the complete(): > > > > > > > > > > rcu_cpu_notify(NULL, CPU_DYING_IDLE, (void *)(long)smp_processor_id()); > > > > > > > > > > Note: This absolutely requires that the rcu_cpu_notify() -always- > > > > > be allowed to execute!!! This will not work if there is -any- possibility > > > > > of __cpu_die() powering off the outgoing CPU before the call to > > > > > rcu_cpu_notify() returns. > > > > > > Exactly, so that's not going to be possible. The completion at that > > > point marks the point at which power _could_ be removed from the CPU > > > going down. > > > > OK, sounds like a polling loop is required. > > I thought about using wait_on_bit() in __cpu_die() (the waiting thread) > and clearing the bit on CPU being powered down. What do you think about > such idea? Hmmm... It looks to me that wait_on_bit() calls out_of_line_wait_on_bit(), which in turn calls __wait_on_bit(), which calls prepare_to_wait() and finish_wait(). These are in the scheduler, but this is being called from the CPU that remains online, so that should be OK. But what do you invoke on the outgoing CPU? Can you get away with simply clearing the bit, or do you also have to do a wakeup? It looks to me like a wakeup is required, which would be illegal on the outgoing CPU, which is at a point where it cannot legally invoke the scheduler. Or am I missing something? You know, this situation is giving me a bad case of nostalgia for the old Sequent Symmetry and NUMA-Q hardware. On those platforms, the outgoing CPU could turn itself off, and thus didn't need to tell some other CPU when it was ready to be turned off. Seems to me that this self-turn-off capability would be a great feature for future systems! Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 From: paulmck@linux.vnet.ibm.com (Paul E. McKenney) Date: Wed, 4 Feb 2015 07:10:28 -0800 Subject: [rcu] [ INFO: suspicious RCU usage. ] In-Reply-To: <1423059387.24415.2.camel@AMDC1943> References: <20150201025922.GA16820@wfg-t540p.sh.intel.com> <1422957702.17540.1.camel@AMDC1943> <20150203162704.GR19109@linux.vnet.ibm.com> <1423049947.19547.6.camel@AMDC1943> <20150204130018.GG8656@n2100.arm.linux.org.uk> <20150204131420.GC5370@linux.vnet.ibm.com> <1423059387.24415.2.camel@AMDC1943> Message-ID: <20150204151028.GD5370@linux.vnet.ibm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Wed, Feb 04, 2015 at 03:16:27PM +0100, Krzysztof Kozlowski wrote: > On ?ro, 2015-02-04 at 05:14 -0800, Paul E. McKenney wrote: > > On Wed, Feb 04, 2015 at 01:00:18PM +0000, Russell King - ARM Linux wrote: > > > On Wed, Feb 04, 2015 at 12:39:07PM +0100, Krzysztof Kozlowski wrote: > > > > +Cc some ARM people > > > > > > I wish that people would CC this list with problems seen on ARM. I'm > > > minded to just ignore this message because of this in the hope that by > > > doing so, people will learn something... > > > > > > > > Another thing I could do would be to have an arch-specific Kconfig > > > > > variable that made ARM responsible for informing RCU that the CPU > > > > > was departing, which would allow a call to as follows to be placed > > > > > immediately after the complete(): > > > > > > > > > > rcu_cpu_notify(NULL, CPU_DYING_IDLE, (void *)(long)smp_processor_id()); > > > > > > > > > > Note: This absolutely requires that the rcu_cpu_notify() -always- > > > > > be allowed to execute!!! This will not work if there is -any- possibility > > > > > of __cpu_die() powering off the outgoing CPU before the call to > > > > > rcu_cpu_notify() returns. > > > > > > Exactly, so that's not going to be possible. The completion at that > > > point marks the point at which power _could_ be removed from the CPU > > > going down. > > > > OK, sounds like a polling loop is required. > > I thought about using wait_on_bit() in __cpu_die() (the waiting thread) > and clearing the bit on CPU being powered down. What do you think about > such idea? Hmmm... It looks to me that wait_on_bit() calls out_of_line_wait_on_bit(), which in turn calls __wait_on_bit(), which calls prepare_to_wait() and finish_wait(). These are in the scheduler, but this is being called from the CPU that remains online, so that should be OK. But what do you invoke on the outgoing CPU? Can you get away with simply clearing the bit, or do you also have to do a wakeup? It looks to me like a wakeup is required, which would be illegal on the outgoing CPU, which is at a point where it cannot legally invoke the scheduler. Or am I missing something? You know, this situation is giving me a bad case of nostalgia for the old Sequent Symmetry and NUMA-Q hardware. On those platforms, the outgoing CPU could turn itself off, and thus didn't need to tell some other CPU when it was ready to be turned off. Seems to me that this self-turn-off capability would be a great feature for future systems! Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============0439696311850915593==" MIME-Version: 1.0 From: Paul E. McKenney To: lkp@lists.01.org Subject: Re: [rcu] [ INFO: suspicious RCU usage. ] Date: Wed, 04 Feb 2015 07:10:28 -0800 Message-ID: <20150204151028.GD5370@linux.vnet.ibm.com> In-Reply-To: <1423059387.24415.2.camel@AMDC1943> List-Id: --===============0439696311850915593== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Wed, Feb 04, 2015 at 03:16:27PM +0100, Krzysztof Kozlowski wrote: > On =C5=9Bro, 2015-02-04 at 05:14 -0800, Paul E. McKenney wrote: > > On Wed, Feb 04, 2015 at 01:00:18PM +0000, Russell King - ARM Linux wrot= e: > > > On Wed, Feb 04, 2015 at 12:39:07PM +0100, Krzysztof Kozlowski wrote: > > > > +Cc some ARM people > > > = > > > I wish that people would CC this list with problems seen on ARM. I'm > > > minded to just ignore this message because of this in the hope that by > > > doing so, people will learn something... > > > = > > > > > Another thing I could do would be to have an arch-specific Kconfig > > > > > variable that made ARM responsible for informing RCU that the CPU > > > > > was departing, which would allow a call to as follows to be placed > > > > > immediately after the complete(): > > > > > = > > > > > rcu_cpu_notify(NULL, CPU_DYING_IDLE, (void *)(long)smp_processor_= id()); > > > > > = > > > > > Note: This absolutely requires that the rcu_cpu_notify() -always- > > > > > be allowed to execute!!! This will not work if there is -any- po= ssibility > > > > > of __cpu_die() powering off the outgoing CPU before the call to > > > > > rcu_cpu_notify() returns. > > > = > > > Exactly, so that's not going to be possible. The completion at that > > > point marks the point at which power _could_ be removed from the CPU > > > going down. > > = > > OK, sounds like a polling loop is required. > = > I thought about using wait_on_bit() in __cpu_die() (the waiting thread) > and clearing the bit on CPU being powered down. What do you think about > such idea? Hmmm... It looks to me that wait_on_bit() calls out_of_line_wait_on_bit(), which in turn calls __wait_on_bit(), which calls prepare_to_wait() and finish_wait(). These are in the scheduler, but this is being called from the CPU that remains online, so that should be OK. But what do you invoke on the outgoing CPU? Can you get away with simply clearing the bit, or do you also have to do a wakeup? It looks to me like a wakeup is required, which would be illegal on the outgoing CPU, which is at a point where it cannot legally invoke the scheduler. Or am I missing something? You know, this situation is giving me a bad case of nostalgia for the old Sequent Symmetry and NUMA-Q hardware. On those platforms, the outgoing CPU could turn itself off, and thus didn't need to tell some other CPU when it was ready to be turned off. Seems to me that this self-turn-off capability would be a great feature for future systems! Thanx, Paul --===============0439696311850915593==--