From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932439Ab2BCBTc (ORCPT ); Thu, 2 Feb 2012 20:19:32 -0500 Received: from mail-iy0-f174.google.com ([209.85.210.174]:59622 "EHLO mail-iy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756423Ab2BCBT3 convert rfc822-to-8bit (ORCPT ); Thu, 2 Feb 2012 20:19:29 -0500 MIME-Version: 1.0 In-Reply-To: <20120201180705.GA20936@e102568-lin.cambridge.arm.com> References: <1324426147-16735-1-git-send-email-ccross@android.com> <4F1929E9.7070707@linaro.org> <20120201145934.GA20421@e102568-lin.cambridge.arm.com> <20120201180705.GA20936@e102568-lin.cambridge.arm.com> Date: Thu, 2 Feb 2012 17:19:28 -0800 X-Google-Sender-Auth: _DJOjjRcdyt0yQdxTFZFvPmg75s Message-ID: Subject: Re: [linux-pm] [PATCH 0/3] coupled cpuidle state support From: Colin Cross To: Lorenzo Pieralisi Cc: Vincent Guittot , Daniel Lezcano , Kevin Hilman , Len Brown , "linux-kernel@vger.kernel.org" , Amit Kucheria , "linux-tegra@vger.kernel.org" , "linux-pm@lists.linux-foundation.org" , "linux-omap@vger.kernel.org" , Arjan van de Ven , "linux-arm-kernel@lists.infradead.org" X-System-Of-Record: true Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 1, 2012 at 10:07 AM, Lorenzo Pieralisi wrote: > On Wed, Feb 01, 2012 at 05:30:15PM +0000, Colin Cross wrote: >> On Wed, Feb 1, 2012 at 6:59 AM, Lorenzo Pieralisi >> wrote: >> > On Wed, Feb 01, 2012 at 12:13:26PM +0000, Vincent Guittot wrote: >> > >> > [...] >> > >> >> >> In your patch, you put in safe state (WFI for most of platform) the >> >> >> cpus that become idle and these cpus are woken up each time a new cpu >> >> >> of the cluster becomes idle. Then, the cluster state is chosen and the >> >> >> cpus enter the selected C-state. On ux500, we are using another >> >> >> behavior for synchronizing  the cpus. The cpus are prepared to enter >> >> >> the c-state that has been chosen by the governor and the last cpu, >> >> >> that enters idle, chooses the final cluster state (according to cpus' >> >> >> C-state). The main advantage of this solution is that you don't need >> >> >> to wake other cpus to enter the C-state of a cluster. This can be >> >> >> quite worth full when tasks mainly run on one cpu. Have you also think >> >> >> about such behavior when developing the coupled cpuidle driver ? It >> >> >> could be interesting to add such behavior. >> >> > >> >> > Waking up the cpus that are in the safe state is not done just to >> >> > choose the target state, it's done to allow the cpus to take >> >> > themselves to the target low power state.  On ux500, are you saying >> >> > you take the cpus directly from the safe state to a lower power state >> >> > without ever going back to the active state?  I once implemented Tegra >> >> >> >> yes it is >> > >> > But if there is a single power rail for the entire cluster, when a CPU >> > is "prepared" for shutdown this means that you have to save the context and >> > clean L1, maybe for nothing since if other CPUs are up and running the >> > CPU going idle can just enter a simple standby wfi (clock-gated but power on). >> > >> > With Colin's approach, context is saved and L1 cleaned only when it is >> > almost certain the cluster is powered off (so the CPUs). >> > >> > It is a trade-off, I am not saying one approach is better than the >> > other; we just have to make sure that preparing the CPU for "possible" shutdown >> > is better than sending IPIs to take CPUs out of wfi and synchronize >> > them (this happens if and only if CPUs enter coupled C-states). >> > >> > As usual this will depend on use cases (and silicon implementations :) ) >> > >> > It is definitely worth benchmarking them. >> > >> >> I'm less worried about performance, and more worried about race >> conditions.  How do you deal with the following situation: >> CPU0 goes to WFI, and saves its state >> CPU1 goes idle, and selects a deep idle state that powers down CPU0 >> CPU1 saves is state, and is about to trigger the power down >> CPU0 gets an interrupt, restores its state, and modifies state (maybe >> takes a spinlock during boot) >> CPU1 cuts the power to CPU0 >> >> On OMAP4, the race is handled in hardware.  When CPU1 tries to cut the >> power to the blocks shared by CPU0 the hardware will ignore the >> request if CPU0 is not in WFI.  On Tegra2, there is no hardware >> support and I had to handle it with a spinlock implemented in scratch >> registers because CPU0 is out of coherency when it starts booting and >> ldrex/strex don't work.  I'm not convinced my implementation is >> correct, and I'd be curious to see any other implementations. > > That's a problem you solved with coupled C-states (ie your example in > the cover letter), where the primary waits for other CPUs to be reset > before issuing the power down command, right ? At that point in time > secondaries cannot wake up (?) and if wfi (ie power down) aborts you just > take the secondaries out of reset and restart executing simultaneously, > correct ? It mirrors the suspend behaviour, which is easier to deal with > than completely random idle paths. Yes, anything that supports hotplug and suspend should support coupled cpuidle states fairly easily. The only thing required that is not already used by hotplug/suspend is the ability to save and restore context on cpu1, but most implementations end up doing that already. > It is true that this should be managed by the PM HW; if HW is not > capable of managing these situations things get nasty as you highlighted. Yes - on some platforms, the HW is not designed to handle it. On others, it is designed to, but due to HW bugs it cannot be used. > And it is also true ldrex/strex on cacheable memory might not be available in > those early warm-boot stages. I came up with a locking algorithm on > strongly ordered memory to deal with that, but I am still not sure it is > something we really really need. I did the same, but with device memory. > I will test coupled C-state code ASAP, and come back with feedback. > > Thanks, > Lorenzo >