From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754480AbbGFQmJ (ORCPT ); Mon, 6 Jul 2015 12:42:09 -0400 Received: from muru.com ([72.249.23.125]:33452 "EHLO muru.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751263AbbGFQmF (ORCPT ); Mon, 6 Jul 2015 12:42:05 -0400 Date: Mon, 6 Jul 2015 09:42:00 -0700 From: Tony Lindgren To: Thomas Gleixner Cc: linux-kernel@vger.kernel.org, linux-omap@vger.kernel.org, Felipe Balbi , John Stultz , Nishanth Menon , Yingjoe Chen , "Rafael J. Wysocki" , Peter Zijlstra Subject: Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle Message-ID: <20150706164200.GN10705@atomide.com> References: <1436166725-18353-1-git-send-email-tony@atomide.com> <20150706152201.GJ10705@atomide.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Thomas Gleixner [150706 08:48]: > On Mon, 6 Jul 2015, Tony Lindgren wrote: > > * Thomas Gleixner [150706 07:20]: > > > On Mon, 6 Jul 2015, Tony Lindgren wrote: > > The timekeeping accuracy issue certainly needs some thinking, and > > also the resolution between the clocksources can be different.. In the > > test case I have the slow timer is always on and of a lower resolution > > than the ARM global timer being used during runtime. > > > > Got some handy timer test in mind you want me to run to provide data > > on the accuracy? > > John Stultz might have something. > > > > > +/** > > > > + * clocksource_pm_enter - change to a persistent clocksource before idle > > > > + * > > > > + * Changes system to use a persistent clocksource for idle. Intended to > > > > + * be called from CPUidle from the last active CPU. > > > > + */ > > > > +int clocksource_pm_enter(void) > > > > +{ > > > > + bool oneshot = tick_oneshot_mode_active(); > > > > + struct clocksource *best; > > > > + > > > > + if (WARN_ONCE(!mutex_trylock(&clocksource_mutex), > > > > + "Unable to get clocksource_mutex")) > > > > + return -EINTR; > > > > > > This trylock serves which purpose? > > > > Well we don't want to start changing clocksource if something is > > running like you pointed out. > > Well yes, but .... > > > > I really cannot see how this is proper serialized. > > > > We need to allow this only from the last cpu before hitting idle. > > And I cannot see anything which does so. > > cpu0 cpu1 > is_idle > go_idle() > clocksource_pm_enter() > lock(cs_mutex); > wakeup() > clocksource_pm_exit() > trylock fails .... > > ... > unlock(cs_mutex); > > --> Crap! OK you're right, this only works with cpuidle and using drivers/cpuidle/coupled.c. > > > > @@ -1086,7 +1086,18 @@ int timekeeping_notify(struct clocksource *clock) > > > > > > > > if (tk->tkr_mono.clock == clock) > > > > return 0; > > > > - stop_machine(change_clocksource, clock, NULL); > > > > + > > > > + /* > > > > + * We may want to toggle between a fast and a persistent > > > > + * clocksource from CPUidle on the last active CPU and can't > > > > + * use stop_machine at that point. > > > > + */ > > > > + if (cpumask_test_cpu(smp_processor_id(), cpu_online_mask) && > > > > > > Can you please explain how this code gets called from an offline cpu? > > > > Last cpu getting idled.. > > That does not make any sense at all. How is idle related to the online > mask? An idle cpu is still online. Oops yeah that's a bogus test, cpu off != offlined. > > > > + !rcu_is_watching()) > > > > > > So pick some random combination of conditions and define that it is > > > correct, right? How on earth does !rcu_watching() tell that this is > > > the last running cpu. > > > > We have called rcu_idle_enter() from cpuidle_idle_call(). Do you have > > some better test in mind when the last cpu is about hit idle? > > The cpuidle code should know that. And if it does not know, it better > should keep track of that information and based on it provide the > proper serialization, so the call into the timekeeping code is not a > subject to guess work and race conditions. OK I agree. Based on your comments this clearly needs to be limited to cpuidle. And thanks for your comments. Regards, Tony