All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	"Raju P . L . S . S . S . N" <rplsssn@codeaurora.org>,
	Stephen Boyd <sboyd@kernel.org>, Tony Lindgren <tony@atomide.com>,
	Kevin Hilman <khilman@kernel.org>,
	Lina Iyer <ilina@codeaurora.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lina Iyer <lina.iyer@linaro.org>
Subject: Re: [PATCH v10 03/27] timer: Export next wakeup time of a CPU
Date: Wed, 16 Jan 2019 11:59:40 +0100	[thread overview]
Message-ID: <CAJZ5v0iY7j5SThQhtq5xP+g5yNiHQ+cMWxq-tdVh3t5wJ_s3Tw@mail.gmail.com> (raw)
In-Reply-To: <CAPDyKFqQQnMKFNpGyi_dkP_Rm8qo-1WjPfVkp2unOxyXRa2wbA@mail.gmail.com>

On Wed, Jan 16, 2019 at 8:58 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 11 Jan 2019 at 12:07, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Thursday, November 29, 2018 6:46:36 PM CET Ulf Hansson wrote:
> > > From: Lina Iyer <lina.iyer@linaro.org>
> > >
> > > Knowing the sleep duration of CPUs, is known to be needed while selecting
> > > the most energy efficient idle state for a CPU or a group of CPUs.
> > >
> > > However, to be able to compute the sleep duration, we need to know at what
> > > time the next expected wakeup is for the CPU. Therefore, let's export this
> > > information via a new function, tick_nohz_get_next_wakeup(). Following
> > > changes make use of it.
> > >
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > > Cc: Lina Iyer <ilina@codeaurora.org>
> > > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
> > > Co-developed-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > ---
> > >
> > > Changes in v10:
> > >       - Updated function header of tick_nohz_get_next_wakeup().
> > >
> > > ---
> > >  include/linux/tick.h     |  8 ++++++++
> > >  kernel/time/tick-sched.c | 13 +++++++++++++
> > >  2 files changed, 21 insertions(+)
> > >
> > > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > > index 55388ab45fd4..e48f6b26b425 100644
> > > --- a/include/linux/tick.h
> > > +++ b/include/linux/tick.h
> > > @@ -125,6 +125,7 @@ extern bool tick_nohz_idle_got_tick(void);
> > >  extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
> > >  extern unsigned long tick_nohz_get_idle_calls(void);
> > >  extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
> > > +extern ktime_t tick_nohz_get_next_wakeup(int cpu);
> > >  extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
> > >  extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
> > >
> > > @@ -151,6 +152,13 @@ static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
> > >       *delta_next = TICK_NSEC;
> > >       return *delta_next;
> > >  }
> > > +
> > > +static inline ktime_t tick_nohz_get_next_wakeup(int cpu)
> > > +{
> > > +     /* Next wake up is the tick period, assume it starts now */
> > > +     return ktime_add(ktime_get(), TICK_NSEC);
> > > +}
> > > +
> > >  static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
> > >  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 69e673b88474..7a9166506503 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -1089,6 +1089,19 @@ unsigned long tick_nohz_get_idle_calls(void)
> > >       return ts->idle_calls;
> > >  }
> > >
> > > +/**
> > > + * tick_nohz_get_next_wakeup - return the next wake up of the CPU
> > > + * @cpu: the particular CPU to get next wake up for
> > > + *
> > > + * Called for idle CPUs only.
> > > + */
> > > +ktime_t tick_nohz_get_next_wakeup(int cpu)
> > > +{
> > > +     struct clock_event_device *dev = per_cpu(tick_cpu_device.evtdev, cpu);
> > > +
> > > +     return dev->next_event;
> > > +}
> > > +
> > >  static void tick_nohz_account_idle_ticks(struct tick_sched *ts)
> > >  {
> > >  #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
> > >
> >
> > Well, I have concerns regarding this one.
> >
> > I don't believe it is valid to call this new function for non-idle CPUs and
> > the kerneldoc kind of says so, but the next patch doesn't actually prevent
> > it from being called for a non-idle CPU (at the time it is called in there
> > the target CPU may not be idle any more AFAICS).
>
> You are correct, but let me clarify things.
>
> We are calling this new API from the new genpd governor, which may
> have a cpumask indicating there is more than one CPU attached to its
> PM domain+sub-PM domains. In other words, we may call the API for
> another CPU than the one we are executing on.
>
> When the new genpd governor is called, all CPUs in the cpumask of the
> genpd in question, are already runtime suspended and will remain so
> throughout the decisions made by the governor.
>
> However, because of the race condition, which needs to be manged by
> the genpd backend driver and its corresponding FW, one of the CPU in
> the genpd cpumask could potentially wake up from idle when the genpd
> governor runs. However, as a part of exiting from idle, that CPU needs
> to wait for the call to pm_runtime_get_sync() to return before
> completing the exit patch of idle. This also means waiting for the
> genpd governor to finish.

OK, so the CPU spins on a spin lock inside of the idle loop with interrupts off.

> The point is, no matter what decision the governor takes under these
> circumstances, the genpd backend driver and its FW must manage this
> race condition during the last man standing. For PSCI OSI mode, it
> means that if a cluster idle state is suggested by Linux during these
> circumstances, it must be prevented and aborted.

I would suggest putting a comment to explain that somewhere as it is
not really obvious.

> >
> > In principle, the cpuidle core can store this value, say in struct
> > cpuidle_device of the given CPU, and expose a helper to access it from
> > genpd, but that would be extra overhead totally unnecessary on everthing
> > that doesn't use genpd for cpuidle.
> >
> > So maybe the driver could store it in its ->enter callback?  After all,
> > the driver knows that genpd is going to be used later.
>
> This would work, but it wouldn't really change much when it comes to
> the race condition described above.

No, it wouldn't make the race go away.

> Of course it would turn the code
> into being more cpuidle specific, which seems reasonable to me.
>
> Anyway, if I understand your suggestion, in principle it means
> changing $subject patch in such way that the API should not take "int
> cpu" as an in-parameter, but instead only use __this_cpu() to read out
> the next event for current idle CPU.

Yes.

> Additionally, we need another new cpuidle API, which genpd can call to
> retrieve a new per CPU "next event data" stored by the cpuidle driver
> from its ->enter() callback. Is this a correct interpretation of your
> suggestion?

Yes, it is.

Generally, something like "cpuidle, give me the wakeup time of this
CPU".  And it may very well give you 0 if the CPU has woken up
already. :-)

WARNING: multiple messages have this Message-ID (diff)
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	"Raju P . L . S . S . S . N" <rplsssn@codeaurora.org>,
	Stephen Boyd <sboyd@kernel.org>, Tony Lindgren <tony@atomide.com>,
	Kevin Hilman <khilman@kernel.org>,
	Lina Iyer <ilina@codeaurora.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lina Iyer <lina.iyer@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH v10 03/27] timer: Export next wakeup time of a CPU
Date: Wed, 16 Jan 2019 11:59:40 +0100	[thread overview]
Message-ID: <CAJZ5v0iY7j5SThQhtq5xP+g5yNiHQ+cMWxq-tdVh3t5wJ_s3Tw@mail.gmail.com> (raw)
In-Reply-To: <CAPDyKFqQQnMKFNpGyi_dkP_Rm8qo-1WjPfVkp2unOxyXRa2wbA@mail.gmail.com>

On Wed, Jan 16, 2019 at 8:58 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 11 Jan 2019 at 12:07, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Thursday, November 29, 2018 6:46:36 PM CET Ulf Hansson wrote:
> > > From: Lina Iyer <lina.iyer@linaro.org>
> > >
> > > Knowing the sleep duration of CPUs, is known to be needed while selecting
> > > the most energy efficient idle state for a CPU or a group of CPUs.
> > >
> > > However, to be able to compute the sleep duration, we need to know at what
> > > time the next expected wakeup is for the CPU. Therefore, let's export this
> > > information via a new function, tick_nohz_get_next_wakeup(). Following
> > > changes make use of it.
> > >
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > > Cc: Lina Iyer <ilina@codeaurora.org>
> > > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
> > > Co-developed-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > ---
> > >
> > > Changes in v10:
> > >       - Updated function header of tick_nohz_get_next_wakeup().
> > >
> > > ---
> > >  include/linux/tick.h     |  8 ++++++++
> > >  kernel/time/tick-sched.c | 13 +++++++++++++
> > >  2 files changed, 21 insertions(+)
> > >
> > > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > > index 55388ab45fd4..e48f6b26b425 100644
> > > --- a/include/linux/tick.h
> > > +++ b/include/linux/tick.h
> > > @@ -125,6 +125,7 @@ extern bool tick_nohz_idle_got_tick(void);
> > >  extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
> > >  extern unsigned long tick_nohz_get_idle_calls(void);
> > >  extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
> > > +extern ktime_t tick_nohz_get_next_wakeup(int cpu);
> > >  extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
> > >  extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
> > >
> > > @@ -151,6 +152,13 @@ static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
> > >       *delta_next = TICK_NSEC;
> > >       return *delta_next;
> > >  }
> > > +
> > > +static inline ktime_t tick_nohz_get_next_wakeup(int cpu)
> > > +{
> > > +     /* Next wake up is the tick period, assume it starts now */
> > > +     return ktime_add(ktime_get(), TICK_NSEC);
> > > +}
> > > +
> > >  static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
> > >  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 69e673b88474..7a9166506503 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -1089,6 +1089,19 @@ unsigned long tick_nohz_get_idle_calls(void)
> > >       return ts->idle_calls;
> > >  }
> > >
> > > +/**
> > > + * tick_nohz_get_next_wakeup - return the next wake up of the CPU
> > > + * @cpu: the particular CPU to get next wake up for
> > > + *
> > > + * Called for idle CPUs only.
> > > + */
> > > +ktime_t tick_nohz_get_next_wakeup(int cpu)
> > > +{
> > > +     struct clock_event_device *dev = per_cpu(tick_cpu_device.evtdev, cpu);
> > > +
> > > +     return dev->next_event;
> > > +}
> > > +
> > >  static void tick_nohz_account_idle_ticks(struct tick_sched *ts)
> > >  {
> > >  #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
> > >
> >
> > Well, I have concerns regarding this one.
> >
> > I don't believe it is valid to call this new function for non-idle CPUs and
> > the kerneldoc kind of says so, but the next patch doesn't actually prevent
> > it from being called for a non-idle CPU (at the time it is called in there
> > the target CPU may not be idle any more AFAICS).
>
> You are correct, but let me clarify things.
>
> We are calling this new API from the new genpd governor, which may
> have a cpumask indicating there is more than one CPU attached to its
> PM domain+sub-PM domains. In other words, we may call the API for
> another CPU than the one we are executing on.
>
> When the new genpd governor is called, all CPUs in the cpumask of the
> genpd in question, are already runtime suspended and will remain so
> throughout the decisions made by the governor.
>
> However, because of the race condition, which needs to be manged by
> the genpd backend driver and its corresponding FW, one of the CPU in
> the genpd cpumask could potentially wake up from idle when the genpd
> governor runs. However, as a part of exiting from idle, that CPU needs
> to wait for the call to pm_runtime_get_sync() to return before
> completing the exit patch of idle. This also means waiting for the
> genpd governor to finish.

OK, so the CPU spins on a spin lock inside of the idle loop with interrupts off.

> The point is, no matter what decision the governor takes under these
> circumstances, the genpd backend driver and its FW must manage this
> race condition during the last man standing. For PSCI OSI mode, it
> means that if a cluster idle state is suggested by Linux during these
> circumstances, it must be prevented and aborted.

I would suggest putting a comment to explain that somewhere as it is
not really obvious.

> >
> > In principle, the cpuidle core can store this value, say in struct
> > cpuidle_device of the given CPU, and expose a helper to access it from
> > genpd, but that would be extra overhead totally unnecessary on everthing
> > that doesn't use genpd for cpuidle.
> >
> > So maybe the driver could store it in its ->enter callback?  After all,
> > the driver knows that genpd is going to be used later.
>
> This would work, but it wouldn't really change much when it comes to
> the race condition described above.

No, it wouldn't make the race go away.

> Of course it would turn the code
> into being more cpuidle specific, which seems reasonable to me.
>
> Anyway, if I understand your suggestion, in principle it means
> changing $subject patch in such way that the API should not take "int
> cpu" as an in-parameter, but instead only use __this_cpu() to read out
> the next event for current idle CPU.

Yes.

> Additionally, we need another new cpuidle API, which genpd can call to
> retrieve a new per CPU "next event data" stored by the cpuidle driver
> from its ->enter() callback. Is this a correct interpretation of your
> suggestion?

Yes, it is.

Generally, something like "cpuidle, give me the wakeup time of this
CPU".  And it may very well give you 0 if the CPU has woken up
already. :-)

WARNING: multiple messages have this Message-ID (diff)
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Linux PM <linux-pm@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Kevin Hilman <khilman@kernel.org>,
	Lina Iyer <ilina@codeaurora.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Tony Lindgren <tony@atomide.com>,
	Lina Iyer <lina.iyer@linaro.org>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Raju P . L . S . S . S . N" <rplsssn@codeaurora.org>,
	Ingo Molnar <mingo@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v10 03/27] timer: Export next wakeup time of a CPU
Date: Wed, 16 Jan 2019 11:59:40 +0100	[thread overview]
Message-ID: <CAJZ5v0iY7j5SThQhtq5xP+g5yNiHQ+cMWxq-tdVh3t5wJ_s3Tw@mail.gmail.com> (raw)
In-Reply-To: <CAPDyKFqQQnMKFNpGyi_dkP_Rm8qo-1WjPfVkp2unOxyXRa2wbA@mail.gmail.com>

On Wed, Jan 16, 2019 at 8:58 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 11 Jan 2019 at 12:07, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Thursday, November 29, 2018 6:46:36 PM CET Ulf Hansson wrote:
> > > From: Lina Iyer <lina.iyer@linaro.org>
> > >
> > > Knowing the sleep duration of CPUs, is known to be needed while selecting
> > > the most energy efficient idle state for a CPU or a group of CPUs.
> > >
> > > However, to be able to compute the sleep duration, we need to know at what
> > > time the next expected wakeup is for the CPU. Therefore, let's export this
> > > information via a new function, tick_nohz_get_next_wakeup(). Following
> > > changes make use of it.
> > >
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > > Cc: Lina Iyer <ilina@codeaurora.org>
> > > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > > Cc: Ingo Molnar <mingo@kernel.org>
> > > Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
> > > Co-developed-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > ---
> > >
> > > Changes in v10:
> > >       - Updated function header of tick_nohz_get_next_wakeup().
> > >
> > > ---
> > >  include/linux/tick.h     |  8 ++++++++
> > >  kernel/time/tick-sched.c | 13 +++++++++++++
> > >  2 files changed, 21 insertions(+)
> > >
> > > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > > index 55388ab45fd4..e48f6b26b425 100644
> > > --- a/include/linux/tick.h
> > > +++ b/include/linux/tick.h
> > > @@ -125,6 +125,7 @@ extern bool tick_nohz_idle_got_tick(void);
> > >  extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
> > >  extern unsigned long tick_nohz_get_idle_calls(void);
> > >  extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
> > > +extern ktime_t tick_nohz_get_next_wakeup(int cpu);
> > >  extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
> > >  extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
> > >
> > > @@ -151,6 +152,13 @@ static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
> > >       *delta_next = TICK_NSEC;
> > >       return *delta_next;
> > >  }
> > > +
> > > +static inline ktime_t tick_nohz_get_next_wakeup(int cpu)
> > > +{
> > > +     /* Next wake up is the tick period, assume it starts now */
> > > +     return ktime_add(ktime_get(), TICK_NSEC);
> > > +}
> > > +
> > >  static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
> > >  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 69e673b88474..7a9166506503 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -1089,6 +1089,19 @@ unsigned long tick_nohz_get_idle_calls(void)
> > >       return ts->idle_calls;
> > >  }
> > >
> > > +/**
> > > + * tick_nohz_get_next_wakeup - return the next wake up of the CPU
> > > + * @cpu: the particular CPU to get next wake up for
> > > + *
> > > + * Called for idle CPUs only.
> > > + */
> > > +ktime_t tick_nohz_get_next_wakeup(int cpu)
> > > +{
> > > +     struct clock_event_device *dev = per_cpu(tick_cpu_device.evtdev, cpu);
> > > +
> > > +     return dev->next_event;
> > > +}
> > > +
> > >  static void tick_nohz_account_idle_ticks(struct tick_sched *ts)
> > >  {
> > >  #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
> > >
> >
> > Well, I have concerns regarding this one.
> >
> > I don't believe it is valid to call this new function for non-idle CPUs and
> > the kerneldoc kind of says so, but the next patch doesn't actually prevent
> > it from being called for a non-idle CPU (at the time it is called in there
> > the target CPU may not be idle any more AFAICS).
>
> You are correct, but let me clarify things.
>
> We are calling this new API from the new genpd governor, which may
> have a cpumask indicating there is more than one CPU attached to its
> PM domain+sub-PM domains. In other words, we may call the API for
> another CPU than the one we are executing on.
>
> When the new genpd governor is called, all CPUs in the cpumask of the
> genpd in question, are already runtime suspended and will remain so
> throughout the decisions made by the governor.
>
> However, because of the race condition, which needs to be manged by
> the genpd backend driver and its corresponding FW, one of the CPU in
> the genpd cpumask could potentially wake up from idle when the genpd
> governor runs. However, as a part of exiting from idle, that CPU needs
> to wait for the call to pm_runtime_get_sync() to return before
> completing the exit patch of idle. This also means waiting for the
> genpd governor to finish.

OK, so the CPU spins on a spin lock inside of the idle loop with interrupts off.

> The point is, no matter what decision the governor takes under these
> circumstances, the genpd backend driver and its FW must manage this
> race condition during the last man standing. For PSCI OSI mode, it
> means that if a cluster idle state is suggested by Linux during these
> circumstances, it must be prevented and aborted.

I would suggest putting a comment to explain that somewhere as it is
not really obvious.

> >
> > In principle, the cpuidle core can store this value, say in struct
> > cpuidle_device of the given CPU, and expose a helper to access it from
> > genpd, but that would be extra overhead totally unnecessary on everthing
> > that doesn't use genpd for cpuidle.
> >
> > So maybe the driver could store it in its ->enter callback?  After all,
> > the driver knows that genpd is going to be used later.
>
> This would work, but it wouldn't really change much when it comes to
> the race condition described above.

No, it wouldn't make the race go away.

> Of course it would turn the code
> into being more cpuidle specific, which seems reasonable to me.
>
> Anyway, if I understand your suggestion, in principle it means
> changing $subject patch in such way that the API should not take "int
> cpu" as an in-parameter, but instead only use __this_cpu() to read out
> the next event for current idle CPU.

Yes.

> Additionally, we need another new cpuidle API, which genpd can call to
> retrieve a new per CPU "next event data" stored by the cpuidle driver
> from its ->enter() callback. Is this a correct interpretation of your
> suggestion?

Yes, it is.

Generally, something like "cpuidle, give me the wakeup time of this
CPU".  And it may very well give you 0 if the CPU has woken up
already. :-)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-01-16 10:59 UTC|newest]

Thread overview: 157+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-29 17:46 [PATCH v10 00/27] PM / Domains: Support hierarchical CPU arrangement (PSCI/ARM) Ulf Hansson
2018-11-29 17:46 ` Ulf Hansson
2018-11-29 17:46 ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 01/27] PM / Domains: Add generic data pointer to genpd_power_state struct Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-18 10:39   ` Daniel Lezcano
2018-12-18 10:39     ` Daniel Lezcano
2018-12-18 11:53     ` Ulf Hansson
2018-12-18 11:53       ` Ulf Hansson
2019-01-11 10:52       ` Rafael J. Wysocki
2019-01-11 10:52         ` Rafael J. Wysocki
2018-11-29 17:46 ` [PATCH v10 02/27] PM / Domains: Add support for CPU devices to genpd Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-19  9:53   ` Daniel Lezcano
2018-12-19  9:53     ` Daniel Lezcano
2018-12-19 10:02     ` Ulf Hansson
2018-12-19 10:02       ` Ulf Hansson
2019-01-11 10:54       ` Rafael J. Wysocki
2019-01-11 10:54         ` Rafael J. Wysocki
2018-11-29 17:46 ` [PATCH v10 03/27] timer: Export next wakeup time of a CPU Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2019-01-11 11:06   ` Rafael J. Wysocki
2019-01-11 11:06     ` Rafael J. Wysocki
2019-01-11 11:06     ` Rafael J. Wysocki
2019-01-16  7:57     ` Ulf Hansson
2019-01-16  7:57       ` Ulf Hansson
2019-01-16  7:57       ` Ulf Hansson
2019-01-16 10:59       ` Rafael J. Wysocki [this message]
2019-01-16 10:59         ` Rafael J. Wysocki
2019-01-16 10:59         ` Rafael J. Wysocki
2019-01-16 12:00         ` Ulf Hansson
2019-01-16 12:00           ` Ulf Hansson
2019-01-16 12:00           ` Ulf Hansson
2019-01-25 10:04           ` Ulf Hansson
2019-01-25 10:04             ` Ulf Hansson
2019-01-25 10:04             ` Ulf Hansson
2019-01-25 10:18             ` Rafael J. Wysocki
2019-01-25 10:18               ` Rafael J. Wysocki
2019-01-25 10:18               ` Rafael J. Wysocki
2018-11-29 17:46 ` [PATCH v10 04/27] PM / Domains: Add genpd governor for CPUs Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-19  9:54   ` Daniel Lezcano
2018-12-19  9:54     ` Daniel Lezcano
2018-12-19 10:09     ` Ulf Hansson
2018-12-19 10:09       ` Ulf Hansson
2018-12-19 10:09       ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 05/27] dt: psci: Update DT bindings to support hierarchical PSCI states Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 06/27] of: base: Add of_get_cpu_state_node() to get idle states for a CPU node Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-19 11:05   ` Daniel Lezcano
2018-12-19 11:05     ` Daniel Lezcano
2018-11-29 17:46 ` [PATCH v10 07/27] cpuidle: dt: Support hierarchical CPU idle states Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-19 11:20   ` Daniel Lezcano
2018-12-19 11:20     ` Daniel Lezcano
2018-11-29 17:46 ` [PATCH v10 08/27] ARM/ARM64: cpuidle: Let back-end init ops take the driver as input Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 09/27] drivers: firmware: psci: Move psci to separate directory Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 10/27] MAINTAINERS: Update files for PSCI Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 11/27] drivers: firmware: psci: Split psci_dt_cpu_init_idle() Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 12/27] drivers: firmware: psci: Simplify state node parsing Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 13/27] drivers: firmware: psci: Support hierarchical CPU idle states Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-19 12:11   ` Daniel Lezcano
2018-12-19 12:11     ` Daniel Lezcano
2018-12-19 12:53     ` Ulf Hansson
2018-12-19 12:53       ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 14/27] drivers: firmware: psci: Simplify error path of psci_dt_init() Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-19 12:08   ` Daniel Lezcano
2018-12-19 12:08     ` Daniel Lezcano
2018-11-29 17:46 ` [PATCH v10 15/27] drivers: firmware: psci: Announce support for OS initiated suspend mode Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-20 13:11   ` Daniel Lezcano
2018-12-20 13:11     ` Daniel Lezcano
2018-11-29 17:46 ` [PATCH v10 16/27] drivers: firmware: psci: Prepare to use " Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-20 14:08   ` Daniel Lezcano
2018-12-20 14:08     ` Daniel Lezcano
2018-12-20 15:41     ` Ulf Hansson
2018-12-20 15:41       ` Ulf Hansson
2018-12-20 17:16       ` Daniel Lezcano
2018-12-20 17:16         ` Daniel Lezcano
2018-11-29 17:46 ` [PATCH v10 17/27] drivers: firmware: psci: Prepare to support PM domains Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-20 14:19   ` Daniel Lezcano
2018-12-20 14:19     ` Daniel Lezcano
2018-12-20 15:49     ` Ulf Hansson
2018-12-20 15:49       ` Ulf Hansson
2018-12-20 18:06       ` Daniel Lezcano
2018-12-20 18:06         ` Daniel Lezcano
2018-12-20 21:37         ` Ulf Hansson
2018-12-20 21:37           ` Ulf Hansson
2018-12-21  7:15           ` Daniel Lezcano
2018-12-21  7:15             ` Daniel Lezcano
2018-11-29 17:46 ` [PATCH v10 18/27] drivers: firmware: psci: Add support for PM domains using genpd Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-03 16:37   ` Lina Iyer
2018-12-03 16:37     ` Lina Iyer
2018-12-03 20:03     ` Ulf Hansson
2018-12-03 20:03       ` Ulf Hansson
2018-12-20 14:35   ` Daniel Lezcano
2018-12-20 14:35     ` Daniel Lezcano
2018-12-20 21:09     ` Ulf Hansson
2018-12-20 21:09       ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 19/27] drivers: firmware: psci: Add hierarchical domain idle states converter Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 20/27] drivers: firmware: psci: Introduce psci_dt_topology_init() Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 21/27] drivers: firmware: psci: Add a helper to attach a CPU to its PM domain Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-12-04 18:45   ` Lina Iyer
2018-12-04 18:45     ` Lina Iyer
2018-12-06  9:15     ` Ulf Hansson
2018-12-06  9:15       ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 22/27] drivers: firmware: psci: Attach the CPU's device " Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 23/27] drivers: firmware: psci: Manage runtime PM in the idle path for CPUs Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 24/27] drivers: firmware: psci: Support CPU hotplug for the hierarchical model Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 22:31   ` Lina Iyer
2018-11-29 22:31     ` Lina Iyer
2018-11-30  8:25     ` Ulf Hansson
2018-11-30  8:25       ` Ulf Hansson
2018-11-30 20:57       ` Lina Iyer
2018-11-30 20:57         ` Lina Iyer
2018-12-19 11:17   ` Lorenzo Pieralisi
2018-12-19 11:17     ` Lorenzo Pieralisi
2018-12-19 11:47     ` Ulf Hansson
2018-12-19 11:47       ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 25/27] arm64: kernel: Respect the hierarchical CPU topology in DT for PSCI Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:46 ` [PATCH v10 26/27] arm64: dts: Convert to the hierarchical CPU topology layout for MSM8916 Ulf Hansson
2018-11-29 17:46   ` Ulf Hansson
2018-11-29 17:47 ` [PATCH v10 27/27] arm64: dts: hikey: Convert to the hierarchical CPU topology layout Ulf Hansson
2018-11-29 17:47   ` Ulf Hansson
2018-12-17 16:12 ` [PATCH v10 00/27] PM / Domains: Support hierarchical CPU arrangement (PSCI/ARM) Ulf Hansson
2018-12-17 16:12   ` Ulf Hansson
2019-01-11 11:08   ` Rafael J. Wysocki
2019-01-11 11:08     ` Rafael J. Wysocki
2019-01-03 12:06 ` Sudeep Holla
2019-01-03 12:06   ` Sudeep Holla
2019-01-03 12:06   ` Sudeep Holla
2019-01-16  9:10   ` Ulf Hansson
2019-01-16  9:10     ` Ulf Hansson
2019-01-17 17:44     ` Sudeep Holla
2019-01-17 17:44       ` Sudeep Holla
2019-01-17 17:44       ` Sudeep Holla
2019-01-18 11:56       ` Ulf Hansson
2019-01-18 11:56         ` Ulf Hansson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0iY7j5SThQhtq5xP+g5yNiHQ+cMWxq-tdVh3t5wJ_s3Tw@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=Lorenzo.Pieralisi@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=geert+renesas@glider.be \
    --cc=ilina@codeaurora.org \
    --cc=khilman@kernel.org \
    --cc=lina.iyer@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=rplsssn@codeaurora.org \
    --cc=sboyd@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=tony@atomide.com \
    --cc=ulf.hansson@linaro.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.