All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hector Martin <marcan@marcan.st>
To: Sudeep Holla <sudeep.holla@arm.com>
Cc: Kazuki <kazukih0205@gmail.com>,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	Sven Peter <sven@svenpeter.dev>, Len Brown <len.brown@intel.com>,
	Pavel Machek <pavel@ucw.cz>
Subject: Re: s2idle breaks on machines without cpuidle support
Date: Thu, 9 Feb 2023 01:45:18 +0900	[thread overview]
Message-ID: <e069e779-3943-0f9a-6592-5b096a2a8076@marcan.st> (raw)
In-Reply-To: <20230208161805.2dlx66oxphl25p3c@bogus>

On 09/02/2023 01.18, Sudeep Holla wrote:
> On Thu, Feb 09, 2023 at 12:42:17AM +0900, Hector Martin wrote:
>> On 08/02/2023 19.35, Sudeep Holla wrote:
>>> On Wed, Feb 08, 2023 at 04:48:18AM +0900, Kazuki wrote:
>>>> On Mon, Feb 06, 2023 at 10:12:39AM +0000, Sudeep Holla wrote:
>>>>>
>>>>> What do you mean by break ? More details on the observation would be helpful.
>>>> For example, CLOCK_MONOTONIC doesn't stop even after suspend since
>>>> these chain of commands don't get called.
>>>>
>>>> call_cpuidle_s2idle->cpuidle_enter_s2idle->enter_s2idle_proper->tick_freeze->sched_clock_suspend (Function that pauses CLOCK_MONOTONIC)
>>>>
>>>> Which in turn causes programs like systemd to crash since it doesn't
>>>> expect this.
>>>
>>> Yes expected IIUC. The per-cpu timers and counters continue to tick in
>>> WFI and hence CLOCK_MONOTONIC can't stop.
>>
>> The hardware counters would also keep ticking in "real" s2idle (with
>> hypothetical PSCI idle support) and often in full suspend. There is a
>> flag for this (CLOCK_SOURCE_SUSPEND_NONSTOP) that is set by default for
>> ARM. So this isn't why CLOCK_MONOTONIC isn't stopping; there is
>> machinery to make the kernel's view of time stop anyway, it's just not
>> being invoked here.
>>
> 
> Indeed, and I assumed s2idle was designed with those requirements but I
> think I may be wrong especially looking at few points you have raised
> provide my understanding is aligned with yours.
> 
>> This is somewhat orthogonal to the issue of PSCI. These machines can't
>> physically support PSCI and KVM at the same time (they do not have EL3),
>> so PSCI is not an option. We will be starting a conversation about how
>> to provide something "like" PSCI over some other sort of transport to
>> solve this soon. So that will "fix" this issue once it's all implemented.
>>
> 
> All the best for the efforts.
> 
>> However, these machines aren't the only ones without PSCI (search for
>> "spin-table" in arch/arm64/boot/dts, this isn't new and these aren't the
>> first).
> 
> Yes I am aware of it and if you see arch/arm64/kernel/smp_spin_table.c
> we don't support CPU hotplug or suspend for such a system.

We certainly support s2idle, except it's kind of broken as stated. Try
it, it works :-)

I didn't do anything special to enable s2idle on Apple platforms other
than make sure random drivers weren't broken and there was at least one
driver capable of triggering a wakeup. I just compile with
CONFIG_SUSPEND and s2idle works. Except for the part where
CLOCK_MONOTONIC keeps running. So generic kernels on spin_table
platforms ought to expose (broken) s2idle by default already.

>> It seems broken that Linux currently implements s2idle in such a
>> way that it violates the userspace clock behavior contract on systems
>> without a cpuidle driver (and does so silently, to make it worse).
> 
> Just to check if I understand this correctly, are you referring to:
> cpuidle_idle_call()->default_idle_call() if cpuidle_not_available()
> And the problem is it idles there in wfi but CLOCK_MONOTONIC isn't
> stopping as expected by the userspace ? If so, fair enough. If not, I
> may be missing to understand something.

Right. I'm not too certain on the details of exactly what suspend
machinery is running/supposed to, because this CLOCK_MONOTONIC issue was
a surprise to me when it came up. From my point of view s2idle "just
worked", it's only now that this has come up that we're realizing it's
winding up in a very different codepath to what would happen were
cpuidle/PSCI available. This was all silent from the user POV (it all
looks like it suspends/resumes normally as far as I can tell).

>> So that should be fixed regardless of whether we end up coming up with a
>> PSCI alternative or not for these platforms.
> 
> If above understanding is correct, I agree. But not sure what was the
> motivation behind the current behaviour.
> 
>> There's no fundamental reason why s2idle can't work properly with plain WFI.
>>
> 
> Fair enough. I hadn't thought much of it before as most of the platforms
> I have seen or used have at-least one deeper than WFI state these days.
> On arm32, this was common but each platform managed suspend_set_ops
> on its own and probably can do the same s2idle_set_ops.

Yeah, we do have one deeper idle state (and we should figure out how to
implement a PSCI alternative to enable it soon, since in particular for
certain SoCs plain WFI is quite a power hog since it keeps all the core
clusters powered up and at least partially clocked). But since we don't
have that yet, we've been using WFI-only s2idle so users have *some*
suspend ability.

- Hector

WARNING: multiple messages have this Message-ID (diff)
From: Hector Martin <marcan@marcan.st>
To: Sudeep Holla <sudeep.holla@arm.com>
Cc: Kazuki <kazukih0205@gmail.com>,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	Sven Peter <sven@svenpeter.dev>, Len Brown <len.brown@intel.com>,
	Pavel Machek <pavel@ucw.cz>
Subject: Re: s2idle breaks on machines without cpuidle support
Date: Thu, 9 Feb 2023 01:45:18 +0900	[thread overview]
Message-ID: <e069e779-3943-0f9a-6592-5b096a2a8076@marcan.st> (raw)
In-Reply-To: <20230208161805.2dlx66oxphl25p3c@bogus>

On 09/02/2023 01.18, Sudeep Holla wrote:
> On Thu, Feb 09, 2023 at 12:42:17AM +0900, Hector Martin wrote:
>> On 08/02/2023 19.35, Sudeep Holla wrote:
>>> On Wed, Feb 08, 2023 at 04:48:18AM +0900, Kazuki wrote:
>>>> On Mon, Feb 06, 2023 at 10:12:39AM +0000, Sudeep Holla wrote:
>>>>>
>>>>> What do you mean by break ? More details on the observation would be helpful.
>>>> For example, CLOCK_MONOTONIC doesn't stop even after suspend since
>>>> these chain of commands don't get called.
>>>>
>>>> call_cpuidle_s2idle->cpuidle_enter_s2idle->enter_s2idle_proper->tick_freeze->sched_clock_suspend (Function that pauses CLOCK_MONOTONIC)
>>>>
>>>> Which in turn causes programs like systemd to crash since it doesn't
>>>> expect this.
>>>
>>> Yes expected IIUC. The per-cpu timers and counters continue to tick in
>>> WFI and hence CLOCK_MONOTONIC can't stop.
>>
>> The hardware counters would also keep ticking in "real" s2idle (with
>> hypothetical PSCI idle support) and often in full suspend. There is a
>> flag for this (CLOCK_SOURCE_SUSPEND_NONSTOP) that is set by default for
>> ARM. So this isn't why CLOCK_MONOTONIC isn't stopping; there is
>> machinery to make the kernel's view of time stop anyway, it's just not
>> being invoked here.
>>
> 
> Indeed, and I assumed s2idle was designed with those requirements but I
> think I may be wrong especially looking at few points you have raised
> provide my understanding is aligned with yours.
> 
>> This is somewhat orthogonal to the issue of PSCI. These machines can't
>> physically support PSCI and KVM at the same time (they do not have EL3),
>> so PSCI is not an option. We will be starting a conversation about how
>> to provide something "like" PSCI over some other sort of transport to
>> solve this soon. So that will "fix" this issue once it's all implemented.
>>
> 
> All the best for the efforts.
> 
>> However, these machines aren't the only ones without PSCI (search for
>> "spin-table" in arch/arm64/boot/dts, this isn't new and these aren't the
>> first).
> 
> Yes I am aware of it and if you see arch/arm64/kernel/smp_spin_table.c
> we don't support CPU hotplug or suspend for such a system.

We certainly support s2idle, except it's kind of broken as stated. Try
it, it works :-)

I didn't do anything special to enable s2idle on Apple platforms other
than make sure random drivers weren't broken and there was at least one
driver capable of triggering a wakeup. I just compile with
CONFIG_SUSPEND and s2idle works. Except for the part where
CLOCK_MONOTONIC keeps running. So generic kernels on spin_table
platforms ought to expose (broken) s2idle by default already.

>> It seems broken that Linux currently implements s2idle in such a
>> way that it violates the userspace clock behavior contract on systems
>> without a cpuidle driver (and does so silently, to make it worse).
> 
> Just to check if I understand this correctly, are you referring to:
> cpuidle_idle_call()->default_idle_call() if cpuidle_not_available()
> And the problem is it idles there in wfi but CLOCK_MONOTONIC isn't
> stopping as expected by the userspace ? If so, fair enough. If not, I
> may be missing to understand something.

Right. I'm not too certain on the details of exactly what suspend
machinery is running/supposed to, because this CLOCK_MONOTONIC issue was
a surprise to me when it came up. From my point of view s2idle "just
worked", it's only now that this has come up that we're realizing it's
winding up in a very different codepath to what would happen were
cpuidle/PSCI available. This was all silent from the user POV (it all
looks like it suspends/resumes normally as far as I can tell).

>> So that should be fixed regardless of whether we end up coming up with a
>> PSCI alternative or not for these platforms.
> 
> If above understanding is correct, I agree. But not sure what was the
> motivation behind the current behaviour.
> 
>> There's no fundamental reason why s2idle can't work properly with plain WFI.
>>
> 
> Fair enough. I hadn't thought much of it before as most of the platforms
> I have seen or used have at-least one deeper than WFI state these days.
> On arm32, this was common but each platform managed suspend_set_ops
> on its own and probably can do the same s2idle_set_ops.

Yeah, we do have one deeper idle state (and we should figure out how to
implement a PSCI alternative to enable it soon, since in particular for
certain SoCs plain WFI is quite a power hog since it keeps all the core
clusters powered up and at least partially clocked). But since we don't
have that yet, we've been using WFI-only s2idle so users have *some*
suspend ability.

- Hector

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-02-08 16:45 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-04 15:27 s2idle breaks on machines without cpuidle support Kazuki
2023-02-04 15:27 ` Kazuki
2023-02-06 10:12 ` Sudeep Holla
2023-02-06 10:12   ` Sudeep Holla
2023-02-07 19:48   ` Kazuki
2023-02-07 19:48     ` Kazuki
2023-02-08 10:35     ` Sudeep Holla
2023-02-08 10:35       ` Sudeep Holla
2023-02-08 11:20       ` Kazuki
2023-02-08 11:20         ` Kazuki
2023-02-08 14:16         ` Sudeep Holla
2023-02-08 14:16           ` Sudeep Holla
2023-02-08 14:43           ` Kazuki
2023-02-08 14:43             ` Kazuki
2023-02-08 15:03             ` Sudeep Holla
2023-02-08 15:03               ` Sudeep Holla
2023-02-08 15:19               ` Kazuki
2023-02-08 15:19                 ` Kazuki
2023-02-08 15:34                 ` Sudeep Holla
2023-02-08 15:34                   ` Sudeep Holla
2023-02-08 15:42                   ` Kazuki
2023-02-08 15:42                     ` Kazuki
2023-02-08 14:52           ` Kazuki
2023-02-08 14:52             ` Kazuki
2023-02-08 15:42       ` Hector Martin
2023-02-08 15:42         ` Hector Martin
2023-02-08 16:18         ` Sudeep Holla
2023-02-08 16:18           ` Sudeep Holla
2023-02-08 16:45           ` Hector Martin [this message]
2023-02-08 16:45             ` Hector Martin
2023-09-07 19:11             ` Florian Fainelli
2023-09-07 19:11               ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e069e779-3943-0f9a-6592-5b096a2a8076@marcan.st \
    --to=marcan@marcan.st \
    --cc=daniel.lezcano@linaro.org \
    --cc=kazukih0205@gmail.com \
    --cc=len.brown@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=sven@svenpeter.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.