All of lore.kernel.org
 help / color / mirror / Atom feed
* Xen(arm64)  hang on suspend/resume
@ 2023-09-25 11:15 Jonas Blixt
  2023-09-28  8:30 ` Julien Grall
  0 siblings, 1 reply; 2+ messages in thread
From: Jonas Blixt @ 2023-09-25 11:15 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 1988 bytes --]

Hello,

I've encountered a strange behavior with Xen on arm64 with regards to suspend/resume.

My setup:
Version: Xen 4.13.1
Target: NXP imx8x SoC

We also use a set of patches from Aggios (https://xen-devel.narkive.com/yGps0HKG/rfc-v2-xen-arm-suspend-to-ram-support-in-xen-for-arm)

Occasionally xen gets stuck on resume. We know that the lower levels wake up and xen starts to resume because xen's debug console is available. When we're in this state dom0 does not resume and both pCPU's are in idle loops. If we at this point issue debug console commands (like 'h' for the help menu) that schedule tasklets dom0 wakes up and continues. Debug function that run in the irq-handler does not have the same effect.

This is what the run queue looks like:

sched_smt_power_savings: disabled
NOW=490275382125
Online Cpus: 0-1
Cpupool 0:
Cpus: 0-1
Scheduler: SMP Credit Scheduler rev2 (credit2)
Active queues: 1
    default-weight     = 256
Runqueue 0:
    ncpus              = 2
    cpus               = 0-1
    max_weight         = 256
    pick_bias          = 1
    instload           = 1
    aveload            = 282294 (~107%)
    idlers: 0
    tickled: 0
    fully idle cores: 0
Domain info:
    Domain: 0 w 256 c 0 v 2
      1: [0.0] flags=2 cpu=1 credit=984625 [w=256] load=208781 (~79%)
      2: [0.1] flags=0 cpu=1 credit=9742375 [w=256] load=25693 (~9%)
    Domain: 1 w 256 c 0 v 1
      3: [1.0] flags=0 cpu=1 credit=10447250 [w=256] load=7835 (~2%)
Runqueue 0:
CPU[00] runq=0, sibling={0}, core={0}
CPU[01] runq=0, sibling={1}, core={1}
    run: [0.0] flags=2 cpu=1 credit=-1543000 [w=256] load=208781 (~79%)
RUNQ:
      0: [0.1] flags=0 cpu=1 credit=10275375 [w=256] load=25279 (~9%)
[t: display multi-cpu clock info]
Synced stime skew: max=125ns avg=125ns samples=1 current=125ns
Synced cycles skew: max=1 avg=1 samples=1 current=1

I would be grateful if I could get some hint's on how to debug this.

Best Regards
Jonas



[-- Attachment #2: Type: text/html, Size: 8574 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Xen(arm64) hang on suspend/resume
  2023-09-25 11:15 Xen(arm64) hang on suspend/resume Jonas Blixt
@ 2023-09-28  8:30 ` Julien Grall
  0 siblings, 0 replies; 2+ messages in thread
From: Julien Grall @ 2023-09-28  8:30 UTC (permalink / raw)
  To: Jonas Blixt, xen-devel, Bertrand Marquis, Stefano Stabellini,
	George Dunlap



On 25/09/2023 12:15, Jonas Blixt wrote:
> Hello,

Hi,

> 
> I've encountered a strange behavior with Xen on arm64 with regards to suspend/resume.
> 
> My setup:
> Version: Xen 4.13.1

This has been relaesed in 2019 and not even the latest point release for 
4.13 (it is 4.13.5). For new development, I would strongly recommend to 
use the latest stable (4.17) if not staging.

But you should at least use the lastest point release (4.13.5). Note 
that this tree is not anymore supported at all by the community. So it 
may contain (security) bugs.

> Target: NXP imx8x SoC
> 
> We also use a set of patches from Aggios (https://xen-devel.narkive.com/yGps0HKG/rfc-v2-xen-arm-suspend-to-ram-support-in-xen-for-arm)

There was a new version of the series sent in 2022 [1]. This is based on 
a more recent Xen (4.16). I would suggest to give a try and check if it 
helps.

Note that this series is still in development and has not yet been 
accepted by the community. So if there are any bugs, then I would 
recommend to contact the original author.


> Occasionally xen gets stuck on resume. We know that the lower levels wake up and xen starts to resume because xen's debug console is available. When we're in this state dom0 does not resume and both pCPU's are in idle loops. If we at this point issue debug console commands (like 'h' for the help menu) that schedule tasklets dom0 wakes up and continues. Debug function that run in the irq-handler does not have the same effect.

It sounds like the dom0 vCPUs were not unblocked. That said, it is 
unclear why 'h' would help. Would you be able to print the field 
'pause_flags' for each vCPU?

You could use the key 'q' to dump all the domain information. Hopefully, 
this doesn't have a side-effect. If it has, then I would suggest to add 
some printk at boot.

Cheers,

[1] https://lore.kernel.org/cover.1665128335.git.mykyta_poturai@epam.com

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-09-28  8:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-25 11:15 Xen(arm64) hang on suspend/resume Jonas Blixt
2023-09-28  8:30 ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.