All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
@ 2017-12-22 17:41 Mirela Simonovic
  2018-01-11  0:55 ` Stefano Stabellini
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Mirela Simonovic @ 2017-12-22 17:41 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, julien.grall

This document contains our design specification for "suspend to RAM"
support for ARM in Xen. It covers the basic suspend to RAM mechanism
based on ARM PSCI standard, that would allow individual guests and
Xen itself to suspend/resume.

We would appreciate your feedback.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
v2:
-Improved specification according to comments
-Added more implementation details
-Incremented revision number to 1.1
---
 docs/misc/arm/suspend-to-ram.txt | 266 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 266 insertions(+)
 create mode 100644 docs/misc/arm/suspend-to-ram.txt

diff --git a/docs/misc/arm/suspend-to-ram.txt b/docs/misc/arm/suspend-to-ram.txt
new file mode 100644
index 0000000000..6e8f10d1ce
--- /dev/null
+++ b/docs/misc/arm/suspend-to-ram.txt
@@ -0,0 +1,266 @@
+% Suspend to RAM Support in Xen for ARM
+% Revision 1.1
+
+========
+Overview
+========
+
+Suspend to RAM (in the following text 'suspend') for ARM in Xen should be
+coordinated using ARM PSCI standard [1].
+
+Ideally, EL1/2 should suspend in the following order:
+1) Unprivileged guests (DomUs) suspend
+2) Privileged guest (Dom0) suspends
+3) Xen suspends
+
+However, suspending unprivileged guests is not mandatory for suspending
+Dom0 and Xen. System suspend initiated by Dom0 (step 2) is considered to be an
+ultimate decision to suspend the physical machine. Suspending of Xen (step 3)
+is triggered whenever Dom0 completes suspend. Xen suspend leads to the full
+suspend of EL2.
+
+If an unprivileged guest is not suspended at the moment when Dom0 initiates
+its own suspend, the guest will be paused on Xen's suspend and unpaused on
+Xen's resume. That way, a guest which doesn't have power management support
+cannot prevent the physical system from suspending when the decision to suspend
+is made by privileged software (Dom0).
+
+Each guest in the system is responsible for suspending the devices it owns.
+If a guest does not suspend a device, the device will continue to operate as
+it is configured at the moment when the system suspends. If a device triggers
+an interrupt while the physical system is suspended, the system will resume.
+
+It is recommended for an unprivileged guest to participate in power management
+in the following scenario:
+Assume unprivileged guest owns a device which will trigger interrupt at some
+point. This interrupt will wake-up the system. If such a behavior is not wanted,
+coordination between Dom0 and the guest is required in order to inform the guest
+about the intended physical system suspend. Then, the guest will have a chance
+to suspend the device or respond to the request in an abort fashion.
+
+Since this proposal is focused on implementing PSCI-based suspend mechanisms in
+Xen, communication with or among the guests is not covered by this document.
+The order of suspending the guests is assumed to be guaranteed by the software
+running in EL1.
+
+This document covers the following:
+1) Mechanism for suspending/resuming a guest:
+	1.1) Suspend is initiated by the guest
+	1.2) Resume is initiated by a device interrupt
+2) Mechanism for pausing/unpausing running guests when Dom0 suspends
+3) Mechanism for suspending/resuming Xen when Dom0 completes suspend
+4) Resuming from any state on a wake-up event (device interrupt):
+	4.1) Resume DomU on wake-up event when Dom0 is still running
+	4.2) Resume DomU on wake-up event when Xen is suspended
+	4.3) Resume Dom0 on wake-up event
+
+Mechanisms enumerated above will allow different kind of policies and
+coordination among guests to be implemented in EL1. That is out of the scope of
+this document.
+
+-----------------
+Suspending Guests
+-----------------
+
+Suspend procedure for a guest consists of the following:
+1) Suspending devices
+2) Suspending non-boot CPUs (based on hotplug/PSCI)
+3) System suspend, performed by the boot CPU
+
+Each guest should suspend the devices it owns just like it would when running
+without Xen.
+
+Guests should suspend their non-boot vCPUs using the hotplug mechanism.
+Virtual CPUs should be put offline using the already implemented PSCI vCPU_OFF
+call (prefix 'v' is added to distinguish PSCI calls made by guests to Xen, which
+affect virtual machines; as opposed to PSCI calls made by Xen to the EL3, which
+can affect power state of the physical machine).
+
+After suspending its non-boot vCPUs a guest should finalize the suspend by
+making the vSYSTEM_SUSPEND PSCI call. The resume address is specified by the
+guest via the vSYSTEM_SUSPEND entry_point_address argument. The vSYSTEM_SUSPEND
+call is currently not implemented in Xen.
+
+It is expected that a guest leaves enabled all interrupts that should wake it
+up. Other interrupts should be disabled by the guest prior to calling
+vSYSTEM_SUSPEND.
+
+After an unprivileged guest suspends, Xen will not suspend. Xen would suspend
+only after the Dom0 completes the system suspend.
+
+--------------
+Suspending Xen
+--------------
+
+Xen should start suspending itself upon receiving the vSYSTEM_SUSPEND call
+from the last running guest (Dom0). At that moment all physical CPUs are still
+online (taking offline a vCPU or suspending a VM does not affect physical CPUs).
+Xen shall now put offline the non-boot pCPUs by making the CPU_OFF PSCI call
+to EL3. The CPU_OFF PSCI function is currently not implemented in Xen.
+
+After putting offline the non-boot cores Xen must save the context and finalize
+suspend by invoking SYSTEM_SUSPEND PSCI call, which is passed to EL3.
+The resume point of Xen is specified by the entry_point_address argument of the
+SYSTEM_SUSPEND call. The SYSTEM_SUSPEND function and context saving is not
+implemented in Xen for ARM today.
+
+------------
+Resuming Xen
+------------
+
+Xen must be resumed prior to any software running in EL1. Starting from the
+resume point, Xen should restore the context and resume Dom0. Dom0 shall always
+be resumed whenever Xen resumes.
+
+---------------
+Resuming Guests
+---------------
+
+Resume of the privileged guest (Dom0) is always following the Xen resume.
+
+An unprivileged guest shall resume once a device it owns triggers a wake-up
+interrupt, regardless of whether Xen was suspended when the wake-up interrupt
+was triggered. If Xen was suspended, it is assumed that Dom0 will be running
+before the DomU guest starts to resume. The synchronization mechanism to
+enforce the assumed condition is TBD.
+
+If the ARM's GIC was powered down after the ARM subsystem suspended, it is
+assumed that Xen needs to restore the GIC interface for a VM prior to handing
+over control to the guest. However, the guest should restore its own context
+upon entering the resume point, just like it would when running without Xen.
+
+===============
+Implementation
+===============
+
+--------
+Overview
+--------
+
+In order to enable the suspend/resume of VMs and Xen itself, the following PSCI
+calls have to be implemented and integrated in Xen:
+1) vSYSTEM_SUSPEND
+2) CPU_OFF
+3) SYSTEM_SUSPEND
+
+In addition, the following have to be implemented:
+* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
+* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
+	* Disable wathdog on suspend, enable it on resume
+	* Pause domains on suspend, unpause them on resume
+	* Disable non-boot pCPUs on suspend, enable them on resume
+	* Save/restore of GIC configuration
+	* Suspend/resume timer
+	* Save/restore of EL2 context
+	* Implement resume entry point in Xen, including MMU configuration
+
+Implementation details are provided in the sections below. Function names and
+paths used below are consistent within the document but may not always match the
+names used in future implementation. Existing functions and paths are named as
+in Xen source tree.
+
+-------------------------------------
+Suspend/Resume Implementation Details
+-------------------------------------
+
+PSCI Implementation and Integration
+-----------------------------------
+vSYSTEM_SUSPEND
+---------------
+vSYSTEM_SUSPEND shall be implemented in
+* do_psci_system_suspend() in arch/arm/vpsci.c
+* Code independent from PSCI interface will be added in arch/arm/suspend.c
+
+The implementation shall include the following steps:
+* Suspend the current (calling) vCPU. Consists of 2 major steps:
+1) Reset context of vCPU and save entry point into PC and context ID into X0
+(entry point and context ID are provided via vSYSTEM_SUSPEND arguments)
+2) Block vCPU to ensure that it is not scheduled until it is unblocked by an
+interrupt.
+In step 1) above, the context is reset in order to prepare the vCPU for resume,
+i.e. to save vCPU context that matches reset values as expected by software on
+resume. This doesn't hold for PC and X0, since the PC contains resume entry
+point and X0 contains context ID, as defined by PSCI.
+* If the hardware domain made the call trigger Xen suspend, i.e.
+  call machine_suspend() which will be implemented in arch/arm/suspend.c
+ (similar as the machine_restart() is implemented in arch/arm/shutdown.c)
+
+The function do_psci_system_suspend() shall be called from
+* do_trap_psci() in arch/arm/traps.c
+
+CPU_OFF (physical CPUs)
+-----------------------
+The CPU_OFF function shall be implemented in
+* call_psci_cpu_off() in arch/arm/psci.c
+
+The implementation shall consist just of making the SMC call to EL3.
+
+This function needs to be called when Xen generic code disables a non-boot CPU.
+When a CPU is disabled it will loop forever in while loop (stop_cpu() function
+which is already implemented in xen/arch/arm/smpboot.c). Call to
+call_psci_cpu_off() shall be made before the CPU enters infinite loop.
+
+SYSTEM_SUSPEND (physical)
+-------------------------
+The SYSTEM_SUSPEND function shall be implemented in
+* call_psci_system_suspend() in arch/arm/psci.c
+
+The implementation shall consist just of making the SMC call to EL3. The
+entry_point_address argument of the SMC call needs to be an ARM architecture
+resume address, which shall be implemented, e.g. as hyp_resume() in
+arch/arm/arm64/entry.S. The call_psci_system_suspend() function does not return.
+On the resume, the execution flow continues from hyp_resume.
+
+The function needs to be called from machine_suspend() to finalize the suspend
+procedure.
+
+------------------
+Additional Changes
+------------------
+
+Suspend Flow
+------------
+The suspend procedure shall be implemented in
+* machine_suspend() in arch/arm/suspend.c
+
+The implementation shall include the following steps:
+* Move the execution to boot pCPU
+* Set the system_state variable to SYS_STATE_suspend
+* Disable watchdog
+* Freeze domains by calling domain_pause() for each domain
+* Disable non-boot CPUs by calling disable_nonboot_cpus()
+* Disable interrupts
+* Suspend timer
+* Save GIC context. Shall be implemented in arch/arm/gic.c,
+  include/asm-arm/gic.h and arch/arm/gic-v2.c (only GICv2 will be supported).
+* Save CPU context. This shall be implemented in assembly, in hyp_suspend()
+  in arch/arm/arm64/entry.S. The context consists of callee-saved general
+  purpose registers, as well as few system registers. Context of registers shall
+  be saved in a statically allocated structure.
+* Finalize the suspend by calling call_psci_system_suspend()
+
+Resume Flow
+------------
+The resume entry point shall be implemented in
+* hyp_resume() in arch/arm/arm64/entry.S
+The very beginning of the resume procedure has to be implemented in assembly.
+It shall contain the following:
+* Enable the MMU so that the structure containing CPU context which was saved on
+suspend can be accessed
+* Restore CPU context (to match the values saved on suspend) and return into C
+* Set the system_state variable to SYS_STATE_resume
+* Restore GIC context
+* Resume timer
+* Enable interrupts
+* Enable non-boot CPUs by calling enable_nonboot_cpus()
+* Thaw domains by calling domain_unpause() for each domain
+* Enable watchdog
+* Set the system_state variable to SYS_STATE_active
+* Resume Dom0
+
+==========
+References
+==========
+
+[1] Power State Coordination Interface (ARM):
+http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2017-12-22 17:41 [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM Mirela Simonovic
@ 2018-01-11  0:55 ` Stefano Stabellini
  2018-01-11 14:00 ` Julien Grall
  2018-03-26  9:51 ` Peng Fan
  2 siblings, 0 replies; 15+ messages in thread
From: Stefano Stabellini @ 2018-01-11  0:55 UTC (permalink / raw)
  To: Mirela Simonovic; +Cc: edgar.iglesias, sstabellini, julien.grall, xen-devel

On Fri, 22 Dec 2017, Mirela Simonovic wrote:
> This document contains our design specification for "suspend to RAM"
> support for ARM in Xen. It covers the basic suspend to RAM mechanism
> based on ARM PSCI standard, that would allow individual guests and
> Xen itself to suspend/resume.
> 
> We would appreciate your feedback.
> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>

Sounds good to me

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> v2:
> -Improved specification according to comments
> -Added more implementation details
> -Incremented revision number to 1.1
> ---
>  docs/misc/arm/suspend-to-ram.txt | 266 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 266 insertions(+)
>  create mode 100644 docs/misc/arm/suspend-to-ram.txt
> 
> diff --git a/docs/misc/arm/suspend-to-ram.txt b/docs/misc/arm/suspend-to-ram.txt
> new file mode 100644
> index 0000000000..6e8f10d1ce
> --- /dev/null
> +++ b/docs/misc/arm/suspend-to-ram.txt
> @@ -0,0 +1,266 @@
> +% Suspend to RAM Support in Xen for ARM
> +% Revision 1.1
> +
> +========
> +Overview
> +========
> +
> +Suspend to RAM (in the following text 'suspend') for ARM in Xen should be
> +coordinated using ARM PSCI standard [1].
> +
> +Ideally, EL1/2 should suspend in the following order:
> +1) Unprivileged guests (DomUs) suspend
> +2) Privileged guest (Dom0) suspends
> +3) Xen suspends
> +
> +However, suspending unprivileged guests is not mandatory for suspending
> +Dom0 and Xen. System suspend initiated by Dom0 (step 2) is considered to be an
> +ultimate decision to suspend the physical machine. Suspending of Xen (step 3)
> +is triggered whenever Dom0 completes suspend. Xen suspend leads to the full
> +suspend of EL2.
> +
> +If an unprivileged guest is not suspended at the moment when Dom0 initiates
> +its own suspend, the guest will be paused on Xen's suspend and unpaused on
> +Xen's resume. That way, a guest which doesn't have power management support
> +cannot prevent the physical system from suspending when the decision to suspend
> +is made by privileged software (Dom0).
> +
> +Each guest in the system is responsible for suspending the devices it owns.
> +If a guest does not suspend a device, the device will continue to operate as
> +it is configured at the moment when the system suspends. If a device triggers
> +an interrupt while the physical system is suspended, the system will resume.
> +
> +It is recommended for an unprivileged guest to participate in power management
> +in the following scenario:
> +Assume unprivileged guest owns a device which will trigger interrupt at some
> +point. This interrupt will wake-up the system. If such a behavior is not wanted,
> +coordination between Dom0 and the guest is required in order to inform the guest
> +about the intended physical system suspend. Then, the guest will have a chance
> +to suspend the device or respond to the request in an abort fashion.
> +
> +Since this proposal is focused on implementing PSCI-based suspend mechanisms in
> +Xen, communication with or among the guests is not covered by this document.
> +The order of suspending the guests is assumed to be guaranteed by the software
> +running in EL1.
> +
> +This document covers the following:
> +1) Mechanism for suspending/resuming a guest:
> +	1.1) Suspend is initiated by the guest
> +	1.2) Resume is initiated by a device interrupt
> +2) Mechanism for pausing/unpausing running guests when Dom0 suspends
> +3) Mechanism for suspending/resuming Xen when Dom0 completes suspend
> +4) Resuming from any state on a wake-up event (device interrupt):
> +	4.1) Resume DomU on wake-up event when Dom0 is still running
> +	4.2) Resume DomU on wake-up event when Xen is suspended
> +	4.3) Resume Dom0 on wake-up event
> +
> +Mechanisms enumerated above will allow different kind of policies and
> +coordination among guests to be implemented in EL1. That is out of the scope of
> +this document.
> +
> +-----------------
> +Suspending Guests
> +-----------------
> +
> +Suspend procedure for a guest consists of the following:
> +1) Suspending devices
> +2) Suspending non-boot CPUs (based on hotplug/PSCI)
> +3) System suspend, performed by the boot CPU
> +
> +Each guest should suspend the devices it owns just like it would when running
> +without Xen.
> +
> +Guests should suspend their non-boot vCPUs using the hotplug mechanism.
> +Virtual CPUs should be put offline using the already implemented PSCI vCPU_OFF
> +call (prefix 'v' is added to distinguish PSCI calls made by guests to Xen, which
> +affect virtual machines; as opposed to PSCI calls made by Xen to the EL3, which
> +can affect power state of the physical machine).
> +
> +After suspending its non-boot vCPUs a guest should finalize the suspend by
> +making the vSYSTEM_SUSPEND PSCI call. The resume address is specified by the
> +guest via the vSYSTEM_SUSPEND entry_point_address argument. The vSYSTEM_SUSPEND
> +call is currently not implemented in Xen.
> +
> +It is expected that a guest leaves enabled all interrupts that should wake it
> +up. Other interrupts should be disabled by the guest prior to calling
> +vSYSTEM_SUSPEND.
> +
> +After an unprivileged guest suspends, Xen will not suspend. Xen would suspend
> +only after the Dom0 completes the system suspend.
> +
> +--------------
> +Suspending Xen
> +--------------
> +
> +Xen should start suspending itself upon receiving the vSYSTEM_SUSPEND call
> +from the last running guest (Dom0). At that moment all physical CPUs are still
> +online (taking offline a vCPU or suspending a VM does not affect physical CPUs).
> +Xen shall now put offline the non-boot pCPUs by making the CPU_OFF PSCI call
> +to EL3. The CPU_OFF PSCI function is currently not implemented in Xen.
> +
> +After putting offline the non-boot cores Xen must save the context and finalize
> +suspend by invoking SYSTEM_SUSPEND PSCI call, which is passed to EL3.
> +The resume point of Xen is specified by the entry_point_address argument of the
> +SYSTEM_SUSPEND call. The SYSTEM_SUSPEND function and context saving is not
> +implemented in Xen for ARM today.
> +
> +------------
> +Resuming Xen
> +------------
> +
> +Xen must be resumed prior to any software running in EL1. Starting from the
> +resume point, Xen should restore the context and resume Dom0. Dom0 shall always
> +be resumed whenever Xen resumes.
> +
> +---------------
> +Resuming Guests
> +---------------
> +
> +Resume of the privileged guest (Dom0) is always following the Xen resume.
> +
> +An unprivileged guest shall resume once a device it owns triggers a wake-up
> +interrupt, regardless of whether Xen was suspended when the wake-up interrupt
> +was triggered. If Xen was suspended, it is assumed that Dom0 will be running
> +before the DomU guest starts to resume. The synchronization mechanism to
> +enforce the assumed condition is TBD.
> +
> +If the ARM's GIC was powered down after the ARM subsystem suspended, it is
> +assumed that Xen needs to restore the GIC interface for a VM prior to handing
> +over control to the guest. However, the guest should restore its own context
> +upon entering the resume point, just like it would when running without Xen.
> +
> +===============
> +Implementation
> +===============
> +
> +--------
> +Overview
> +--------
> +
> +In order to enable the suspend/resume of VMs and Xen itself, the following PSCI
> +calls have to be implemented and integrated in Xen:
> +1) vSYSTEM_SUSPEND
> +2) CPU_OFF
> +3) SYSTEM_SUSPEND
> +
> +In addition, the following have to be implemented:
> +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
> +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
> +	* Disable wathdog on suspend, enable it on resume
> +	* Pause domains on suspend, unpause them on resume
> +	* Disable non-boot pCPUs on suspend, enable them on resume
> +	* Save/restore of GIC configuration
> +	* Suspend/resume timer
> +	* Save/restore of EL2 context
> +	* Implement resume entry point in Xen, including MMU configuration
> +
> +Implementation details are provided in the sections below. Function names and
> +paths used below are consistent within the document but may not always match the
> +names used in future implementation. Existing functions and paths are named as
> +in Xen source tree.
> +
> +-------------------------------------
> +Suspend/Resume Implementation Details
> +-------------------------------------
> +
> +PSCI Implementation and Integration
> +-----------------------------------
> +vSYSTEM_SUSPEND
> +---------------
> +vSYSTEM_SUSPEND shall be implemented in
> +* do_psci_system_suspend() in arch/arm/vpsci.c
> +* Code independent from PSCI interface will be added in arch/arm/suspend.c
> +
> +The implementation shall include the following steps:
> +* Suspend the current (calling) vCPU. Consists of 2 major steps:
> +1) Reset context of vCPU and save entry point into PC and context ID into X0
> +(entry point and context ID are provided via vSYSTEM_SUSPEND arguments)
> +2) Block vCPU to ensure that it is not scheduled until it is unblocked by an
> +interrupt.
> +In step 1) above, the context is reset in order to prepare the vCPU for resume,
> +i.e. to save vCPU context that matches reset values as expected by software on
> +resume. This doesn't hold for PC and X0, since the PC contains resume entry
> +point and X0 contains context ID, as defined by PSCI.
> +* If the hardware domain made the call trigger Xen suspend, i.e.
> +  call machine_suspend() which will be implemented in arch/arm/suspend.c
> + (similar as the machine_restart() is implemented in arch/arm/shutdown.c)
> +
> +The function do_psci_system_suspend() shall be called from
> +* do_trap_psci() in arch/arm/traps.c
> +
> +CPU_OFF (physical CPUs)
> +-----------------------
> +The CPU_OFF function shall be implemented in
> +* call_psci_cpu_off() in arch/arm/psci.c
> +
> +The implementation shall consist just of making the SMC call to EL3.
> +
> +This function needs to be called when Xen generic code disables a non-boot CPU.
> +When a CPU is disabled it will loop forever in while loop (stop_cpu() function
> +which is already implemented in xen/arch/arm/smpboot.c). Call to
> +call_psci_cpu_off() shall be made before the CPU enters infinite loop.
> +
> +SYSTEM_SUSPEND (physical)
> +-------------------------
> +The SYSTEM_SUSPEND function shall be implemented in
> +* call_psci_system_suspend() in arch/arm/psci.c
> +
> +The implementation shall consist just of making the SMC call to EL3. The
> +entry_point_address argument of the SMC call needs to be an ARM architecture
> +resume address, which shall be implemented, e.g. as hyp_resume() in
> +arch/arm/arm64/entry.S. The call_psci_system_suspend() function does not return.
> +On the resume, the execution flow continues from hyp_resume.
> +
> +The function needs to be called from machine_suspend() to finalize the suspend
> +procedure.
> +
> +------------------
> +Additional Changes
> +------------------
> +
> +Suspend Flow
> +------------
> +The suspend procedure shall be implemented in
> +* machine_suspend() in arch/arm/suspend.c
> +
> +The implementation shall include the following steps:
> +* Move the execution to boot pCPU
> +* Set the system_state variable to SYS_STATE_suspend
> +* Disable watchdog
> +* Freeze domains by calling domain_pause() for each domain
> +* Disable non-boot CPUs by calling disable_nonboot_cpus()
> +* Disable interrupts
> +* Suspend timer
> +* Save GIC context. Shall be implemented in arch/arm/gic.c,
> +  include/asm-arm/gic.h and arch/arm/gic-v2.c (only GICv2 will be supported).
> +* Save CPU context. This shall be implemented in assembly, in hyp_suspend()
> +  in arch/arm/arm64/entry.S. The context consists of callee-saved general
> +  purpose registers, as well as few system registers. Context of registers shall
> +  be saved in a statically allocated structure.
> +* Finalize the suspend by calling call_psci_system_suspend()
> +
> +Resume Flow
> +------------
> +The resume entry point shall be implemented in
> +* hyp_resume() in arch/arm/arm64/entry.S
> +The very beginning of the resume procedure has to be implemented in assembly.
> +It shall contain the following:
> +* Enable the MMU so that the structure containing CPU context which was saved on
> +suspend can be accessed
> +* Restore CPU context (to match the values saved on suspend) and return into C
> +* Set the system_state variable to SYS_STATE_resume
> +* Restore GIC context
> +* Resume timer
> +* Enable interrupts
> +* Enable non-boot CPUs by calling enable_nonboot_cpus()
> +* Thaw domains by calling domain_unpause() for each domain
> +* Enable watchdog
> +* Set the system_state variable to SYS_STATE_active
> +* Resume Dom0
> +
> +==========
> +References
> +==========
> +
> +[1] Power State Coordination Interface (ARM):
> +http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf
> -- 
> 2.13.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2017-12-22 17:41 [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM Mirela Simonovic
  2018-01-11  0:55 ` Stefano Stabellini
@ 2018-01-11 14:00 ` Julien Grall
  2018-01-23 11:52   ` Oleksandr Tyshchenko
  2018-01-24 17:55   ` Mirela Simonovic
  2018-03-26  9:51 ` Peng Fan
  2 siblings, 2 replies; 15+ messages in thread
From: Julien Grall @ 2018-01-11 14:00 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini

Hi Mirela,

Thank you for the sending the design document. The general design looks 
good to me. I have some comments below, but they are more related to the 
implementation of CPU on/off in Xen.

On 22/12/17 17:41, Mirela Simonovic wrote:

[...]

> +---------------
> +Resuming Guests
> +---------------
> +
> +Resume of the privileged guest (Dom0) is always following the Xen resume.
> +
> +An unprivileged guest shall resume once a device it owns triggers a wake-up
> +interrupt, regardless of whether Xen was suspended when the wake-up interrupt
> +was triggered. If Xen was suspended, it is assumed that Dom0 will be running
> +before the DomU guest starts to resume. The synchronization mechanism to
> +enforce the assumed condition is TBD.

Given that all but the non-boot CPU will be offlined. Does the wake-up 
interrupt always need to target the non-boot CPU?

> +
> +If the ARM's GIC was powered down after the ARM subsystem suspended, it is
> +assumed that Xen needs to restore the GIC interface for a VM prior to handing
> +over control to the guest. However, the guest should restore its own context
> +upon entering the resume point, just like it would when running without Xen.
> +
> +===============
> +Implementation
> +===============

[...]

> +CPU_OFF (physical CPUs)
> +-----------------------
> +The CPU_OFF function shall be implemented in
> +* call_psci_cpu_off() in arch/arm/psci.c
> +
> +The implementation shall consist just of making the SMC call to EL3.
> +
> +This function needs to be called when Xen generic code disables a non-boot CPU.
> +When a CPU is disabled it will loop forever in while loop (stop_cpu() function
> +which is already implemented in xen/arch/arm/smpboot.c). Call to
> +call_psci_cpu_off() shall be made before the CPU enters infinite loop.

While the code is present, we never offline physical CPU at the moment 
except when shutting down the place. So I am not fully convinced that 
stop_cpu() is properly implemented.

For instance, you likely need to migrate interrupts that was assigned to 
the physical CPU (either guest one or Xen one). Though Xen ones might be 
less a concern because I think they are always assigned to CPU0 at the 
moment.

Furthermore, PPI handlers are not removed. Same for any memory allocated 
(you may loose reference to it because percpu area for that CPU will get 
freed). I believe get into trouble when the CPU is back online?

I may have miss other bits, so I would highly recommend to go through 
the boot code and see what could go wrong.

[..]

> +Resume Flow
> +------------
> +The resume entry point shall be implemented in
> +* hyp_resume() in arch/arm/arm64/entry.S
> +The very beginning of the resume procedure has to be implemented in assembly.
> +It shall contain the following:
> +* Enable the MMU so that the structure containing CPU context which was saved on
> +suspend can be accessed
> +* Restore CPU context (to match the values saved on suspend) and return into C
> +* Set the system_state variable to SYS_STATE_resume
> +* Restore GIC context
> +* Resume timer
> +* Enable interrupts
> +* Enable non-boot CPUs by calling enable_nonboot_cpus()

You would have to be careful on re-enabling the non-CPU. start_secondary 
is implemented based on the assumption that it will only be called 
during Xen boot. Some of the code may be part of __init (see 
cpu_up_send_sgi) or should not be called as it is after boot (e.g 
check_local_cpu_errata).

Another I have in mind is the way VTCR_EL2 is set today (see 
setup_virt_paging). It is done at boot time, so if you online a CPU 
afterwards, VTCR_EL2 will not be set correctly.

I probably have missed other bits. I am happy to provide more insights here.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-11 14:00 ` Julien Grall
@ 2018-01-23 11:52   ` Oleksandr Tyshchenko
  2018-01-23 11:58     ` Edgar E. Iglesias
  2018-01-24 17:55   ` Mirela Simonovic
  1 sibling, 1 reply; 15+ messages in thread
From: Oleksandr Tyshchenko @ 2018-01-23 11:52 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E . Iglesias, Julien Grall, Stefano Stabellini, Xen Devel

Hi Mirela,

Just some remarks regarding the scope of work:

+In addition, the following have to be implemented:
+* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
+* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
+       * Disable wathdog on suspend, enable it on resume
+       * Pause domains on suspend, unpause them on resume
+       * Disable non-boot pCPUs on suspend, enable them on resume
+       * Save/restore of GIC configuration
+       * Suspend/resume timer
+       * Save/restore of EL2 context
+       * Implement resume entry point in Xen, including MMU configuration

I think that saving/restoring IOMMU registers/context should be
implemented as well.
In other words, each involved platform device driver in Xen on ARM
(IOMMU-XX, UART-XX, etc) should have suspend/resume callbacks
implemented.

-- 
Regards,

Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-23 11:52   ` Oleksandr Tyshchenko
@ 2018-01-23 11:58     ` Edgar E. Iglesias
  2018-01-24 18:04       ` Mirela Simonovic
  0 siblings, 1 reply; 15+ messages in thread
From: Edgar E. Iglesias @ 2018-01-23 11:58 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Julien Grall, Stefano Stabellini, Mirela Simonovic, Xen Devel

On Tue, Jan 23, 2018 at 01:52:50PM +0200, Oleksandr Tyshchenko wrote:
> Hi Mirela,
> 
> Just some remarks regarding the scope of work:
> 
> +In addition, the following have to be implemented:
> +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
> +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
> +       * Disable wathdog on suspend, enable it on resume
> +       * Pause domains on suspend, unpause them on resume
> +       * Disable non-boot pCPUs on suspend, enable them on resume
> +       * Save/restore of GIC configuration
> +       * Suspend/resume timer
> +       * Save/restore of EL2 context
> +       * Implement resume entry point in Xen, including MMU configuration
> 
> I think that saving/restoring IOMMU registers/context should be
> implemented as well.

Yes, good point.
Mirela, I think that in the ZU+ case the IOMMU actually gets powered down
with the FPD.


> In other words, each involved platform device driver in Xen on ARM
> (IOMMU-XX, UART-XX, etc) should have suspend/resume callbacks
> implemented.

Yes agreed.

Best regards,
Edgar


> 
> -- 
> Regards,
> 
> Oleksandr Tyshchenko

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-11 14:00 ` Julien Grall
  2018-01-23 11:52   ` Oleksandr Tyshchenko
@ 2018-01-24 17:55   ` Mirela Simonovic
  2018-01-26 16:08     ` Julien Grall
  1 sibling, 1 reply; 15+ messages in thread
From: Mirela Simonovic @ 2018-01-24 17:55 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: edgar.iglesias, sstabellini

Hi Julien, Stefano,


Thank you very much for the feedback!


On 01/11/2018 03:00 PM, Julien Grall wrote:
> Hi Mirela,
>
> Thank you for the sending the design document. The general design 
> looks good to me. I have some comments below, but they are more 
> related to the implementation of CPU on/off in Xen.
>
> On 22/12/17 17:41, Mirela Simonovic wrote:
>
> [...]
>
>> +---------------
>> +Resuming Guests
>> +---------------
>> +
>> +Resume of the privileged guest (Dom0) is always following the Xen 
>> resume.
>> +
>> +An unprivileged guest shall resume once a device it owns triggers a 
>> wake-up
>> +interrupt, regardless of whether Xen was suspended when the wake-up 
>> interrupt
>> +was triggered. If Xen was suspended, it is assumed that Dom0 will be 
>> running
>> +before the DomU guest starts to resume. The synchronization 
>> mechanism to
>> +enforce the assumed condition is TBD.
>
> Given that all but the non-boot CPU will be offlined. Does the wake-up 
> interrupt always need to target the non-boot CPU?

Wake-up interrupt needs to be targeted to the boot pCPU, and the resume 
sequence has to start from the boot pCPU.

>
>> +
>> +If the ARM's GIC was powered down after the ARM subsystem suspended, 
>> it is
>> +assumed that Xen needs to restore the GIC interface for a VM prior 
>> to handing
>> +over control to the guest. However, the guest should restore its own 
>> context
>> +upon entering the resume point, just like it would when running 
>> without Xen.
>> +
>> +===============
>> +Implementation
>> +===============
>
> [...]
>
>> +CPU_OFF (physical CPUs)
>> +-----------------------
>> +The CPU_OFF function shall be implemented in
>> +* call_psci_cpu_off() in arch/arm/psci.c
>> +
>> +The implementation shall consist just of making the SMC call to EL3.
>> +
>> +This function needs to be called when Xen generic code disables a 
>> non-boot CPU.
>> +When a CPU is disabled it will loop forever in while loop 
>> (stop_cpu() function
>> +which is already implemented in xen/arch/arm/smpboot.c). Call to
>> +call_psci_cpu_off() shall be made before the CPU enters infinite loop.
>
> While the code is present, we never offline physical CPU at the moment 
> except when shutting down the place. So I am not fully convinced that 
> stop_cpu() is properly implemented.

stop_cpu() is called in shutdown scenario, but not from the same place 
as it would be called in suspend scenario.
In suspend scenario, the boot CPU is performing suspend procedure (to be 
implemented) and as one of the steps it will disable non-boot CPUs by 
calling the existing disable_nonboot_cpus() function (x86 suspend flow 
does the same).
disable_nonboot_cpus() will lead to triggering each non-boot CPU to 
execute stop_cpu() for itself. In this respect, I believe stop_cpu() 
should be only extended to call PSCI CPU_OFF in order to trigger 
powering down of the calling CPU.
Consequently, in the shutdown scenario non-boot CPUs will also be 
powered down, but this is beneficial and comes for free with the suspend 
support.

However, you're right - more needs to be done elsewhere.

>
> For instance, you likely need to migrate interrupts that was assigned 
> to the physical CPU (either guest one or Xen one). Though Xen ones 
> might be less a concern because I think they are always assigned to 
> CPU0 at the moment.

I would very appreciate more information on this. These kind of 
scenarios can be easily overlooked and I'm not that much experienced 
with pinning and its side effects.
Lets assume a vCPU is pinned to the non-boot CPU#1. When the guest 
enables an interrupt (interrupt is targeted to the vCPU), would Xen 
target physical interrupt to the GIC CPU interface of pCPU#1 or pCPU#0 
or all pCPUs?

>
> Furthermore, PPI handlers are not removed. Same for any memory 
> allocated (you may loose reference to it because percpu area for that 
> CPU will get freed). I believe get into trouble when the CPU is back 
> online?

Yes, I needed to add few fixes into existing code to enable pCPU to come 
back online. I'll submit RFC soon.

Thanks,
Mirela

>
> I may have miss other bits, so I would highly recommend to go through 
> the boot code and see what could go wrong.
>
> [..]
>
>> +Resume Flow
>> +------------
>> +The resume entry point shall be implemented in
>> +* hyp_resume() in arch/arm/arm64/entry.S
>> +The very beginning of the resume procedure has to be implemented in 
>> assembly.
>> +It shall contain the following:
>> +* Enable the MMU so that the structure containing CPU context which 
>> was saved on
>> +suspend can be accessed
>> +* Restore CPU context (to match the values saved on suspend) and 
>> return into C
>> +* Set the system_state variable to SYS_STATE_resume
>> +* Restore GIC context
>> +* Resume timer
>> +* Enable interrupts
>> +* Enable non-boot CPUs by calling enable_nonboot_cpus()
>
> You would have to be careful on re-enabling the non-CPU. 
> start_secondary is implemented based on the assumption that it will 
> only be called during Xen boot. Some of the code may be part of __init 
> (see cpu_up_send_sgi) or should not be called as it is after boot (e.g 
> check_local_cpu_errata).
>
> Another I have in mind is the way VTCR_EL2 is set today (see 
> setup_virt_paging). It is done at boot time, so if you online a CPU 
> afterwards, VTCR_EL2 will not be set correctly.

Was there any reason to configure VTCR_EL2 after all CPUs become online?

I fixed this as follows: in start_xen(), the boot CPU calls 
setup_virt_paging() prior to enabling non-boot CPUs. setup_virt_paging() 
configures VTCR_EL2 only for the boot CPU.
Non-boot CPUs call setup_virt_paging_one() later, from 
start_secondary(). Also, only the boot CPU performs the calculation for 
how to configure VTCR_EL2, non-boot CPUs rely on the calculated value.

>
> I probably have missed other bits. I am happy to provide more insights 
> here.
>
> Cheers,
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-23 11:58     ` Edgar E. Iglesias
@ 2018-01-24 18:04       ` Mirela Simonovic
  2018-01-25 14:15         ` Edgar E. Iglesias
  0 siblings, 1 reply; 15+ messages in thread
From: Mirela Simonovic @ 2018-01-24 18:04 UTC (permalink / raw)
  To: Edgar E. Iglesias, Oleksandr Tyshchenko
  Cc: Julien Grall, Stefano Stabellini, Xen Devel

Hi Oleksandr, Edgar,


Thanks, you're right.


On 01/23/2018 12:58 PM, Edgar E. Iglesias wrote:
> On Tue, Jan 23, 2018 at 01:52:50PM +0200, Oleksandr Tyshchenko wrote:
>> Hi Mirela,
>>
>> Just some remarks regarding the scope of work:
>>
>> +In addition, the following have to be implemented:
>> +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
>> +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
>> +       * Disable wathdog on suspend, enable it on resume
>> +       * Pause domains on suspend, unpause them on resume
>> +       * Disable non-boot pCPUs on suspend, enable them on resume
>> +       * Save/restore of GIC configuration
>> +       * Suspend/resume timer
>> +       * Save/restore of EL2 context
>> +       * Implement resume entry point in Xen, including MMU configuration
>>
>> I think that saving/restoring IOMMU registers/context should be
>> implemented as well.
> Yes, good point.
> Mirela, I think that in the ZU+ case the IOMMU actually gets powered down
> with the FPD.

Yes, it is in FPD.
>
>
>> In other words, each involved platform device driver in Xen on ARM
>> (IOMMU-XX, UART-XX, etc) should have suspend/resume callbacks
>> implemented.

Yes, callback should be platform specific but not each platform has to 
have all callbacks implemented. E.g. on ZU+ UART is in differrent power 
domain compared to APU/Xen. Xen can suspend and its power domain can go 
down even if UART is not suspended. However, suspending UART even in 
this case may be beneficial from power perspective. We should definitely 
provide the option to implement the callback.

AFAIU, the following devices has to be suspended:
1. timer
2. IOMMU
3. UART
4. GIC

Please let me know if I missed something. I'll update this in next 
design spec version. Thank you very much for good feedback!

Cheers,
Mirela
> Yes agreed.
>
> Best regards,
> Edgar
>
>
>> -- 
>> Regards,
>>
>> Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-24 18:04       ` Mirela Simonovic
@ 2018-01-25 14:15         ` Edgar E. Iglesias
  2018-01-26 15:37           ` Julien Grall
  0 siblings, 1 reply; 15+ messages in thread
From: Edgar E. Iglesias @ 2018-01-25 14:15 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Oleksandr Tyshchenko, Julien Grall, Stefano Stabellini, Xen Devel

On Wed, Jan 24, 2018 at 07:04:35PM +0100, Mirela Simonovic wrote:
> Hi Oleksandr, Edgar,
> 
> 
> Thanks, you're right.
> 
> 
> On 01/23/2018 12:58 PM, Edgar E. Iglesias wrote:
> >On Tue, Jan 23, 2018 at 01:52:50PM +0200, Oleksandr Tyshchenko wrote:
> >>Hi Mirela,
> >>
> >>Just some remarks regarding the scope of work:
> >>
> >>+In addition, the following have to be implemented:
> >>+* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
> >>+* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
> >>+       * Disable wathdog on suspend, enable it on resume
> >>+       * Pause domains on suspend, unpause them on resume
> >>+       * Disable non-boot pCPUs on suspend, enable them on resume
> >>+       * Save/restore of GIC configuration
> >>+       * Suspend/resume timer
> >>+       * Save/restore of EL2 context
> >>+       * Implement resume entry point in Xen, including MMU configuration
> >>
> >>I think that saving/restoring IOMMU registers/context should be
> >>implemented as well.
> >Yes, good point.
> >Mirela, I think that in the ZU+ case the IOMMU actually gets powered down
> >with the FPD.
> 
> Yes, it is in FPD.

Having said that it may still be useful from a patch review perspective
to incrementally add things. Perhaps the IOMMU suspending support could
come in follow-up patch series if others agree.

Best regards,
Edgar



> >
> >
> >>In other words, each involved platform device driver in Xen on ARM
> >>(IOMMU-XX, UART-XX, etc) should have suspend/resume callbacks
> >>implemented.
> 
> Yes, callback should be platform specific but not each platform has to have
> all callbacks implemented. E.g. on ZU+ UART is in differrent power domain
> compared to APU/Xen. Xen can suspend and its power domain can go down even
> if UART is not suspended. However, suspending UART even in this case may be
> beneficial from power perspective. We should definitely provide the option
> to implement the callback.
> 
> AFAIU, the following devices has to be suspended:
> 1. timer
> 2. IOMMU
> 3. UART
> 4. GIC
> 
> Please let me know if I missed something. I'll update this in next design
> spec version. Thank you very much for good feedback!
> 
> Cheers,
> Mirela
> >Yes agreed.
> >
> >Best regards,
> >Edgar
> >
> >
> >>-- 
> >>Regards,
> >>
> >>Oleksandr Tyshchenko
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-25 14:15         ` Edgar E. Iglesias
@ 2018-01-26 15:37           ` Julien Grall
  0 siblings, 0 replies; 15+ messages in thread
From: Julien Grall @ 2018-01-26 15:37 UTC (permalink / raw)
  To: Edgar E. Iglesias, Mirela Simonovic
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Xen Devel

Hi Edgar,

On 25/01/18 14:15, Edgar E. Iglesias wrote:
> On Wed, Jan 24, 2018 at 07:04:35PM +0100, Mirela Simonovic wrote:
>> Hi Oleksandr, Edgar,
>>
>>
>> Thanks, you're right.
>>
>>
>> On 01/23/2018 12:58 PM, Edgar E. Iglesias wrote:
>>> On Tue, Jan 23, 2018 at 01:52:50PM +0200, Oleksandr Tyshchenko wrote:
>>>> Hi Mirela,
>>>>
>>>> Just some remarks regarding the scope of work:
>>>>
>>>> +In addition, the following have to be implemented:
>>>> +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
>>>> +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0), including:
>>>> +       * Disable wathdog on suspend, enable it on resume
>>>> +       * Pause domains on suspend, unpause them on resume
>>>> +       * Disable non-boot pCPUs on suspend, enable them on resume
>>>> +       * Save/restore of GIC configuration
>>>> +       * Suspend/resume timer
>>>> +       * Save/restore of EL2 context
>>>> +       * Implement resume entry point in Xen, including MMU configuration
>>>>
>>>> I think that saving/restoring IOMMU registers/context should be
>>>> implemented as well.
>>> Yes, good point.
>>> Mirela, I think that in the ZU+ case the IOMMU actually gets powered down
>>> with the FPD.
>>
>> Yes, it is in FPD.
> 
> Having said that it may still be useful from a patch review perspective
> to incrementally add things. Perhaps the IOMMU suspending support could
> come in follow-up patch series if others agree.

+1 :). I suspect the suspend/resume patch set will be quite big. So 
anything that can help the review (e.g splitting patch series, moving 
out some parts of the initial work...) would be greatly appreciated.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-24 17:55   ` Mirela Simonovic
@ 2018-01-26 16:08     ` Julien Grall
  2018-04-17 12:13       ` Mirela Simonovic
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2018-01-26 16:08 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini



On 24/01/18 17:55, Mirela Simonovic wrote:
> Hi Julien, Stefano,

Hi Mirela,

> 
> Thank you very much for the feedback!
> 
> 
> On 01/11/2018 03:00 PM, Julien Grall wrote:
>> Hi Mirela,
>>
>> Thank you for the sending the design document. The general design 
>> looks good to me. I have some comments below, but they are more 
>> related to the implementation of CPU on/off in Xen.
>>
>> On 22/12/17 17:41, Mirela Simonovic wrote:
>>
>> [...]
>>
>>> +---------------
>>> +Resuming Guests
>>> +---------------
>>> +
>>> +Resume of the privileged guest (Dom0) is always following the Xen 
>>> resume.
>>> +
>>> +An unprivileged guest shall resume once a device it owns triggers a 
>>> wake-up
>>> +interrupt, regardless of whether Xen was suspended when the wake-up 
>>> interrupt
>>> +was triggered. If Xen was suspended, it is assumed that Dom0 will be 
>>> running
>>> +before the DomU guest starts to resume. The synchronization 
>>> mechanism to
>>> +enforce the assumed condition is TBD.
>>
>> Given that all but the non-boot CPU will be offlined. Does the wake-up 
>> interrupt always need to target the non-boot CPU?
> 
> Wake-up interrupt needs to be targeted to the boot pCPU, and the resume 
> sequence has to start from the boot pCPU.

I assume that wake-up interrupts could belong to a guest.
In that case, the wake-up interrupts will need to be moved to the boot 
pCPU on suspend.

[...]

>>
>> For instance, you likely need to migrate interrupts that was assigned 
>> to the physical CPU (either guest one or Xen one). Though Xen ones 
>> might be less a concern because I think they are always assigned to 
>> CPU0 at the moment.
> 
> I would very appreciate more information on this. These kind of 
> scenarios can be easily overlooked and I'm not that much experienced 
> with pinning and its side effects.
> Lets assume a vCPU is pinned to the non-boot CPU#1. When the guest 
> enables an interrupt (interrupt is targeted to the vCPU), would Xen 
> target physical interrupt to the GIC CPU interface of pCPU#1 or pCPU#0 
> or all pCPUs?

In your example, the interrupts will target pCPU#1 only.

> 
>>
>> Furthermore, PPI handlers are not removed. Same for any memory 
>> allocated (you may loose reference to it because percpu area for that 
>> CPU will get freed). I believe get into trouble when the CPU is back 
>> online?
> 
> Yes, I needed to add few fixes into existing code to enable pCPU to come 
> back online. I'll submit RFC soon.

Thank you!

[...]

>>
>> I may have miss other bits, so I would highly recommend to go through 
>> the boot code and see what could go wrong.
>>
>> [..]
>>
>>> +Resume Flow
>>> +------------
>>> +The resume entry point shall be implemented in
>>> +* hyp_resume() in arch/arm/arm64/entry.S
>>> +The very beginning of the resume procedure has to be implemented in 
>>> assembly.
>>> +It shall contain the following:
>>> +* Enable the MMU so that the structure containing CPU context which 
>>> was saved on
>>> +suspend can be accessed
>>> +* Restore CPU context (to match the values saved on suspend) and 
>>> return into C
>>> +* Set the system_state variable to SYS_STATE_resume
>>> +* Restore GIC context
>>> +* Resume timer
>>> +* Enable interrupts
>>> +* Enable non-boot CPUs by calling enable_nonboot_cpus()
>>
>> You would have to be careful on re-enabling the non-CPU. 
>> start_secondary is implemented based on the assumption that it will 
>> only be called during Xen boot. Some of the code may be part of __init 
>> (see cpu_up_send_sgi) or should not be called as it is after boot (e.g 
>> check_local_cpu_errata).
>>
>> Another I have in mind is the way VTCR_EL2 is set today (see 
>> setup_virt_paging). It is done at boot time, so if you online a CPU 
>> afterwards, VTCR_EL2 will not be set correctly.
> 
> Was there any reason to configure VTCR_EL2 after all CPUs become online?
> 
> I fixed this as follows: in start_xen(), the boot CPU calls 
> setup_virt_paging() prior to enabling non-boot CPUs. setup_virt_paging() 
> configures VTCR_EL2 only for the boot CPU.
> Non-boot CPUs call setup_virt_paging_one() later, from 
> start_secondary(). Also, only the boot CPU performs the calculation for 
> how to configure VTCR_EL2, non-boot CPUs rely on the calculated value.
This would not be correct. Imagine a platform with heterogeneous 
processors (such as big.LITTLE), each processors may have different set 
of "features" (e.g max IPA size supported). You want Xen to use a common 
set of "features" that would work on all CPUs.

To give an example, your boot CPU may support maximum 48-bit IPA while 
all the other CPUs would support maximum 40-bit IPA. If Xen decides to 
use maximum 48-bit IPA, then page-tables would not work on other CPUs.

In order to take the decision, you need to wait all CPUs to come up and 
look at their ID registers. Once the decision is made, then you can 
configure correctly VTCR_EL2.

This is why setup_virt_paging() is called after all the CPUs have booted.

Obviously, this does not work for CPUs brought up afterwards (e.g resume 
case). For those CPUs we should call setup_virt_paging_one directly and 
check that the value chosen by Xen can be handled by the processor. If 
not, this CPU should be parked.

I think you can use (system_state > SYS_STATE_active) to differentiate 
between CPUs brought during Xen boot from the one afterwards.

Note that this is probably the only place where platform with 
heterogeneous processors are properly supported on Xen. In the future, 
we should do that for all "features" and park CPU brought after boot if 
they does not support the ones used by Xen. Maybe by having a framework 
very similar to Linux (see arch/arm64/kernel/cpufeature.c).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2017-12-22 17:41 [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM Mirela Simonovic
  2018-01-11  0:55 ` Stefano Stabellini
  2018-01-11 14:00 ` Julien Grall
@ 2018-03-26  9:51 ` Peng Fan
  2018-03-26 11:42   ` Edgar E. Iglesias
  2 siblings, 1 reply; 15+ messages in thread
From: Peng Fan @ 2018-03-26  9:51 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, julien.grall

Hi Mirela,

Good to know that you are working suspend/resume support. Currently we are also trying
to support this on i.MX8, just wonder do you have any open source available to
support suspend to ram?

> +
> +Suspend to RAM (in the following text 'suspend') for ARM in Xen should
> +be coordinated using ARM PSCI standard [1].
> +
> +Ideally, EL1/2 should suspend in the following order:
> +1) Unprivileged guests (DomUs) suspend
> +2) Privileged guest (Dom0) suspends
> +3) Xen suspends
> +
> +However, suspending unprivileged guests is not mandatory for suspending
> +Dom0 and Xen. System suspend initiated by Dom0 (step 2) is considered
> +to be an ultimate decision to suspend the physical machine. Suspending
> +of Xen (step 3) is triggered whenever Dom0 completes suspend. Xen
> +suspend leads to the full suspend of EL2.
> +
> +If an unprivileged guest is not suspended at the moment when Dom0
> +initiates its own suspend, the guest will be paused on Xen's suspend
> +and unpaused on Xen's resume. That way, a guest which doesn't have
> +power management support cannot prevent the physical system from
> +suspending when the decision to suspend is made by privileged software
> (Dom0).
> +
> +Each guest in the system is responsible for suspending the devices it owns.
> +If a guest does not suspend a device, the device will continue to
> +operate as it is configured at the moment when the system suspends. If
> +a device triggers an interrupt while the physical system is suspended, the
> system will resume.
> +
> +It is recommended for an unprivileged guest to participate in power
> +management in the following scenario:
> +Assume unprivileged guest owns a device which will trigger interrupt at
> +some point. This interrupt will wake-up the system. If such a behavior
> +is not wanted, coordination between Dom0 and the guest is required in
> +order to inform the guest about the intended physical system suspend.
> +Then, the guest will have a chance to suspend the device or respond to the
> request in an abort fashion.
> +
> +Since this proposal is focused on implementing PSCI-based suspend
> +mechanisms in Xen, communication with or among the guests is not covered by
> this document.
> +The order of suspending the guests is assumed to be guaranteed by the
> +software running in EL1.
> +
> +This document covers the following:
> +1) Mechanism for suspending/resuming a guest:
> +	1.1) Suspend is initiated by the guest
> +	1.2) Resume is initiated by a device interrupt
> +2) Mechanism for pausing/unpausing running guests when Dom0 suspends

Will this take care of passthroughed devices for DomU?

Thanks,
Peng.

> +3) Mechanism for suspending/resuming Xen when Dom0 completes suspend
> +4) Resuming from any state on a wake-up event (device interrupt):
> +	4.1) Resume DomU on wake-up event when Dom0 is still running
> +	4.2) Resume DomU on wake-up event when Xen is suspended
> +	4.3) Resume Dom0 on wake-up event
> +
> +Mechanisms enumerated above will allow different kind of policies and
> +coordination among guests to be implemented in EL1. That is out of the
> +scope of this document.
> +
> +-----------------
> +Suspending Guests
> +-----------------
> +
> +Suspend procedure for a guest consists of the following:
> +1) Suspending devices
> +2) Suspending non-boot CPUs (based on hotplug/PSCI)
> +3) System suspend, performed by the boot CPU
> +
> +Each guest should suspend the devices it owns just like it would when
> +running without Xen.
> +
> +Guests should suspend their non-boot vCPUs using the hotplug mechanism.
> +Virtual CPUs should be put offline using the already implemented PSCI
> +vCPU_OFF call (prefix 'v' is added to distinguish PSCI calls made by
> +guests to Xen, which affect virtual machines; as opposed to PSCI calls
> +made by Xen to the EL3, which can affect power state of the physical
> machine).
> +
> +After suspending its non-boot vCPUs a guest should finalize the suspend
> +by making the vSYSTEM_SUSPEND PSCI call. The resume address is
> +specified by the guest via the vSYSTEM_SUSPEND entry_point_address
> +argument. The vSYSTEM_SUSPEND call is currently not implemented in Xen.
> +
> +It is expected that a guest leaves enabled all interrupts that should
> +wake it up. Other interrupts should be disabled by the guest prior to
> +calling vSYSTEM_SUSPEND.
> +
> +After an unprivileged guest suspends, Xen will not suspend. Xen would
> +suspend only after the Dom0 completes the system suspend.
> +
> +--------------
> +Suspending Xen
> +--------------
> +
> +Xen should start suspending itself upon receiving the vSYSTEM_SUSPEND
> +call from the last running guest (Dom0). At that moment all physical
> +CPUs are still online (taking offline a vCPU or suspending a VM does not affect
> physical CPUs).
> +Xen shall now put offline the non-boot pCPUs by making the CPU_OFF PSCI
> +call to EL3. The CPU_OFF PSCI function is currently not implemented in Xen.
> +
> +After putting offline the non-boot cores Xen must save the context and
> +finalize suspend by invoking SYSTEM_SUSPEND PSCI call, which is passed to EL3.
> +The resume point of Xen is specified by the entry_point_address
> +argument of the SYSTEM_SUSPEND call. The SYSTEM_SUSPEND function and
> +context saving is not implemented in Xen for ARM today.
> +
> +------------
> +Resuming Xen
> +------------
> +
> +Xen must be resumed prior to any software running in EL1. Starting from
> +the resume point, Xen should restore the context and resume Dom0. Dom0
> +shall always be resumed whenever Xen resumes.
> +
> +---------------
> +Resuming Guests
> +---------------
> +
> +Resume of the privileged guest (Dom0) is always following the Xen resume.
> +
> +An unprivileged guest shall resume once a device it owns triggers a
> +wake-up interrupt, regardless of whether Xen was suspended when the
> +wake-up interrupt was triggered. If Xen was suspended, it is assumed
> +that Dom0 will be running before the DomU guest starts to resume. The
> +synchronization mechanism to enforce the assumed condition is TBD.
> +
> +If the ARM's GIC was powered down after the ARM subsystem suspended, it
> +is assumed that Xen needs to restore the GIC interface for a VM prior
> +to handing over control to the guest. However, the guest should restore
> +its own context upon entering the resume point, just like it would when
> running without Xen.
> +
> +===============
> +Implementation
> +===============
> +
> +--------
> +Overview
> +--------
> +
> +In order to enable the suspend/resume of VMs and Xen itself, the
> +following PSCI calls have to be implemented and integrated in Xen:
> +1) vSYSTEM_SUSPEND
> +2) CPU_OFF
> +3) SYSTEM_SUSPEND
> +
> +In addition, the following have to be implemented:
> +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
> +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0),
> including:
> +	* Disable wathdog on suspend, enable it on resume
> +	* Pause domains on suspend, unpause them on resume
> +	* Disable non-boot pCPUs on suspend, enable them on resume
> +	* Save/restore of GIC configuration
> +	* Suspend/resume timer
> +	* Save/restore of EL2 context
> +	* Implement resume entry point in Xen, including MMU configuration
> +
> +Implementation details are provided in the sections below. Function
> +names and paths used below are consistent within the document but may
> +not always match the names used in future implementation. Existing
> +functions and paths are named as in Xen source tree.
> +
> +-------------------------------------
> +Suspend/Resume Implementation Details
> +-------------------------------------
> +
> +PSCI Implementation and Integration
> +-----------------------------------
> +vSYSTEM_SUSPEND
> +---------------
> +vSYSTEM_SUSPEND shall be implemented in
> +* do_psci_system_suspend() in arch/arm/vpsci.c
> +* Code independent from PSCI interface will be added in
> +arch/arm/suspend.c
> +
> +The implementation shall include the following steps:
> +* Suspend the current (calling) vCPU. Consists of 2 major steps:
> +1) Reset context of vCPU and save entry point into PC and context ID
> +into X0 (entry point and context ID are provided via vSYSTEM_SUSPEND
> +arguments)
> +2) Block vCPU to ensure that it is not scheduled until it is unblocked
> +by an interrupt.
> +In step 1) above, the context is reset in order to prepare the vCPU for
> +resume, i.e. to save vCPU context that matches reset values as expected
> +by software on resume. This doesn't hold for PC and X0, since the PC
> +contains resume entry point and X0 contains context ID, as defined by PSCI.
> +* If the hardware domain made the call trigger Xen suspend, i.e.
> +  call machine_suspend() which will be implemented in
> +arch/arm/suspend.c  (similar as the machine_restart() is implemented in
> +arch/arm/shutdown.c)
> +
> +The function do_psci_system_suspend() shall be called from
> +* do_trap_psci() in arch/arm/traps.c
> +
> +CPU_OFF (physical CPUs)
> +-----------------------
> +The CPU_OFF function shall be implemented in
> +* call_psci_cpu_off() in arch/arm/psci.c
> +
> +The implementation shall consist just of making the SMC call to EL3.
> +
> +This function needs to be called when Xen generic code disables a non-boot
> CPU.
> +When a CPU is disabled it will loop forever in while loop (stop_cpu()
> +function which is already implemented in xen/arch/arm/smpboot.c). Call
> +to
> +call_psci_cpu_off() shall be made before the CPU enters infinite loop.
> +
> +SYSTEM_SUSPEND (physical)
> +-------------------------
> +The SYSTEM_SUSPEND function shall be implemented in
> +* call_psci_system_suspend() in arch/arm/psci.c
> +
> +The implementation shall consist just of making the SMC call to EL3.
> +The entry_point_address argument of the SMC call needs to be an ARM
> +architecture resume address, which shall be implemented, e.g. as
> +hyp_resume() in arch/arm/arm64/entry.S. The call_psci_system_suspend()
> function does not return.
> +On the resume, the execution flow continues from hyp_resume.
> +
> +The function needs to be called from machine_suspend() to finalize the
> +suspend procedure.
> +
> +------------------
> +Additional Changes
> +------------------
> +
> +Suspend Flow
> +------------
> +The suspend procedure shall be implemented in
> +* machine_suspend() in arch/arm/suspend.c
> +
> +The implementation shall include the following steps:
> +* Move the execution to boot pCPU
> +* Set the system_state variable to SYS_STATE_suspend
> +* Disable watchdog
> +* Freeze domains by calling domain_pause() for each domain
> +* Disable non-boot CPUs by calling disable_nonboot_cpus()
> +* Disable interrupts
> +* Suspend timer
> +* Save GIC context. Shall be implemented in arch/arm/gic.c,
> +  include/asm-arm/gic.h and arch/arm/gic-v2.c (only GICv2 will be supported).
> +* Save CPU context. This shall be implemented in assembly, in
> +hyp_suspend()
> +  in arch/arm/arm64/entry.S. The context consists of callee-saved
> +general
> +  purpose registers, as well as few system registers. Context of
> +registers shall
> +  be saved in a statically allocated structure.
> +* Finalize the suspend by calling call_psci_system_suspend()
> +
> +Resume Flow
> +------------
> +The resume entry point shall be implemented in
> +* hyp_resume() in arch/arm/arm64/entry.S The very beginning of the
> +resume procedure has to be implemented in assembly.
> +It shall contain the following:
> +* Enable the MMU so that the structure containing CPU context which was
> +saved on suspend can be accessed
> +* Restore CPU context (to match the values saved on suspend) and return
> +into C
> +* Set the system_state variable to SYS_STATE_resume
> +* Restore GIC context
> +* Resume timer
> +* Enable interrupts
> +* Enable non-boot CPUs by calling enable_nonboot_cpus()
> +* Thaw domains by calling domain_unpause() for each domain
> +* Enable watchdog
> +* Set the system_state variable to SYS_STATE_active
> +* Resume Dom0
> +
> +==========
> +References
> +==========
> +
> +[1] Power State Coordination Interface (ARM):
> +https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfoc
> +enter.arm.com%2Fhelp%2Ftopic%2Fcom.arm.doc.den0022d%2FPower_State
> _Coord
> +ination_Interface_PDD_v1_1_DEN0022D.pdf&data=02%7C01%7Cpeng.fan%4
> 0nxp.c
> +om%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c6fa92cd99
> c5c30163
> +5%7C0%7C1%7C636495614074885940&sdata=3ycqEZR9XgcqdvrmJKY86aukt
> %2BQv%2BS
> +BSZMxbCrpraEY%3D&reserved=0
> --
> 2.13.0
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.xe
> nproject.org%2Fmailman%2Flistinfo%2Fxen-devel&data=02%7C01%7Cpeng.fan
> %40nxp.com%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c6f
> a92cd99c5c301635%7C0%7C0%7C636495614074885940&sdata=YLuJhbx%2B1
> tDvblYbgtOZZBhsG36%2BUhpRc4VpSpHHM%2FU%3D&reserved=0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-03-26  9:51 ` Peng Fan
@ 2018-03-26 11:42   ` Edgar E. Iglesias
  2018-04-12  2:26     ` Peng Fan
  0 siblings, 1 reply; 15+ messages in thread
From: Edgar E. Iglesias @ 2018-03-26 11:42 UTC (permalink / raw)
  To: Peng Fan; +Cc: sstabellini, Mirela Simonovic, julien.grall, xen-devel

On Mon, Mar 26, 2018 at 09:51:40AM +0000, Peng Fan wrote:
> Hi Mirela,
> 
> Good to know that you are working suspend/resume support. Currently we are also trying
> to support this on i.MX8, just wonder do you have any open source available to
> support suspend to ram?
> 
> > +
> > +Suspend to RAM (in the following text 'suspend') for ARM in Xen should
> > +be coordinated using ARM PSCI standard [1].
> > +
> > +Ideally, EL1/2 should suspend in the following order:
> > +1) Unprivileged guests (DomUs) suspend
> > +2) Privileged guest (Dom0) suspends
> > +3) Xen suspends
> > +
> > +However, suspending unprivileged guests is not mandatory for suspending
> > +Dom0 and Xen. System suspend initiated by Dom0 (step 2) is considered
> > +to be an ultimate decision to suspend the physical machine. Suspending
> > +of Xen (step 3) is triggered whenever Dom0 completes suspend. Xen
> > +suspend leads to the full suspend of EL2.
> > +
> > +If an unprivileged guest is not suspended at the moment when Dom0
> > +initiates its own suspend, the guest will be paused on Xen's suspend
> > +and unpaused on Xen's resume. That way, a guest which doesn't have
> > +power management support cannot prevent the physical system from
> > +suspending when the decision to suspend is made by privileged software
> > (Dom0).
> > +
> > +Each guest in the system is responsible for suspending the devices it owns.
> > +If a guest does not suspend a device, the device will continue to
> > +operate as it is configured at the moment when the system suspends. If
> > +a device triggers an interrupt while the physical system is suspended, the
> > system will resume.
> > +
> > +It is recommended for an unprivileged guest to participate in power
> > +management in the following scenario:
> > +Assume unprivileged guest owns a device which will trigger interrupt at
> > +some point. This interrupt will wake-up the system. If such a behavior
> > +is not wanted, coordination between Dom0 and the guest is required in
> > +order to inform the guest about the intended physical system suspend.
> > +Then, the guest will have a chance to suspend the device or respond to the
> > request in an abort fashion.
> > +
> > +Since this proposal is focused on implementing PSCI-based suspend
> > +mechanisms in Xen, communication with or among the guests is not covered by
> > this document.
> > +The order of suspending the guests is assumed to be guaranteed by the
> > +software running in EL1.
> > +
> > +This document covers the following:
> > +1) Mechanism for suspending/resuming a guest:
> > +	1.1) Suspend is initiated by the guest
> > +	1.2) Resume is initiated by a device interrupt
> > +2) Mechanism for pausing/unpausing running guests when Dom0 suspends
> 
> Will this take care of passthroughed devices for DomU?

Hi Peng,

The ZynqMP uses the EEMI Firmware interface to do power-management.
https://www.xilinx.com/support/documentation/user_guides/ug1200-eemi-api.pdf

In our case, we've implemented an EEMI mediator in Xen that traps EEMI
requests from domU's and makes sure that the guest owns the device it
is trying to operate on.
https://github.com/Xilinx/xen/blob/xilinx/stable-4.9/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c

So domU will first issue the usual EEMI calls as it would in a non-virtualized
case to suspend all it's devices. Once that has happened, the guest will issue
PSCI calls to suspend the VM. So, Mirela please shim in if I missed something.

The EEMI mediator has been posted to the ML but is currently sitting in our
tree waiting for us to go through the upstreaming effort.

Cheers,
Edgar




> 
> Thanks,
> Peng.
> 
> > +3) Mechanism for suspending/resuming Xen when Dom0 completes suspend
> > +4) Resuming from any state on a wake-up event (device interrupt):
> > +	4.1) Resume DomU on wake-up event when Dom0 is still running
> > +	4.2) Resume DomU on wake-up event when Xen is suspended
> > +	4.3) Resume Dom0 on wake-up event
> > +
> > +Mechanisms enumerated above will allow different kind of policies and
> > +coordination among guests to be implemented in EL1. That is out of the
> > +scope of this document.
> > +
> > +-----------------
> > +Suspending Guests
> > +-----------------
> > +
> > +Suspend procedure for a guest consists of the following:
> > +1) Suspending devices
> > +2) Suspending non-boot CPUs (based on hotplug/PSCI)
> > +3) System suspend, performed by the boot CPU
> > +
> > +Each guest should suspend the devices it owns just like it would when
> > +running without Xen.
> > +
> > +Guests should suspend their non-boot vCPUs using the hotplug mechanism.
> > +Virtual CPUs should be put offline using the already implemented PSCI
> > +vCPU_OFF call (prefix 'v' is added to distinguish PSCI calls made by
> > +guests to Xen, which affect virtual machines; as opposed to PSCI calls
> > +made by Xen to the EL3, which can affect power state of the physical
> > machine).
> > +
> > +After suspending its non-boot vCPUs a guest should finalize the suspend
> > +by making the vSYSTEM_SUSPEND PSCI call. The resume address is
> > +specified by the guest via the vSYSTEM_SUSPEND entry_point_address
> > +argument. The vSYSTEM_SUSPEND call is currently not implemented in Xen.
> > +
> > +It is expected that a guest leaves enabled all interrupts that should
> > +wake it up. Other interrupts should be disabled by the guest prior to
> > +calling vSYSTEM_SUSPEND.
> > +
> > +After an unprivileged guest suspends, Xen will not suspend. Xen would
> > +suspend only after the Dom0 completes the system suspend.
> > +
> > +--------------
> > +Suspending Xen
> > +--------------
> > +
> > +Xen should start suspending itself upon receiving the vSYSTEM_SUSPEND
> > +call from the last running guest (Dom0). At that moment all physical
> > +CPUs are still online (taking offline a vCPU or suspending a VM does not affect
> > physical CPUs).
> > +Xen shall now put offline the non-boot pCPUs by making the CPU_OFF PSCI
> > +call to EL3. The CPU_OFF PSCI function is currently not implemented in Xen.
> > +
> > +After putting offline the non-boot cores Xen must save the context and
> > +finalize suspend by invoking SYSTEM_SUSPEND PSCI call, which is passed to EL3.
> > +The resume point of Xen is specified by the entry_point_address
> > +argument of the SYSTEM_SUSPEND call. The SYSTEM_SUSPEND function and
> > +context saving is not implemented in Xen for ARM today.
> > +
> > +------------
> > +Resuming Xen
> > +------------
> > +
> > +Xen must be resumed prior to any software running in EL1. Starting from
> > +the resume point, Xen should restore the context and resume Dom0. Dom0
> > +shall always be resumed whenever Xen resumes.
> > +
> > +---------------
> > +Resuming Guests
> > +---------------
> > +
> > +Resume of the privileged guest (Dom0) is always following the Xen resume.
> > +
> > +An unprivileged guest shall resume once a device it owns triggers a
> > +wake-up interrupt, regardless of whether Xen was suspended when the
> > +wake-up interrupt was triggered. If Xen was suspended, it is assumed
> > +that Dom0 will be running before the DomU guest starts to resume. The
> > +synchronization mechanism to enforce the assumed condition is TBD.
> > +
> > +If the ARM's GIC was powered down after the ARM subsystem suspended, it
> > +is assumed that Xen needs to restore the GIC interface for a VM prior
> > +to handing over control to the guest. However, the guest should restore
> > +its own context upon entering the resume point, just like it would when
> > running without Xen.
> > +
> > +===============
> > +Implementation
> > +===============
> > +
> > +--------
> > +Overview
> > +--------
> > +
> > +In order to enable the suspend/resume of VMs and Xen itself, the
> > +following PSCI calls have to be implemented and integrated in Xen:
> > +1) vSYSTEM_SUSPEND
> > +2) CPU_OFF
> > +3) SYSTEM_SUSPEND
> > +
> > +In addition, the following have to be implemented:
> > +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
> > +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0),
> > including:
> > +	* Disable wathdog on suspend, enable it on resume
> > +	* Pause domains on suspend, unpause them on resume
> > +	* Disable non-boot pCPUs on suspend, enable them on resume
> > +	* Save/restore of GIC configuration
> > +	* Suspend/resume timer
> > +	* Save/restore of EL2 context
> > +	* Implement resume entry point in Xen, including MMU configuration
> > +
> > +Implementation details are provided in the sections below. Function
> > +names and paths used below are consistent within the document but may
> > +not always match the names used in future implementation. Existing
> > +functions and paths are named as in Xen source tree.
> > +
> > +-------------------------------------
> > +Suspend/Resume Implementation Details
> > +-------------------------------------
> > +
> > +PSCI Implementation and Integration
> > +-----------------------------------
> > +vSYSTEM_SUSPEND
> > +---------------
> > +vSYSTEM_SUSPEND shall be implemented in
> > +* do_psci_system_suspend() in arch/arm/vpsci.c
> > +* Code independent from PSCI interface will be added in
> > +arch/arm/suspend.c
> > +
> > +The implementation shall include the following steps:
> > +* Suspend the current (calling) vCPU. Consists of 2 major steps:
> > +1) Reset context of vCPU and save entry point into PC and context ID
> > +into X0 (entry point and context ID are provided via vSYSTEM_SUSPEND
> > +arguments)
> > +2) Block vCPU to ensure that it is not scheduled until it is unblocked
> > +by an interrupt.
> > +In step 1) above, the context is reset in order to prepare the vCPU for
> > +resume, i.e. to save vCPU context that matches reset values as expected
> > +by software on resume. This doesn't hold for PC and X0, since the PC
> > +contains resume entry point and X0 contains context ID, as defined by PSCI.
> > +* If the hardware domain made the call trigger Xen suspend, i.e.
> > +  call machine_suspend() which will be implemented in
> > +arch/arm/suspend.c  (similar as the machine_restart() is implemented in
> > +arch/arm/shutdown.c)
> > +
> > +The function do_psci_system_suspend() shall be called from
> > +* do_trap_psci() in arch/arm/traps.c
> > +
> > +CPU_OFF (physical CPUs)
> > +-----------------------
> > +The CPU_OFF function shall be implemented in
> > +* call_psci_cpu_off() in arch/arm/psci.c
> > +
> > +The implementation shall consist just of making the SMC call to EL3.
> > +
> > +This function needs to be called when Xen generic code disables a non-boot
> > CPU.
> > +When a CPU is disabled it will loop forever in while loop (stop_cpu()
> > +function which is already implemented in xen/arch/arm/smpboot.c). Call
> > +to
> > +call_psci_cpu_off() shall be made before the CPU enters infinite loop.
> > +
> > +SYSTEM_SUSPEND (physical)
> > +-------------------------
> > +The SYSTEM_SUSPEND function shall be implemented in
> > +* call_psci_system_suspend() in arch/arm/psci.c
> > +
> > +The implementation shall consist just of making the SMC call to EL3.
> > +The entry_point_address argument of the SMC call needs to be an ARM
> > +architecture resume address, which shall be implemented, e.g. as
> > +hyp_resume() in arch/arm/arm64/entry.S. The call_psci_system_suspend()
> > function does not return.
> > +On the resume, the execution flow continues from hyp_resume.
> > +
> > +The function needs to be called from machine_suspend() to finalize the
> > +suspend procedure.
> > +
> > +------------------
> > +Additional Changes
> > +------------------
> > +
> > +Suspend Flow
> > +------------
> > +The suspend procedure shall be implemented in
> > +* machine_suspend() in arch/arm/suspend.c
> > +
> > +The implementation shall include the following steps:
> > +* Move the execution to boot pCPU
> > +* Set the system_state variable to SYS_STATE_suspend
> > +* Disable watchdog
> > +* Freeze domains by calling domain_pause() for each domain
> > +* Disable non-boot CPUs by calling disable_nonboot_cpus()
> > +* Disable interrupts
> > +* Suspend timer
> > +* Save GIC context. Shall be implemented in arch/arm/gic.c,
> > +  include/asm-arm/gic.h and arch/arm/gic-v2.c (only GICv2 will be supported).
> > +* Save CPU context. This shall be implemented in assembly, in
> > +hyp_suspend()
> > +  in arch/arm/arm64/entry.S. The context consists of callee-saved
> > +general
> > +  purpose registers, as well as few system registers. Context of
> > +registers shall
> > +  be saved in a statically allocated structure.
> > +* Finalize the suspend by calling call_psci_system_suspend()
> > +
> > +Resume Flow
> > +------------
> > +The resume entry point shall be implemented in
> > +* hyp_resume() in arch/arm/arm64/entry.S The very beginning of the
> > +resume procedure has to be implemented in assembly.
> > +It shall contain the following:
> > +* Enable the MMU so that the structure containing CPU context which was
> > +saved on suspend can be accessed
> > +* Restore CPU context (to match the values saved on suspend) and return
> > +into C
> > +* Set the system_state variable to SYS_STATE_resume
> > +* Restore GIC context
> > +* Resume timer
> > +* Enable interrupts
> > +* Enable non-boot CPUs by calling enable_nonboot_cpus()
> > +* Thaw domains by calling domain_unpause() for each domain
> > +* Enable watchdog
> > +* Set the system_state variable to SYS_STATE_active
> > +* Resume Dom0
> > +
> > +==========
> > +References
> > +==========
> > +
> > +[1] Power State Coordination Interface (ARM):
> > +https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfoc
> > +enter.arm.com%2Fhelp%2Ftopic%2Fcom.arm.doc.den0022d%2FPower_State
> > _Coord
> > +ination_Interface_PDD_v1_1_DEN0022D.pdf&data=02%7C01%7Cpeng.fan%4
> > 0nxp.c
> > +om%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c6fa92cd99
> > c5c30163
> > +5%7C0%7C1%7C636495614074885940&sdata=3ycqEZR9XgcqdvrmJKY86aukt
> > %2BQv%2BS
> > +BSZMxbCrpraEY%3D&reserved=0
> > --
> > 2.13.0
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xenproject.org
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.xe
> > nproject.org%2Fmailman%2Flistinfo%2Fxen-devel&data=02%7C01%7Cpeng.fan
> > %40nxp.com%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c6f
> > a92cd99c5c301635%7C0%7C0%7C636495614074885940&sdata=YLuJhbx%2B1
> > tDvblYbgtOZZBhsG36%2BUhpRc4VpSpHHM%2FU%3D&reserved=0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-03-26 11:42   ` Edgar E. Iglesias
@ 2018-04-12  2:26     ` Peng Fan
  2018-04-12 14:13       ` Mirela Simonovic
  0 siblings, 1 reply; 15+ messages in thread
From: Peng Fan @ 2018-04-12  2:26 UTC (permalink / raw)
  To: Edgar E. Iglesias; +Cc: sstabellini, Mirela Simonovic, julien.grall, xen-devel

Hi Edgar,

> -----Original Message-----
> From: Edgar E. Iglesias [mailto:edgar.iglesias@xilinx.com]
> Sent: 2018年3月26日 19:43
> To: Peng Fan <peng.fan@nxp.com>
> Cc: Mirela Simonovic <mirela.simonovic@aggios.com>; xen-devel@lists.xen.org;
> sstabellini@kernel.org; julien.grall@linaro.org
> Subject: Re: [Xen-devel] [RFC v2] xen/arm: Suspend to RAM Support in Xen for
> ARM
> 
> On Mon, Mar 26, 2018 at 09:51:40AM +0000, Peng Fan wrote:
> > Hi Mirela,
> >
> > Good to know that you are working suspend/resume support. Currently we
> > are also trying to support this on i.MX8, just wonder do you have any
> > open source available to support suspend to ram?
> >
> > > +
> > > +Suspend to RAM (in the following text 'suspend') for ARM in Xen
> > > +should be coordinated using ARM PSCI standard [1].
> > > +
> > > +Ideally, EL1/2 should suspend in the following order:
> > > +1) Unprivileged guests (DomUs) suspend
> > > +2) Privileged guest (Dom0) suspends
> > > +3) Xen suspends
> > > +
> > > +However, suspending unprivileged guests is not mandatory for
> > > +suspending
> > > +Dom0 and Xen. System suspend initiated by Dom0 (step 2) is
> > > +considered to be an ultimate decision to suspend the physical
> > > +machine. Suspending of Xen (step 3) is triggered whenever Dom0
> > > +completes suspend. Xen suspend leads to the full suspend of EL2.
> > > +
> > > +If an unprivileged guest is not suspended at the moment when Dom0
> > > +initiates its own suspend, the guest will be paused on Xen's
> > > +suspend and unpaused on Xen's resume. That way, a guest which
> > > +doesn't have power management support cannot prevent the physical
> > > +system from suspending when the decision to suspend is made by
> > > +privileged software
> > > (Dom0).
> > > +
> > > +Each guest in the system is responsible for suspending the devices it owns.
> > > +If a guest does not suspend a device, the device will continue to
> > > +operate as it is configured at the moment when the system suspends.
> > > +If a device triggers an interrupt while the physical system is
> > > +suspended, the
> > > system will resume.
> > > +
> > > +It is recommended for an unprivileged guest to participate in power
> > > +management in the following scenario:
> > > +Assume unprivileged guest owns a device which will trigger
> > > +interrupt at some point. This interrupt will wake-up the system. If
> > > +such a behavior is not wanted, coordination between Dom0 and the
> > > +guest is required in order to inform the guest about the intended physical
> system suspend.
> > > +Then, the guest will have a chance to suspend the device or respond
> > > +to the
> > > request in an abort fashion.
> > > +
> > > +Since this proposal is focused on implementing PSCI-based suspend
> > > +mechanisms in Xen, communication with or among the guests is not
> > > +covered by
> > > this document.
> > > +The order of suspending the guests is assumed to be guaranteed by
> > > +the software running in EL1.
> > > +
> > > +This document covers the following:
> > > +1) Mechanism for suspending/resuming a guest:
> > > +	1.1) Suspend is initiated by the guest
> > > +	1.2) Resume is initiated by a device interrupt
> > > +2) Mechanism for pausing/unpausing running guests when Dom0
> > > +suspends
> >
> > Will this take care of passthroughed devices for DomU?
> 

Thanks for your reply. Sorry for late reply

> Hi Peng,
> 
> The ZynqMP uses the EEMI Firmware interface to do power-management.
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.x
> ilinx.com%2Fsupport%2Fdocumentation%2Fuser_guides%2Fug1200-eemi-api.p
> df&data=02%7C01%7Cpeng.fan%40nxp.com%7C021307a245394e945cbf08d59
> 30ebb3c%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C6365766138
> 46476140&sdata=xwyil1ar7VXXYPJb2yXxYPWJvR5mVEb6wokggdt0ZH4%3D&re
> served=0

Yes. I see. 

> 
> In our case, we've implemented an EEMI mediator in Xen that traps EEMI
> requests from domU's and makes sure that the guest owns the device it is trying
> to operate on.
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
> com%2FXilinx%2Fxen%2Fblob%2Fxilinx%2Fstable-4.9%2Fxen%2Farch%2Farm%
> 2Fplatforms%2Fxilinx-zynqmp-eemi.c&data=02%7C01%7Cpeng.fan%40nxp.com
> %7C021307a245394e945cbf08d5930ebb3c%7C686ea1d3bc2b4c6fa92cd99c5c3
> 01635%7C0%7C0%7C636576613846476140&sdata=33AdCyBLxUYIR6h%2BtZzx
> TrnYpOZ86IMFySmjHA2%2Fits%3D&reserved=0
> 
> So domU will first issue the usual EEMI calls as it would in a non-virtualized case
> to suspend all it's devices. Once that has happened, the guest will issue PSCI calls
> to suspend the VM. So, Mirela please shim in if I missed something.
> 
> The EEMI mediator has been posted to the ML but is currently sitting in our tree
> waiting for us to go through the upstreaming effort.

So if Dom0 and DomU both running Linux, in DomU, "echo mem >/sys/power/state" to suspend
DomU, then in Dom0 "echo mem >/sys/power/state" to suspend Dom0?

Thanks,
Peng.

> 
> Cheers,
> Edgar
> 
> 
> 
> 
> >
> > Thanks,
> > Peng.
> >
> > > +3) Mechanism for suspending/resuming Xen when Dom0 completes
> > > +suspend
> > > +4) Resuming from any state on a wake-up event (device interrupt):
> > > +	4.1) Resume DomU on wake-up event when Dom0 is still running
> > > +	4.2) Resume DomU on wake-up event when Xen is suspended
> > > +	4.3) Resume Dom0 on wake-up event
> > > +
> > > +Mechanisms enumerated above will allow different kind of policies
> > > +and coordination among guests to be implemented in EL1. That is out
> > > +of the scope of this document.
> > > +
> > > +-----------------
> > > +Suspending Guests
> > > +-----------------
> > > +
> > > +Suspend procedure for a guest consists of the following:
> > > +1) Suspending devices
> > > +2) Suspending non-boot CPUs (based on hotplug/PSCI)
> > > +3) System suspend, performed by the boot CPU
> > > +
> > > +Each guest should suspend the devices it owns just like it would
> > > +when running without Xen.
> > > +
> > > +Guests should suspend their non-boot vCPUs using the hotplug mechanism.
> > > +Virtual CPUs should be put offline using the already implemented
> > > +PSCI vCPU_OFF call (prefix 'v' is added to distinguish PSCI calls
> > > +made by guests to Xen, which affect virtual machines; as opposed to
> > > +PSCI calls made by Xen to the EL3, which can affect power state of
> > > +the physical
> > > machine).
> > > +
> > > +After suspending its non-boot vCPUs a guest should finalize the
> > > +suspend by making the vSYSTEM_SUSPEND PSCI call. The resume address
> > > +is specified by the guest via the vSYSTEM_SUSPEND
> > > +entry_point_address argument. The vSYSTEM_SUSPEND call is currently
> not implemented in Xen.
> > > +
> > > +It is expected that a guest leaves enabled all interrupts that
> > > +should wake it up. Other interrupts should be disabled by the guest
> > > +prior to calling vSYSTEM_SUSPEND.
> > > +
> > > +After an unprivileged guest suspends, Xen will not suspend. Xen
> > > +would suspend only after the Dom0 completes the system suspend.
> > > +
> > > +--------------
> > > +Suspending Xen
> > > +--------------
> > > +
> > > +Xen should start suspending itself upon receiving the
> > > +vSYSTEM_SUSPEND call from the last running guest (Dom0). At that
> > > +moment all physical CPUs are still online (taking offline a vCPU or
> > > +suspending a VM does not affect
> > > physical CPUs).
> > > +Xen shall now put offline the non-boot pCPUs by making the CPU_OFF
> > > +PSCI call to EL3. The CPU_OFF PSCI function is currently not implemented in
> Xen.
> > > +
> > > +After putting offline the non-boot cores Xen must save the context
> > > +and finalize suspend by invoking SYSTEM_SUSPEND PSCI call, which is
> passed to EL3.
> > > +The resume point of Xen is specified by the entry_point_address
> > > +argument of the SYSTEM_SUSPEND call. The SYSTEM_SUSPEND function
> > > +and context saving is not implemented in Xen for ARM today.
> > > +
> > > +------------
> > > +Resuming Xen
> > > +------------
> > > +
> > > +Xen must be resumed prior to any software running in EL1. Starting
> > > +from the resume point, Xen should restore the context and resume
> > > +Dom0. Dom0 shall always be resumed whenever Xen resumes.
> > > +
> > > +---------------
> > > +Resuming Guests
> > > +---------------
> > > +
> > > +Resume of the privileged guest (Dom0) is always following the Xen resume.
> > > +
> > > +An unprivileged guest shall resume once a device it owns triggers a
> > > +wake-up interrupt, regardless of whether Xen was suspended when the
> > > +wake-up interrupt was triggered. If Xen was suspended, it is
> > > +assumed that Dom0 will be running before the DomU guest starts to
> > > +resume. The synchronization mechanism to enforce the assumed condition
> is TBD.
> > > +
> > > +If the ARM's GIC was powered down after the ARM subsystem
> > > +suspended, it is assumed that Xen needs to restore the GIC
> > > +interface for a VM prior to handing over control to the guest.
> > > +However, the guest should restore its own context upon entering the
> > > +resume point, just like it would when
> > > running without Xen.
> > > +
> > > +===============
> > > +Implementation
> > > +===============
> > > +
> > > +--------
> > > +Overview
> > > +--------
> > > +
> > > +In order to enable the suspend/resume of VMs and Xen itself, the
> > > +following PSCI calls have to be implemented and integrated in Xen:
> > > +1) vSYSTEM_SUSPEND
> > > +2) CPU_OFF
> > > +3) SYSTEM_SUSPEND
> > > +
> > > +In addition, the following have to be implemented:
> > > +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
> > > +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0),
> > > including:
> > > +	* Disable wathdog on suspend, enable it on resume
> > > +	* Pause domains on suspend, unpause them on resume
> > > +	* Disable non-boot pCPUs on suspend, enable them on resume
> > > +	* Save/restore of GIC configuration
> > > +	* Suspend/resume timer
> > > +	* Save/restore of EL2 context
> > > +	* Implement resume entry point in Xen, including MMU configuration
> > > +
> > > +Implementation details are provided in the sections below. Function
> > > +names and paths used below are consistent within the document but
> > > +may not always match the names used in future implementation.
> > > +Existing functions and paths are named as in Xen source tree.
> > > +
> > > +-------------------------------------
> > > +Suspend/Resume Implementation Details
> > > +-------------------------------------
> > > +
> > > +PSCI Implementation and Integration
> > > +-----------------------------------
> > > +vSYSTEM_SUSPEND
> > > +---------------
> > > +vSYSTEM_SUSPEND shall be implemented in
> > > +* do_psci_system_suspend() in arch/arm/vpsci.c
> > > +* Code independent from PSCI interface will be added in
> > > +arch/arm/suspend.c
> > > +
> > > +The implementation shall include the following steps:
> > > +* Suspend the current (calling) vCPU. Consists of 2 major steps:
> > > +1) Reset context of vCPU and save entry point into PC and context
> > > +ID into X0 (entry point and context ID are provided via
> > > +vSYSTEM_SUSPEND
> > > +arguments)
> > > +2) Block vCPU to ensure that it is not scheduled until it is
> > > +unblocked by an interrupt.
> > > +In step 1) above, the context is reset in order to prepare the vCPU
> > > +for resume, i.e. to save vCPU context that matches reset values as
> > > +expected by software on resume. This doesn't hold for PC and X0,
> > > +since the PC contains resume entry point and X0 contains context ID, as
> defined by PSCI.
> > > +* If the hardware domain made the call trigger Xen suspend, i.e.
> > > +  call machine_suspend() which will be implemented in
> > > +arch/arm/suspend.c  (similar as the machine_restart() is
> > > +implemented in
> > > +arch/arm/shutdown.c)
> > > +
> > > +The function do_psci_system_suspend() shall be called from
> > > +* do_trap_psci() in arch/arm/traps.c
> > > +
> > > +CPU_OFF (physical CPUs)
> > > +-----------------------
> > > +The CPU_OFF function shall be implemented in
> > > +* call_psci_cpu_off() in arch/arm/psci.c
> > > +
> > > +The implementation shall consist just of making the SMC call to EL3.
> > > +
> > > +This function needs to be called when Xen generic code disables a
> > > +non-boot
> > > CPU.
> > > +When a CPU is disabled it will loop forever in while loop
> > > +(stop_cpu() function which is already implemented in
> > > +xen/arch/arm/smpboot.c). Call to
> > > +call_psci_cpu_off() shall be made before the CPU enters infinite loop.
> > > +
> > > +SYSTEM_SUSPEND (physical)
> > > +-------------------------
> > > +The SYSTEM_SUSPEND function shall be implemented in
> > > +* call_psci_system_suspend() in arch/arm/psci.c
> > > +
> > > +The implementation shall consist just of making the SMC call to EL3.
> > > +The entry_point_address argument of the SMC call needs to be an ARM
> > > +architecture resume address, which shall be implemented, e.g. as
> > > +hyp_resume() in arch/arm/arm64/entry.S. The
> > > +call_psci_system_suspend()
> > > function does not return.
> > > +On the resume, the execution flow continues from hyp_resume.
> > > +
> > > +The function needs to be called from machine_suspend() to finalize
> > > +the suspend procedure.
> > > +
> > > +------------------
> > > +Additional Changes
> > > +------------------
> > > +
> > > +Suspend Flow
> > > +------------
> > > +The suspend procedure shall be implemented in
> > > +* machine_suspend() in arch/arm/suspend.c
> > > +
> > > +The implementation shall include the following steps:
> > > +* Move the execution to boot pCPU
> > > +* Set the system_state variable to SYS_STATE_suspend
> > > +* Disable watchdog
> > > +* Freeze domains by calling domain_pause() for each domain
> > > +* Disable non-boot CPUs by calling disable_nonboot_cpus()
> > > +* Disable interrupts
> > > +* Suspend timer
> > > +* Save GIC context. Shall be implemented in arch/arm/gic.c,
> > > +  include/asm-arm/gic.h and arch/arm/gic-v2.c (only GICv2 will be
> supported).
> > > +* Save CPU context. This shall be implemented in assembly, in
> > > +hyp_suspend()
> > > +  in arch/arm/arm64/entry.S. The context consists of callee-saved
> > > +general
> > > +  purpose registers, as well as few system registers. Context of
> > > +registers shall
> > > +  be saved in a statically allocated structure.
> > > +* Finalize the suspend by calling call_psci_system_suspend()
> > > +
> > > +Resume Flow
> > > +------------
> > > +The resume entry point shall be implemented in
> > > +* hyp_resume() in arch/arm/arm64/entry.S The very beginning of the
> > > +resume procedure has to be implemented in assembly.
> > > +It shall contain the following:
> > > +* Enable the MMU so that the structure containing CPU context which
> > > +was saved on suspend can be accessed
> > > +* Restore CPU context (to match the values saved on suspend) and
> > > +return into C
> > > +* Set the system_state variable to SYS_STATE_resume
> > > +* Restore GIC context
> > > +* Resume timer
> > > +* Enable interrupts
> > > +* Enable non-boot CPUs by calling enable_nonboot_cpus()
> > > +* Thaw domains by calling domain_unpause() for each domain
> > > +* Enable watchdog
> > > +* Set the system_state variable to SYS_STATE_active
> > > +* Resume Dom0
> > > +
> > > +==========
> > > +References
> > > +==========
> > > +
> > > +[1] Power State Coordination Interface (ARM):
> > > +https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fi
> > > +nfoc
> > >
> +enter.arm.com%2Fhelp%2Ftopic%2Fcom.arm.doc.den0022d%2FPower_State
> > > _Coord
> > >
> +ination_Interface_PDD_v1_1_DEN0022D.pdf&data=02%7C01%7Cpeng.fan%4
> > > 0nxp.c
> > >
> +om%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c6fa92cd99
> > > c5c30163
> > >
> +5%7C0%7C1%7C636495614074885940&sdata=3ycqEZR9XgcqdvrmJKY86aukt
> > > %2BQv%2BS
> > > +BSZMxbCrpraEY%3D&reserved=0
> > > --
> > > 2.13.0
> > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xenproject.org
> > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
> > > ists.xe
> > >
> nproject.org%2Fmailman%2Flistinfo%2Fxen-devel&data=02%7C01%7Cpeng.fa
> > >
> n %40nxp.com%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c
> 6f
> > >
> a92cd99c5c301635%7C0%7C0%7C636495614074885940&sdata=YLuJhbx%2B1
> > > tDvblYbgtOZZBhsG36%2BUhpRc4VpSpHHM%2FU%3D&reserved=0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-04-12  2:26     ` Peng Fan
@ 2018-04-12 14:13       ` Mirela Simonovic
  0 siblings, 0 replies; 15+ messages in thread
From: Mirela Simonovic @ 2018-04-12 14:13 UTC (permalink / raw)
  To: Peng Fan; +Cc: Edgar E. Iglesias, sstabellini, julien.grall, xen-devel

Hi Peng,

Sorry for late response, this email got buried and I accidentally saw it now.

On Thu, Apr 12, 2018 at 4:26 AM, Peng Fan <peng.fan@nxp.com> wrote:
> Hi Edgar,
>
>> -----Original Message-----
>> From: Edgar E. Iglesias [mailto:edgar.iglesias@xilinx.com]
>> Sent: 2018年3月26日 19:43
>> To: Peng Fan <peng.fan@nxp.com>
>> Cc: Mirela Simonovic <mirela.simonovic@aggios.com>; xen-devel@lists.xen.org;
>> sstabellini@kernel.org; julien.grall@linaro.org
>> Subject: Re: [Xen-devel] [RFC v2] xen/arm: Suspend to RAM Support in Xen for
>> ARM
>>
>> On Mon, Mar 26, 2018 at 09:51:40AM +0000, Peng Fan wrote:
>> > Hi Mirela,
>> >
>> > Good to know that you are working suspend/resume support. Currently we
>> > are also trying to support this on i.MX8, just wonder do you have any
>> > open source available to support suspend to ram?
>> >

We don't have suspend to RAM support available in open source yet.
However, we are starting the upstream (just the first series covering
CPU hotplug is submitted so far).

>> > > +
>> > > +Suspend to RAM (in the following text 'suspend') for ARM in Xen
>> > > +should be coordinated using ARM PSCI standard [1].
>> > > +
>> > > +Ideally, EL1/2 should suspend in the following order:
>> > > +1) Unprivileged guests (DomUs) suspend
>> > > +2) Privileged guest (Dom0) suspends
>> > > +3) Xen suspends
>> > > +
>> > > +However, suspending unprivileged guests is not mandatory for
>> > > +suspending
>> > > +Dom0 and Xen. System suspend initiated by Dom0 (step 2) is
>> > > +considered to be an ultimate decision to suspend the physical
>> > > +machine. Suspending of Xen (step 3) is triggered whenever Dom0
>> > > +completes suspend. Xen suspend leads to the full suspend of EL2.
>> > > +
>> > > +If an unprivileged guest is not suspended at the moment when Dom0
>> > > +initiates its own suspend, the guest will be paused on Xen's
>> > > +suspend and unpaused on Xen's resume. That way, a guest which
>> > > +doesn't have power management support cannot prevent the physical
>> > > +system from suspending when the decision to suspend is made by
>> > > +privileged software
>> > > (Dom0).
>> > > +
>> > > +Each guest in the system is responsible for suspending the devices it owns.
>> > > +If a guest does not suspend a device, the device will continue to
>> > > +operate as it is configured at the moment when the system suspends.
>> > > +If a device triggers an interrupt while the physical system is
>> > > +suspended, the
>> > > system will resume.
>> > > +
>> > > +It is recommended for an unprivileged guest to participate in power
>> > > +management in the following scenario:
>> > > +Assume unprivileged guest owns a device which will trigger
>> > > +interrupt at some point. This interrupt will wake-up the system. If
>> > > +such a behavior is not wanted, coordination between Dom0 and the
>> > > +guest is required in order to inform the guest about the intended physical
>> system suspend.
>> > > +Then, the guest will have a chance to suspend the device or respond
>> > > +to the
>> > > request in an abort fashion.
>> > > +
>> > > +Since this proposal is focused on implementing PSCI-based suspend
>> > > +mechanisms in Xen, communication with or among the guests is not
>> > > +covered by
>> > > this document.
>> > > +The order of suspending the guests is assumed to be guaranteed by
>> > > +the software running in EL1.
>> > > +
>> > > +This document covers the following:
>> > > +1) Mechanism for suspending/resuming a guest:
>> > > + 1.1) Suspend is initiated by the guest
>> > > + 1.2) Resume is initiated by a device interrupt
>> > > +2) Mechanism for pausing/unpausing running guests when Dom0
>> > > +suspends
>> >
>> > Will this take care of passthroughed devices for DomU?
>>
>
> Thanks for your reply. Sorry for late reply
>
>> Hi Peng,
>>
>> The ZynqMP uses the EEMI Firmware interface to do power-management.
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.x
>> ilinx.com%2Fsupport%2Fdocumentation%2Fuser_guides%2Fug1200-eemi-api.p
>> df&data=02%7C01%7Cpeng.fan%40nxp.com%7C021307a245394e945cbf08d59
>> 30ebb3c%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C6365766138
>> 46476140&sdata=xwyil1ar7VXXYPJb2yXxYPWJvR5mVEb6wokggdt0ZH4%3D&re
>> served=0
>
> Yes. I see.
>
>>
>> In our case, we've implemented an EEMI mediator in Xen that traps EEMI
>> requests from domU's and makes sure that the guest owns the device it is trying
>> to operate on.
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
>> com%2FXilinx%2Fxen%2Fblob%2Fxilinx%2Fstable-4.9%2Fxen%2Farch%2Farm%
>> 2Fplatforms%2Fxilinx-zynqmp-eemi.c&data=02%7C01%7Cpeng.fan%40nxp.com
>> %7C021307a245394e945cbf08d5930ebb3c%7C686ea1d3bc2b4c6fa92cd99c5c3
>> 01635%7C0%7C0%7C636576613846476140&sdata=33AdCyBLxUYIR6h%2BtZzx
>> TrnYpOZ86IMFySmjHA2%2Fits%3D&reserved=0
>>
>> So domU will first issue the usual EEMI calls as it would in a non-virtualized case
>> to suspend all it's devices. Once that has happened, the guest will issue PSCI calls
>> to suspend the VM. So, Mirela please shim in if I missed something.
>>
>> The EEMI mediator has been posted to the ML but is currently sitting in our tree
>> waiting for us to go through the upstreaming effort.
>
> So if Dom0 and DomU both running Linux, in DomU, "echo mem >/sys/power/state" to suspend
> DomU, then in Dom0 "echo mem >/sys/power/state" to suspend Dom0?

That is correct. Xen suspend will be triggered when Dom0 completes suspend.

>
> Thanks,
> Peng.
>
>>
>> Cheers,
>> Edgar
>>
>>
>>
>>
>> >
>> > Thanks,
>> > Peng.
>> >
>> > > +3) Mechanism for suspending/resuming Xen when Dom0 completes
>> > > +suspend
>> > > +4) Resuming from any state on a wake-up event (device interrupt):
>> > > + 4.1) Resume DomU on wake-up event when Dom0 is still running
>> > > + 4.2) Resume DomU on wake-up event when Xen is suspended
>> > > + 4.3) Resume Dom0 on wake-up event
>> > > +
>> > > +Mechanisms enumerated above will allow different kind of policies
>> > > +and coordination among guests to be implemented in EL1. That is out
>> > > +of the scope of this document.
>> > > +
>> > > +-----------------
>> > > +Suspending Guests
>> > > +-----------------
>> > > +
>> > > +Suspend procedure for a guest consists of the following:
>> > > +1) Suspending devices
>> > > +2) Suspending non-boot CPUs (based on hotplug/PSCI)
>> > > +3) System suspend, performed by the boot CPU
>> > > +
>> > > +Each guest should suspend the devices it owns just like it would
>> > > +when running without Xen.
>> > > +
>> > > +Guests should suspend their non-boot vCPUs using the hotplug mechanism.
>> > > +Virtual CPUs should be put offline using the already implemented
>> > > +PSCI vCPU_OFF call (prefix 'v' is added to distinguish PSCI calls
>> > > +made by guests to Xen, which affect virtual machines; as opposed to
>> > > +PSCI calls made by Xen to the EL3, which can affect power state of
>> > > +the physical
>> > > machine).
>> > > +
>> > > +After suspending its non-boot vCPUs a guest should finalize the
>> > > +suspend by making the vSYSTEM_SUSPEND PSCI call. The resume address
>> > > +is specified by the guest via the vSYSTEM_SUSPEND
>> > > +entry_point_address argument. The vSYSTEM_SUSPEND call is currently
>> not implemented in Xen.
>> > > +
>> > > +It is expected that a guest leaves enabled all interrupts that
>> > > +should wake it up. Other interrupts should be disabled by the guest
>> > > +prior to calling vSYSTEM_SUSPEND.
>> > > +
>> > > +After an unprivileged guest suspends, Xen will not suspend. Xen
>> > > +would suspend only after the Dom0 completes the system suspend.
>> > > +
>> > > +--------------
>> > > +Suspending Xen
>> > > +--------------
>> > > +
>> > > +Xen should start suspending itself upon receiving the
>> > > +vSYSTEM_SUSPEND call from the last running guest (Dom0). At that
>> > > +moment all physical CPUs are still online (taking offline a vCPU or
>> > > +suspending a VM does not affect
>> > > physical CPUs).
>> > > +Xen shall now put offline the non-boot pCPUs by making the CPU_OFF
>> > > +PSCI call to EL3. The CPU_OFF PSCI function is currently not implemented in
>> Xen.
>> > > +
>> > > +After putting offline the non-boot cores Xen must save the context
>> > > +and finalize suspend by invoking SYSTEM_SUSPEND PSCI call, which is
>> passed to EL3.
>> > > +The resume point of Xen is specified by the entry_point_address
>> > > +argument of the SYSTEM_SUSPEND call. The SYSTEM_SUSPEND function
>> > > +and context saving is not implemented in Xen for ARM today.
>> > > +
>> > > +------------
>> > > +Resuming Xen
>> > > +------------
>> > > +
>> > > +Xen must be resumed prior to any software running in EL1. Starting
>> > > +from the resume point, Xen should restore the context and resume
>> > > +Dom0. Dom0 shall always be resumed whenever Xen resumes.
>> > > +
>> > > +---------------
>> > > +Resuming Guests
>> > > +---------------
>> > > +
>> > > +Resume of the privileged guest (Dom0) is always following the Xen resume.
>> > > +
>> > > +An unprivileged guest shall resume once a device it owns triggers a
>> > > +wake-up interrupt, regardless of whether Xen was suspended when the
>> > > +wake-up interrupt was triggered. If Xen was suspended, it is
>> > > +assumed that Dom0 will be running before the DomU guest starts to
>> > > +resume. The synchronization mechanism to enforce the assumed condition
>> is TBD.
>> > > +
>> > > +If the ARM's GIC was powered down after the ARM subsystem
>> > > +suspended, it is assumed that Xen needs to restore the GIC
>> > > +interface for a VM prior to handing over control to the guest.
>> > > +However, the guest should restore its own context upon entering the
>> > > +resume point, just like it would when
>> > > running without Xen.
>> > > +
>> > > +===============
>> > > +Implementation
>> > > +===============
>> > > +
>> > > +--------
>> > > +Overview
>> > > +--------
>> > > +
>> > > +In order to enable the suspend/resume of VMs and Xen itself, the
>> > > +following PSCI calls have to be implemented and integrated in Xen:
>> > > +1) vSYSTEM_SUSPEND
>> > > +2) CPU_OFF
>> > > +3) SYSTEM_SUSPEND
>> > > +
>> > > +In addition, the following have to be implemented:
>> > > +* Suspend/resume vCPU (triggered by vSYSTEM_SUSPEND call)
>> > > +* Suspend/resume Xen (triggered by vSYSTEM_SUSPEND called by Dom0),
>> > > including:
>> > > + * Disable wathdog on suspend, enable it on resume
>> > > + * Pause domains on suspend, unpause them on resume
>> > > + * Disable non-boot pCPUs on suspend, enable them on resume
>> > > + * Save/restore of GIC configuration
>> > > + * Suspend/resume timer
>> > > + * Save/restore of EL2 context
>> > > + * Implement resume entry point in Xen, including MMU configuration
>> > > +
>> > > +Implementation details are provided in the sections below. Function
>> > > +names and paths used below are consistent within the document but
>> > > +may not always match the names used in future implementation.
>> > > +Existing functions and paths are named as in Xen source tree.
>> > > +
>> > > +-------------------------------------
>> > > +Suspend/Resume Implementation Details
>> > > +-------------------------------------
>> > > +
>> > > +PSCI Implementation and Integration
>> > > +-----------------------------------
>> > > +vSYSTEM_SUSPEND
>> > > +---------------
>> > > +vSYSTEM_SUSPEND shall be implemented in
>> > > +* do_psci_system_suspend() in arch/arm/vpsci.c
>> > > +* Code independent from PSCI interface will be added in
>> > > +arch/arm/suspend.c
>> > > +
>> > > +The implementation shall include the following steps:
>> > > +* Suspend the current (calling) vCPU. Consists of 2 major steps:
>> > > +1) Reset context of vCPU and save entry point into PC and context
>> > > +ID into X0 (entry point and context ID are provided via
>> > > +vSYSTEM_SUSPEND
>> > > +arguments)
>> > > +2) Block vCPU to ensure that it is not scheduled until it is
>> > > +unblocked by an interrupt.
>> > > +In step 1) above, the context is reset in order to prepare the vCPU
>> > > +for resume, i.e. to save vCPU context that matches reset values as
>> > > +expected by software on resume. This doesn't hold for PC and X0,
>> > > +since the PC contains resume entry point and X0 contains context ID, as
>> defined by PSCI.
>> > > +* If the hardware domain made the call trigger Xen suspend, i.e.
>> > > +  call machine_suspend() which will be implemented in
>> > > +arch/arm/suspend.c  (similar as the machine_restart() is
>> > > +implemented in
>> > > +arch/arm/shutdown.c)
>> > > +
>> > > +The function do_psci_system_suspend() shall be called from
>> > > +* do_trap_psci() in arch/arm/traps.c
>> > > +
>> > > +CPU_OFF (physical CPUs)
>> > > +-----------------------
>> > > +The CPU_OFF function shall be implemented in
>> > > +* call_psci_cpu_off() in arch/arm/psci.c
>> > > +
>> > > +The implementation shall consist just of making the SMC call to EL3.
>> > > +
>> > > +This function needs to be called when Xen generic code disables a
>> > > +non-boot
>> > > CPU.
>> > > +When a CPU is disabled it will loop forever in while loop
>> > > +(stop_cpu() function which is already implemented in
>> > > +xen/arch/arm/smpboot.c). Call to
>> > > +call_psci_cpu_off() shall be made before the CPU enters infinite loop.
>> > > +
>> > > +SYSTEM_SUSPEND (physical)
>> > > +-------------------------
>> > > +The SYSTEM_SUSPEND function shall be implemented in
>> > > +* call_psci_system_suspend() in arch/arm/psci.c
>> > > +
>> > > +The implementation shall consist just of making the SMC call to EL3.
>> > > +The entry_point_address argument of the SMC call needs to be an ARM
>> > > +architecture resume address, which shall be implemented, e.g. as
>> > > +hyp_resume() in arch/arm/arm64/entry.S. The
>> > > +call_psci_system_suspend()
>> > > function does not return.
>> > > +On the resume, the execution flow continues from hyp_resume.
>> > > +
>> > > +The function needs to be called from machine_suspend() to finalize
>> > > +the suspend procedure.
>> > > +
>> > > +------------------
>> > > +Additional Changes
>> > > +------------------
>> > > +
>> > > +Suspend Flow
>> > > +------------
>> > > +The suspend procedure shall be implemented in
>> > > +* machine_suspend() in arch/arm/suspend.c
>> > > +
>> > > +The implementation shall include the following steps:
>> > > +* Move the execution to boot pCPU
>> > > +* Set the system_state variable to SYS_STATE_suspend
>> > > +* Disable watchdog
>> > > +* Freeze domains by calling domain_pause() for each domain
>> > > +* Disable non-boot CPUs by calling disable_nonboot_cpus()
>> > > +* Disable interrupts
>> > > +* Suspend timer
>> > > +* Save GIC context. Shall be implemented in arch/arm/gic.c,
>> > > +  include/asm-arm/gic.h and arch/arm/gic-v2.c (only GICv2 will be
>> supported).
>> > > +* Save CPU context. This shall be implemented in assembly, in
>> > > +hyp_suspend()
>> > > +  in arch/arm/arm64/entry.S. The context consists of callee-saved
>> > > +general
>> > > +  purpose registers, as well as few system registers. Context of
>> > > +registers shall
>> > > +  be saved in a statically allocated structure.
>> > > +* Finalize the suspend by calling call_psci_system_suspend()
>> > > +
>> > > +Resume Flow
>> > > +------------
>> > > +The resume entry point shall be implemented in
>> > > +* hyp_resume() in arch/arm/arm64/entry.S The very beginning of the
>> > > +resume procedure has to be implemented in assembly.
>> > > +It shall contain the following:
>> > > +* Enable the MMU so that the structure containing CPU context which
>> > > +was saved on suspend can be accessed
>> > > +* Restore CPU context (to match the values saved on suspend) and
>> > > +return into C
>> > > +* Set the system_state variable to SYS_STATE_resume
>> > > +* Restore GIC context
>> > > +* Resume timer
>> > > +* Enable interrupts
>> > > +* Enable non-boot CPUs by calling enable_nonboot_cpus()
>> > > +* Thaw domains by calling domain_unpause() for each domain
>> > > +* Enable watchdog
>> > > +* Set the system_state variable to SYS_STATE_active
>> > > +* Resume Dom0
>> > > +
>> > > +==========
>> > > +References
>> > > +==========
>> > > +
>> > > +[1] Power State Coordination Interface (ARM):
>> > > +https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fi
>> > > +nfoc
>> > >
>> +enter.arm.com%2Fhelp%2Ftopic%2Fcom.arm.doc.den0022d%2FPower_State
>> > > _Coord
>> > >
>> +ination_Interface_PDD_v1_1_DEN0022D.pdf&data=02%7C01%7Cpeng.fan%4
>> > > 0nxp.c
>> > >
>> +om%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c6fa92cd99
>> > > c5c30163
>> > >
>> +5%7C0%7C1%7C636495614074885940&sdata=3ycqEZR9XgcqdvrmJKY86aukt
>> > > %2BQv%2BS
>> > > +BSZMxbCrpraEY%3D&reserved=0
>> > > --
>> > > 2.13.0
>> > >
>> > >
>> > > _______________________________________________
>> > > Xen-devel mailing list
>> > > Xen-devel@lists.xenproject.org
>> > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
>> > > ists.xe
>> > >
>> nproject.org%2Fmailman%2Flistinfo%2Fxen-devel&data=02%7C01%7Cpeng.fa
>> > >
>> n %40nxp.com%7Cb343d128930d44c90f5d08d54963807b%7C686ea1d3bc2b4c
>> 6f
>> > >
>> a92cd99c5c301635%7C0%7C0%7C636495614074885940&sdata=YLuJhbx%2B1
>> > > tDvblYbgtOZZBhsG36%2BUhpRc4VpSpHHM%2FU%3D&reserved=0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
  2018-01-26 16:08     ` Julien Grall
@ 2018-04-17 12:13       ` Mirela Simonovic
  0 siblings, 0 replies; 15+ messages in thread
From: Mirela Simonovic @ 2018-04-17 12:13 UTC (permalink / raw)
  To: Julien Grall; +Cc: Edgar E. Iglesias, Stefano Stabellini, Xen Devel

Hi Julien,


On Fri, Jan 26, 2018 at 5:08 PM, Julien Grall <julien.grall@linaro.org> wrote:
>
>
> On 24/01/18 17:55, Mirela Simonovic wrote:
>>
>> Hi Julien, Stefano,
>
>
> Hi Mirela,
>
>>
>> Thank you very much for the feedback!
>>
>>
>> On 01/11/2018 03:00 PM, Julien Grall wrote:
>>>
>>> Hi Mirela,
>>>
>>> Thank you for the sending the design document. The general design looks
>>> good to me. I have some comments below, but they are more related to the
>>> implementation of CPU on/off in Xen.
>>>
>>> On 22/12/17 17:41, Mirela Simonovic wrote:
>>>
>>> [...]
>>>
>>>> +---------------
>>>> +Resuming Guests
>>>> +---------------
>>>> +
>>>> +Resume of the privileged guest (Dom0) is always following the Xen
>>>> resume.
>>>> +
>>>> +An unprivileged guest shall resume once a device it owns triggers a
>>>> wake-up
>>>> +interrupt, regardless of whether Xen was suspended when the wake-up
>>>> interrupt
>>>> +was triggered. If Xen was suspended, it is assumed that Dom0 will be
>>>> running
>>>> +before the DomU guest starts to resume. The synchronization mechanism
>>>> to
>>>> +enforce the assumed condition is TBD.
>>>
>>>
>>> Given that all but the non-boot CPU will be offlined. Does the wake-up
>>> interrupt always need to target the non-boot CPU?
>>
>>
>> Wake-up interrupt needs to be targeted to the boot pCPU, and the resume
>> sequence has to start from the boot pCPU.
>
>
> I assume that wake-up interrupts could belong to a guest.
> In that case, the wake-up interrupts will need to be moved to the boot pCPU
> on suspend.
>
> [...]
>
>>>
>>> For instance, you likely need to migrate interrupts that was assigned to
>>> the physical CPU (either guest one or Xen one). Though Xen ones might be
>>> less a concern because I think they are always assigned to CPU0 at the
>>> moment.
>>
>>
>> I would very appreciate more information on this. These kind of scenarios
>> can be easily overlooked and I'm not that much experienced with pinning and
>> its side effects.
>> Lets assume a vCPU is pinned to the non-boot CPU#1. When the guest enables
>> an interrupt (interrupt is targeted to the vCPU), would Xen target physical
>> interrupt to the GIC CPU interface of pCPU#1 or pCPU#0 or all pCPUs?
>
>
> In your example, the interrupts will target pCPU#1 only.
>
>>
>>>
>>> Furthermore, PPI handlers are not removed. Same for any memory allocated
>>> (you may loose reference to it because percpu area for that CPU will get
>>> freed). I believe get into trouble when the CPU is back online?
>>
>>
>> Yes, I needed to add few fixes into existing code to enable pCPU to come
>> back online. I'll submit RFC soon.
>
>
> Thank you!
>
> [...]
>
>>>
>>> I may have miss other bits, so I would highly recommend to go through the
>>> boot code and see what could go wrong.
>>>
>>> [..]
>>>
>>>> +Resume Flow
>>>> +------------
>>>> +The resume entry point shall be implemented in
>>>> +* hyp_resume() in arch/arm/arm64/entry.S
>>>> +The very beginning of the resume procedure has to be implemented in
>>>> assembly.
>>>> +It shall contain the following:
>>>> +* Enable the MMU so that the structure containing CPU context which was
>>>> saved on
>>>> +suspend can be accessed
>>>> +* Restore CPU context (to match the values saved on suspend) and return
>>>> into C
>>>> +* Set the system_state variable to SYS_STATE_resume
>>>> +* Restore GIC context
>>>> +* Resume timer
>>>> +* Enable interrupts
>>>> +* Enable non-boot CPUs by calling enable_nonboot_cpus()
>>>
>>>
>>> You would have to be careful on re-enabling the non-CPU. start_secondary
>>> is implemented based on the assumption that it will only be called during
>>> Xen boot. Some of the code may be part of __init (see cpu_up_send_sgi) or
>>> should not be called as it is after boot (e.g check_local_cpu_errata).
>>>
>>> Another I have in mind is the way VTCR_EL2 is set today (see
>>> setup_virt_paging). It is done at boot time, so if you online a CPU
>>> afterwards, VTCR_EL2 will not be set correctly.
>>
>>
>> Was there any reason to configure VTCR_EL2 after all CPUs become online?
>>
>> I fixed this as follows: in start_xen(), the boot CPU calls
>> setup_virt_paging() prior to enabling non-boot CPUs. setup_virt_paging()
>> configures VTCR_EL2 only for the boot CPU.
>> Non-boot CPUs call setup_virt_paging_one() later, from start_secondary().
>> Also, only the boot CPU performs the calculation for how to configure
>> VTCR_EL2, non-boot CPUs rely on the calculated value.
>
> This would not be correct. Imagine a platform with heterogeneous processors
> (such as big.LITTLE), each processors may have different set of "features"
> (e.g max IPA size supported). You want Xen to use a common set of "features"
> that would work on all CPUs.
>
> To give an example, your boot CPU may support maximum 48-bit IPA while all
> the other CPUs would support maximum 40-bit IPA. If Xen decides to use
> maximum 48-bit IPA, then page-tables would not work on other CPUs.
>
> In order to take the decision, you need to wait all CPUs to come up and look
> at their ID registers. Once the decision is made, then you can configure
> correctly VTCR_EL2.
>
> This is why setup_virt_paging() is called after all the CPUs have booted.
>
> Obviously, this does not work for CPUs brought up afterwards (e.g resume
> case). For those CPUs we should call setup_virt_paging_one directly and
> check that the value chosen by Xen can be handled by the processor. If not,
> this CPU should be parked.
>

Could you please clarify "this does not work for CPUs brought up
afterwards (e.g resume case)"?
Each CPU that is brought up on resume use to be brought up for the
first time somewhere in the past before the suspend was triggered.
That moment is the place where the CPU could be parked if the config
value chosen by Xen cannot be handled by that CPU.
I don't see what could possibly change in suspend/resume path that
would require this check to be performed again. Each CPU that has
suspended is compliant with the VTCR_EL2 chosen by Xen.

Please let me know if I misunderstood something.

Thanks,
Mirela

> I think you can use (system_state > SYS_STATE_active) to differentiate
> between CPUs brought during Xen boot from the one afterwards.
>
> Note that this is probably the only place where platform with heterogeneous
> processors are properly supported on Xen. In the future, we should do that
> for all "features" and park CPU brought after boot if they does not support
> the ones used by Xen. Maybe by having a framework very similar to Linux (see
> arch/arm64/kernel/cpufeature.c).
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-04-17 12:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-22 17:41 [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM Mirela Simonovic
2018-01-11  0:55 ` Stefano Stabellini
2018-01-11 14:00 ` Julien Grall
2018-01-23 11:52   ` Oleksandr Tyshchenko
2018-01-23 11:58     ` Edgar E. Iglesias
2018-01-24 18:04       ` Mirela Simonovic
2018-01-25 14:15         ` Edgar E. Iglesias
2018-01-26 15:37           ` Julien Grall
2018-01-24 17:55   ` Mirela Simonovic
2018-01-26 16:08     ` Julien Grall
2018-04-17 12:13       ` Mirela Simonovic
2018-03-26  9:51 ` Peng Fan
2018-03-26 11:42   ` Edgar E. Iglesias
2018-04-12  2:26     ` Peng Fan
2018-04-12 14:13       ` Mirela Simonovic

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.