xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* questions of vm save/restore on arm64
@ 2016-05-27 10:08 Chenxiao Zhao
  2016-05-30 11:40 ` Stefano Stabellini
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-05-27 10:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Stefano Stabellini


[-- Attachment #1.1: Type: text/plain, Size: 8014 bytes --]

Hi,

My board is Hikey on which have octa-core of arm cortex-a53. I have applied
patches [1] to try vm save/restore on arm.
These patches originally do not working on arm64. I have made some changes
based on patch set [2].

What I have got so far is

1. if I run 'xl save -p guest memState' to leave guest in suspend state,
then run 'xl unpause guest'.
    the guest can resume successfully. so I suppose the guest works find on
suspend/resume.

2. if I run 'xl restore -p memState' to restore guest and use xenctx to
dump all vcpu's registers.
    all the registers are identical to the state on save. After I run 'xl
unpause guest', I got no error but can not connect to console.
After restore the guest's PC is at a function
called user_disable_single_step(), which is called
by single_step_handler().

My question is

1. How could I debug guest on restore progress? are there any tools
available?
2. From my understanding, the restore not working is because some status is
missing when saving.
 e.g. on cpu_save, it know the domain is 64bit, but on cpu_load, it always
think it is a 32bit domain. so I have hard coded the domain type to
DOMAIN_64BIT.
Am I correct?
3. How could I dump all VM's status? I only found xenctx can dump vcpu's
registers.

I have attached my patch and log below.

Looking forward for your feedback.
Thanks

xl list
Name                                        ID   Mem VCPUs      State
Time(s)
Domain-0                                     0  1024     8     r-----
 11.7
root@linaro-alip:~# xl create guest.cfg
Parsing config from guest.cfg
[   39.238723] xen-blkback: ring-ref 8, event-channel 3, protocol 1
(arm-abi) persistent grants

root@linaro-alip:~# xl save -p guest memState
Saving to memState new xl format (info 0x3/0x0/931)
xc: info: Saving domain 1, type ARM
(XEN) HVM1 save: VCPU
(XEN) HVM1 save: A15_TIMER
(XEN) HVM1 save: GICV2_GICD
(XEN) HVM1 save: GICV2_GICC
(XEN) HVM1 save: GICV3
root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 1
PC:       ffffffc0000ab028
LR:       ffffffc00050458c
ELR_EL1:  ffffffc000086b34
CPSR:     200001c5
SPSR_EL1: 60000145
SP_EL0:   0000007ff6f2a850
SP_EL1:   ffffffc0140a7ca0

 x0: 0000000000000001    x1: 00000000deadbeef    x2: 0000000000000002
 x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
 x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
 x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0

SCTLR: 34d5d91d
TTBCR: 00000032b5193519
TTBR0: 002d000054876000
TTBR1: 0000000040dcf000
root@linaro-alip:~# xl destroy guest
(XEN) mm.c:1265:d0v1 gnttab_mark_dirty not implemented yet
root@linaro-alip:~# xl restore -p memState
Loading new save file memState (new xl fmt info 0x3/0x0/931)
 Savefile contains xl domain config in JSON format
Parsing config from <saved>
xc: info: (XEN) HVM2 restore: VCPU 0
Found ARM domain from Xen 4.7
xc: info: Restoring domain
(XEN) HVM2 restore: A15_TIMER 0
(XEN) HVM2 restore: A15_TIMER 0
(XEN) HVM2 restore: GICV2_GICD 0
(XEN) HVM2 restore: GICV2_GICC 0
(XEN) GICH_LRs (vcpu 0) mask=0
(XEN)    VCPU_LR[0]=0
(XEN)    VCPU_LR[1]=0
(XEN)    VCPU_LR[2]=0
(XEN)    VCPU_LR[3]=0
xc: info: Restore successful
xc: info: XenStore: mfn 0x39001, dom 0, evt 1
xc: info: Console: mfn 0x39000, dom 0, evt 2
root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
PC:       ffffffc0000ab028
LR:       ffffffc00050458c
ELR_EL1:  ffffffc000086b34
CPSR:     200001c5
SPSR_EL1: 60000145
SP_EL0:   0000007ff6f2a850
SP_EL1:   ffffffc0140a7ca0

 x0: 0000000000000000    x1: 00000000deadbeef    x2: 0000000000000002
 x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
 x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
 x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0

SCTLR: 34d5d91d
TTBCR: 00000000b5193519
TTBR0: 002d000054876000
TTBR1: 0000000040dcf000
root@linaro-alip:~# xl unpause guest
root@linaro-alip:~# xl list
Name                                        ID   Mem VCPUs      State
Time(s)
Domain-0                                     0  1024     8     r-----
 22.2
guest                                        2     0     1     r-----
4.8
root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
PC:       ffffffc000084a00
LR:       ffffffc00050458c
ELR_EL1:  ffffffc000084a00
CPSR:     000003c5
SPSR_EL1: 000003c5
SP_EL0:   0000007ff6f2a850
SP_EL1:   ffffffc0140a7ca0

 x0: 0000000000000000    x1: 00000000deadbeef    x2: 0000000000000002
 x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
 x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
 x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0

SCTLR: 34d5d91d
TTBCR: 00000000b5193519
TTBR0: 002d000054876000
TTBR1: 0000000040dcf000
root@linaro-alip:~# xl console guest
xenconsole: Could not read tty from store: Success
root@linaro-alip:~#


diff --git a/xen/arch/arm/hvm.c b/xen/arch/arm/hvm.c
index aee3353..411bab4 100644
--- a/xen/arch/arm/hvm.c
+++ b/xen/arch/arm/hvm.c
@@ -120,7 +120,8 @@ static int cpu_save(struct domain *d,
hvm_domain_context_t *h)
         ctxt.dfar = v->arch.dfar;
         ctxt.dfsr = v->arch.dfsr;
 #else
-        /* XXX 64-bit */
+       ctxt.far = v->arch.far;
+       ctxt.esr = v->arch.esr;
 #endif

 #ifdef CONFIG_ARM_32
@@ -187,6 +188,9 @@ static int cpu_load(struct domain *d,
hvm_domain_context_t *h)
     if ( hvm_load_entry(VCPU, h, &ctxt) != 0 )
         return -EINVAL;

+#ifdef CONFIG_ARM64
+    v->arch.type = DOMAIN_64BIT;
+#endif
     v->arch.sctlr = ctxt.sctlr;
     v->arch.ttbr0 = ctxt.ttbr0;
     v->arch.ttbr1 = ctxt.ttbr1;
@@ -199,7 +203,8 @@ static int cpu_load(struct domain *d,
hvm_domain_context_t *h)
     v->arch.dfar = ctxt.dfar;
     v->arch.dfsr = ctxt.dfsr;
 #else
-    /* XXX 64-bit */
+    v->arch.far = ctxt.far;
+    v->arch.esr = ctxt.esr;
 #endif

 #ifdef CONFIG_ARM_32
diff --git a/xen/include/public/arch-arm/hvm/save.h
b/xen/include/public/arch-arm/hvm/save.h
index db916b1..89e6e89 100644
--- a/xen/include/public/arch-arm/hvm/save.h
+++ b/xen/include/public/arch-arm/hvm/save.h
@@ -46,8 +46,12 @@ DECLARE_HVM_SAVE_TYPE(HEADER, 1, struct hvm_save_header);

 struct hvm_hw_cpu
 {
+#ifdef CONFIG_ARM_32
     uint64_t vfp[34]; /* Vector floating pointer */
     /* VFP v3 state is 34x64 bit, VFP v4 is not yet supported */
+#else
+    uint64_t vfp[66];
+#endif

     /* Guest core registers */
     struct vcpu_guest_core_regs core_regs;
@@ -60,6 +64,9 @@ struct hvm_hw_cpu
     uint32_t dacr;
     uint64_t par;

+    uint64_t far;
+    uint64_t esr;
+
     uint64_t mair0, mair1;
     uint64_t tpidr_el0;
     uint64_t tpidr_el1;



[1] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01053.html
[2] http://lists.xen.org/archives/html/xen-devel/2014-04/msg01544.html

[-- Attachment #1.2: Type: text/html, Size: 11721 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-05-27 10:08 questions of vm save/restore on arm64 Chenxiao Zhao
@ 2016-05-30 11:40 ` Stefano Stabellini
  2016-06-01  0:28   ` Chenxiao Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Stefano Stabellini @ 2016-05-30 11:40 UTC (permalink / raw)
  To: Chenxiao Zhao; +Cc: julien.grall, Stefano Stabellini, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 9391 bytes --]

On Fri, 27 May 2016, Chenxiao Zhao wrote:
> Hi, 
> 
> My board is Hikey on which have octa-core of arm cortex-a53. I have applied patches [1] to try vm save/restore on arm.
> These patches originally do not working on arm64. I have made some changes based on patch set [2].

Hello Chenxiao,

thanks for your interest in Xen on ARM save/restore.


> What I have got so far is
> 
> 1. if I run 'xl save -p guest memState' to leave guest in suspend state, then run 'xl unpause guest'.
>     the guest can resume successfully. so I suppose the guest works find on suspend/resume. 
> 
> 2. if I run 'xl restore -p memState' to restore guest and use xenctx to dump all vcpu's registers.
>     all the registers are identical to the state on save. After I run 'xl unpause guest', I got no error but can not connect to console.
> After restore the guest's PC is at a function called user_disable_single_step(), which is called by single_step_handler(). 
> 
> My question is
> 
> 1. How could I debug guest on restore progress? are there any tools available?

Nothing special. You can use ctrl-AAA on the console to switch to the
hypervisor console and see the state of the guest. You can also add some
debug printks; if the console doesn't work you can use
dom0_write_console in Linux to get messages out of your guest (you need
to compile Xen with debug=y for that to work).


> 2. From my understanding, the restore not working is because some status is missing when saving.
>  e.g. on cpu_save, it know the domain is 64bit, but on cpu_load, it always think it is a 32bit domain. so I have hard coded the domain type to
> DOMAIN_64BIT.
> Am I correct?

If Xen thinks the domain is 32-bit at restore, it must be a bug.


> 3. How could I dump all VM's status? I only found xenctx can dump vcpu's registers.

You can use the hypervisor console via the ctrl-aaa menu.


> I have attached my patch and log below.
> 
> Looking forward for your feedback.
> Thanks
> 
> xl list
> Name                                        ID   Mem VCPUs      State   Time(s)
> Domain-0                                     0  1024     8     r-----      11.7
> root@linaro-alip:~# xl create guest.cfg
> Parsing config from guest.cfg
> [   39.238723] xen-blkback: ring-ref 8, event-channel 3, protocol 1 (arm-abi) persistent grants
> 
> root@linaro-alip:~# xl save -p guest memState
> Saving to memState new xl format (info 0x3/0x0/931)
> xc: info: Saving domain 1, type ARM
> (XEN) HVM1 save: VCPU
> (XEN) HVM1 save: A15_TIMER
> (XEN) HVM1 save: GICV2_GICD
> (XEN) HVM1 save: GICV2_GICC
> (XEN) HVM1 save: GICV3
> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 1
> PC:       ffffffc0000ab028
> LR:       ffffffc00050458c
> ELR_EL1:  ffffffc000086b34
> CPSR:     200001c5
> SPSR_EL1: 60000145
> SP_EL0:   0000007ff6f2a850
> SP_EL1:   ffffffc0140a7ca0
> 
>  x0: 0000000000000001    x1: 00000000deadbeef    x2: 0000000000000002
>  x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
>  x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
>  x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
> x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
> x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
> x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
> x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
> x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
> x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0
> 
> SCTLR: 34d5d91d
> TTBCR: 00000032b5193519
> TTBR0: 002d000054876000
> TTBR1: 0000000040dcf000
> root@linaro-alip:~# xl destroy guest
> (XEN) mm.c:1265:d0v1 gnttab_mark_dirty not implemented yet
> root@linaro-alip:~# xl restore -p memState
> Loading new save file memState (new xl fmt info 0x3/0x0/931)
>  Savefile contains xl domain config in JSON format
> Parsing config from <saved>
> xc: info: (XEN) HVM2 restore: VCPU 0
> Found ARM domain from Xen 4.7
> xc: info: Restoring domain
> (XEN) HVM2 restore: A15_TIMER 0
> (XEN) HVM2 restore: A15_TIMER 0
> (XEN) HVM2 restore: GICV2_GICD 0
> (XEN) HVM2 restore: GICV2_GICC 0
> (XEN) GICH_LRs (vcpu 0) mask=0
> (XEN)    VCPU_LR[0]=0
> (XEN)    VCPU_LR[1]=0
> (XEN)    VCPU_LR[2]=0
> (XEN)    VCPU_LR[3]=0
> xc: info: Restore successful
> xc: info: XenStore: mfn 0x39001, dom 0, evt 1
> xc: info: Console: mfn 0x39000, dom 0, evt 2
> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
> PC:       ffffffc0000ab028
> LR:       ffffffc00050458c
> ELR_EL1:  ffffffc000086b34
> CPSR:     200001c5
> SPSR_EL1: 60000145
> SP_EL0:   0000007ff6f2a850
> SP_EL1:   ffffffc0140a7ca0
> 
>  x0: 0000000000000000    x1: 00000000deadbeef    x2: 0000000000000002
>  x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
>  x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
>  x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
> x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
> x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
> x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
> x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
> x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
> x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0
> 
> SCTLR: 34d5d91d
> TTBCR: 00000000b5193519
> TTBR0: 002d000054876000
> TTBR1: 0000000040dcf000
> root@linaro-alip:~# xl unpause guest
> root@linaro-alip:~# xl list
> Name                                        ID   Mem VCPUs      State   Time(s)
> Domain-0                                     0  1024     8     r-----      22.2
> guest                                        2     0     1     r-----       4.8
> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
> PC:       ffffffc000084a00
> LR:       ffffffc00050458c
> ELR_EL1:  ffffffc000084a00
> CPSR:     000003c5
> SPSR_EL1: 000003c5
> SP_EL0:   0000007ff6f2a850
> SP_EL1:   ffffffc0140a7ca0
> 
>  x0: 0000000000000000    x1: 00000000deadbeef    x2: 0000000000000002
>  x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
>  x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
>  x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
> x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
> x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
> x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
> x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
> x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
> x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0
> 
> SCTLR: 34d5d91d
> TTBCR: 00000000b5193519
> TTBR0: 002d000054876000
> TTBR1: 0000000040dcf000
> root@linaro-alip:~# xl console guest
> xenconsole: Could not read tty from store: Success
> root@linaro-alip:~#
> 
> 
> diff --git a/xen/arch/arm/hvm.c b/xen/arch/arm/hvm.c
> index aee3353..411bab4 100644
> --- a/xen/arch/arm/hvm.c
> +++ b/xen/arch/arm/hvm.c
> @@ -120,7 +120,8 @@ static int cpu_save(struct domain *d, hvm_domain_context_t *h)
>          ctxt.dfar = v->arch.dfar;
>          ctxt.dfsr = v->arch.dfsr;
>  #else
> -        /* XXX 64-bit */
> +       ctxt.far = v->arch.far;
> +       ctxt.esr = v->arch.esr;
>  #endif
> 
>  #ifdef CONFIG_ARM_32
> @@ -187,6 +188,9 @@ static int cpu_load(struct domain *d, hvm_domain_context_t *h)
>      if ( hvm_load_entry(VCPU, h, &ctxt) != 0 )
>          return -EINVAL;
> 
> +#ifdef CONFIG_ARM64
> +    v->arch.type = DOMAIN_64BIT;
> +#endif
>      v->arch.sctlr = ctxt.sctlr;
>      v->arch.ttbr0 = ctxt.ttbr0;
>      v->arch.ttbr1 = ctxt.ttbr1;
> @@ -199,7 +203,8 @@ static int cpu_load(struct domain *d, hvm_domain_context_t *h)
>      v->arch.dfar = ctxt.dfar;
>      v->arch.dfsr = ctxt.dfsr;
>  #else
> -    /* XXX 64-bit */
> +    v->arch.far = ctxt.far;
> +    v->arch.esr = ctxt.esr;
>  #endif
> 
>  #ifdef CONFIG_ARM_32
> diff --git a/xen/include/public/arch-arm/hvm/save.h b/xen/include/public/arch-arm/hvm/save.h
> index db916b1..89e6e89 100644
> --- a/xen/include/public/arch-arm/hvm/save.h
> +++ b/xen/include/public/arch-arm/hvm/save.h
> @@ -46,8 +46,12 @@ DECLARE_HVM_SAVE_TYPE(HEADER, 1, struct hvm_save_header);
> 
>  struct hvm_hw_cpu
>  {
> +#ifdef CONFIG_ARM_32
>      uint64_t vfp[34]; /* Vector floating pointer */
>      /* VFP v3 state is 34x64 bit, VFP v4 is not yet supported */
> +#else
> +    uint64_t vfp[66];
> +#endif
> 
>      /* Guest core registers */
>      struct vcpu_guest_core_regs core_regs;
> @@ -60,6 +64,9 @@ struct hvm_hw_cpu
>      uint32_t dacr;
>      uint64_t par;
> 
> +    uint64_t far;
> +    uint64_t esr;
> +
>      uint64_t mair0, mair1;
>      uint64_t tpidr_el0;
>      uint64_t tpidr_el1;
> 
> 
> 
> [1] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01053.html
> [2] http://lists.xen.org/archives/html/xen-devel/2014-04/msg01544.html
> 
> 

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-05-30 11:40 ` Stefano Stabellini
@ 2016-06-01  0:28   ` Chenxiao Zhao
  2016-06-02 12:29     ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-01  0:28 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: julien.grall, xen-devel



On 5/30/2016 4:40 AM, Stefano Stabellini wrote:
> On Fri, 27 May 2016, Chenxiao Zhao wrote:
>> Hi,
>>
>> My board is Hikey on which have octa-core of arm cortex-a53. I have applied patches [1] to try vm save/restore on arm.
>> These patches originally do not working on arm64. I have made some changes based on patch set [2].
>
> Hello Chenxiao,
>
> thanks for your interest in Xen on ARM save/restore.

Hi Stefano,

Thanks for your advice.

I found a possible reason that cause the restore failure is that xen 
always failed on p2m_lookup for guest domain.

I called dump_p2m_lookup in p2m_look() and get the output like below:

(XEN) dom1 IPA 0x0000000039001000
(XEN) P2M @ 0000000801e7ce80 mfn:0x79f3a
(XEN) Using concatenated root table 0
(XEN) 1ST[0x0] = 0x0040000079f3c77f
(XEN) 2ND[0x1c8] = 0x0000000000000000

My question is:

1. who is responsible for restoring p2m table, Xen or guest kernel?
2. After restore, the vm always get zero memory space, but there is no 
error reported on the restore progress. Does the memory requested by the 
guest kernel or should be allocated early by hypervisor?


Name                          ID   Mem VCPUs      State   Time(s)
Domain-0                       0  1024     8     r-----      15.0
guest                          1     0     1     --p---       0.0



>
>
>> What I have got so far is
>>
>> 1. if I run 'xl save -p guest memState' to leave guest in suspend state, then run 'xl unpause guest'.
>>     the guest can resume successfully. so I suppose the guest works find on suspend/resume.
>>
>> 2. if I run 'xl restore -p memState' to restore guest and use xenctx to dump all vcpu's registers.
>>     all the registers are identical to the state on save. After I run 'xl unpause guest', I got no error but can not connect to console.
>> After restore the guest's PC is at a function called user_disable_single_step(), which is called by single_step_handler().
>>
>> My question is
>>
>> 1. How could I debug guest on restore progress? are there any tools available?
>
> Nothing special. You can use ctrl-AAA on the console to switch to the
> hypervisor console and see the state of the guest. You can also add some
> debug printks; if the console doesn't work you can use
> dom0_write_console in Linux to get messages out of your guest (you need
> to compile Xen with debug=y for that to work).
>
>
>> 2. From my understanding, the restore not working is because some status is missing when saving.
>>  e.g. on cpu_save, it know the domain is 64bit, but on cpu_load, it always think it is a 32bit domain. so I have hard coded the domain type to
>> DOMAIN_64BIT.
>> Am I correct?
>
> If Xen thinks the domain is 32-bit at restore, it must be a bug.
>
>
>> 3. How could I dump all VM's status? I only found xenctx can dump vcpu's registers.
>
> You can use the hypervisor console via the ctrl-aaa menu.
>
>
>> I have attached my patch and log below.
>>
>> Looking forward for your feedback.
>> Thanks
>>
>> xl list
>> Name                                        ID   Mem VCPUs      State   Time(s)
>> Domain-0                                     0  1024     8     r-----      11.7
>> root@linaro-alip:~# xl create guest.cfg
>> Parsing config from guest.cfg
>> [   39.238723] xen-blkback: ring-ref 8, event-channel 3, protocol 1 (arm-abi) persistent grants
>>
>> root@linaro-alip:~# xl save -p guest memState
>> Saving to memState new xl format (info 0x3/0x0/931)
>> xc: info: Saving domain 1, type ARM
>> (XEN) HVM1 save: VCPU
>> (XEN) HVM1 save: A15_TIMER
>> (XEN) HVM1 save: GICV2_GICD
>> (XEN) HVM1 save: GICV2_GICC
>> (XEN) HVM1 save: GICV3
>> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 1
>> PC:       ffffffc0000ab028
>> LR:       ffffffc00050458c
>> ELR_EL1:  ffffffc000086b34
>> CPSR:     200001c5
>> SPSR_EL1: 60000145
>> SP_EL0:   0000007ff6f2a850
>> SP_EL1:   ffffffc0140a7ca0
>>
>>  x0: 0000000000000001    x1: 00000000deadbeef    x2: 0000000000000002
>>  x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
>>  x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
>>  x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
>> x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
>> x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
>> x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
>> x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
>> x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
>> x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0
>>
>> SCTLR: 34d5d91d
>> TTBCR: 00000032b5193519
>> TTBR0: 002d000054876000
>> TTBR1: 0000000040dcf000
>> root@linaro-alip:~# xl destroy guest
>> (XEN) mm.c:1265:d0v1 gnttab_mark_dirty not implemented yet
>> root@linaro-alip:~# xl restore -p memState
>> Loading new save file memState (new xl fmt info 0x3/0x0/931)
>>  Savefile contains xl domain config in JSON format
>> Parsing config from <saved>
>> xc: info: (XEN) HVM2 restore: VCPU 0
>> Found ARM domain from Xen 4.7
>> xc: info: Restoring domain
>> (XEN) HVM2 restore: A15_TIMER 0
>> (XEN) HVM2 restore: A15_TIMER 0
>> (XEN) HVM2 restore: GICV2_GICD 0
>> (XEN) HVM2 restore: GICV2_GICC 0
>> (XEN) GICH_LRs (vcpu 0) mask=0
>> (XEN)    VCPU_LR[0]=0
>> (XEN)    VCPU_LR[1]=0
>> (XEN)    VCPU_LR[2]=0
>> (XEN)    VCPU_LR[3]=0
>> xc: info: Restore successful
>> xc: info: XenStore: mfn 0x39001, dom 0, evt 1
>> xc: info: Console: mfn 0x39000, dom 0, evt 2
>> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
>> PC:       ffffffc0000ab028
>> LR:       ffffffc00050458c
>> ELR_EL1:  ffffffc000086b34
>> CPSR:     200001c5
>> SPSR_EL1: 60000145
>> SP_EL0:   0000007ff6f2a850
>> SP_EL1:   ffffffc0140a7ca0
>>
>>  x0: 0000000000000000    x1: 00000000deadbeef    x2: 0000000000000002
>>  x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
>>  x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
>>  x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
>> x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
>> x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
>> x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
>> x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
>> x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
>> x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0
>>
>> SCTLR: 34d5d91d
>> TTBCR: 00000000b5193519
>> TTBR0: 002d000054876000
>> TTBR1: 0000000040dcf000
>> root@linaro-alip:~# xl unpause guest
>> root@linaro-alip:~# xl list
>> Name                                        ID   Mem VCPUs      State   Time(s)
>> Domain-0                                     0  1024     8     r-----      22.2
>> guest                                        2     0     1     r-----       4.8
>> root@linaro-alip:~# /usr/lib/xen/bin/xenctx -a 2
>> PC:       ffffffc000084a00
>> LR:       ffffffc00050458c
>> ELR_EL1:  ffffffc000084a00
>> CPSR:     000003c5
>> SPSR_EL1: 000003c5
>> SP_EL0:   0000007ff6f2a850
>> SP_EL1:   ffffffc0140a7ca0
>>
>>  x0: 0000000000000000    x1: 00000000deadbeef    x2: 0000000000000002
>>  x3: 0000000000000002    x4: 0000000000000004    x5: 0000000000000000
>>  x6: 000000000000001b    x7: 0000000000000001    x8: 000000618e589e00
>>  x9: 0000000000000000   x10: 0000000000000000   x11: 0000000000000000
>> x12: 00000000000001a3   x13: 000000001911a7d9   x14: 0000000000002ee0
>> x15: 0000000000000005   x16: 00000000deadbeef   x17: 0000000000000001
>> x18: 0000000000000007   x19: 0000000000000000   x20: ffffffc014163d58
>> x21: ffffffc014163cd8   x22: 0000000000000001   x23: 0000000000000140
>> x24: ffffffc000d5bb18   x25: ffffffc014163cd8   x26: 0000000000000000
>> x27: 0000000000000000   x28: 0000000000000000   x29: ffffffc0140a7ca0
>>
>> SCTLR: 34d5d91d
>> TTBCR: 00000000b5193519
>> TTBR0: 002d000054876000
>> TTBR1: 0000000040dcf000
>> root@linaro-alip:~# xl console guest
>> xenconsole: Could not read tty from store: Success
>> root@linaro-alip:~#
>>
>>
>> diff --git a/xen/arch/arm/hvm.c b/xen/arch/arm/hvm.c
>> index aee3353..411bab4 100644
>> --- a/xen/arch/arm/hvm.c
>> +++ b/xen/arch/arm/hvm.c
>> @@ -120,7 +120,8 @@ static int cpu_save(struct domain *d, hvm_domain_context_t *h)
>>          ctxt.dfar = v->arch.dfar;
>>          ctxt.dfsr = v->arch.dfsr;
>>  #else
>> -        /* XXX 64-bit */
>> +       ctxt.far = v->arch.far;
>> +       ctxt.esr = v->arch.esr;
>>  #endif
>>
>>  #ifdef CONFIG_ARM_32
>> @@ -187,6 +188,9 @@ static int cpu_load(struct domain *d, hvm_domain_context_t *h)
>>      if ( hvm_load_entry(VCPU, h, &ctxt) != 0 )
>>          return -EINVAL;
>>
>> +#ifdef CONFIG_ARM64
>> +    v->arch.type = DOMAIN_64BIT;
>> +#endif
>>      v->arch.sctlr = ctxt.sctlr;
>>      v->arch.ttbr0 = ctxt.ttbr0;
>>      v->arch.ttbr1 = ctxt.ttbr1;
>> @@ -199,7 +203,8 @@ static int cpu_load(struct domain *d, hvm_domain_context_t *h)
>>      v->arch.dfar = ctxt.dfar;
>>      v->arch.dfsr = ctxt.dfsr;
>>  #else
>> -    /* XXX 64-bit */
>> +    v->arch.far = ctxt.far;
>> +    v->arch.esr = ctxt.esr;
>>  #endif
>>
>>  #ifdef CONFIG_ARM_32
>> diff --git a/xen/include/public/arch-arm/hvm/save.h b/xen/include/public/arch-arm/hvm/save.h
>> index db916b1..89e6e89 100644
>> --- a/xen/include/public/arch-arm/hvm/save.h
>> +++ b/xen/include/public/arch-arm/hvm/save.h
>> @@ -46,8 +46,12 @@ DECLARE_HVM_SAVE_TYPE(HEADER, 1, struct hvm_save_header);
>>
>>  struct hvm_hw_cpu
>>  {
>> +#ifdef CONFIG_ARM_32
>>      uint64_t vfp[34]; /* Vector floating pointer */
>>      /* VFP v3 state is 34x64 bit, VFP v4 is not yet supported */
>> +#else
>> +    uint64_t vfp[66];
>> +#endif
>>
>>      /* Guest core registers */
>>      struct vcpu_guest_core_regs core_regs;
>> @@ -60,6 +64,9 @@ struct hvm_hw_cpu
>>      uint32_t dacr;
>>      uint64_t par;
>>
>> +    uint64_t far;
>> +    uint64_t esr;
>> +
>>      uint64_t mair0, mair1;
>>      uint64_t tpidr_el0;
>>      uint64_t tpidr_el1;
>>
>>
>>
>> [1] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01053.html
>> [2] http://lists.xen.org/archives/html/xen-devel/2014-04/msg01544.html
>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-01  0:28   ` Chenxiao Zhao
@ 2016-06-02 12:29     ` Julien Grall
  2016-06-03 17:05       ` Chenxiao Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Julien Grall @ 2016-06-02 12:29 UTC (permalink / raw)
  To: Chenxiao Zhao, Stefano Stabellini; +Cc: xen-devel

Hello,

On 01/06/16 01:28, Chenxiao Zhao wrote:
>
>
> On 5/30/2016 4:40 AM, Stefano Stabellini wrote:
>> On Fri, 27 May 2016, Chenxiao Zhao wrote:
>>> Hi,
>>>
>>> My board is Hikey on which have octa-core of arm cortex-a53. I have
>>> applied patches [1] to try vm save/restore on arm.
>>> These patches originally do not working on arm64. I have made some
>>> changes based on patch set [2].
>>
>> Hello Chenxiao,
>>
>> thanks for your interest in Xen on ARM save/restore.
>
> Hi Stefano,
>
> Thanks for your advice.
>
> I found a possible reason that cause the restore failure is that xen
> always failed on p2m_lookup for guest domain.

Who is calling p2m_lookup?

>
> I called dump_p2m_lookup in p2m_look() and get the output like below:
>
> (XEN) dom1 IPA 0x0000000039001000

Looking at the memory layout (see include/public/arch-arm.h) 0x39001000 
is part of the magic region. It contains pages for the console, 
xenstore, memaccess (see the list in tools/libxc/xc_dom_arm.c).

0x39001000 is the base address of the xenstore page.

> (XEN) P2M @ 0000000801e7ce80 mfn:0x79f3a
> (XEN) Using concatenated root table 0
> (XEN) 1ST[0x0] = 0x0040000079f3c77f
> (XEN) 2ND[0x1c8] = 0x0000000000000000
>
> My question is:
>
> 1. who is responsible for restoring p2m table, Xen or guest kernel?

AFAIK, the toolstack is restoring the most of the memory.

> 2. After restore, the vm always get zero memory space, but there is no

What do you mean by "zero memory space"?

> error reported on the restore progress. Does the memory requested by the
> guest kernel or should be allocated early by hypervisor?

Bear in mind that the patch series you are working on is an RFC and the 
ARM64 was not supported. There might be some issue in the save/restore path.

For instance xc_clear_domain_page (tools/libxc/xc_sr_restore_arm.c) main 
return an error if the memory is not mapped. But the caller does not 
check the return value.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-03 17:05       ` Chenxiao Zhao
@ 2016-06-03 10:16         ` Julien Grall
  2016-06-04  1:32           ` Chenxiao Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Julien Grall @ 2016-06-03 10:16 UTC (permalink / raw)
  To: Chenxiao Zhao, Stefano Stabellini; +Cc: xen-devel

Hello,

On 03/06/16 18:05, Chenxiao Zhao wrote:
> I finally found out that the problem is that the toolstack did not get
> corret p2m_size while sending all pages on save(always be zero). After I
> fixed that, the guest could be restored but guest kernel caught
> handle_mm_fault().
>
> where do you think I'm going to investigate, guest kernel hibernation
> restore or xen?

The hibernation support for ARM64 has only been merged recently in the 
kernel. Which kernel are you using?

Also, what are the modifications you have made to support Xen 
suspend/resume for ARM64?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-04  1:32           ` Chenxiao Zhao
@ 2016-06-03 11:02             ` Julien Grall
  2016-06-04  2:37               ` Chenxiao Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Julien Grall @ 2016-06-03 11:02 UTC (permalink / raw)
  To: Chenxiao Zhao, Stefano Stabellini; +Cc: xen-devel

Hello,

First thing, the time in the mail headers seems to be wrong. Maybe 
because of a wrong timezone?

I got: 04/06/16 02:32 however we are still the 3rd in my timezone.

On 04/06/16 02:32, Chenxiao Zhao wrote:
>
>
> On 6/3/2016 3:16 AM, Julien Grall wrote:
>> Hello,
>>
>> On 03/06/16 18:05, Chenxiao Zhao wrote:
>>> I finally found out that the problem is that the toolstack did not get
>>> corret p2m_size while sending all pages on save(always be zero). After I
>>> fixed that, the guest could be restored but guest kernel caught
>>> handle_mm_fault().
>>>
>>> where do you think I'm going to investigate, guest kernel hibernation
>>> restore or xen?
>>
>> The hibernation support for ARM64 has only been merged recently in the
>> kernel. Which kernel are you using?
>
> Hi Julien,
>
> I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
>
> https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
>
> I also applied following patches to make the kernel support hibernation.

This looks the wrong way to do it as this series may requires some 
patches which have been upstreamed before hand.

Linux upstream seems support to the hikey board [1]. Any reason to not 
using it?

> [1] http://www.spinics.net/lists/arm-kernel/msg477769.html
> [2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html
>
>>
>> Also, what are the modifications you have made to support Xen
>> suspend/resume for ARM64?
>
> I believe I have posted my modifications on xen in the first mail of
> this thread.

I mean in Linux. The patch from Ian Campbell does not have any kind of 
support for ARM64.

For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am 
wondering if your kernel has support of hibernation...

>
>  From my understanding, a kernel hibernation will cause kernel to save
> memories to disk(swap partition). But on guest save progress, the
> hibernation for domU does not make the guest save memories to disk. it's
> more like suspend all processes in guest, and memors actually depends on
> xen toolstack to save the pages to file. Am I correct?

You are using an older tree with a patch series based on a newer tree.

So I would recommend you to move to a newer tree. If it is not possible, 
please test that hibernation works on baremetal.

Regards,

[1] https://lists.96boards.org/pipermail/dev/2016-May/000933.html

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-04  2:37               ` Chenxiao Zhao
@ 2016-06-03 12:33                 ` Julien Grall
  2016-06-06 11:58                 ` Stefano Stabellini
  1 sibling, 0 replies; 16+ messages in thread
From: Julien Grall @ 2016-06-03 12:33 UTC (permalink / raw)
  To: Chenxiao Zhao, Stefano Stabellini; +Cc: xen-devel

Hello,

On 04/06/16 03:37, Chenxiao Zhao wrote:
>
>
> On 6/3/2016 4:02 AM, Julien Grall wrote:
>> Hello,
>>
>> First thing, the time in the mail headers seems to be wrong. Maybe
>> because of a wrong timezone?
>>
>> I got: 04/06/16 02:32 however we are still the 3rd in my timezone.
>>
>> On 04/06/16 02:32, Chenxiao Zhao wrote:
>>>
>>>
>>> On 6/3/2016 3:16 AM, Julien Grall wrote:
>>>> Hello,
>>>>
>>>> On 03/06/16 18:05, Chenxiao Zhao wrote:
>>>>> I finally found out that the problem is that the toolstack did not get
>>>>> corret p2m_size while sending all pages on save(always be zero).
>>>>> After I
>>>>> fixed that, the guest could be restored but guest kernel caught
>>>>> handle_mm_fault().
>>>>>
>>>>> where do you think I'm going to investigate, guest kernel hibernation
>>>>> restore or xen?
>>>>
>>>> The hibernation support for ARM64 has only been merged recently in the
>>>> kernel. Which kernel are you using?
>>>
>>> Hi Julien,
>>>
>>> I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
>>>
>>> https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
>>>
>>> I also applied following patches to make the kernel support hibernation.
>>
>> This looks the wrong way to do it as this series may requires some
>> patches which have been upstreamed before hand.
>>
>> Linux upstream seems support to the hikey board [1]. Any reason to not
>> using it?
>
> I tried a newer version of kernel 4.4, but got no luck to start dom0
> with xen. so I decide to stay in 4.1 for now.

The current upstream is 4.7-rc1 not 4.4.

However, the kernel for the guest does not require any support for your 
board so you can use upstream (i.e linus/master).

[...]

>>
>>>
>>>  From my understanding, a kernel hibernation will cause kernel to save
>>> memories to disk(swap partition). But on guest save progress, the
>>> hibernation for domU does not make the guest save memories to disk. it's
>>> more like suspend all processes in guest, and memors actually depends on
>>> xen toolstack to save the pages to file. Am I correct?
>>
>> You are using an older tree with a patch series based on a newer tree.
>>
>> So I would recommend you to move to a newer tree. If it is not possible,
>> please test that hibernation works on baremetal.
>
> I think the suspend/resume in guest is working, cause I can use
> pause/unpause command in toolstack to suspend/resume guest without
> problem. I can also see the suspend/resume kernel messages from guest's
> console. The only problem is it's can not resume from restore.

The commands pause/unpause do not require any kind of cooperation with 
the kernel. They are only request to the hypervisor to put the vCPUs in 
sleep or to wake them up.

You can look at the implementation of libxl_domain_{,un}pause.

> One thing that confused me is that the kernel's hibernation means the
> guest kernel will save the memory state to disk and power off VM at
> last. The guest will also take care of the memory restore itself. But I
> do not see the save/restore on xen works that way. So my question is why
> it requires hibernation (aka. suspend to disk) instead of the real
> suspend (aka. suspend to RAM and standby)?

I am not an expert in the suspend/resume of Xen. However by looking at 
the code, Xen has a specific path to suspend (see drivers/xen/manage.c). 
I guess, this code requires features which are only present when 
CONFIG_HIBERNATION is selected.

In any case, please use upstream Linux for the development in the guest. 
If there is still a bug, then we know that it is not because you are 
using a 4.5 based patch series in a 4.1 kernel.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-02 12:29     ` Julien Grall
@ 2016-06-03 17:05       ` Chenxiao Zhao
  2016-06-03 10:16         ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-03 17:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel



On 6/2/2016 5:29 AM, Julien Grall wrote:
> Hello,
>
> On 01/06/16 01:28, Chenxiao Zhao wrote:
>>
>>
>> On 5/30/2016 4:40 AM, Stefano Stabellini wrote:
>>> On Fri, 27 May 2016, Chenxiao Zhao wrote:
>>>> Hi,
>>>>
>>>> My board is Hikey on which have octa-core of arm cortex-a53. I have
>>>> applied patches [1] to try vm save/restore on arm.
>>>> These patches originally do not working on arm64. I have made some
>>>> changes based on patch set [2].
>>>
>>> Hello Chenxiao,
>>>
>>> thanks for your interest in Xen on ARM save/restore.
>>
>> Hi Stefano,
>>
>> Thanks for your advice.
>>
>> I found a possible reason that cause the restore failure is that xen
>> always failed on p2m_lookup for guest domain.
>
> Who is calling p2m_lookup?

It call by handle_hvm_params(tools/libxc/xc_sr_restore_arm.c) while 
restoring HVM_PARAM_STORE_PFN.

>
>>
>> I called dump_p2m_lookup in p2m_look() and get the output like below:
>>
>> (XEN) dom1 IPA 0x0000000039001000
>
> Looking at the memory layout (see include/public/arch-arm.h) 0x39001000
> is part of the magic region. It contains pages for the console,
> xenstore, memaccess (see the list in tools/libxc/xc_dom_arm.c).

yes, I also noticed that.

>
> 0x39001000 is the base address of the xenstore page.
>
>> (XEN) P2M @ 0000000801e7ce80 mfn:0x79f3a
>> (XEN) Using concatenated root table 0
>> (XEN) 1ST[0x0] = 0x0040000079f3c77f
>> (XEN) 2ND[0x1c8] = 0x0000000000000000
>>
>> My question is:
>>
>> 1. who is responsible for restoring p2m table, Xen or guest kernel?
>
> AFAIK, the toolstack is restoring the most of the memory.
>
>> 2. After restore, the vm always get zero memory space, but there is no
>
> What do you mean by "zero memory space"?

I mean xl does not assign any memory to the restored VM.

Name                                        ID   Mem VCPUs      State 
Time(s)
Domain-0                                     0  1024     8     r----- 
   76.3
guest                                        2     0     1     r----- 
    1.9


>
>> error reported on the restore progress. Does the memory requested by the
>> guest kernel or should be allocated early by hypervisor?
>
> Bear in mind that the patch series you are working on is an RFC and the
> ARM64 was not supported. There might be some issue in the save/restore
> path.
>
> For instance xc_clear_domain_page (tools/libxc/xc_sr_restore_arm.c) main
> return an error if the memory is not mapped. But the caller does not
> check the return value.
>
> Regards,

I finally found out that the problem is that the toolstack did not get 
corret p2m_size while sending all pages on save(always be zero). After I 
fixed that, the guest could be restored but guest kernel caught 
handle_mm_fault().

where do you think I'm going to investigate, guest kernel hibernation 
restore or xen?

Best regards.

>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-03 10:16         ` Julien Grall
@ 2016-06-04  1:32           ` Chenxiao Zhao
  2016-06-03 11:02             ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-04  1:32 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel



On 6/3/2016 3:16 AM, Julien Grall wrote:
> Hello,
>
> On 03/06/16 18:05, Chenxiao Zhao wrote:
>> I finally found out that the problem is that the toolstack did not get
>> corret p2m_size while sending all pages on save(always be zero). After I
>> fixed that, the guest could be restored but guest kernel caught
>> handle_mm_fault().
>>
>> where do you think I'm going to investigate, guest kernel hibernation
>> restore or xen?
>
> The hibernation support for ARM64 has only been merged recently in the
> kernel. Which kernel are you using?

Hi Julien,

I'm using a linaro ported Linux kernel 4.1 for hikey from this link.

https://github.com/96boards/linux/tree/android-hikey-linaro-4.1

I also applied following patches to make the kernel support hibernation.

[1] http://www.spinics.net/lists/arm-kernel/msg477769.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html

>
> Also, what are the modifications you have made to support Xen
> suspend/resume for ARM64?

I believe I have posted my modifications on xen in the first mail of 
this thread.

 From my understanding, a kernel hibernation will cause kernel to save 
memories to disk(swap partition). But on guest save progress, the 
hibernation for domU does not make the guest save memories to disk. it's 
more like suspend all processes in guest, and memors actually depends on 
xen toolstack to save the pages to file. Am I correct?

Looking forward for your advice.

Thanks.

>
> Regards,
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-03 11:02             ` Julien Grall
@ 2016-06-04  2:37               ` Chenxiao Zhao
  2016-06-03 12:33                 ` Julien Grall
  2016-06-06 11:58                 ` Stefano Stabellini
  0 siblings, 2 replies; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-04  2:37 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel



On 6/3/2016 4:02 AM, Julien Grall wrote:
> Hello,
>
> First thing, the time in the mail headers seems to be wrong. Maybe
> because of a wrong timezone?
>
> I got: 04/06/16 02:32 however we are still the 3rd in my timezone.
>
> On 04/06/16 02:32, Chenxiao Zhao wrote:
>>
>>
>> On 6/3/2016 3:16 AM, Julien Grall wrote:
>>> Hello,
>>>
>>> On 03/06/16 18:05, Chenxiao Zhao wrote:
>>>> I finally found out that the problem is that the toolstack did not get
>>>> corret p2m_size while sending all pages on save(always be zero).
>>>> After I
>>>> fixed that, the guest could be restored but guest kernel caught
>>>> handle_mm_fault().
>>>>
>>>> where do you think I'm going to investigate, guest kernel hibernation
>>>> restore or xen?
>>>
>>> The hibernation support for ARM64 has only been merged recently in the
>>> kernel. Which kernel are you using?
>>
>> Hi Julien,
>>
>> I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
>>
>> https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
>>
>> I also applied following patches to make the kernel support hibernation.
>
> This looks the wrong way to do it as this series may requires some
> patches which have been upstreamed before hand.
>
> Linux upstream seems support to the hikey board [1]. Any reason to not
> using it?

I tried a newer version of kernel 4.4, but got no luck to start dom0 
with xen. so I decide to stay in 4.1 for now.

>
>> [1] http://www.spinics.net/lists/arm-kernel/msg477769.html
>> [2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html
>>
>>>
>>> Also, what are the modifications you have made to support Xen
>>> suspend/resume for ARM64?
>>
>> I believe I have posted my modifications on xen in the first mail of
>> this thread.
>
> I mean in Linux. The patch from Ian Campbell does not have any kind of
> support for ARM64.
>
> For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am
> wondering if your kernel has support of hibernation...

Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile to 
let it build for arm64.
>
>>
>>  From my understanding, a kernel hibernation will cause kernel to save
>> memories to disk(swap partition). But on guest save progress, the
>> hibernation for domU does not make the guest save memories to disk. it's
>> more like suspend all processes in guest, and memors actually depends on
>> xen toolstack to save the pages to file. Am I correct?
>
> You are using an older tree with a patch series based on a newer tree.
>
> So I would recommend you to move to a newer tree. If it is not possible,
> please test that hibernation works on baremetal.

I think the suspend/resume in guest is working, cause I can use 
pause/unpause command in toolstack to suspend/resume guest without 
problem. I can also see the suspend/resume kernel messages from guest's 
console. The only problem is it's can not resume from restore.

One thing that confused me is that the kernel's hibernation means the 
guest kernel will save the memory state to disk and power off VM at 
last. The guest will also take care of the memory restore itself. But I 
do not see the save/restore on xen works that way. So my question is why 
it requires hibernation (aka. suspend to disk) instead of the real 
suspend (aka. suspend to RAM and standby)?


>
> Regards,
>
> [1] https://lists.96boards.org/pipermail/dev/2016-May/000933.html
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-04  2:37               ` Chenxiao Zhao
  2016-06-03 12:33                 ` Julien Grall
@ 2016-06-06 11:58                 ` Stefano Stabellini
  2016-06-07  1:17                   ` Chenxiao Zhao
  1 sibling, 1 reply; 16+ messages in thread
From: Stefano Stabellini @ 2016-06-06 11:58 UTC (permalink / raw)
  To: Chenxiao Zhao; +Cc: Julien Grall, Stefano Stabellini, xen-devel

On Fri, 3 Jun 2016, Chenxiao Zhao wrote:
> On 6/3/2016 4:02 AM, Julien Grall wrote:
> > Hello,
> > 
> > First thing, the time in the mail headers seems to be wrong. Maybe
> > because of a wrong timezone?
> > 
> > I got: 04/06/16 02:32 however we are still the 3rd in my timezone.
> > 
> > On 04/06/16 02:32, Chenxiao Zhao wrote:
> > > 
> > > 
> > > On 6/3/2016 3:16 AM, Julien Grall wrote:
> > > > Hello,
> > > > 
> > > > On 03/06/16 18:05, Chenxiao Zhao wrote:
> > > > > I finally found out that the problem is that the toolstack did not get
> > > > > corret p2m_size while sending all pages on save(always be zero).
> > > > > After I
> > > > > fixed that, the guest could be restored but guest kernel caught
> > > > > handle_mm_fault().
> > > > > 
> > > > > where do you think I'm going to investigate, guest kernel hibernation
> > > > > restore or xen?
> > > > 
> > > > The hibernation support for ARM64 has only been merged recently in the
> > > > kernel. Which kernel are you using?
> > > 
> > > Hi Julien,
> > > 
> > > I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
> > > 
> > > https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
> > > 
> > > I also applied following patches to make the kernel support hibernation.
> > 
> > This looks the wrong way to do it as this series may requires some
> > patches which have been upstreamed before hand.
> > 
> > Linux upstream seems support to the hikey board [1]. Any reason to not
> > using it?
> 
> I tried a newer version of kernel 4.4, but got no luck to start dom0 with xen.
> so I decide to stay in 4.1 for now.
> 
> > 
> > > [1] http://www.spinics.net/lists/arm-kernel/msg477769.html
> > > [2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html
> > > 
> > > > 
> > > > Also, what are the modifications you have made to support Xen
> > > > suspend/resume for ARM64?
> > > 
> > > I believe I have posted my modifications on xen in the first mail of
> > > this thread.
> > 
> > I mean in Linux. The patch from Ian Campbell does not have any kind of
> > support for ARM64.
> > 
> > For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am
> > wondering if your kernel has support of hibernation...
> 
> Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile to let it
> build for arm64.
> > 
> > > 
> > >  From my understanding, a kernel hibernation will cause kernel to save
> > > memories to disk(swap partition). But on guest save progress, the
> > > hibernation for domU does not make the guest save memories to disk. it's
> > > more like suspend all processes in guest, and memors actually depends on
> > > xen toolstack to save the pages to file. Am I correct?
> > 
> > You are using an older tree with a patch series based on a newer tree.
> > 
> > So I would recommend you to move to a newer tree. If it is not possible,
> > please test that hibernation works on baremetal.
> 
> I think the suspend/resume in guest is working, cause I can use pause/unpause
> command in toolstack to suspend/resume guest without problem. I can also see
> the suspend/resume kernel messages from guest's console. The only problem is
> it's can not resume from restore.

But can you still connect to the guest after resume, maybe over the network?
If you cannot, then something is likely wrong.


> One thing that confused me is that the kernel's hibernation means the guest
> kernel will save the memory state to disk and power off VM at last. The guest
> will also take care of the memory restore itself. But I do not see the
> save/restore on xen works that way. So my question is why it requires
> hibernation (aka. suspend to disk) instead of the real suspend (aka. suspend
> to RAM and standby)?

Xen suspend/resume has nothing to do with guest suspend to RAM or guest
hibernation.

Xen suspend/resume is a way for the hypervisor to save to file the
entire state of the VM, including RAM and the state of any devices.
Guest suspend to RAM and guest hibernation are two guest driven
technologies to save the state of the operating system to RAM or to
disk. The only link between Xen suspend and guest suspend is that when
Xen issues a domain suspend, it notifies the guest of it so that it can
ease the process.  The code in Linux to support Xen suspend/resume is:

drivers/xen/manage.c:do_suspend

and makes use of some of the Linux internal hooks provided for
hibernations (see CONFIG_HIBERNATE_CALLBACKS). But that's just for
better integration with the rest of Linux: hibernation is NOT what is
happening.

I hope that this clarifies things a bit, I realize that it is confusing.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-06 11:58                 ` Stefano Stabellini
@ 2016-06-07  1:17                   ` Chenxiao Zhao
  2016-06-12  9:46                     ` Chenxiao Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-07  1:17 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Julien Grall, xen-devel



On 6/6/2016 7:58 PM, Stefano Stabellini wrote:
> On Fri, 3 Jun 2016, Chenxiao Zhao wrote:
>> On 6/3/2016 4:02 AM, Julien Grall wrote:
>>> Hello,
>>>
>>> First thing, the time in the mail headers seems to be wrong. Maybe
>>> because of a wrong timezone?
>>>
>>> I got: 04/06/16 02:32 however we are still the 3rd in my timezone.
>>>
>>> On 04/06/16 02:32, Chenxiao Zhao wrote:
>>>>
>>>>
>>>> On 6/3/2016 3:16 AM, Julien Grall wrote:
>>>>> Hello,
>>>>>
>>>>> On 03/06/16 18:05, Chenxiao Zhao wrote:
>>>>>> I finally found out that the problem is that the toolstack did not get
>>>>>> corret p2m_size while sending all pages on save(always be zero).
>>>>>> After I
>>>>>> fixed that, the guest could be restored but guest kernel caught
>>>>>> handle_mm_fault().
>>>>>>
>>>>>> where do you think I'm going to investigate, guest kernel hibernation
>>>>>> restore or xen?
>>>>>
>>>>> The hibernation support for ARM64 has only been merged recently in the
>>>>> kernel. Which kernel are you using?
>>>>
>>>> Hi Julien,
>>>>
>>>> I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
>>>>
>>>> https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
>>>>
>>>> I also applied following patches to make the kernel support hibernation.
>>>
>>> This looks the wrong way to do it as this series may requires some
>>> patches which have been upstreamed before hand.
>>>
>>> Linux upstream seems support to the hikey board [1]. Any reason to not
>>> using it?
>>
>> I tried a newer version of kernel 4.4, but got no luck to start dom0 with xen.
>> so I decide to stay in 4.1 for now.
>>
>>>
>>>> [1] http://www.spinics.net/lists/arm-kernel/msg477769.html
>>>> [2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html
>>>>
>>>>>
>>>>> Also, what are the modifications you have made to support Xen
>>>>> suspend/resume for ARM64?
>>>>
>>>> I believe I have posted my modifications on xen in the first mail of
>>>> this thread.
>>>
>>> I mean in Linux. The patch from Ian Campbell does not have any kind of
>>> support for ARM64.
>>>
>>> For instance arch/arm/xen/suspend.c needs to be built for ARM64. So I am
>>> wondering if your kernel has support of hibernation...
>>
>> Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile to let it
>> build for arm64.
>>>
>>>>
>>>>  From my understanding, a kernel hibernation will cause kernel to save
>>>> memories to disk(swap partition). But on guest save progress, the
>>>> hibernation for domU does not make the guest save memories to disk. it's
>>>> more like suspend all processes in guest, and memors actually depends on
>>>> xen toolstack to save the pages to file. Am I correct?
>>>
>>> You are using an older tree with a patch series based on a newer tree.
>>>
>>> So I would recommend you to move to a newer tree. If it is not possible,
>>> please test that hibernation works on baremetal.
>>
>> I think the suspend/resume in guest is working, cause I can use pause/unpause
>> command in toolstack to suspend/resume guest without problem. I can also see
>> the suspend/resume kernel messages from guest's console. The only problem is
>> it's can not resume from restore.
>
> But can you still connect to the guest after resume, maybe over the network?
> If you cannot, then something is likely wrong.

Hi Stefano,

I can connect to the guest after resume from xen console. It responds by 
'return' key, but I can not run any other commands, e.g. ls or ps. I 
think the guest is not 'fully' restored.

>
>
>> One thing that confused me is that the kernel's hibernation means the guest
>> kernel will save the memory state to disk and power off VM at last. The guest
>> will also take care of the memory restore itself. But I do not see the
>> save/restore on xen works that way. So my question is why it requires
>> hibernation (aka. suspend to disk) instead of the real suspend (aka. suspend
>> to RAM and standby)?
>
> Xen suspend/resume has nothing to do with guest suspend to RAM or guest
> hibernation.
>
> Xen suspend/resume is a way for the hypervisor to save to file the
> entire state of the VM, including RAM and the state of any devices.
> Guest suspend to RAM and guest hibernation are two guest driven
> technologies to save the state of the operating system to RAM or to
> disk. The only link between Xen suspend and guest suspend is that when
> Xen issues a domain suspend, it notifies the guest of it so that it can
> ease the process.  The code in Linux to support Xen suspend/resume is:
>
> drivers/xen/manage.c:do_suspend
>
> and makes use of some of the Linux internal hooks provided for
> hibernations (see CONFIG_HIBERNATE_CALLBACKS). But that's just for
> better integration with the rest of Linux: hibernation is NOT what is
> happening.
>
> I hope that this clarifies things a bit, I realize that it is confusing.
>

Thanks for your explanation, It clear enough and just as my 
understanding from the code. I think the problem might caused by 
incompatible of arm p2m and xen save/restore mechanism. I'll try a 
core-dump and compare the memory after save and restore. I suppose this 
two dumps should be identical but there already pages are different. 
I'll let you know if I got some progress.

Regards.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-07  1:17                   ` Chenxiao Zhao
@ 2016-06-12  9:46                     ` Chenxiao Zhao
  2016-06-12 15:31                       ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-12  9:46 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Julien Grall, xen-devel



On 6/7/2016 9:17 AM, Chenxiao Zhao wrote:
>
>
> On 6/6/2016 7:58 PM, Stefano Stabellini wrote:
>> On Fri, 3 Jun 2016, Chenxiao Zhao wrote:
>>> On 6/3/2016 4:02 AM, Julien Grall wrote:
>>>> Hello,
>>>>
>>>> First thing, the time in the mail headers seems to be wrong. Maybe
>>>> because of a wrong timezone?
>>>>
>>>> I got: 04/06/16 02:32 however we are still the 3rd in my timezone.
>>>>
>>>> On 04/06/16 02:32, Chenxiao Zhao wrote:
>>>>>
>>>>>
>>>>> On 6/3/2016 3:16 AM, Julien Grall wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On 03/06/16 18:05, Chenxiao Zhao wrote:
>>>>>>> I finally found out that the problem is that the toolstack did
>>>>>>> not get
>>>>>>> corret p2m_size while sending all pages on save(always be zero).
>>>>>>> After I
>>>>>>> fixed that, the guest could be restored but guest kernel caught
>>>>>>> handle_mm_fault().
>>>>>>>
>>>>>>> where do you think I'm going to investigate, guest kernel
>>>>>>> hibernation
>>>>>>> restore or xen?
>>>>>>
>>>>>> The hibernation support for ARM64 has only been merged recently in
>>>>>> the
>>>>>> kernel. Which kernel are you using?
>>>>>
>>>>> Hi Julien,
>>>>>
>>>>> I'm using a linaro ported Linux kernel 4.1 for hikey from this link.
>>>>>
>>>>> https://github.com/96boards/linux/tree/android-hikey-linaro-4.1
>>>>>
>>>>> I also applied following patches to make the kernel support
>>>>> hibernation.
>>>>
>>>> This looks the wrong way to do it as this series may requires some
>>>> patches which have been upstreamed before hand.
>>>>
>>>> Linux upstream seems support to the hikey board [1]. Any reason to not
>>>> using it?
>>>
>>> I tried a newer version of kernel 4.4, but got no luck to start dom0
>>> with xen.
>>> so I decide to stay in 4.1 for now.
>>>
>>>>
>>>>> [1] http://www.spinics.net/lists/arm-kernel/msg477769.html
>>>>> [2] http://lists.xen.org/archives/html/xen-devel/2015-12/msg01068.html
>>>>>
>>>>>>
>>>>>> Also, what are the modifications you have made to support Xen
>>>>>> suspend/resume for ARM64?
>>>>>
>>>>> I believe I have posted my modifications on xen in the first mail of
>>>>> this thread.
>>>>
>>>> I mean in Linux. The patch from Ian Campbell does not have any kind of
>>>> support for ARM64.
>>>>
>>>> For instance arch/arm/xen/suspend.c needs to be built for ARM64. So
>>>> I am
>>>> wondering if your kernel has support of hibernation...
>>>
>>> Oh, yes, I most forgot I added this file in arch/arm64/xen/Makefile
>>> to let it
>>> build for arm64.
>>>>
>>>>>
>>>>>  From my understanding, a kernel hibernation will cause kernel to save
>>>>> memories to disk(swap partition). But on guest save progress, the
>>>>> hibernation for domU does not make the guest save memories to disk.
>>>>> it's
>>>>> more like suspend all processes in guest, and memors actually
>>>>> depends on
>>>>> xen toolstack to save the pages to file. Am I correct?
>>>>
>>>> You are using an older tree with a patch series based on a newer tree.
>>>>
>>>> So I would recommend you to move to a newer tree. If it is not
>>>> possible,
>>>> please test that hibernation works on baremetal.
>>>
>>> I think the suspend/resume in guest is working, cause I can use
>>> pause/unpause
>>> command in toolstack to suspend/resume guest without problem. I can
>>> also see
>>> the suspend/resume kernel messages from guest's console. The only
>>> problem is
>>> it's can not resume from restore.
>>
>> But can you still connect to the guest after resume, maybe over the
>> network?
>> If you cannot, then something is likely wrong.
>
> Hi Stefano,
>
> I can connect to the guest after resume from xen console. It responds by
> 'return' key, but I can not run any other commands, e.g. ls or ps. I
> think the guest is not 'fully' restored.
>
>>
>>
>>> One thing that confused me is that the kernel's hibernation means the
>>> guest
>>> kernel will save the memory state to disk and power off VM at last.
>>> The guest
>>> will also take care of the memory restore itself. But I do not see the
>>> save/restore on xen works that way. So my question is why it requires
>>> hibernation (aka. suspend to disk) instead of the real suspend (aka.
>>> suspend
>>> to RAM and standby)?
>>
>> Xen suspend/resume has nothing to do with guest suspend to RAM or guest
>> hibernation.
>>
>> Xen suspend/resume is a way for the hypervisor to save to file the
>> entire state of the VM, including RAM and the state of any devices.
>> Guest suspend to RAM and guest hibernation are two guest driven
>> technologies to save the state of the operating system to RAM or to
>> disk. The only link between Xen suspend and guest suspend is that when
>> Xen issues a domain suspend, it notifies the guest of it so that it can
>> ease the process.  The code in Linux to support Xen suspend/resume is:
>>
>> drivers/xen/manage.c:do_suspend
>>
>> and makes use of some of the Linux internal hooks provided for
>> hibernations (see CONFIG_HIBERNATE_CALLBACKS). But that's just for
>> better integration with the rest of Linux: hibernation is NOT what is
>> happening.
>>
>> I hope that this clarifies things a bit, I realize that it is confusing.
>>
>
> Thanks for your explanation, It clear enough and just as my
> understanding from the code. I think the problem might caused by
> incompatible of arm p2m and xen save/restore mechanism. I'll try a
> core-dump and compare the memory after save and restore. I suppose this
> two dumps should be identical but there already pages are different.
> I'll let you know if I got some progress.
>
> Regards.

Hi all,

I finally got save/restore working on arm64, but it only works when I 
assign only one vCPU to VM. If I set vcpus=4 in configure file, the 
restored VM does not work properly.

Got any ideas?

Best Regards.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-12  9:46                     ` Chenxiao Zhao
@ 2016-06-12 15:31                       ` Julien Grall
  2016-06-13  0:55                         ` Chenxiao Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Julien Grall @ 2016-06-12 15:31 UTC (permalink / raw)
  To: Chenxiao Zhao, Stefano Stabellini; +Cc: xen-devel

On 12/06/2016 10:46, Chenxiao Zhao wrote:
> Hi all,

Hello,

> I finally got save/restore working on arm64, but it only works when I
> assign only one vCPU to VM. If I set vcpus=4 in configure file, the
> restored VM does not work properly.

Can you describe what you mean by "does not work properly"? What are the 
symptoms?

Also, I would start by debugging with 2 vCPUs and then increasing the 
number step by step.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-12 15:31                       ` Julien Grall
@ 2016-06-13  0:55                         ` Chenxiao Zhao
  2016-06-13  9:59                           ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: Chenxiao Zhao @ 2016-06-13  0:55 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel



On 6/12/2016 11:31 PM, Julien Grall wrote:
> On 12/06/2016 10:46, Chenxiao Zhao wrote:
>> Hi all,
>
> Hello,
>
>> I finally got save/restore working on arm64, but it only works when I
>> assign only one vCPU to VM. If I set vcpus=4 in configure file, the
>> restored VM does not work properly.
>
> Can you describe what you mean by "does not work properly"? What are the
> symptoms?

After restoring VM with more than one vCPU, the VM keeps in "b" state.

I'm running Centos on guest and listed the console log after restore.

[   32.530490] Xen: initializing cpu0
[   32.530490] xen:grant_table: Grant tables using version 1 layout
[   32.531034] PM: noirq restore of devices complete after 0.382 msecs
[   32.531382] PM: early restore of devices complete after 0.300 msecs
[   32.531430] Xen: initializing cpu1
[   32.569028] PM: restore of devices complete after 24.663 msecs
[   32.569304] Restarting tasks ...
[   32.569903] systemd-journal[800]: undefined instruction: 
pc=0000007fa37dd4c8
[   32.569975] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.571530] done.
[   32.571631] systemd[1]: undefined instruction: pc=0000007f8a9ea4c8
[   32.571650] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.573527] auditd[1365]: undefined instruction: pc=0000007f8aca24c8
[   32.573553] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636573] systemd-cgroups[2210]: undefined instruction: 
pc=0000007f99ad14c8
[   32.636633] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636726] audit: *NO* daemon at audit_pid=1365
[   32.636741] audit: audit_lost=1 audit_rate_limit=0 
audit_backlog_limit=320
[   32.636755] audit: auditd disappeared
[   32.638545] systemd-logind[1387]: undefined instruction: 
pc=0000007f86e5b4c8
[   32.638594] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.638673] audit: type=1701 audit(68.167:214): auid=4294967295 uid=0 
gid=0 s
es=4294967295 subj=system_u:system_r:systemd_logind_t:s0 pid=1387 
comm="systemd-
logind" exe="/usr/lib/systemd/systemd-logind" sig=4
[   32.647972] systemd-cgroups[2211]: undefined instruction: 
pc=0000007fa7f414c8
[   32.648017] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.648087] audit: type=1701 audit(68.177:215): auid=4294967295 uid=0 
gid=0 s
es=4294967295 subj=system_u:system_r:init_t:s0 pid=2211 
comm="systemd-cgroups" e
xe="/usr/lib/systemd/systemd-cgroups-agent" sig=4
[   61.401838] do_undefinstr: 8 callbacks suppressed
[   61.401882] crond[1550]: undefined instruction: pc=0000007f8d15d4c8
[   61.401903] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   61.402077] audit: type=1701 audit(96.947:218): auid=4294967295 uid=0 
gid=0 s
es=4294967295 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 pid=1550 
comm="crond
" exe="/usr/sbin/crond" sig=4
[   61.407024] dbus-daemon[1390]: undefined instruction: pc=0000007f87fae4c8
[   61.407064] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   61.407212] audit: type=1701 audit(96.947:219): auid=4294967295 
uid=81 gid=81
  ses=4294967295 subj=system_u:system_r:system_dbusd_t:s0-s0:c0.c1023 
pid=1390 co
mm="dbus-daemon" exe="/usr/bin/dbus-daemon" sig=4
[   61.408311] systemd-cgroups-agent[2214]: Failed to process message 
[type=erro
r sender=org.freedesktop.DBus path=n/a interface=n/a member=n/a 
signature=s]: Co
nnection timed out
[   61.416815] systemd-cgroups-agent[2216]: Failed to get D-Bus 
connection: Conn
ection refused
[   61.421499] systemd-cgroups-agent[2215]: Failed to get D-Bus 
connection: Conn
ection refused
[   61.429413] systemd-cgroups-agent[2217]: Failed to get D-Bus 
connection: Conn
ection refused
[   61.434301] systemd-cgroups-agent[2218]: Failed to get D-Bus 
connection: Conn
ection refused
[   61.435016] systemd-cgroups-agent[2219]: Failed to get D-Bus 
connection: Conn
ection refused

[  110.095570] audit: type=1701 audit(145.637:220): auid=0 uid=0 gid=0 
ses=1 sub
j=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=2189 
comm="bash" exe
="/usr/bin/bash" sig=4
[  110.098120] audit: type=1104 audit(145.637:221): pid=1602 uid=0 
auid=0 ses=1
subj=system_u:system_r:local_login_t:s0-s0:c0.c1023 msg='op=PAM:setcred 
grantors
=pam_securetty,pam_unix acct="root" exe="/usr/bin/login" hostname=? 
addr=? termi
nal=hvc0 res=success'
[  110.102730] audit: type=1106 audit(145.637:222): pid=1602 uid=0 
auid=0 ses=1
subj=system_u:system_r:local_login_t:s0-s0:c0.c1023 
msg='op=PAM:session_close gr
antors=? acct="root" exe="/usr/bin/login" hostname=? addr=? 
terminal=hvc0 res=fa
iled'
[  110.112341] systemd-cgroups-agent[2220]: Failed to get D-Bus 
connection: Conn
ection refused

>
> Also, I would start by debugging with 2 vCPUs and then increasing the
> number step by step.

It's the same issue when restoring VM with more than one vCPUS. What I 
see is guest reported "undefined instruction" with random PC depends on 
the save point.

Can you advice how would I start debugging this issue?

Thanks.

>
> Regards,
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions of vm save/restore on arm64
  2016-06-13  0:55                         ` Chenxiao Zhao
@ 2016-06-13  9:59                           ` Julien Grall
  0 siblings, 0 replies; 16+ messages in thread
From: Julien Grall @ 2016-06-13  9:59 UTC (permalink / raw)
  To: Chenxiao Zhao, Stefano Stabellini; +Cc: xen-devel

Hello,

On 13/06/16 01:55, Chenxiao Zhao wrote:
>
>
> On 6/12/2016 11:31 PM, Julien Grall wrote:
>> On 12/06/2016 10:46, Chenxiao Zhao wrote:
>>> I finally got save/restore working on arm64, but it only works when I
>>> assign only one vCPU to VM. If I set vcpus=4 in configure file, the
>>> restored VM does not work properly.
>>
>> Can you describe what you mean by "does not work properly"? What are the
>> symptoms?
>
> After restoring VM with more than one vCPU, the VM keeps in "b" state.

This happen if all the vCPUs of the guest are waiting on an event. For 
instance if the guest is executing the instruction WFI, the vCPU will 
get blocked until an interrupt is coming up.

I would not worry about this.

> [   32.530490] Xen: initializing cpu0
> [   32.530490] xen:grant_table: Grant tables using version 1 layout
> [   32.531034] PM: noirq restore of devices complete after 0.382 msecs
> [   32.531382] PM: early restore of devices complete after 0.300 msecs
> [   32.531430] Xen: initializing cpu1
> [   32.569028] PM: restore of devices complete after 24.663 msecs
> [   32.569304] Restarting tasks ...
> [   32.569903] systemd-journal[800]: undefined instruction:
> pc=0000007fa37dd4c8
> [   32.569975] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
> [   32.571530] done.
> [   32.571631] systemd[1]: undefined instruction: pc=0000007f8a9ea4c8
> [   32.571650] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
> [   32.573527] auditd[1365]: undefined instruction: pc=0000007f8aca24c8
> [   32.573553] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
> [   32.636573] systemd-cgroups[2210]: undefined instruction:
> pc=0000007f99ad14c8
> [   32.636633] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
> [   32.636726] audit: *NO* daemon at audit_pid=1365
> [   32.636741] audit: audit_lost=1 audit_rate_limit=0
> audit_backlog_limit=320
> [   32.636755] audit: auditd disappeared
> [   32.638545] systemd-logind[1387]: undefined instruction:
> pc=0000007f86e5b4c8
> [   32.638594] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)

[...]

> [   32.638673] audit: type=1701 audit(68.167:214): auid=4294967295 uid=0
> gid=0 s
> es=4294967295 subj=system_u:system_r:systemd_logind_t:s0 pid=1387
> comm="systemd-
> logind" exe="/usr/lib/systemd/systemd-logind" sig=4
> [   32.647972] systemd-cgroups[2211]: undefined instruction:
> pc=0000007fa7f414c8
> [   32.648017] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
> [   32.648087] audit: type=1701 audit(68.177:215): auid=4294967295 uid=0
> gid=0 s
> es=4294967295 subj=system_u:system_r:init_t:s0 pid=2211
> comm="systemd-cgroups" e
> xe="/usr/lib/systemd/systemd-cgroups-agent" sig=4
> [   61.401838] do_undefinstr: 8 callbacks suppressed
> [   61.401882] crond[1550]: undefined instruction: pc=0000007f8d15d4c8
> [   61.401903] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)

[...]

>>
>> Also, I would start by debugging with 2 vCPUs and then increasing the
>> number step by step.
>
> It's the same issue when restoring VM with more than one vCPUS. What I
> see is guest reported "undefined instruction" with random PC depends on
> the save point.

My point was that it is easier to debug with 2 vCPUs than 4 vCPUs. There 
is less concurrency involved.

The PC is the program counter of the application, which might be fully 
randomized.

>
> Can you advice how would I start debugging this issue?

The undefined instructions are always the same in your log (d53be04f).
This is the encoding for "mrs     x15, cntvct_el0". This register is 
only accessible at EL0 if CTKCTL_EL0.EL0VCTEN is enabled.

I guess that this register has not been save/restore correctly.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-06-13  9:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-27 10:08 questions of vm save/restore on arm64 Chenxiao Zhao
2016-05-30 11:40 ` Stefano Stabellini
2016-06-01  0:28   ` Chenxiao Zhao
2016-06-02 12:29     ` Julien Grall
2016-06-03 17:05       ` Chenxiao Zhao
2016-06-03 10:16         ` Julien Grall
2016-06-04  1:32           ` Chenxiao Zhao
2016-06-03 11:02             ` Julien Grall
2016-06-04  2:37               ` Chenxiao Zhao
2016-06-03 12:33                 ` Julien Grall
2016-06-06 11:58                 ` Stefano Stabellini
2016-06-07  1:17                   ` Chenxiao Zhao
2016-06-12  9:46                     ` Chenxiao Zhao
2016-06-12 15:31                       ` Julien Grall
2016-06-13  0:55                         ` Chenxiao Zhao
2016-06-13  9:59                           ` Julien Grall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).