All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] ppc: migration fixes for TCG
@ 2017-09-10 14:37 Mark Cave-Ayland
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t Mark Cave-Ayland
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-10 14:37 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, aik, lvivier

This is a rebase of my outstanding patches to fix migration for the Mac
g3beige and mac99 machines when using TCG. Some discussion on previous
versions of these patches can be found at:

https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00544.html
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01114.html
https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01489.html

With this patchset applied, I've been able to repeatedly execute
savevm/loadvm commands from the console when installing Darwin on g3beige
and whilst using MacOS 9.2 on mac99 with no apparent ill effects.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>


Mark Cave-Ayland (4):
  ppc: change CPUPPCState access_type from int to uint8_t
  ppc: add CPU IRQ state to PPC VMStateDescription
  ppc: add CPU access_type into the migration stream
  ppc: ensure we update the decrementer value during migration

 target/ppc/cpu.h     |    4 ++--
 target/ppc/machine.c |   14 +++++++++++---
 2 files changed, 13 insertions(+), 5 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t
  2017-09-10 14:37 [Qemu-devel] [PATCH 0/4] ppc: migration fixes for TCG Mark Cave-Ayland
@ 2017-09-10 14:37 ` Mark Cave-Ayland
  2017-09-10 16:30   ` Laurent Vivier
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription Mark Cave-Ayland
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-10 14:37 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, aik, lvivier

This change was suggested by Alexey in advance of a subsequent commit which
adds access_type into vmstate_ppc_cpu.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/cpu.h     |    4 ++--
 target/ppc/machine.c |    4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 12f0949..59d1656 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1010,8 +1010,8 @@ struct CPUPPCState {
     /* Next instruction pointer */
     target_ulong nip;
 
-    int access_type; /* when a memory exception occurs, the access
-                        type is stored here */
+    uint8_t access_type; /* when a memory exception occurs, the access
+                            type is stored here */
 
     CPU_COMMON
 
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index e36b710..e59049f 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -19,6 +19,7 @@ static int cpu_load_old(QEMUFile *f, void *opaque, int version_id)
     target_ulong sdr1;
     uint32_t fpscr;
     target_ulong xer;
+    int access_type;
 
     for (i = 0; i < 32; i++)
         qemu_get_betls(f, &env->gpr[i]);
@@ -46,7 +47,8 @@ static int cpu_load_old(QEMUFile *f, void *opaque, int version_id)
     }
     qemu_get_be32s(f, &fpscr);
     env->fpscr = fpscr;
-    qemu_get_sbe32s(f, &env->access_type);
+    qemu_get_sbe32s(f, &access_type);
+    env->access_type = (uint8_t)access_type;
 #if defined(TARGET_PPC64)
     qemu_get_betls(f, &env->spr[SPR_ASR]);
     qemu_get_sbe32s(f, &env->slb_nr);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-10 14:37 [Qemu-devel] [PATCH 0/4] ppc: migration fixes for TCG Mark Cave-Ayland
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t Mark Cave-Ayland
@ 2017-09-10 14:37 ` Mark Cave-Ayland
  2017-09-11  7:50   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream Mark Cave-Ayland
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration Mark Cave-Ayland
  3 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-10 14:37 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, aik, lvivier

Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
appears to drop the internal CPU IRQ state from the migration stream. Whilst
testing migration on g3beige/mac99 machines, test images would randomly fail to
resume unless a key was pressed on the VGA console.

Further investigation suggests that internal CPU IRQ state isn't being
preserved and so interrupts asserted at the time of migration are lost. Adding
the pending_interrupts and irq_input_state fields back into the migration
stream appears to fix the problem here during local tests.

As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
the additional fields.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/machine.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index e59049f..8fec1a4 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
 
 const VMStateDescription vmstate_ppc_cpu = {
     .name = "cpu",
-    .version_id = 5,
+    .version_id = 6,
     .minimum_version_id = 5,
     .minimum_version_id_old = 4,
     .load_state_old = cpu_load_old,
@@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
         VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
         /* FIXME: access_type? */
 
+        /* Interrupt state */
+        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
+        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
+
         /* Sanity checking */
         VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
         VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream
  2017-09-10 14:37 [Qemu-devel] [PATCH 0/4] ppc: migration fixes for TCG Mark Cave-Ayland
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t Mark Cave-Ayland
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription Mark Cave-Ayland
@ 2017-09-10 14:37 ` Mark Cave-Ayland
  2017-09-11 10:57   ` David Gibson
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration Mark Cave-Ayland
  3 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-10 14:37 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, aik, lvivier

This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
in the migration stream.

Note: the vmstate_ppc version number has already been bumped by the previous
patch in this series.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/machine.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 8fec1a4..10b3c41 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -676,7 +676,7 @@ const VMStateDescription vmstate_ppc_cpu = {
 
         /* Internal state */
         VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
-        /* FIXME: access_type? */
+        VMSTATE_UINT8_V(env.access_type, PowerPCCPU, 6),
 
         /* Interrupt state */
         VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-10 14:37 [Qemu-devel] [PATCH 0/4] ppc: migration fixes for TCG Mark Cave-Ayland
                   ` (2 preceding siblings ...)
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream Mark Cave-Ayland
@ 2017-09-10 14:37 ` Mark Cave-Ayland
  2017-09-13  7:12   ` David Gibson
  3 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-10 14:37 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc, david, aik, lvivier

During local testing with TCG, intermittent errors were found when trying to
migrate Darwin OS images.

The underlying cause was that Darwin resets the decrementer value to fairly
small values on each interrupt. cpu_ppc_set_tb_clk() sets the default value
of the decrementer to 0xffffffff during initialisation which typically
corresponds to several seconds. Hence when restoring the image, the guest
would effectively "lose" decrementer interrupts during this time causing
confusion in the guest.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/machine.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 10b3c41..a16a856 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -176,6 +176,7 @@ static void cpu_pre_save(void *opaque)
     env->spr[SPR_CFAR] = env->cfar;
 #endif
     env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
+    env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
 
     for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
         env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
@@ -280,6 +281,7 @@ static int cpu_post_load(void *opaque, int version_id)
     env->cfar = env->spr[SPR_CFAR];
 #endif
     env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
+    cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
 
     for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
         env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t Mark Cave-Ayland
@ 2017-09-10 16:30   ` Laurent Vivier
  2017-09-10 18:00     ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: Laurent Vivier @ 2017-09-10 16:30 UTC (permalink / raw)
  To: Mark Cave-Ayland, qemu-devel, qemu-ppc, david, aik

On 10/09/2017 16:37, Mark Cave-Ayland wrote:
> This change was suggested by Alexey in advance of a subsequent commit which
> adds access_type into vmstate_ppc_cpu.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/cpu.h     |    4 ++--
>  target/ppc/machine.c |    4 +++-
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 12f0949..59d1656 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1010,8 +1010,8 @@ struct CPUPPCState {
>      /* Next instruction pointer */
>      target_ulong nip;
>  
> -    int access_type; /* when a memory exception occurs, the access
> -                        type is stored here */
> +    uint8_t access_type; /* when a memory exception occurs, the access
> +                            type is stored here */

I think this breaks TCG as we have:

target/ppc/translate.c:

     82 void ppc_translate_init(void)
...
    191 
    192     cpu_access_type = tcg_global_mem_new_i32(cpu_env,
    193                                              offsetof(CPUPPCState, access_type), "access_type");
    194 
    195     done_init = 1;
    196 }

it expects an int32_t (or int).

Thanks,
Laurent

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t
  2017-09-10 16:30   ` Laurent Vivier
@ 2017-09-10 18:00     ` Mark Cave-Ayland
  0 siblings, 0 replies; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-10 18:00 UTC (permalink / raw)
  To: Laurent Vivier, qemu-devel, qemu-ppc, david, aik

On 10/09/17 17:30, Laurent Vivier wrote:

> On 10/09/2017 16:37, Mark Cave-Ayland wrote:
>> This change was suggested by Alexey in advance of a subsequent commit which
>> adds access_type into vmstate_ppc_cpu.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>>  target/ppc/cpu.h     |    4 ++--
>>  target/ppc/machine.c |    4 +++-
>>  2 files changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
>> index 12f0949..59d1656 100644
>> --- a/target/ppc/cpu.h
>> +++ b/target/ppc/cpu.h
>> @@ -1010,8 +1010,8 @@ struct CPUPPCState {
>>      /* Next instruction pointer */
>>      target_ulong nip;
>>  
>> -    int access_type; /* when a memory exception occurs, the access
>> -                        type is stored here */
>> +    uint8_t access_type; /* when a memory exception occurs, the access
>> +                            type is stored here */
> 
> I think this breaks TCG as we have:
> 
> target/ppc/translate.c:
> 
>      82 void ppc_translate_init(void)
> ...
>     191 
>     192     cpu_access_type = tcg_global_mem_new_i32(cpu_env,
>     193                                              offsetof(CPUPPCState, access_type), "access_type");
>     194 
>     195     done_init = 1;
>     196 }
> 
> it expects an int32_t (or int).

Indeed, yes. I'm really surprised this didn't break compilation or
anything at runtime...

Having a further look I can't see any implementations for
tcg_global_mem_new_u8() or tcg_gen_movi_u8() so changing this isn't a
straightforward type swap.

Alexey, do you still think this is required?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription Mark Cave-Ayland
@ 2017-09-11  7:50   ` Greg Kurz
  2017-09-11  9:30     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 33+ messages in thread
From: Greg Kurz @ 2017-09-11  7:50 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: qemu-devel, qemu-ppc, david, aik, lvivier

[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]

On Sun, 10 Sep 2017 15:37:33 +0100
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:

> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> appears to drop the internal CPU IRQ state from the migration stream. Whilst
> testing migration on g3beige/mac99 machines, test images would randomly fail to
> resume unless a key was pressed on the VGA console.
> 
> Further investigation suggests that internal CPU IRQ state isn't being
> preserved and so interrupts asserted at the time of migration are lost. Adding
> the pending_interrupts and irq_input_state fields back into the migration
> stream appears to fix the problem here during local tests.
> 
> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> the additional fields.
> 

And so this unconditionally breaks backward migration... what about adding
a subsection for this ?

> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/machine.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index e59049f..8fec1a4 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
>  
>  const VMStateDescription vmstate_ppc_cpu = {
>      .name = "cpu",
> -    .version_id = 5,
> +    .version_id = 6,
>      .minimum_version_id = 5,
>      .minimum_version_id_old = 4,
>      .load_state_old = cpu_load_old,
> @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
>          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
>          /* FIXME: access_type? */
>  
> +        /* Interrupt state */
> +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> +
>          /* Sanity checking */
>          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
>          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-11  7:50   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2017-09-11  9:30     ` Dr. David Alan Gilbert
  2017-09-11 10:48       ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Dr. David Alan Gilbert @ 2017-09-11  9:30 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Mark Cave-Ayland, aik, lvivier, qemu-ppc, qemu-devel, david

* Greg Kurz (groug@kaod.org) wrote:
> On Sun, 10 Sep 2017 15:37:33 +0100
> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> 
> > Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > testing migration on g3beige/mac99 machines, test images would randomly fail to
> > resume unless a key was pressed on the VGA console.
> > 
> > Further investigation suggests that internal CPU IRQ state isn't being
> > preserved and so interrupts asserted at the time of migration are lost. Adding
> > the pending_interrupts and irq_input_state fields back into the migration
> > stream appears to fix the problem here during local tests.
> > 
> > As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > the additional fields.
> > 
> 
> And so this unconditionally breaks backward migration... what about adding
> a subsection for this ?

and wiring it to a flag on the machine type so that older machine types
don't send it.

Dave

> > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > ---
> >  target/ppc/machine.c |    6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > index e59049f..8fec1a4 100644
> > --- a/target/ppc/machine.c
> > +++ b/target/ppc/machine.c
> > @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
> >  
> >  const VMStateDescription vmstate_ppc_cpu = {
> >      .name = "cpu",
> > -    .version_id = 5,
> > +    .version_id = 6,
> >      .minimum_version_id = 5,
> >      .minimum_version_id_old = 4,
> >      .load_state_old = cpu_load_old,
> > @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> >          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> >          /* FIXME: access_type? */
> >  
> > +        /* Interrupt state */
> > +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> > +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> > +
> >          /* Sanity checking */
> >          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
> >          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-11  9:30     ` Dr. David Alan Gilbert
@ 2017-09-11 10:48       ` David Gibson
  2017-09-11 16:46         ` Mark Cave-Ayland
  2017-09-12 16:21         ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 33+ messages in thread
From: David Gibson @ 2017-09-11 10:48 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Greg Kurz, Mark Cave-Ayland, aik, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3257 bytes --]

On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> * Greg Kurz (groug@kaod.org) wrote:
> > On Sun, 10 Sep 2017 15:37:33 +0100
> > Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> > 
> > > Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > > appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > > testing migration on g3beige/mac99 machines, test images would randomly fail to
> > > resume unless a key was pressed on the VGA console.
> > > 
> > > Further investigation suggests that internal CPU IRQ state isn't being
> > > preserved and so interrupts asserted at the time of migration are lost. Adding
> > > the pending_interrupts and irq_input_state fields back into the migration
> > > stream appears to fix the problem here during local tests.
> > > 
> > > As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > > the additional fields.
> > > 
> > 
> > And so this unconditionally breaks backward migration... what about adding
> > a subsection for this ?
> 
> and wiring it to a flag on the machine type so that older machine types
> don't send it.

Right, a subsection is certainly necessary to avoid breaking backwards
migration.

But apart from that I want to understand better exactly why this is
necessary.  What's the state that's being lost, and is it really not
recoverable from anywhere else.

The other thing that concerns me is how we're encoding the
information.  These are essentially internal fields, not reflecting
something with an architected encoding - adding those to the migration
stream is often a bad idea - it inhibits our ability to rework
internal encodings.
> 
> Dave
> 
> > > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > > ---
> > >  target/ppc/machine.c |    6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > > index e59049f..8fec1a4 100644
> > > --- a/target/ppc/machine.c
> > > +++ b/target/ppc/machine.c
> > > @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
> > >  
> > >  const VMStateDescription vmstate_ppc_cpu = {
> > >      .name = "cpu",
> > > -    .version_id = 5,
> > > +    .version_id = 6,
> > >      .minimum_version_id = 5,
> > >      .minimum_version_id_old = 4,
> > >      .load_state_old = cpu_load_old,
> > > @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> > >          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> > >          /* FIXME: access_type? */
> > >  
> > > +        /* Interrupt state */
> > > +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> > > +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> > > +
> > >          /* Sanity checking */
> > >          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
> > >          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
> > 
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream Mark Cave-Ayland
@ 2017-09-11 10:57   ` David Gibson
  2017-09-11 16:52     ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: David Gibson @ 2017-09-11 10:57 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: qemu-devel, qemu-ppc, aik, lvivier

[-- Attachment #1: Type: text/plain, Size: 1596 bytes --]

On Sun, Sep 10, 2017 at 03:37:34PM +0100, Mark Cave-Ayland wrote:
> This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
> in the migration stream.

That is not, on its own, sufficient reason.

> Note: the vmstate_ppc version number has already been bumped by the previous
> patch in this series.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

As with 2/4 it breaks backwards migration.

But more, I really disklike the idea of migrating this.  It's internal
state for one, and it's also essentially transitory state.  Can we
avoid putting it in the otherwise persistent structure at all? Can we
derive the state from elsewhere?  Can we prevent migration from
occurring in the small windows where this data is live?

> ---
>  target/ppc/machine.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index 8fec1a4..10b3c41 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -676,7 +676,7 @@ const VMStateDescription vmstate_ppc_cpu = {
>  
>          /* Internal state */
>          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> -        /* FIXME: access_type? */
> +        VMSTATE_UINT8_V(env.access_type, PowerPCCPU, 6),
>  
>          /* Interrupt state */
>          VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-11 10:48       ` David Gibson
@ 2017-09-11 16:46         ` Mark Cave-Ayland
  2017-09-11 17:19           ` Dr. David Alan Gilbert
  2017-09-12 16:21         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-11 16:46 UTC (permalink / raw)
  To: David Gibson, Dr. David Alan Gilbert
  Cc: lvivier, aik, Greg Kurz, qemu-devel, qemu-ppc

On 11/09/17 11:48, David Gibson wrote:

> On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
>> * Greg Kurz (groug@kaod.org) wrote:
>>> On Sun, 10 Sep 2017 15:37:33 +0100
>>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
>>>
>>>> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
>>>> appears to drop the internal CPU IRQ state from the migration stream. Whilst
>>>> testing migration on g3beige/mac99 machines, test images would randomly fail to
>>>> resume unless a key was pressed on the VGA console.
>>>>
>>>> Further investigation suggests that internal CPU IRQ state isn't being
>>>> preserved and so interrupts asserted at the time of migration are lost. Adding
>>>> the pending_interrupts and irq_input_state fields back into the migration
>>>> stream appears to fix the problem here during local tests.
>>>>
>>>> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
>>>> the additional fields.
>>>>
>>>
>>> And so this unconditionally breaks backward migration... what about adding
>>> a subsection for this ?
>>
>> and wiring it to a flag on the machine type so that older machine types
>> don't send it.
> 
> Right, a subsection is certainly necessary to avoid breaking backwards
> migration.

The suggestion of using the VMSTATE_*_V macros with an increased version
number came from Alexey's original review of the patch many months ago
which is why I did it that way.

Out of curiosity though, what is the criteria for supporting backwards
migration? Obviously forward migration is supported as-is in this
manner, so what determines if a patch needs to be backwards compatible
and how far?

> But apart from that I want to understand better exactly why this is
> necessary.  What's the state that's being lost, and is it really not
> recoverable from anywhere else.

The test case I have is installing Darwin PPC 6.02 with qemu-system-ppc
TCG and repeatedly pausing, executing "savevm foo", then quitting and
continuing with "-loadvm foo" during the installation phase. About 1 in
10 times the installer hangs after the loadvm until I press a key, at
which point it carries on as normal.

I then proceeded to going backwards through the git history until I
found out that it was the removal of the pending_interrupts,
irq_input_state and access_type fields during the conversion to
VMStateDescription commit a90db15 which seemed to cause the problem.

> The other thing that concerns me is how we're encoding the
> information.  These are essentially internal fields, not reflecting
> something with an architected encoding - adding those to the migration
> stream is often a bad idea - it inhibits our ability to rework
> internal encodings.

I'm not sure how this should be managed, however there was a similar
issue with excp_prefix (which was also removed in a90db15) that was
fixed in 2360b6e by calling a helper in cpu_post_load(). I can't easily
see how could work with env.pending_interrupts and env.irq_input_state
though.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream
  2017-09-11 10:57   ` David Gibson
@ 2017-09-11 16:52     ` Mark Cave-Ayland
  2017-09-13  7:19       ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-11 16:52 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, lvivier, qemu-ppc, qemu-devel

On 11/09/17 11:57, David Gibson wrote:

> On Sun, Sep 10, 2017 at 03:37:34PM +0100, Mark Cave-Ayland wrote:
>> This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
>> in the migration stream.
> 
> That is not, on its own, sufficient reason.
> 
>> Note: the vmstate_ppc version number has already been bumped by the previous
>> patch in this series.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> 
> As with 2/4 it breaks backwards migration.
> 
> But more, I really disklike the idea of migrating this.  It's internal
> state for one, and it's also essentially transitory state.  Can we
> avoid putting it in the otherwise persistent structure at all? Can we
> derive the state from elsewhere?  Can we prevent migration from
> occurring in the small windows where this data is live?

>From what I can see references to access_type are scattered throughout
mmu_helper.c although I'm not necessarily familiar enough with PPC to
know whether this is something that can be derived elsewhere instead.
And once again it was something that was removed by a90db15.

When pausing a VM, does execution stop at the end of the current TB
rather than immediately? If so, perhaps someone could confirm that
guarantee is good enough for access_type?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-11 16:46         ` Mark Cave-Ayland
@ 2017-09-11 17:19           ` Dr. David Alan Gilbert
  2017-09-13  7:03             ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Dr. David Alan Gilbert @ 2017-09-11 17:19 UTC (permalink / raw)
  To: Mark Cave-Ayland
  Cc: David Gibson, lvivier, aik, Greg Kurz, qemu-devel, qemu-ppc

* Mark Cave-Ayland (mark.cave-ayland@ilande.co.uk) wrote:
> On 11/09/17 11:48, David Gibson wrote:
> 
> > On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> >> * Greg Kurz (groug@kaod.org) wrote:
> >>> On Sun, 10 Sep 2017 15:37:33 +0100
> >>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> >>>
> >>>> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> >>>> appears to drop the internal CPU IRQ state from the migration stream. Whilst
> >>>> testing migration on g3beige/mac99 machines, test images would randomly fail to
> >>>> resume unless a key was pressed on the VGA console.
> >>>>
> >>>> Further investigation suggests that internal CPU IRQ state isn't being
> >>>> preserved and so interrupts asserted at the time of migration are lost. Adding
> >>>> the pending_interrupts and irq_input_state fields back into the migration
> >>>> stream appears to fix the problem here during local tests.
> >>>>
> >>>> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> >>>> the additional fields.
> >>>>
> >>>
> >>> And so this unconditionally breaks backward migration... what about adding
> >>> a subsection for this ?
> >>
> >> and wiring it to a flag on the machine type so that older machine types
> >> don't send it.
> > 
> > Right, a subsection is certainly necessary to avoid breaking backwards
> > migration.
> 
> The suggestion of using the VMSTATE_*_V macros with an increased version
> number came from Alexey's original review of the patch many months ago
> which is why I did it that way.
> 
> Out of curiosity though, what is the criteria for supporting backwards
> migration? Obviously forward migration is supported as-is in this
> manner, so what determines if a patch needs to be backwards compatible
> and how far?

It's a bit fuzzy.  Downstream we do backwards migration between various
versions - and those versions are pretty arbitrarily chosen.
Generally I prefer if we don't break that upstream either, although
it isn't tested much.

If it was in code that was specific to your g3beige I wouldn't mind;
but for ppc in general then if it breaks the server migration it'll
be a pain we'd have to then fix.  Best to keep it working upstream.

But it's fairly easy to put new fields in a subsection and tie it
to a property;  that makes it easy to switch it on/off in machine
types.

> > But apart from that I want to understand better exactly why this is
> > necessary.  What's the state that's being lost, and is it really not
> > recoverable from anywhere else.
> 
> The test case I have is installing Darwin PPC 6.02 with qemu-system-ppc
> TCG and repeatedly pausing, executing "savevm foo", then quitting and
> continuing with "-loadvm foo" during the installation phase. About 1 in
> 10 times the installer hangs after the loadvm until I press a key, at
> which point it carries on as normal.
> 
> I then proceeded to going backwards through the git history until I
> found out that it was the removal of the pending_interrupts,
> irq_input_state and access_type fields during the conversion to
> VMStateDescription commit a90db15 which seemed to cause the problem.
> 
> > The other thing that concerns me is how we're encoding the
> > information.  These are essentially internal fields, not reflecting
> > something with an architected encoding - adding those to the migration
> > stream is often a bad idea - it inhibits our ability to rework
> > internal encodings.
> 
> I'm not sure how this should be managed, however there was a similar
> issue with excp_prefix (which was also removed in a90db15) that was
> fixed in 2360b6e by calling a helper in cpu_post_load(). I can't easily
> see how could work with env.pending_interrupts and env.irq_input_state
> though.

Without knowing anything about this hardware... generally the migration
stream should reflect the real state of the system rather than internal
implementation detail, that way if you change the implementation you
don't need to fudge the state.  Having said that, there's generally
some internal state that's perhaps not immediately obvious from specs
until you try and implement it.

Dave

> ATB,
> 
> Mark.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-11 10:48       ` David Gibson
  2017-09-11 16:46         ` Mark Cave-Ayland
@ 2017-09-12 16:21         ` Dr. David Alan Gilbert
  2017-09-12 16:41           ` Mark Cave-Ayland
  1 sibling, 1 reply; 33+ messages in thread
From: Dr. David Alan Gilbert @ 2017-09-12 16:21 UTC (permalink / raw)
  To: David Gibson
  Cc: Greg Kurz, Mark Cave-Ayland, aik, lvivier, qemu-ppc, qemu-devel

* David Gibson (david@gibson.dropbear.id.au) wrote:
> On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> > * Greg Kurz (groug@kaod.org) wrote:
> > > On Sun, 10 Sep 2017 15:37:33 +0100
> > > Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> > > 
> > > > Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > > > appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > > > testing migration on g3beige/mac99 machines, test images would randomly fail to
> > > > resume unless a key was pressed on the VGA console.
> > > > 
> > > > Further investigation suggests that internal CPU IRQ state isn't being
> > > > preserved and so interrupts asserted at the time of migration are lost. Adding
> > > > the pending_interrupts and irq_input_state fields back into the migration
> > > > stream appears to fix the problem here during local tests.
> > > > 
> > > > As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > > > the additional fields.
> > > > 
> > > 
> > > And so this unconditionally breaks backward migration... what about adding
> > > a subsection for this ?
> > 
> > and wiring it to a flag on the machine type so that older machine types
> > don't send it.
> 
> Right, a subsection is certainly necessary to avoid breaking backwards
> migration.
> 
> But apart from that I want to understand better exactly why this is
> necessary.  What's the state that's being lost, and is it really not
> recoverable from anywhere else.
> 
> The other thing that concerns me is how we're encoding the
> information.  These are essentially internal fields, not reflecting
> something with an architected encoding - adding those to the migration
> stream is often a bad idea - it inhibits our ability to rework
> internal encodings.

Yes, agreed, where possible the contents of the stream should reflect
'real' state that's actually being modelled and be as independent of
the implementation as possible.

Dave

> > 
> > Dave
> > 
> > > > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > > > ---
> > > >  target/ppc/machine.c |    6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > > > index e59049f..8fec1a4 100644
> > > > --- a/target/ppc/machine.c
> > > > +++ b/target/ppc/machine.c
> > > > @@ -647,7 +647,7 @@ static const VMStateDescription vmstate_compat = {
> > > >  
> > > >  const VMStateDescription vmstate_ppc_cpu = {
> > > >      .name = "cpu",
> > > > -    .version_id = 5,
> > > > +    .version_id = 6,
> > > >      .minimum_version_id = 5,
> > > >      .minimum_version_id_old = 4,
> > > >      .load_state_old = cpu_load_old,
> > > > @@ -678,6 +678,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> > > >          VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
> > > >          /* FIXME: access_type? */
> > > >  
> > > > +        /* Interrupt state */
> > > > +        VMSTATE_UINT32_V(env.pending_interrupts, PowerPCCPU, 6),
> > > > +        VMSTATE_UINT32_V(env.irq_input_state, PowerPCCPU, 6),
> > > > +
> > > >          /* Sanity checking */
> > > >          VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, cpu_pre_2_8_migration),
> > > >          VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, cpu_pre_2_8_migration),
> > > 
> > 
> > 
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-12 16:21         ` Dr. David Alan Gilbert
@ 2017-09-12 16:41           ` Mark Cave-Ayland
  2017-09-12 16:46             ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-12 16:41 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, David Gibson
  Cc: lvivier, aik, Greg Kurz, qemu-devel, qemu-ppc

On 12/09/17 17:21, Dr. David Alan Gilbert wrote:

>> Right, a subsection is certainly necessary to avoid breaking backwards
>> migration.
>>
>> But apart from that I want to understand better exactly why this is
>> necessary.  What's the state that's being lost, and is it really not
>> recoverable from anywhere else.
>>
>> The other thing that concerns me is how we're encoding the
>> information.  These are essentially internal fields, not reflecting
>> something with an architected encoding - adding those to the migration
>> stream is often a bad idea - it inhibits our ability to rework
>> internal encodings.
> 
> Yes, agreed, where possible the contents of the stream should reflect
> 'real' state that's actually being modelled and be as independent of
> the implementation as possible.

Oh sure. However I should re-iterate that I'm not trying to add new
fields into the migration stream, merely reinstate the ones that were
dropped without warning in commit a90db15 by Alexey since without them
TCG state doesn't restore correctly in my local tests.

The commit message mentions that prior to the conversion some CPU state
was missing but it doesn't mention anything about dropping existing
fields as part of the conversion process so I suspect that this was an
accidental side-effect.

I can definitely look at re-implementing the patch using a subsection in
order to preserve backwards compatibility though.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-12 16:41           ` Mark Cave-Ayland
@ 2017-09-12 16:46             ` Mark Cave-Ayland
  2017-09-13  2:23               ` Alexey Kardashevskiy
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-12 16:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, David Gibson
  Cc: lvivier, aik, Greg Kurz, qemu-devel, qemu-ppc

On 12/09/17 17:41, Mark Cave-Ayland wrote:

> The commit message mentions that prior to the conversion some CPU state
> was missing but it doesn't mention anything about dropping existing
> fields as part of the conversion process so I suspect that this was an
> accidental side-effect.

Actually I've clicked send a little too early here since re-reading the
last paragraph of a90db15 I can see the inference here: "Exactly what
needs to be saved in what configurations has been more carefully
examined, too".

Alexey - do you recall from your analysis why these fields were no
longer deemed necessary, and how your TCG tests were configured?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-12 16:46             ` Mark Cave-Ayland
@ 2017-09-13  2:23               ` Alexey Kardashevskiy
  2017-09-13  6:02                 ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Alexey Kardashevskiy @ 2017-09-13  2:23 UTC (permalink / raw)
  To: Mark Cave-Ayland, Dr. David Alan Gilbert, David Gibson
  Cc: lvivier, Greg Kurz, qemu-devel, qemu-ppc

On 13/09/17 02:46, Mark Cave-Ayland wrote:
> On 12/09/17 17:41, Mark Cave-Ayland wrote:
> 
>> The commit message mentions that prior to the conversion some CPU state
>> was missing but it doesn't mention anything about dropping existing
>> fields as part of the conversion process so I suspect that this was an
>> accidental side-effect.
> 
> Actually I've clicked send a little too early here since re-reading the
> last paragraph of a90db15 I can see the inference here: "Exactly what
> needs to be saved in what configurations has been more carefully
> examined, too".
> 
> Alexey - do you recall from your analysis why these fields were no
> longer deemed necessary, and how your TCG tests were configured?

I most certainly did not do analysis (my bad. sorry) - I took the patch
from David as he left the team, fixed to compile and pushed away. I am also
very suspicions we did not try migrating TCG or anything but pseries. My
guest that things did not break (if they did not which I am not sure about,
for the TCG case) because the interrupt controller (XICS) or the
pseries-guest took care of resending an interrupt which does not seem to be
the case for mac99.


-- 
Alexey

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-13  2:23               ` Alexey Kardashevskiy
@ 2017-09-13  6:02                 ` David Gibson
  2017-09-13 16:44                   ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: David Gibson @ 2017-09-13  6:02 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Mark Cave-Ayland, Dr. David Alan Gilbert, lvivier, Greg Kurz,
	qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2043 bytes --]

On Wed, Sep 13, 2017 at 12:23:13PM +1000, Alexey Kardashevskiy wrote:
> On 13/09/17 02:46, Mark Cave-Ayland wrote:
> > On 12/09/17 17:41, Mark Cave-Ayland wrote:
> > 
> >> The commit message mentions that prior to the conversion some CPU state
> >> was missing but it doesn't mention anything about dropping existing
> >> fields as part of the conversion process so I suspect that this was an
> >> accidental side-effect.
> > 
> > Actually I've clicked send a little too early here since re-reading the
> > last paragraph of a90db15 I can see the inference here: "Exactly what
> > needs to be saved in what configurations has been more carefully
> > examined, too".
> > 
> > Alexey - do you recall from your analysis why these fields were no
> > longer deemed necessary, and how your TCG tests were configured?
> 
> I most certainly did not do analysis (my bad. sorry) - I took the patch
> from David as he left the team, fixed to compile and pushed away. I am also
> very suspicions we did not try migrating TCG or anything but pseries. My
> guest that things did not break (if they did not which I am not sure about,
> for the TCG case) because the interrupt controller (XICS) or the
> pseries-guest took care of resending an interrupt which does not seem to be
> the case for mac99.

Right, that's probably true.  The main point, though, is that these
fields were dropped a *long* time ago, when migration was barely
working to begin with.  In particular I'm pretty sure most of the
non-pseries platforms were already pretty broken for migration
(amongst other things).

Polishing the mac platforms up to working again, including migration,
is a reasonable goal.  But it can't be at the expense of pseries,
which is already working, used in production, and much better tested
than mac99 or g3beige ever were.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-11 17:19           ` Dr. David Alan Gilbert
@ 2017-09-13  7:03             ` David Gibson
  0 siblings, 0 replies; 33+ messages in thread
From: David Gibson @ 2017-09-13  7:03 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Mark Cave-Ayland, lvivier, aik, Greg Kurz, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 5315 bytes --]

On Mon, Sep 11, 2017 at 06:19:54PM +0100, Dr. David Alan Gilbert wrote:
> * Mark Cave-Ayland (mark.cave-ayland@ilande.co.uk) wrote:
> > On 11/09/17 11:48, David Gibson wrote:
> > 
> > > On Mon, Sep 11, 2017 at 10:30:33AM +0100, Dr. David Alan Gilbert wrote:
> > >> * Greg Kurz (groug@kaod.org) wrote:
> > >>> On Sun, 10 Sep 2017 15:37:33 +0100
> > >>> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> > >>>
> > >>>> Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
> > >>>> appears to drop the internal CPU IRQ state from the migration stream. Whilst
> > >>>> testing migration on g3beige/mac99 machines, test images would randomly fail to
> > >>>> resume unless a key was pressed on the VGA console.
> > >>>>
> > >>>> Further investigation suggests that internal CPU IRQ state isn't being
> > >>>> preserved and so interrupts asserted at the time of migration are lost. Adding
> > >>>> the pending_interrupts and irq_input_state fields back into the migration
> > >>>> stream appears to fix the problem here during local tests.
> > >>>>
> > >>>> As part of this commit we bump the vmstate_ppc version from 5 to 6 to handle
> > >>>> the additional fields.
> > >>>>
> > >>>
> > >>> And so this unconditionally breaks backward migration... what about adding
> > >>> a subsection for this ?
> > >>
> > >> and wiring it to a flag on the machine type so that older machine types
> > >> don't send it.
> > > 
> > > Right, a subsection is certainly necessary to avoid breaking backwards
> > > migration.
> > 
> > The suggestion of using the VMSTATE_*_V macros with an increased version
> > number came from Alexey's original review of the patch many months ago
> > which is why I did it that way.
> > 
> > Out of curiosity though, what is the criteria for supporting backwards
> > migration? Obviously forward migration is supported as-is in this
> > manner, so what determines if a patch needs to be backwards compatible
> > and how far?
> 
> It's a bit fuzzy.  Downstream we do backwards migration between various
> versions - and those versions are pretty arbitrarily chosen.
> Generally I prefer if we don't break that upstream either, although
> it isn't tested much.

Right.  In this case not breaking backwards migration upstream is
essentially a favour to downstream, since unbreaking it downstream
once its broken upstream is a real PITA.

> If it was in code that was specific to your g3beige I wouldn't mind;
> but for ppc in general then if it breaks the server migration it'll
> be a pain we'd have to then fix.  Best to keep it working upstream.

Right.

> But it's fairly easy to put new fields in a subsection and tie it
> to a property;  that makes it easy to switch it on/off in machine
> types.

In this case I'm not sure we even need a property - I think we could
migrate it only when it's non-zero.  That shold only break it in cases
where actually it would already break.

> > > But apart from that I want to understand better exactly why this is
> > > necessary.  What's the state that's being lost, and is it really not
> > > recoverable from anywhere else.
> > 
> > The test case I have is installing Darwin PPC 6.02 with qemu-system-ppc
> > TCG and repeatedly pausing, executing "savevm foo", then quitting and
> > continuing with "-loadvm foo" during the installation phase. About 1 in
> > 10 times the installer hangs after the loadvm until I press a key, at
> > which point it carries on as normal.
> > 
> > I then proceeded to going backwards through the git history until I
> > found out that it was the removal of the pending_interrupts,
> > irq_input_state and access_type fields during the conversion to
> > VMStateDescription commit a90db15 which seemed to cause the problem.
> > 
> > > The other thing that concerns me is how we're encoding the
> > > information.  These are essentially internal fields, not reflecting
> > > something with an architected encoding - adding those to the migration
> > > stream is often a bad idea - it inhibits our ability to rework
> > > internal encodings.
> > 
> > I'm not sure how this should be managed, however there was a similar
> > issue with excp_prefix (which was also removed in a90db15) that was
> > fixed in 2360b6e by calling a helper in cpu_post_load(). I can't easily
> > see how could work with env.pending_interrupts and env.irq_input_state
> > though.
> 
> Without knowing anything about this hardware... generally the migration
> stream should reflect the real state of the system rather than internal
> implementation detail, that way if you change the implementation you
> don't need to fudge the state.  Having said that, there's generally
> some internal state that's perhaps not immediately obvious from specs
> until you try and implement it.

Right.  I'm not really sure how to handle this yet.  The CPU irq
numbers are pretty much arbitrarily assigned, I don't think much
thought has gone into them.  And if its going to become part of the
migration ABI, some thought needs to be put into it.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-10 14:37 ` [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration Mark Cave-Ayland
@ 2017-09-13  7:12   ` David Gibson
  2017-09-13 17:11     ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: David Gibson @ 2017-09-13  7:12 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: qemu-devel, qemu-ppc, aik, lvivier

[-- Attachment #1: Type: text/plain, Size: 2594 bytes --]

On Sun, Sep 10, 2017 at 03:37:35PM +0100, Mark Cave-Ayland wrote:
> During local testing with TCG, intermittent errors were found when trying to
> migrate Darwin OS images.
> 
> The underlying cause was that Darwin resets the decrementer value to fairly
> small values on each interrupt. cpu_ppc_set_tb_clk() sets the default value
> of the decrementer to 0xffffffff during initialisation which typically
> corresponds to several seconds. Hence when restoring the image, the guest
> would effectively "lose" decrementer interrupts during this time causing
> confusion in the guest.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

This is subtly incorrect.  It sets the DECR on load to exactly the
value that was saved.  That effectively means that the DECR is frozen
for the migration downtime, which probably isn't what we want.

Instead we need to save the DECR as an offset from the timebase, and
restore it relative to the (downtime adjusted) timebase on the
destination.

The complication with that is that the timebase is generally migrated
at the machine level, not the cpu level: the timebase is generally
synchronized between cpus in the machine, and migrating it at the
individual cpu could break that.  Which means we probably need a
machine level hook to handle the decrementer too, even though it
logically *is* per-cpu, because otherwise we'll be trying to restore
it before the timebase is restored.

> ---
>  target/ppc/machine.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index 10b3c41..a16a856 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -176,6 +176,7 @@ static void cpu_pre_save(void *opaque)
>      env->spr[SPR_CFAR] = env->cfar;
>  #endif
>      env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
> +    env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
>  
>      for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
>          env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
> @@ -280,6 +281,7 @@ static int cpu_post_load(void *opaque, int version_id)
>      env->cfar = env->spr[SPR_CFAR];
>  #endif
>      env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
> +    cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
>  
>      for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
>          env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream
  2017-09-11 16:52     ` Mark Cave-Ayland
@ 2017-09-13  7:19       ` David Gibson
  2017-09-13 17:17         ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: David Gibson @ 2017-09-13  7:19 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: aik, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2660 bytes --]

On Mon, Sep 11, 2017 at 05:52:10PM +0100, Mark Cave-Ayland wrote:
> On 11/09/17 11:57, David Gibson wrote:
> 
> > On Sun, Sep 10, 2017 at 03:37:34PM +0100, Mark Cave-Ayland wrote:
> >> This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
> >> in the migration stream.
> > 
> > That is not, on its own, sufficient reason.
> > 
> >> Note: the vmstate_ppc version number has already been bumped by the previous
> >> patch in this series.
> >>
> >> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> > 
> > As with 2/4 it breaks backwards migration.
> > 
> > But more, I really disklike the idea of migrating this.  It's internal
> > state for one, and it's also essentially transitory state.  Can we
> > avoid putting it in the otherwise persistent structure at all? Can we
> > derive the state from elsewhere?  Can we prevent migration from
> > occurring in the small windows where this data is live?
> 
> >From what I can see references to access_type are scattered throughout
> mmu_helper.c although I'm not necessarily familiar enough with PPC to
> know whether this is something that can be derived elsewhere instead.
> And once again it was something that was removed by a90db15.

Right, but the migration code prior to a90db15 was a complete mess.
It definitely included a number of things it didn't need to and
shouldn't as well as being missing other things that were needed.
It's not a good model.  And although it might have more-or-less worked
for certain machines like the ones you're reviving here, it was never
properly tested

> When pausing a VM, does execution stop at the end of the current TB
> rather than immediately? If so, perhaps someone could confirm that
> guarantee is good enough for access_type?

I'm pretty sure it has to; we'd have to come up out of an individual
TB in order to get to the main loop where we check the "please pause"
flag.  I'm not sure if that helps us here though - I *think* access
type is about carrying information from the point where we trigger an
exception to the point where we actually start processing the
exception.

This code is really ugly and I've never understood it well :(. It's
always seemed bogus to me that we have an essentially global variable
to carry information over that small gap, though.  Unfortunately it's
unlikely that I'd be able to dive into this and work out if it's
really needed any time soon.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-13  6:02                 ` David Gibson
@ 2017-09-13 16:44                   ` Mark Cave-Ayland
  2017-09-13 17:13                     ` Greg Kurz
  2017-09-14  3:30                     ` David Gibson
  0 siblings, 2 replies; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-13 16:44 UTC (permalink / raw)
  To: David Gibson, Alexey Kardashevskiy
  Cc: lvivier, qemu-devel, Dr. David Alan Gilbert, Greg Kurz, qemu-ppc

On 13/09/17 07:02, David Gibson wrote:

>>> Alexey - do you recall from your analysis why these fields were no
>>> longer deemed necessary, and how your TCG tests were configured?
>>
>> I most certainly did not do analysis (my bad. sorry) - I took the patch
>> from David as he left the team, fixed to compile and pushed away. I am also
>> very suspicions we did not try migrating TCG or anything but pseries. My
>> guest that things did not break (if they did not which I am not sure about,
>> for the TCG case) because the interrupt controller (XICS) or the
>> pseries-guest took care of resending an interrupt which does not seem to be
>> the case for mac99.
> 
> Right, that's probably true.  The main point, though, is that these
> fields were dropped a *long* time ago, when migration was barely
> working to begin with.  In particular I'm pretty sure most of the
> non-pseries platforms were already pretty broken for migration
> (amongst other things).
> 
> Polishing the mac platforms up to working again, including migration,
> is a reasonable goal.  But it can't be at the expense of pseries,
> which is already working, used in production, and much better tested
> than mac99 or g3beige ever were.

Oh I completely agree since I'm well aware pseries likely has more users
than the Mac machines - my question was directed more about why we
support backwards migration.

I spent several hours yesterday poking my Darwin test case with trying
the different combinations of pending_interrupts, irq_input_state and
access_type and could easily provoke migration failures unless all 3 of
the fields were present so a practical test shows they are still
required for TCG migration. I think ppc_set_irq()'s use of the interrupt
fields in hw/ppc/ppc.c and the subsequent reference to pending
interrupts in target/ppc may explain why I see freezes/hangs until a key
is pressed in many cases.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-13  7:12   ` David Gibson
@ 2017-09-13 17:11     ` Mark Cave-Ayland
  2017-09-13 17:58       ` Laurent Vivier
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-13 17:11 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, lvivier, qemu-ppc, qemu-devel

On 13/09/17 08:12, David Gibson wrote:

> This is subtly incorrect.  It sets the DECR on load to exactly the
> value that was saved.  That effectively means that the DECR is frozen
> for the migration downtime, which probably isn't what we want.
> 
> Instead we need to save the DECR as an offset from the timebase, and
> restore it relative to the (downtime adjusted) timebase on the
> destination.
> 
> The complication with that is that the timebase is generally migrated
> at the machine level, not the cpu level: the timebase is generally
> synchronized between cpus in the machine, and migrating it at the
> individual cpu could break that.  Which means we probably need a
> machine level hook to handle the decrementer too, even though it
> logically *is* per-cpu, because otherwise we'll be trying to restore
> it before the timebase is restored.

I know that we discussed this in-depth last year, however I was working
along the lines that Laurent's patch fixed this along the lines of our
previous discussion:
https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00338.html (and
indeed Laurent's analysis at
https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01489.html).

However looking again at the this patch in the context you mentioned
above, I'm starting to wonder if the right solution now is for the
machine init function for g3beige/mac99 to do the same as spapr, e.g.
add cpu_ppc_clock_vm_state_change() as a vm_change_state_handler and
then add VMSTATE_PPC_TIMEBASE_V from the machine PPCTimeBase into my new
subsection?

Laurent, do you think that your state change handler would work
correctly in this way?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-13 16:44                   ` Mark Cave-Ayland
@ 2017-09-13 17:13                     ` Greg Kurz
  2017-09-14  3:48                       ` David Gibson
  2017-09-14  3:30                     ` David Gibson
  1 sibling, 1 reply; 33+ messages in thread
From: Greg Kurz @ 2017-09-13 17:13 UTC (permalink / raw)
  To: Mark Cave-Ayland
  Cc: David Gibson, Alexey Kardashevskiy, lvivier, qemu-devel,
	Dr. David Alan Gilbert, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2589 bytes --]

On Wed, 13 Sep 2017 17:44:54 +0100
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:

> On 13/09/17 07:02, David Gibson wrote:
> 
> >>> Alexey - do you recall from your analysis why these fields were no
> >>> longer deemed necessary, and how your TCG tests were configured?  
> >>
> >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> >> from David as he left the team, fixed to compile and pushed away. I am also
> >> very suspicions we did not try migrating TCG or anything but pseries. My
> >> guest that things did not break (if they did not which I am not sure about,
> >> for the TCG case) because the interrupt controller (XICS) or the
> >> pseries-guest took care of resending an interrupt which does not seem to be
> >> the case for mac99.  
> > 
> > Right, that's probably true.  The main point, though, is that these
> > fields were dropped a *long* time ago, when migration was barely
> > working to begin with.  In particular I'm pretty sure most of the
> > non-pseries platforms were already pretty broken for migration
> > (amongst other things).
> > 
> > Polishing the mac platforms up to working again, including migration,
> > is a reasonable goal.  But it can't be at the expense of pseries,
> > which is already working, used in production, and much better tested
> > than mac99 or g3beige ever were.  
> 
> Oh I completely agree since I'm well aware pseries likely has more users
> than the Mac machines - my question was directed more about why we
> support backwards migration.
> 

Downstream support backward migration because end users/customers ask for it
for maximum flexibility when it comes to move workloads around different systems
with different QEMU versions. This is fairly usual in data centers with many
systems.

As others already said, breaking things upstream may turn downstream work
into a nightmare (and FWIW, most of the people working on ppc are also
involved in downstream work).

Cheers,

--
Greg

> I spent several hours yesterday poking my Darwin test case with trying
> the different combinations of pending_interrupts, irq_input_state and
> access_type and could easily provoke migration failures unless all 3 of
> the fields were present so a practical test shows they are still
> required for TCG migration. I think ppc_set_irq()'s use of the interrupt
> fields in hw/ppc/ppc.c and the subsequent reference to pending
> interrupts in target/ppc may explain why I see freezes/hangs until a key
> is pressed in many cases.
> 
> 
> ATB,
> 
> Mark.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream
  2017-09-13  7:19       ` David Gibson
@ 2017-09-13 17:17         ` Mark Cave-Ayland
  2017-09-14  3:54           ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-13 17:17 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, lvivier, qemu-ppc, qemu-devel

On 13/09/17 08:19, David Gibson wrote:

>> When pausing a VM, does execution stop at the end of the current TB
>> rather than immediately? If so, perhaps someone could confirm that
>> guarantee is good enough for access_type?
> 
> I'm pretty sure it has to; we'd have to come up out of an individual
> TB in order to get to the main loop where we check the "please pause"
> flag.  I'm not sure if that helps us here though - I *think* access
> type is about carrying information from the point where we trigger an
> exception to the point where we actually start processing the
> exception.
> 
> This code is really ugly and I've never understood it well :(. It's
> always seemed bogus to me that we have an essentially global variable
> to carry information over that small gap, though.  Unfortunately it's
> unlikely that I'd be able to dive into this and work out if it's
> really needed any time soon.

>From my testing yesterday it does appear to be required for TCG (or
unless this is exposing another bug in the Mac migration) so I can check
it works here and then maybe someone else confirm it works on KVM?

A couple of things I've noticed: the heathrow VMStateDescription looks
good, however I can see that the OpenPIC timers won't likely migrate
correctly without adding a few more fields - that's easy to fix.

Another thing is that if we're changing the migration stream for Mac
models Ben has some OpenPIC and other related changes in his wip queue
that it might make sense to put those in first before properly fixing
the Mac machine migration.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-13 17:11     ` Mark Cave-Ayland
@ 2017-09-13 17:58       ` Laurent Vivier
  2017-09-14  3:52         ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Laurent Vivier @ 2017-09-13 17:58 UTC (permalink / raw)
  To: Mark Cave-Ayland, David Gibson; +Cc: aik, qemu-ppc, qemu-devel

On 13/09/2017 19:11, Mark Cave-Ayland wrote:
> On 13/09/17 08:12, David Gibson wrote:
> 
>> This is subtly incorrect.  It sets the DECR on load to exactly the
>> value that was saved.  That effectively means that the DECR is frozen
>> for the migration downtime, which probably isn't what we want.
>>
>> Instead we need to save the DECR as an offset from the timebase, and
>> restore it relative to the (downtime adjusted) timebase on the
>> destination.
>>
>> The complication with that is that the timebase is generally migrated
>> at the machine level, not the cpu level: the timebase is generally
>> synchronized between cpus in the machine, and migrating it at the
>> individual cpu could break that.  Which means we probably need a
>> machine level hook to handle the decrementer too, even though it
>> logically *is* per-cpu, because otherwise we'll be trying to restore
>> it before the timebase is restored.
> 
> I know that we discussed this in-depth last year, however I was working
> along the lines that Laurent's patch fixed this along the lines of our
> previous discussion:
> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00338.html (and
> indeed Laurent's analysis at
> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01489.html).
> 
> However looking again at the this patch in the context you mentioned
> above, I'm starting to wonder if the right solution now is for the
> machine init function for g3beige/mac99 to do the same as spapr, e.g.
> add cpu_ppc_clock_vm_state_change() as a vm_change_state_handler and
> then add VMSTATE_PPC_TIMEBASE_V from the machine PPCTimeBase into my new
> subsection?
> 
> Laurent, do you think that your state change handler would work
> correctly in this way?

I think all is explained in the second link you have mentioned, it seems 
we don't need a state handler as KVM DECR will no be updated by the migrated value:

hw/ppc/ppc.c

    736 static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
    737                                  QEMUTimer *timer,
    738                                  void (*raise_excp)(void *),
    739                                  void (*lower_excp)(PowerPCCPU *),
    740                                  uint32_t decr, uint32_t value)
    741 {
...
    749     if (kvm_enabled()) {
    750         /* KVM handles decrementer exceptions, we don't need our own timer */
    751         return;
    752     }
...

But this allows to migrate it for TCG. And it should be correct because in case of TCG I think [need to check] timebase is stopped too (so offset is 0)

David, do you agree with that?

Thanks,
Laurent

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-13 16:44                   ` Mark Cave-Ayland
  2017-09-13 17:13                     ` Greg Kurz
@ 2017-09-14  3:30                     ` David Gibson
  1 sibling, 0 replies; 33+ messages in thread
From: David Gibson @ 2017-09-14  3:30 UTC (permalink / raw)
  To: Mark Cave-Ayland
  Cc: Alexey Kardashevskiy, lvivier, qemu-devel,
	Dr. David Alan Gilbert, Greg Kurz, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 3096 bytes --]

On Wed, Sep 13, 2017 at 05:44:54PM +0100, Mark Cave-Ayland wrote:
> On 13/09/17 07:02, David Gibson wrote:
> 
> >>> Alexey - do you recall from your analysis why these fields were no
> >>> longer deemed necessary, and how your TCG tests were configured?
> >>
> >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> >> from David as he left the team, fixed to compile and pushed away. I am also
> >> very suspicions we did not try migrating TCG or anything but pseries. My
> >> guest that things did not break (if they did not which I am not sure about,
> >> for the TCG case) because the interrupt controller (XICS) or the
> >> pseries-guest took care of resending an interrupt which does not seem to be
> >> the case for mac99.
> > 
> > Right, that's probably true.  The main point, though, is that these
> > fields were dropped a *long* time ago, when migration was barely
> > working to begin with.  In particular I'm pretty sure most of the
> > non-pseries platforms were already pretty broken for migration
> > (amongst other things).
> > 
> > Polishing the mac platforms up to working again, including migration,
> > is a reasonable goal.  But it can't be at the expense of pseries,
> > which is already working, used in production, and much better tested
> > than mac99 or g3beige ever were.
> 
> Oh I completely agree since I'm well aware pseries likely has more users
> than the Mac machines - my question was directed more about why we
> support backwards migration.
> 
> I spent several hours yesterday poking my Darwin test case with trying
> the different combinations of pending_interrupts, irq_input_state and
> access_type and could easily provoke migration failures unless all 3 of
> the fields were present so a practical test shows they are still
> required for TCG migration. I think ppc_set_irq()'s use of the interrupt
> fields in hw/ppc/ppc.c and the subsequent reference to pending
> interrupts in target/ppc may explain why I see freezes/hangs until a key
> is pressed in many cases.

Ok, I think we need to consider (pending_interrupts and irq_input_state)
separately from access_type.  The first two are pretty closely related
to each other, and I've got at least a rough idea of what the problems
there might be.  access_type I'm pretty sure has to be an unrelated
problem, and I've got much less of a handle on it.

I suspect we could work around the problems with pending_interrupts
and irq_input_state by having a post_load hook in the board level
interrupt controller to reassert its output irq line based on its
current state.  I believe the relevant irq inputs to the cpu are
effectively level triggered, so I think that will be enough.

access_type I don't have any good ideas for yet.  We really need to
work out what the exact race is here that's causing its state to be
lost harmfully.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
  2017-09-13 17:13                     ` Greg Kurz
@ 2017-09-14  3:48                       ` David Gibson
  0 siblings, 0 replies; 33+ messages in thread
From: David Gibson @ 2017-09-14  3:48 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Mark Cave-Ayland, Alexey Kardashevskiy, lvivier, qemu-devel,
	Dr. David Alan Gilbert, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 3061 bytes --]

On Wed, Sep 13, 2017 at 07:13:09PM +0200, Greg Kurz wrote:
> On Wed, 13 Sep 2017 17:44:54 +0100
> Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> 
> > On 13/09/17 07:02, David Gibson wrote:
> > 
> > >>> Alexey - do you recall from your analysis why these fields were no
> > >>> longer deemed necessary, and how your TCG tests were configured?  
> > >>
> > >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> > >> from David as he left the team, fixed to compile and pushed away. I am also
> > >> very suspicions we did not try migrating TCG or anything but pseries. My
> > >> guest that things did not break (if they did not which I am not sure about,
> > >> for the TCG case) because the interrupt controller (XICS) or the
> > >> pseries-guest took care of resending an interrupt which does not seem to be
> > >> the case for mac99.  
> > > 
> > > Right, that's probably true.  The main point, though, is that these
> > > fields were dropped a *long* time ago, when migration was barely
> > > working to begin with.  In particular I'm pretty sure most of the
> > > non-pseries platforms were already pretty broken for migration
> > > (amongst other things).
> > > 
> > > Polishing the mac platforms up to working again, including migration,
> > > is a reasonable goal.  But it can't be at the expense of pseries,
> > > which is already working, used in production, and much better tested
> > > than mac99 or g3beige ever were.  
> > 
> > Oh I completely agree since I'm well aware pseries likely has more users
> > than the Mac machines - my question was directed more about why we
> > support backwards migration.
> > 
> 
> Downstream support backward migration because end users/customers ask for it
> for maximum flexibility when it comes to move workloads around different systems
> with different QEMU versions. This is fairly usual in data centers with many
> systems.

One specific case where this matters is if you want to update the qemu
version on an entire cluster of hosts.  Management layers like
oVirt/RHV allow you to do that without disrupting guests by migrating
them off each host, one by one, allowing you to remove it from the
cluster, update and then reinsert after which guests may be migrated
onto it again.

But that process could obviously take a fair while for a big cluster;
days or even weeks.   In the meantime you have a cluster with mixed
qemu versions, and since oVirt/RHV/whatever will also migrate guests
around the cluster to load balance, that means the possibility of a
backwards migration.

> As others already said, breaking things upstream may turn downstream work
> into a nightmare (and FWIW, most of the people working on ppc are also
> involved in downstream work).

Right.  And I even used to work as a RHV support tech too :).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-13 17:58       ` Laurent Vivier
@ 2017-09-14  3:52         ` David Gibson
  2017-09-15 12:45           ` Mark Cave-Ayland
  0 siblings, 1 reply; 33+ messages in thread
From: David Gibson @ 2017-09-14  3:52 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Mark Cave-Ayland, aik, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3241 bytes --]

On Wed, Sep 13, 2017 at 07:58:23PM +0200, Laurent Vivier wrote:
> On 13/09/2017 19:11, Mark Cave-Ayland wrote:
> > On 13/09/17 08:12, David Gibson wrote:
> > 
> >> This is subtly incorrect.  It sets the DECR on load to exactly the
> >> value that was saved.  That effectively means that the DECR is frozen
> >> for the migration downtime, which probably isn't what we want.
> >>
> >> Instead we need to save the DECR as an offset from the timebase, and
> >> restore it relative to the (downtime adjusted) timebase on the
> >> destination.
> >>
> >> The complication with that is that the timebase is generally migrated
> >> at the machine level, not the cpu level: the timebase is generally
> >> synchronized between cpus in the machine, and migrating it at the
> >> individual cpu could break that.  Which means we probably need a
> >> machine level hook to handle the decrementer too, even though it
> >> logically *is* per-cpu, because otherwise we'll be trying to restore
> >> it before the timebase is restored.
> > 
> > I know that we discussed this in-depth last year, however I was working
> > along the lines that Laurent's patch fixed this along the lines of our
> > previous discussion:
> > https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00338.html (and
> > indeed Laurent's analysis at
> > https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01489.html).
> > 
> > However looking again at the this patch in the context you mentioned
> > above, I'm starting to wonder if the right solution now is for the
> > machine init function for g3beige/mac99 to do the same as spapr, e.g.
> > add cpu_ppc_clock_vm_state_change() as a vm_change_state_handler and
> > then add VMSTATE_PPC_TIMEBASE_V from the machine PPCTimeBase into my new
> > subsection?
> > 
> > Laurent, do you think that your state change handler would work
> > correctly in this way?
> 
> I think all is explained in the second link you have mentioned, it seems 
> we don't need a state handler as KVM DECR will no be updated by the migrated value:
> 
> hw/ppc/ppc.c
> 
>     736 static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
>     737                                  QEMUTimer *timer,
>     738                                  void (*raise_excp)(void *),
>     739                                  void (*lower_excp)(PowerPCCPU *),
>     740                                  uint32_t decr, uint32_t value)
>     741 {
> ...
>     749     if (kvm_enabled()) {
>     750         /* KVM handles decrementer exceptions, we don't need our own timer */
>     751         return;
>     752     }
> ...
> 
> But this allows to migrate it for TCG. And it should be correct because in case of TCG I think [need to check] timebase is stopped too (so offset is 0)
> 
> David, do you agree with that?

Yes, I think so.  Some details might be different, but the basic idea
of migrating the timebase and decrementers at machine level should be
the same for pseries and g3beige/whatever.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream
  2017-09-13 17:17         ` Mark Cave-Ayland
@ 2017-09-14  3:54           ` David Gibson
  0 siblings, 0 replies; 33+ messages in thread
From: David Gibson @ 2017-09-14  3:54 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: aik, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2296 bytes --]

On Wed, Sep 13, 2017 at 06:17:21PM +0100, Mark Cave-Ayland wrote:
> On 13/09/17 08:19, David Gibson wrote:
> 
> >> When pausing a VM, does execution stop at the end of the current TB
> >> rather than immediately? If so, perhaps someone could confirm that
> >> guarantee is good enough for access_type?
> > 
> > I'm pretty sure it has to; we'd have to come up out of an individual
> > TB in order to get to the main loop where we check the "please pause"
> > flag.  I'm not sure if that helps us here though - I *think* access
> > type is about carrying information from the point where we trigger an
> > exception to the point where we actually start processing the
> > exception.
> > 
> > This code is really ugly and I've never understood it well :(. It's
> > always seemed bogus to me that we have an essentially global variable
> > to carry information over that small gap, though.  Unfortunately it's
> > unlikely that I'd be able to dive into this and work out if it's
> > really needed any time soon.
> 
> >From my testing yesterday it does appear to be required for TCG (or
> unless this is exposing another bug in the Mac migration) so I can check
> it works here and then maybe someone else confirm it works on KVM?
> 
> A couple of things I've noticed: the heathrow VMStateDescription looks
> good, however I can see that the OpenPIC timers won't likely migrate
> correctly without adding a few more fields - that's easy to fix.

Right.  And since OpenPIC isn't used on any platforms that have real
production use in the wild, it's fine to bump the migration stream
version for it.

> Another thing is that if we're changing the migration stream for Mac
> models Ben has some OpenPIC and other related changes in his wip queue
> that it might make sense to put those in first before properly fixing
> the Mac machine migration.

That would have something to be said for it.

It's probably not essential, though, since I don't consider the
non-pseries platforms at this stage sufficiently mature that we
guarantee a stable migration stream format.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-14  3:52         ` David Gibson
@ 2017-09-15 12:45           ` Mark Cave-Ayland
  2017-09-19  8:36             ` David Gibson
  0 siblings, 1 reply; 33+ messages in thread
From: Mark Cave-Ayland @ 2017-09-15 12:45 UTC (permalink / raw)
  To: David Gibson, Laurent Vivier; +Cc: aik, qemu-ppc, qemu-devel

On 14/09/17 04:52, David Gibson wrote:

> On Wed, Sep 13, 2017 at 07:58:23PM +0200, Laurent Vivier wrote:
>> On 13/09/2017 19:11, Mark Cave-Ayland wrote:
>>> On 13/09/17 08:12, David Gibson wrote:
>>>
>>>> This is subtly incorrect.  It sets the DECR on load to exactly the
>>>> value that was saved.  That effectively means that the DECR is frozen
>>>> for the migration downtime, which probably isn't what we want.
>>>>
>>>> Instead we need to save the DECR as an offset from the timebase, and
>>>> restore it relative to the (downtime adjusted) timebase on the
>>>> destination.
>>>>
>>>> The complication with that is that the timebase is generally migrated
>>>> at the machine level, not the cpu level: the timebase is generally
>>>> synchronized between cpus in the machine, and migrating it at the
>>>> individual cpu could break that.  Which means we probably need a
>>>> machine level hook to handle the decrementer too, even though it
>>>> logically *is* per-cpu, because otherwise we'll be trying to restore
>>>> it before the timebase is restored.
>>>
>>> I know that we discussed this in-depth last year, however I was working
>>> along the lines that Laurent's patch fixed this along the lines of our
>>> previous discussion:
>>> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00338.html (and
>>> indeed Laurent's analysis at
>>> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01489.html).
>>>
>>> However looking again at the this patch in the context you mentioned
>>> above, I'm starting to wonder if the right solution now is for the
>>> machine init function for g3beige/mac99 to do the same as spapr, e.g.
>>> add cpu_ppc_clock_vm_state_change() as a vm_change_state_handler and
>>> then add VMSTATE_PPC_TIMEBASE_V from the machine PPCTimeBase into my new
>>> subsection?
>>>
>>> Laurent, do you think that your state change handler would work
>>> correctly in this way?
>>
>> I think all is explained in the second link you have mentioned, it seems 
>> we don't need a state handler as KVM DECR will no be updated by the migrated value:
>>
>> hw/ppc/ppc.c
>>
>>     736 static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
>>     737                                  QEMUTimer *timer,
>>     738                                  void (*raise_excp)(void *),
>>     739                                  void (*lower_excp)(PowerPCCPU *),
>>     740                                  uint32_t decr, uint32_t value)
>>     741 {
>> ...
>>     749     if (kvm_enabled()) {
>>     750         /* KVM handles decrementer exceptions, we don't need our own timer */
>>     751         return;
>>     752     }
>> ...
>>
>> But this allows to migrate it for TCG. And it should be correct because in case of TCG I think [need to check] timebase is stopped too (so offset is 0)
>>
>> David, do you agree with that?
> 
> Yes, I think so.  Some details might be different, but the basic idea
> of migrating the timebase and decrementers at machine level should be
> the same for pseries and g3beige/whatever.

>From my perspective, I'd really like to bury all of the timebase
migration logic into timebase_load()/timebase_save() so then as you
suggest above, everything will work regardless of the machine type and host.

Playing around with these functions I can see that they are very
PPC-specific - for example cpu_get_host_ticks() can only ever work
correctly migrating between 2 PPC hosts because here on Intel the value
returned is the TSC register which is completely unrelated to the
timebase frequency. Hence the calculated values are nonsense and the
guest inevitably ends up freezing.

I've had a go at converting this back to using the host clock to
calculate the required adjustment to tb_offset (see
https://github.com/mcayland/qemu/commit/68e31f0455a79d41b2ab703364c74ceb4d513154)
but then even that struck me as incorrect since unless there are other
host CPUs with an equivalent of tb_offset, this whole section of code
should only ever run if the host CPU is PPC, and maybe even only under KVM.

My current thoughts are now as follows:

- The existing timebase_load() should never adjust tb_offset unless the
host CPU is also PPC (how do we actually determine this?). Otherwise it
should always be 0.

- The per-CPU decrementer values should be *stored* in the normal
vmstate SPR array as per my latest patch at
https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02346.html.

BUT:

- The per-CPU decrementer values should be *restored* in timebase_load()
but again should not be adjusted by tb_offset unless the host CPU is
also PPC.

The real question is whether tb_offset is used for a TCG PPC guest on a
PPC host or whether it also remains at 0 as it does here on an Intel
host? If it does remain at 0 for a TCG PPC guest on a PPC host then we
don't need to work out whether we are running on a PPC host at all: we
can just skip the timebase adjustments made by timebase_load() if
!kvm_enabled() and everything should just work.

If this all seems reasonable then what I need to do for the PPC
g3beige/mac99 machines is add the relevant machine state, include the
existing vmstate_change_handler and then add the code to timebase_load()
to set the decrementer. I'm happy to do this as long as everyone agrees
this is a sensible approach?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration
  2017-09-15 12:45           ` Mark Cave-Ayland
@ 2017-09-19  8:36             ` David Gibson
  0 siblings, 0 replies; 33+ messages in thread
From: David Gibson @ 2017-09-19  8:36 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: Laurent Vivier, aik, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7876 bytes --]

On Fri, Sep 15, 2017 at 01:45:11PM +0100, Mark Cave-Ayland wrote:
> On 14/09/17 04:52, David Gibson wrote:
> 
> > On Wed, Sep 13, 2017 at 07:58:23PM +0200, Laurent Vivier wrote:
> >> On 13/09/2017 19:11, Mark Cave-Ayland wrote:
> >>> On 13/09/17 08:12, David Gibson wrote:
> >>>
> >>>> This is subtly incorrect.  It sets the DECR on load to exactly the
> >>>> value that was saved.  That effectively means that the DECR is frozen
> >>>> for the migration downtime, which probably isn't what we want.
> >>>>
> >>>> Instead we need to save the DECR as an offset from the timebase, and
> >>>> restore it relative to the (downtime adjusted) timebase on the
> >>>> destination.
> >>>>
> >>>> The complication with that is that the timebase is generally migrated
> >>>> at the machine level, not the cpu level: the timebase is generally
> >>>> synchronized between cpus in the machine, and migrating it at the
> >>>> individual cpu could break that.  Which means we probably need a
> >>>> machine level hook to handle the decrementer too, even though it
> >>>> logically *is* per-cpu, because otherwise we'll be trying to restore
> >>>> it before the timebase is restored.
> >>>
> >>> I know that we discussed this in-depth last year, however I was working
> >>> along the lines that Laurent's patch fixed this along the lines of our
> >>> previous discussion:
> >>> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00338.html (and
> >>> indeed Laurent's analysis at
> >>> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01489.html).
> >>>
> >>> However looking again at the this patch in the context you mentioned
> >>> above, I'm starting to wonder if the right solution now is for the
> >>> machine init function for g3beige/mac99 to do the same as spapr, e.g.
> >>> add cpu_ppc_clock_vm_state_change() as a vm_change_state_handler and
> >>> then add VMSTATE_PPC_TIMEBASE_V from the machine PPCTimeBase into my new
> >>> subsection?
> >>>
> >>> Laurent, do you think that your state change handler would work
> >>> correctly in this way?
> >>
> >> I think all is explained in the second link you have mentioned, it seems 
> >> we don't need a state handler as KVM DECR will no be updated by the migrated value:
> >>
> >> hw/ppc/ppc.c
> >>
> >>     736 static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
> >>     737                                  QEMUTimer *timer,
> >>     738                                  void (*raise_excp)(void *),
> >>     739                                  void (*lower_excp)(PowerPCCPU *),
> >>     740                                  uint32_t decr, uint32_t value)
> >>     741 {
> >> ...
> >>     749     if (kvm_enabled()) {
> >>     750         /* KVM handles decrementer exceptions, we don't need our own timer */
> >>     751         return;
> >>     752     }
> >> ...
> >>
> >> But this allows to migrate it for TCG. And it should be correct because in case of TCG I think [need to check] timebase is stopped too (so offset is 0)
> >>
> >> David, do you agree with that?
> > 
> > Yes, I think so.  Some details might be different, but the basic idea
> > of migrating the timebase and decrementers at machine level should be
> > the same for pseries and g3beige/whatever.
> 
> >From my perspective, I'd really like to bury all of the timebase
> migration logic into timebase_load()/timebase_save() so then as you
> suggest above, everything will work regardless of the machine type and host.

That would certainly be ideal if we can manage it.

> Playing around with these functions I can see that they are very
> PPC-specific - for example cpu_get_host_ticks() can only ever work
> correctly migrating between 2 PPC hosts because here on Intel the value
> returned is the TSC register which is completely unrelated to the
> timebase frequency. Hence the calculated values are nonsense and the
> guest inevitably ends up freezing.

Ugh.  That sounds like it needs fixing.

> 
> I've had a go at converting this back to using the host clock to
> calculate the required adjustment to tb_offset (see
> https://github.com/mcayland/qemu/commit/68e31f0455a79d41b2ab703364c74ceb4d513154)
> but then even that struck me as incorrect since unless there are other
> host CPUs with an equivalent of tb_offset, this whole section of code
> should only ever run if the host CPU is PPC, and maybe even only under KVM.

Hm, maybe.  It certainly should be possible to do something
equivalent, recalculating guest TB as necessary from guest generic
clock, frequency and offsets in real time units.  It might end up
being desirable to have a special KVM path which works with raw TB
offsets for simplicity, though.

> My current thoughts are now as follows:
> 
> - The existing timebase_load() should never adjust tb_offset unless the
> host CPU is also PPC (how do we actually determine this?). Otherwise it
> should always be 0.

That doesn't sound right.  But to work out what does makes sense, we
need to figure out what "tb_offset" means in the non-KVM case, which
isn't immediately obvious to me.

> - The per-CPU decrementer values should be *stored* in the normal
> vmstate SPR array as per my latest patch at
> https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02346.html.

Uh.. maybe.  It kind of makes sense, but the lack of symmetry with the
load path is a bit messy, and may not play well with the way vmstate
descriptions work.

> BUT:
> 
> - The per-CPU decrementer values should be *restored* in timebase_load()
> but again should not be adjusted by tb_offset unless the host CPU is
> also PPC.

That doesn't seem right.  Regardless of whether we're KVM or TCG, we
should recalculate the DECR values based on the stored value adjusted
by how many guest timebase ticks have passed since then (which we'll
need to calculate based on qemu_clock_ns() combined with various other
bits of info (basically the same stuff we already need to compute the
correct timebase on load).

Or to put it another way timebase_load() should calculate (and/or
load) both the guest TB at save time and the guest TB at load time,
then adjust all the DECRs based on the delta between them.

For extra fun, to cover some of the embedded cpus, we'll need to
consider how to handle the PIT, which is sort-of-but-not-quite like
the DECR.

> The real question is whether tb_offset is used for a TCG PPC guest on a
> PPC host or whether it also remains at 0 as it does here on an Intel
> host? If it does remain at 0 for a TCG PPC guest on a PPC host then we
> don't need to work out whether we are running on a PPC host at all: we
> can just skip the timebase adjustments made by timebase_load() if
> !kvm_enabled() and everything should just work.

It kind of sounds like we need to rework that value entirely - or at
least rename it to something less misleading.  Basically to represent
the guest TB we want a tuple of a snapshot TB value with the qemu
clock time at which that snapshot was taken.  Based on that, plus the
tb frequency we can calculate a future TB value - and it's a
migratable quantity, which raw tb_offset is not (even between ppc
hosts).

Obviously there are several equivalent ways of representing that.

> If this all seems reasonable then what I need to do for the PPC
> g3beige/mac99 machines is add the relevant machine state, include the
> existing vmstate_change_handler and then add the code to timebase_load()
> to set the decrementer. I'm happy to do this as long as everyone agrees
> this is a sensible approach?
> 
> 
> ATB,
> 
> Mark.
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-09-19 10:36 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-10 14:37 [Qemu-devel] [PATCH 0/4] ppc: migration fixes for TCG Mark Cave-Ayland
2017-09-10 14:37 ` [Qemu-devel] [PATCH 1/4] ppc: change CPUPPCState access_type from int to uint8_t Mark Cave-Ayland
2017-09-10 16:30   ` Laurent Vivier
2017-09-10 18:00     ` Mark Cave-Ayland
2017-09-10 14:37 ` [Qemu-devel] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription Mark Cave-Ayland
2017-09-11  7:50   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2017-09-11  9:30     ` Dr. David Alan Gilbert
2017-09-11 10:48       ` David Gibson
2017-09-11 16:46         ` Mark Cave-Ayland
2017-09-11 17:19           ` Dr. David Alan Gilbert
2017-09-13  7:03             ` David Gibson
2017-09-12 16:21         ` Dr. David Alan Gilbert
2017-09-12 16:41           ` Mark Cave-Ayland
2017-09-12 16:46             ` Mark Cave-Ayland
2017-09-13  2:23               ` Alexey Kardashevskiy
2017-09-13  6:02                 ` David Gibson
2017-09-13 16:44                   ` Mark Cave-Ayland
2017-09-13 17:13                     ` Greg Kurz
2017-09-14  3:48                       ` David Gibson
2017-09-14  3:30                     ` David Gibson
2017-09-10 14:37 ` [Qemu-devel] [PATCH 3/4] ppc: add CPU access_type into the migration stream Mark Cave-Ayland
2017-09-11 10:57   ` David Gibson
2017-09-11 16:52     ` Mark Cave-Ayland
2017-09-13  7:19       ` David Gibson
2017-09-13 17:17         ` Mark Cave-Ayland
2017-09-14  3:54           ` David Gibson
2017-09-10 14:37 ` [Qemu-devel] [PATCH 4/4] ppc: ensure we update the decrementer value during migration Mark Cave-Ayland
2017-09-13  7:12   ` David Gibson
2017-09-13 17:11     ` Mark Cave-Ayland
2017-09-13 17:58       ` Laurent Vivier
2017-09-14  3:52         ` David Gibson
2017-09-15 12:45           ` Mark Cave-Ayland
2017-09-19  8:36             ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.