All of lore.kernel.org
 help / color / mirror / Atom feed
* ARM Snapshots Not Backwards-Compatible
@ 2021-02-03  4:01 Aaron Lindsay
  2021-02-03  8:21 ` Philippe Mathieu-Daudé
  2021-02-03 10:01 ` Peter Maydell
  0 siblings, 2 replies; 22+ messages in thread
From: Aaron Lindsay @ 2021-02-03  4:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell

Hello,

I'm attempting to restore an AArch64 snapshot taken on QEMU 4.1.0 on
QEMU 5.2.0, using system mode. My previous impression, possibly from
https://wiki.qemu.org/Features/Migration/Troubleshooting#Basics was that
this ought to work:

> Note that QEMU supports migrating forward between QEMU versions

Note that I'm using qemu-system-aarch64 with -loadvm.

However, I've run into several issues I thought I should report. The
first of them was that this commit changed the address of CBAR, which
resulted in a mismatch of the register IDs in `cpu_post_load` in
target/arm/machine.c:
https://patchwork.kernel.org/project/qemu-devel/patch/20190927144249.29999-2-peter.maydell@linaro.org/

The second was that several system registers have changed which bits are
allowed to be written in different circumstances, seemingly as a result
of a combination of bugfixes and implementation of additional behavior.
These hit errors detected in `write_list_to_cpustate` in
target/arm/helper.c.

The third is that meanings of the bits in env->features (as defined by
`enum arm_features` in target/arm/cpu.h) has shifted. For example,
ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
added since 4.1.0. Heck, even I have added a field there in the past.
Unfortunately, these additions/removals mean that when env->features is
saved on one version and restored on another the bits can mean different
things. Notably, the removal of the *VFP features means that a snapshot
of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
ARM_FEATURE_M on 5.2.0!

My guess, given the numerous issues and the additional complexity
required to properly implement backwards-compatible snapshotting, is
that this is not a primary goal. However, if it is a goal, what steps
can/should we take to support it more thoroughly?

Thanks!

-Aaron

p.s. Now for an admission: the snapshots I'm testing with were
originally taken with `-cpu max`. This was unintentional, and I
understand if the response is that I can't expect `-cpu max` checkpoints
to work across QEMU versions... but I also don't think that all of these
issues can be blamed on that (notably CBAR and env->features).


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03  4:01 ARM Snapshots Not Backwards-Compatible Aaron Lindsay
@ 2021-02-03  8:21 ` Philippe Mathieu-Daudé
  2021-02-03 10:27   ` Dr. David Alan Gilbert
  2021-02-03 10:01 ` Peter Maydell
  1 sibling, 1 reply; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-03  8:21 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: Peter Maydell, Andrew Jones, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, qemu-arm

Cc'ing migration team and qemu-arm@ list.

On 2/3/21 5:01 AM, Aaron Lindsay wrote:
> Hello,
> 
> I'm attempting to restore an AArch64 snapshot taken on QEMU 4.1.0 on
> QEMU 5.2.0, using system mode. My previous impression, possibly from
> https://wiki.qemu.org/Features/Migration/Troubleshooting#Basics was that
> this ought to work:
> 
>> Note that QEMU supports migrating forward between QEMU versions
> 
> Note that I'm using qemu-system-aarch64 with -loadvm.
> 
> However, I've run into several issues I thought I should report. The
> first of them was that this commit changed the address of CBAR, which
> resulted in a mismatch of the register IDs in `cpu_post_load` in
> target/arm/machine.c:
> https://patchwork.kernel.org/project/qemu-devel/patch/20190927144249.29999-2-peter.maydell@linaro.org/
> 
> The second was that several system registers have changed which bits are
> allowed to be written in different circumstances, seemingly as a result
> of a combination of bugfixes and implementation of additional behavior.
> These hit errors detected in `write_list_to_cpustate` in
> target/arm/helper.c.
> 
> The third is that meanings of the bits in env->features (as defined by
> `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
> ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> added since 4.1.0. Heck, even I have added a field there in the past.
> Unfortunately, these additions/removals mean that when env->features is
> saved on one version and restored on another the bits can mean different
> things. Notably, the removal of the *VFP features means that a snapshot
> of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> ARM_FEATURE_M on 5.2.0!
> 
> My guess, given the numerous issues and the additional complexity
> required to properly implement backwards-compatible snapshotting, is
> that this is not a primary goal. However, if it is a goal, what steps
> can/should we take to support it more thoroughly?
> 
> Thanks!
> 
> -Aaron
> 
> p.s. Now for an admission: the snapshots I'm testing with were
> originally taken with `-cpu max`. This was unintentional, and I
> understand if the response is that I can't expect `-cpu max` checkpoints
> to work across QEMU versions... but I also don't think that all of these
> issues can be blamed on that (notably CBAR and env->features).
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03  4:01 ARM Snapshots Not Backwards-Compatible Aaron Lindsay
  2021-02-03  8:21 ` Philippe Mathieu-Daudé
@ 2021-02-03 10:01 ` Peter Maydell
  2021-02-03 14:58   ` Aaron Lindsay
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Maydell @ 2021-02-03 10:01 UTC (permalink / raw)
  To: Aaron Lindsay; +Cc: QEMU Developers

On Wed, 3 Feb 2021 at 04:01, Aaron Lindsay <aaron@os.amperecomputing.com> wrote:
>
> Hello,
>
> I'm attempting to restore an AArch64 snapshot taken on QEMU 4.1.0 on
> QEMU 5.2.0, using system mode. My previous impression, possibly from
> https://wiki.qemu.org/Features/Migration/Troubleshooting#Basics was that
> this ought to work:
>
> > Note that QEMU supports migrating forward between QEMU versions
>
> Note that I'm using qemu-system-aarch64 with -loadvm.

You don't say what machine type and command line you're using. Strictly,
Strictly speaking, the intended guarantee only covers versioned
machines, eg "virt-4.1" on QEMU 4.1 to "virt-4.1" on QEMU 5.2.

> The third is that meanings of the bits in env->features (as defined by
> `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
> ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> added since 4.1.0. Heck, even I have added a field there in the past.
> Unfortunately, these additions/removals mean that when env->features is
> saved on one version and restored on another the bits can mean different
> things. Notably, the removal of the *VFP features means that a snapshot
> of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> ARM_FEATURE_M on 5.2.0!

Ow. I didn't realize the env->features was in the migration state :-(
There is no reason for it to be, because it's a constant property
of the CPU. The easy fix is to replace
       VMSTATE_UINT64(env.features, ARMCPU),
in target/arm/machine.c with whatever the syntax is for "ignore
64 bits of data here". Then we'll ignore whatever is coming in
from the source, which we don't need, and we'll stop sending it
out if we're the destination.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03  8:21 ` Philippe Mathieu-Daudé
@ 2021-02-03 10:27   ` Dr. David Alan Gilbert
  2021-02-03 10:38     ` Peter Maydell
  0 siblings, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-03 10:27 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Peter Maydell, Andrew Jones, Juan Quintela, qemu-devel,
	Aaron Lindsay, qemu-arm

* Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> Cc'ing migration team and qemu-arm@ list.

I'll have to leave the detail of that to the ARM peole; but from a
migration point of view I think we do want the 64 bit ARM migrations to
be stable now.  Please tie incompatible changes to machine types.

Dave

> On 2/3/21 5:01 AM, Aaron Lindsay wrote:
> > Hello,
> > 
> > I'm attempting to restore an AArch64 snapshot taken on QEMU 4.1.0 on
> > QEMU 5.2.0, using system mode. My previous impression, possibly from
> > https://wiki.qemu.org/Features/Migration/Troubleshooting#Basics was that
> > this ought to work:
> > 
> >> Note that QEMU supports migrating forward between QEMU versions
> > 
> > Note that I'm using qemu-system-aarch64 with -loadvm.
> > 
> > However, I've run into several issues I thought I should report. The
> > first of them was that this commit changed the address of CBAR, which
> > resulted in a mismatch of the register IDs in `cpu_post_load` in
> > target/arm/machine.c:
> > https://patchwork.kernel.org/project/qemu-devel/patch/20190927144249.29999-2-peter.maydell@linaro.org/
> > 
> > The second was that several system registers have changed which bits are
> > allowed to be written in different circumstances, seemingly as a result
> > of a combination of bugfixes and implementation of additional behavior.
> > These hit errors detected in `write_list_to_cpustate` in
> > target/arm/helper.c.
> > 
> > The third is that meanings of the bits in env->features (as defined by
> > `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> > ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
> > ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> > added since 4.1.0. Heck, even I have added a field there in the past.
> > Unfortunately, these additions/removals mean that when env->features is
> > saved on one version and restored on another the bits can mean different
> > things. Notably, the removal of the *VFP features means that a snapshot
> > of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> > ARM_FEATURE_M on 5.2.0!
> > 
> > My guess, given the numerous issues and the additional complexity
> > required to properly implement backwards-compatible snapshotting, is
> > that this is not a primary goal. However, if it is a goal, what steps
> > can/should we take to support it more thoroughly?
> > 
> > Thanks!
> > 
> > -Aaron
> > 
> > p.s. Now for an admission: the snapshots I'm testing with were
> > originally taken with `-cpu max`. This was unintentional, and I
> > understand if the response is that I can't expect `-cpu max` checkpoints
> > to work across QEMU versions... but I also don't think that all of these
> > issues can be blamed on that (notably CBAR and env->features).
> > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:27   ` Dr. David Alan Gilbert
@ 2021-02-03 10:38     ` Peter Maydell
  2021-02-03 10:49       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Maydell @ 2021-02-03 10:38 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Andrew Jones, Juan Quintela, QEMU Developers, Aaron Lindsay,
	qemu-arm, Philippe Mathieu-Daudé

On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>
> * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > Cc'ing migration team and qemu-arm@ list.
>
> I'll have to leave the detail of that to the ARM peole; but from a
> migration point of view I think we do want the 64 bit ARM migrations to
> be stable now.  Please tie incompatible changes to machine types.

That is the intention, but because there's no upstream testing
of migration compat, we never notice if we get it wrong.
What is x86 doing to keep cross-version migration working ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:38     ` Peter Maydell
@ 2021-02-03 10:49       ` Dr. David Alan Gilbert
  2021-02-03 10:52         ` Peter Maydell
  0 siblings, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-03 10:49 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Andrew Jones, Juan Quintela, QEMU Developers, Aaron Lindsay,
	qemu-arm, Philippe Mathieu-Daudé

* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >
> > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > > Cc'ing migration team and qemu-arm@ list.
> >
> > I'll have to leave the detail of that to the ARM peole; but from a
> > migration point of view I think we do want the 64 bit ARM migrations to
> > be stable now.  Please tie incompatible changes to machine types.
> 
> That is the intention, but because there's no upstream testing
> of migration compat, we never notice if we get it wrong.
> What is x86 doing to keep cross-version migration working ?

I know there used to be some of our team running Avocado tests for
compatibility regularly, I'm not sure of the current status.
It's something we also do regularly around when we do downstream
releases, so we tend to catch them then, although even on x86 that
often turns out to be a bit late.

Avocado can take a separate qemu path for source/destination.

Dave

> thanks
> -- PMM
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:49       ` Dr. David Alan Gilbert
@ 2021-02-03 10:52         ` Peter Maydell
  2021-02-03 10:59           ` Dr. David Alan Gilbert
  2021-02-03 12:44           ` Andrew Jones
  0 siblings, 2 replies; 22+ messages in thread
From: Peter Maydell @ 2021-02-03 10:52 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Andrew Jones, Juan Quintela, QEMU Developers, Aaron Lindsay,
	qemu-arm, Philippe Mathieu-Daudé

On Wed, 3 Feb 2021 at 10:49, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>
> * Peter Maydell (peter.maydell@linaro.org) wrote:
> > On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > >
> > > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > > > Cc'ing migration team and qemu-arm@ list.
> > >
> > > I'll have to leave the detail of that to the ARM peole; but from a
> > > migration point of view I think we do want the 64 bit ARM migrations to
> > > be stable now.  Please tie incompatible changes to machine types.
> >
> > That is the intention, but because there's no upstream testing
> > of migration compat, we never notice if we get it wrong.
> > What is x86 doing to keep cross-version migration working ?
>
> I know there used to be some of our team running Avocado tests for
> compatibility regularly, I'm not sure of the current status.
> It's something we also do regularly around when we do downstream
> releases, so we tend to catch them then, although even on x86 that
> often turns out to be a bit late.

So downstream testing only? I think that unless we either (a) start
doing migration-compat testing consistently upstream or (b) RedHat or
some other downstream start testing and reporting compat issues
to us for aarch64 as they do for x86-64, in practice we're just
not going to have working migration compat despite our best
intentions. (None of the issues Aaron raises were deliberate
compat breaks -- they're all "we made a change we didn't think
affected migration but it turns out that it does".)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:52         ` Peter Maydell
@ 2021-02-03 10:59           ` Dr. David Alan Gilbert
  2021-02-03 16:04             ` Philippe Mathieu-Daudé
  2021-02-03 12:44           ` Andrew Jones
  1 sibling, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-03 10:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Andrew Jones, Juan Quintela, QEMU Developers, Aaron Lindsay,
	qemu-arm, Philippe Mathieu-Daudé

* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Wed, 3 Feb 2021 at 10:49, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >
> > * Peter Maydell (peter.maydell@linaro.org) wrote:
> > > On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > > >
> > > > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > > > > Cc'ing migration team and qemu-arm@ list.
> > > >
> > > > I'll have to leave the detail of that to the ARM peole; but from a
> > > > migration point of view I think we do want the 64 bit ARM migrations to
> > > > be stable now.  Please tie incompatible changes to machine types.
> > >
> > > That is the intention, but because there's no upstream testing
> > > of migration compat, we never notice if we get it wrong.
> > > What is x86 doing to keep cross-version migration working ?
> >
> > I know there used to be some of our team running Avocado tests for
> > compatibility regularly, I'm not sure of the current status.
> > It's something we also do regularly around when we do downstream
> > releases, so we tend to catch them then, although even on x86 that
> > often turns out to be a bit late.
> 
> So downstream testing only?

I thought there used to be some regular avocado testing of upstream but
I'm not sure if it's all architectures and I'm not sure if it's still
happening; I haven't seen any migration issues from it for a while,
which makes me think it isn't.

> I think that unless we either (a) start
> doing migration-compat testing consistently upstream or (b) RedHat or
> some other downstream start testing and reporting compat issues
> to us for aarch64 as they do for x86-64, in practice we're just
> not going to have working migration compat despite our best
> intentions. (None of the issues Aaron raises were deliberate
> compat breaks -- they're all "we made a change we didn't think
> affected migration but it turns out that it does".)

I'd agree; we still hit this too often on x86 as well.

Dave

> thanks
> -- PMM
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:52         ` Peter Maydell
  2021-02-03 10:59           ` Dr. David Alan Gilbert
@ 2021-02-03 12:44           ` Andrew Jones
  2021-02-03 15:45             ` aaron--- via
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-02-03 12:44 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, QEMU Developers, Dr. David Alan Gilbert,
	Aaron Lindsay, qemu-arm, Philippe Mathieu-Daudé

On Wed, Feb 03, 2021 at 10:52:59AM +0000, Peter Maydell wrote:
> On Wed, 3 Feb 2021 at 10:49, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >
> > * Peter Maydell (peter.maydell@linaro.org) wrote:
> > > On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > > >
> > > > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > > > > Cc'ing migration team and qemu-arm@ list.
> > > >
> > > > I'll have to leave the detail of that to the ARM peole; but from a
> > > > migration point of view I think we do want the 64 bit ARM migrations to
> > > > be stable now.  Please tie incompatible changes to machine types.
> > >
> > > That is the intention, but because there's no upstream testing
> > > of migration compat, we never notice if we get it wrong.
> > > What is x86 doing to keep cross-version migration working ?
> >
> > I know there used to be some of our team running Avocado tests for
> > compatibility regularly, I'm not sure of the current status.
> > It's something we also do regularly around when we do downstream
> > releases, so we tend to catch them then, although even on x86 that
> > often turns out to be a bit late.
> 
> So downstream testing only?

Not even downstream for the Arm architecture, at least not at Red Hat. The
support we have for Arm Virt is still limited to the use cases for which
it has been deployed. To this day that hasn't included migration[*].

> I think that unless we either (a) start
> doing migration-compat testing consistently upstream or

This is the best choice and it can certainly be an additional approach
regardless of what goes on downstream. I can look into this. A pointer
to the x86 tests would be a good start. It's pretty simple to do a
quick migration test between two versions of QEMU, but we need the
whole build two versions of QEMU stuff first, which I hope already
exists.

> (b) RedHat or
> some other downstream start testing and reporting compat issues
> to us for aarch64 as they do for x86-64,

Red Hat isn't currently allocating resources to this type of testing
for AArch64. And, even if it were to start, it would likely focus on
compat testing of RHEL arm-virt machine types, which aren't exactly the
same thing as the upstream arm-virt machine types. In fact, currently,
there are only two (virt-rhel8.2.0 and virt-rhel8.3.0).

Thanks,
drew

[*] Migration support hasn't been a real high priority for AArch64 KVM,
    because it currently requires identical host CPU types and identical
    host kernels to work. It'd be nice if the destination host kernel
    could at least be a later version, but not even that is guaranteed
    to work at this time.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:01 ` Peter Maydell
@ 2021-02-03 14:58   ` Aaron Lindsay
  2021-02-03 15:02     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 22+ messages in thread
From: Aaron Lindsay @ 2021-02-03 14:58 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Andrew Jones, Juan Quintela, qemu-devel, Dr. David Alan Gilbert,
	qemu-arm, philmd

On Feb 03 10:01, Peter Maydell wrote:
> On Wed, 3 Feb 2021 at 04:01, Aaron Lindsay <aaron@os.amperecomputing.com> wrote:
> > Note that I'm using qemu-system-aarch64 with -loadvm.
> 
> You don't say what machine type and command line you're using. Strictly,
> Strictly speaking, the intended guarantee only covers versioned
> machines, eg "virt-4.1" on QEMU 4.1 to "virt-4.1" on QEMU 5.2.

Sorry for the omission, `-M virt` was used to capture the snapshots on
4.1.0 and I'm using `-M virt-4.1` to restore it on 5.2.0. You get a
fairly helpful error message about that one if you screw it up:

"qemu-system-aarch64: Machine type received is 'virt-4.1' and local is 'virt-5.2'"

I'm restoring using the following command-line on 5.2.0. Snapshots were taken
with the same, except `-M virt` and without `-loadvm` on 4.1.0:

$ qemu-system-aarch64 -smp 1 -M virt-4.1 -cpu max -m 8G -icount 0 -nographic -semihosting -kernel ./Image -append 'console=ttyAMA0 root=/dev/vda3 rw earlycon=smh' -drive if=none,id=image,file=rootfs.qcow2,format=qcow2 -device virtio-blk-pci,drive=image -loadvm "snapshot-name"

I suspect that some of the differences I raised in this section:

> > These hit errors detected in `write_list_to_cpustate` in
> > target/arm/helper.c.

may be from the fact that I'm running on -max, which understandably may cause
changes in the ID registers between QEMU versions. I'm less concerned about
issues related to my usage of -max than I am about the others.

> > The third is that meanings of the bits in env->features (as defined by
> > `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> > ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
> > ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> > added since 4.1.0. Heck, even I have added a field there in the past.
> > Unfortunately, these additions/removals mean that when env->features is
> > saved on one version and restored on another the bits can mean different
> > things. Notably, the removal of the *VFP features means that a snapshot
> > of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> > ARM_FEATURE_M on 5.2.0!
> 
> Ow. I didn't realize the env->features was in the migration state :-(
> There is no reason for it to be, because it's a constant property
> of the CPU. The easy fix is to replace
>        VMSTATE_UINT64(env.features, ARMCPU),
> in target/arm/machine.c with whatever the syntax is for "ignore
> 64 bits of data here". Then we'll ignore whatever is coming in
> from the source, which we don't need, and we'll stop sending it
> out if we're the destination.

I'll look into this.

-Aaron


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 14:58   ` Aaron Lindsay
@ 2021-02-03 15:02     ` Philippe Mathieu-Daudé
  2021-02-03 15:10       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-03 15:02 UTC (permalink / raw)
  To: Aaron Lindsay, Peter Maydell
  Cc: Andrew Jones, qemu-arm, qemu-devel, Dr. David Alan Gilbert,
	Juan Quintela

On 2/3/21 3:58 PM, Aaron Lindsay wrote:
> On Feb 03 10:01, Peter Maydell wrote:
>>> The third is that meanings of the bits in env->features (as defined by
>>> `enum arm_features` in target/arm/cpu.h) has shifted. For example,
>>> ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
>>> ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
>>> added since 4.1.0. Heck, even I have added a field there in the past.
>>> Unfortunately, these additions/removals mean that when env->features is
>>> saved on one version and restored on another the bits can mean different
>>> things. Notably, the removal of the *VFP features means that a snapshot
>>> of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
>>> ARM_FEATURE_M on 5.2.0!
>>
>> Ow. I didn't realize the env->features was in the migration state :-(
>> There is no reason for it to be, because it's a constant property
>> of the CPU. The easy fix is to replace
>>        VMSTATE_UINT64(env.features, ARMCPU),
>> in target/arm/machine.c with whatever the syntax is for "ignore
>> 64 bits of data here". Then we'll ignore whatever is coming in
>> from the source, which we don't need, and we'll stop sending it
>> out if we're the destination.
> 
> I'll look into this.

I think this is:

  VMSTATE_UNUSED(sizeof(uint64_t))

> 
> -Aaron
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 15:02     ` Philippe Mathieu-Daudé
@ 2021-02-03 15:10       ` Dr. David Alan Gilbert
  2021-02-03 15:26         ` Peter Maydell
  2021-02-03 15:42         ` ARM Snapshots Not Backwards-Compatible aaron--- via
  0 siblings, 2 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-03 15:10 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Peter Maydell, Andrew Jones, Juan Quintela, qemu-devel,
	Aaron Lindsay, qemu-arm

* Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> On 2/3/21 3:58 PM, Aaron Lindsay wrote:
> > On Feb 03 10:01, Peter Maydell wrote:
> >>> The third is that meanings of the bits in env->features (as defined by
> >>> `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> >>> ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
> >>> ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> >>> added since 4.1.0. Heck, even I have added a field there in the past.
> >>> Unfortunately, these additions/removals mean that when env->features is
> >>> saved on one version and restored on another the bits can mean different
> >>> things. Notably, the removal of the *VFP features means that a snapshot
> >>> of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> >>> ARM_FEATURE_M on 5.2.0!
> >>
> >> Ow. I didn't realize the env->features was in the migration state :-(
> >> There is no reason for it to be, because it's a constant property
> >> of the CPU. The easy fix is to replace
> >>        VMSTATE_UINT64(env.features, ARMCPU),
> >> in target/arm/machine.c with whatever the syntax is for "ignore
> >> 64 bits of data here". Then we'll ignore whatever is coming in
> >> from the source, which we don't need, and we'll stop sending it
> >> out if we're the destination.
> > 
> > I'll look into this.
> 
> I think this is:
> 
>   VMSTATE_UNUSED(sizeof(uint64_t))

It's interesting that on x86 we've got a longterm request to *add* cpu
features to the stream to detect screwups caused by using mismatched
CPUs; so it's not necessarily a bad idea to include it once you realise
it's there.

Dave

> > 
> > -Aaron
> > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 15:10       ` Dr. David Alan Gilbert
@ 2021-02-03 15:26         ` Peter Maydell
  2021-02-03 15:54           ` Aaron Lindsay
  2021-02-03 16:13           ` [PATCH] target/arm: Don't migrate CPUARMState.features Aaron Lindsay
  2021-02-03 15:42         ` ARM Snapshots Not Backwards-Compatible aaron--- via
  1 sibling, 2 replies; 22+ messages in thread
From: Peter Maydell @ 2021-02-03 15:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Andrew Jones, Juan Quintela, QEMU Developers, Aaron Lindsay,
	qemu-arm, Philippe Mathieu-Daudé

On Wed, 3 Feb 2021 at 15:11, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> It's interesting that on x86 we've got a longterm request to *add* cpu
> features to the stream to detect screwups caused by using mismatched
> CPUs; so it's not necessarily a bad idea to include it once you realise
> it's there.

I think we would want to do that by checking the ID registers,
not the legacy ad-hoc feature-flags word. In fact I think for
KVM at least we already perform the check on the ID registers,
in that the kernel will return an error if we pass it a value
for an ID register and it's not the value it should be.
I forget whether we try to check this for TCG (the mechanism
for handling sysreg migration there is different).

As a side note I suspect that the first two issues Aaron ran
into are TCG-only and don't affect KVM.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 15:10       ` Dr. David Alan Gilbert
  2021-02-03 15:26         ` Peter Maydell
@ 2021-02-03 15:42         ` aaron--- via
  1 sibling, 0 replies; 22+ messages in thread
From: aaron--- via @ 2021-02-03 15:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, Andrew Jones, Juan Quintela, qemu-devel, qemu-arm,
	Philippe Mathieu-Daudé

On Feb 03 15:10, Dr. David Alan Gilbert wrote:
> * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > On 2/3/21 3:58 PM, Aaron Lindsay wrote:
> > > On Feb 03 10:01, Peter Maydell wrote:
> > >>> The third is that meanings of the bits in env->features (as defined by
> > >>> `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> > >>> ARM_FEATURE_PXN, ARM_FEATURE_CRC, ARM_FEATURE_VFP, ARM_FEATURE_VFP3,
> > >>> ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> > >>> added since 4.1.0. Heck, even I have added a field there in the past.
> > >>> Unfortunately, these additions/removals mean that when env->features is
> > >>> saved on one version and restored on another the bits can mean different
> > >>> things. Notably, the removal of the *VFP features means that a snapshot
> > >>> of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> > >>> ARM_FEATURE_M on 5.2.0!
> > >>
> > >> Ow. I didn't realize the env->features was in the migration state :-(
> > >> There is no reason for it to be, because it's a constant property
> > >> of the CPU. The easy fix is to replace
> > >>        VMSTATE_UINT64(env.features, ARMCPU),
> > >> in target/arm/machine.c with whatever the syntax is for "ignore
> > >> 64 bits of data here". Then we'll ignore whatever is coming in
> > >> from the source, which we don't need, and we'll stop sending it
> > >> out if we're the destination.
> > > 
> > > I'll look into this.
> > 
> > I think this is:
> > 
> >   VMSTATE_UNUSED(sizeof(uint64_t))
> 
> It's interesting that on x86 we've got a longterm request to *add* cpu
> features to the stream to detect screwups caused by using mismatched
> CPUs; so it's not necessarily a bad idea to include it once you realise
> it's there.

Another approach would be to assign integer values to the members of
`enum arm_features` in target/arm/cpu.h instead of letting the compiler
assign them for us. Of course, this only fixes the problem moving
forward and doesn't allow future QEMU versions to gain the ability to
load snapshots taken with previous QEMU versions. But seeing as how
no one else has raised the issue, I'm not sure how many people this
impacts.

I've tested the `VMSTATE_UNUSED(sizeof(uint64_t))` change and it appears
to have the same effect as me manually fixing up env.features (which is
to say: it works).

Maybe we could change it to `VMSTATE_UINT64(env.incoming_features,
ARMCPU),` instead and add a check in `cpu_post_load` that the
incoming_features match the expected ones (along with something matching
in `cpu_pre_save` to populate it)? We could ignore the incoming_features
field entirely in cpu_post_load for old snapshot versions, but check
that they match moving forward (perhaps generating a mask from the
current value of the enum to ignore fields which have been removed)?

I can send a patch for whichever path we settle on.

-Aaron


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 12:44           ` Andrew Jones
@ 2021-02-03 15:45             ` aaron--- via
  2021-02-03 15:53               ` Andrew Jones
  0 siblings, 1 reply; 22+ messages in thread
From: aaron--- via @ 2021-02-03 15:45 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Juan Quintela, QEMU Developers,
	Dr. David Alan Gilbert, qemu-arm, Philippe Mathieu-Daudé

On Feb 03 13:44, Andrew Jones wrote:
> On Wed, Feb 03, 2021 at 10:52:59AM +0000, Peter Maydell wrote:
> > On Wed, 3 Feb 2021 at 10:49, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > >
> > > * Peter Maydell (peter.maydell@linaro.org) wrote:
> > > > On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > > > >
> > > > > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > > > > > Cc'ing migration team and qemu-arm@ list.
> > > > >
> > > > > I'll have to leave the detail of that to the ARM peole; but from a
> > > > > migration point of view I think we do want the 64 bit ARM migrations to
> > > > > be stable now.  Please tie incompatible changes to machine types.
> > > >
> > > > That is the intention, but because there's no upstream testing
> > > > of migration compat, we never notice if we get it wrong.
> > > > What is x86 doing to keep cross-version migration working ?
> > >
> > > I know there used to be some of our team running Avocado tests for
> > > compatibility regularly, I'm not sure of the current status.
> > > It's something we also do regularly around when we do downstream
> > > releases, so we tend to catch them then, although even on x86 that
> > > often turns out to be a bit late.
> > 
> > So downstream testing only?
> 
> Not even downstream for the Arm architecture, at least not at Red Hat. The
> support we have for Arm Virt is still limited to the use cases for which
> it has been deployed. To this day that hasn't included migration[*].
> 
> > I think that unless we either (a) start
> > doing migration-compat testing consistently upstream or
> 
> This is the best choice and it can certainly be an additional approach
> regardless of what goes on downstream. I can look into this. A pointer
> to the x86 tests would be a good start. It's pretty simple to do a
> quick migration test between two versions of QEMU, but we need the
> whole build two versions of QEMU stuff first, which I hope already
> exists.

Does this mean that this is largely an issue of developing the tests,
and not a need for a place to host them? Or would additional
hardware/hosting be required for these tests to be run on?

-Aaron


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 15:45             ` aaron--- via
@ 2021-02-03 15:53               ` Andrew Jones
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-02-03 15:53 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: Peter Maydell, Juan Quintela, QEMU Developers,
	Dr. David Alan Gilbert, qemu-arm, Philippe Mathieu-Daudé

On Wed, Feb 03, 2021 at 10:45:47AM -0500, Aaron Lindsay wrote:
> On Feb 03 13:44, Andrew Jones wrote:
> > On Wed, Feb 03, 2021 at 10:52:59AM +0000, Peter Maydell wrote:
> > > On Wed, 3 Feb 2021 at 10:49, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > > >
> > > > * Peter Maydell (peter.maydell@linaro.org) wrote:
> > > > > On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > > > > >
> > > > > > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > > > > > > Cc'ing migration team and qemu-arm@ list.
> > > > > >
> > > > > > I'll have to leave the detail of that to the ARM peole; but from a
> > > > > > migration point of view I think we do want the 64 bit ARM migrations to
> > > > > > be stable now.  Please tie incompatible changes to machine types.
> > > > >
> > > > > That is the intention, but because there's no upstream testing
> > > > > of migration compat, we never notice if we get it wrong.
> > > > > What is x86 doing to keep cross-version migration working ?
> > > >
> > > > I know there used to be some of our team running Avocado tests for
> > > > compatibility regularly, I'm not sure of the current status.
> > > > It's something we also do regularly around when we do downstream
> > > > releases, so we tend to catch them then, although even on x86 that
> > > > often turns out to be a bit late.
> > > 
> > > So downstream testing only?
> > 
> > Not even downstream for the Arm architecture, at least not at Red Hat. The
> > support we have for Arm Virt is still limited to the use cases for which
> > it has been deployed. To this day that hasn't included migration[*].
> > 
> > > I think that unless we either (a) start
> > > doing migration-compat testing consistently upstream or
> > 
> > This is the best choice and it can certainly be an additional approach
> > regardless of what goes on downstream. I can look into this. A pointer
> > to the x86 tests would be a good start. It's pretty simple to do a
> > quick migration test between two versions of QEMU, but we need the
> > whole build two versions of QEMU stuff first, which I hope already
> > exists.
> 
> Does this mean that this is largely an issue of developing the tests,
> and not a need for a place to host them? Or would additional
> hardware/hosting be required for these tests to be run on?
>

I wouldn't expect the new tests to require new resources. Whatever
upstream already has for the tests that are already being run should be
sufficient. I think we just need to see if we can develop/enable these
types of tests in the already present CI. At least for starters...

Thanks,
drew



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 15:26         ` Peter Maydell
@ 2021-02-03 15:54           ` Aaron Lindsay
  2021-02-03 16:13           ` [PATCH] target/arm: Don't migrate CPUARMState.features Aaron Lindsay
  1 sibling, 0 replies; 22+ messages in thread
From: Aaron Lindsay @ 2021-02-03 15:54 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Andrew Jones, Juan Quintela, Dr. David Alan Gilbert,
	QEMU Developers, qemu-arm, Philippe Mathieu-Daudé

On Feb 03 15:26, Peter Maydell wrote:
> On Wed, 3 Feb 2021 at 15:11, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > It's interesting that on x86 we've got a longterm request to *add* cpu
> > features to the stream to detect screwups caused by using mismatched
> > CPUs; so it's not necessarily a bad idea to include it once you realise
> > it's there.
> 
> I think we would want to do that by checking the ID registers,
> not the legacy ad-hoc feature-flags word. In fact I think for
> KVM at least we already perform the check on the ID registers,
> in that the kernel will return an error if we pass it a value
> for an ID register and it's not the value it should be.
> I forget whether we try to check this for TCG (the mechanism
> for handling sysreg migration there is different).

Ah, I sent my reply where I mentioned attempting to find a way to
selectively check env.features before I saw this. I'll just send the
patch for `VMSTATE_UNUSED(sizeof(uint64_t))`.

-Aaron


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ARM Snapshots Not Backwards-Compatible
  2021-02-03 10:59           ` Dr. David Alan Gilbert
@ 2021-02-03 16:04             ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-03 16:04 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, avocado-devel
  Cc: Peter Maydell, Andrew Jones, Juan Quintela, QEMU Developers,
	Aaron Lindsay, qemu-arm

On 2/3/21 11:59 AM, Dr. David Alan Gilbert wrote:
> * Peter Maydell (peter.maydell@linaro.org) wrote:
>> On Wed, 3 Feb 2021 at 10:49, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>>>
>>> * Peter Maydell (peter.maydell@linaro.org) wrote:
>>>> On Wed, 3 Feb 2021 at 10:28, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>>>>>
>>>>> * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
>>>>>> Cc'ing migration team and qemu-arm@ list.
>>>>>
>>>>> I'll have to leave the detail of that to the ARM peole; but from a
>>>>> migration point of view I think we do want the 64 bit ARM migrations to
>>>>> be stable now.  Please tie incompatible changes to machine types.
>>>>
>>>> That is the intention, but because there's no upstream testing
>>>> of migration compat, we never notice if we get it wrong.
>>>> What is x86 doing to keep cross-version migration working ?
>>>
>>> I know there used to be some of our team running Avocado tests for
>>> compatibility regularly, I'm not sure of the current status.
>>> It's something we also do regularly around when we do downstream
>>> releases, so we tend to catch them then, although even on x86 that
>>> often turns out to be a bit late.
>>
>> So downstream testing only?
> 
> I thought there used to be some regular avocado testing of upstream but
> I'm not sure if it's all architectures and I'm not sure if it's still
> happening; I haven't seen any migration issues from it for a while,
> which makes me think it isn't.
> 
>> I think that unless we either (a) start
>> doing migration-compat testing consistently upstream or (b) RedHat or
>> some other downstream start testing and reporting compat issues
>> to us for aarch64 as they do for x86-64, in practice we're just
>> not going to have working migration compat despite our best
>> intentions. (None of the issues Aaron raises were deliberate
>> compat breaks -- they're all "we made a change we didn't think
>> affected migration but it turns out that it does".)
> 
> I'd agree; we still hit this too often on x86 as well.

Cc'ing avocado-devel, since previously discussed:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg654139.html



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] target/arm: Don't migrate CPUARMState.features
  2021-02-03 15:26         ` Peter Maydell
  2021-02-03 15:54           ` Aaron Lindsay
@ 2021-02-03 16:13           ` Aaron Lindsay
  2021-02-03 16:24             ` Philippe Mathieu-Daudé
                               ` (2 more replies)
  1 sibling, 3 replies; 22+ messages in thread
From: Aaron Lindsay @ 2021-02-03 16:13 UTC (permalink / raw)
  To: qemu-devel, Peter Maydell
  Cc: Andrew Jones, Juan Quintela, Dr . David Alan Gilbert,
	Aaron Lindsay, qemu-arm, philmd

As feature flags are added or removed, the meanings of bits in the
`features` field can change between QEMU versions, causing migration
failures. Additionally, migrating the field is not useful because it is
a constant function of the CPU being used.

Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/machine.c b/target/arm/machine.c
index c9e9fd0a12..7f2511b6ed 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -834,7 +834,7 @@ const VMStateDescription vmstate_arm_cpu = {
         VMSTATE_UINT64(env.exclusive_addr, ARMCPU),
         VMSTATE_UINT64(env.exclusive_val, ARMCPU),
         VMSTATE_UINT64(env.exclusive_high, ARMCPU),
-        VMSTATE_UINT64(env.features, ARMCPU),
+        VMSTATE_UNUSED(sizeof(uint64_t)),
         VMSTATE_UINT32(env.exception.syndrome, ARMCPU),
         VMSTATE_UINT32(env.exception.fsr, ARMCPU),
         VMSTATE_UINT64(env.exception.vaddress, ARMCPU),
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] target/arm: Don't migrate CPUARMState.features
  2021-02-03 16:13           ` [PATCH] target/arm: Don't migrate CPUARMState.features Aaron Lindsay
@ 2021-02-03 16:24             ` Philippe Mathieu-Daudé
  2021-02-03 20:06             ` Andrew Jones
  2021-02-08 13:11             ` Peter Maydell
  2 siblings, 0 replies; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-03 16:24 UTC (permalink / raw)
  To: Aaron Lindsay, qemu-devel, Peter Maydell
  Cc: Andrew Jones, qemu-arm, Dr . David Alan Gilbert, Juan Quintela

On 2/3/21 5:13 PM, Aaron Lindsay wrote:
> As feature flags are added or removed, the meanings of bits in the
> `features` field can change between QEMU versions, causing migration
> failures. Additionally, migrating the field is not useful because it is
> a constant function of the CPU being used.

Please don't bury patches within mailing list threads.

BTW you found yet another 13 years old problem :)
918f5dca18d ("target-arm: Extend feature flags to 64 bits")
aa941b94450 ("Savevm/loadvm bits for ARM core, the PXA2xx peripherals
and Spitz hardware.")

> 
> Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
> Suggested-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/machine.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/arm/machine.c b/target/arm/machine.c
> index c9e9fd0a12..7f2511b6ed 100644
> --- a/target/arm/machine.c
> +++ b/target/arm/machine.c
> @@ -834,7 +834,7 @@ const VMStateDescription vmstate_arm_cpu = {
>          VMSTATE_UINT64(env.exclusive_addr, ARMCPU),
>          VMSTATE_UINT64(env.exclusive_val, ARMCPU),
>          VMSTATE_UINT64(env.exclusive_high, ARMCPU),
> -        VMSTATE_UINT64(env.features, ARMCPU),
> +        VMSTATE_UNUSED(sizeof(uint64_t)),
>          VMSTATE_UINT32(env.exception.syndrome, ARMCPU),
>          VMSTATE_UINT32(env.exception.fsr, ARMCPU),
>          VMSTATE_UINT64(env.exception.vaddress, ARMCPU),
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] target/arm: Don't migrate CPUARMState.features
  2021-02-03 16:13           ` [PATCH] target/arm: Don't migrate CPUARMState.features Aaron Lindsay
  2021-02-03 16:24             ` Philippe Mathieu-Daudé
@ 2021-02-03 20:06             ` Andrew Jones
  2021-02-08 13:11             ` Peter Maydell
  2 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-02-03 20:06 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: Peter Maydell, Juan Quintela, qemu-devel,
	Dr . David Alan Gilbert, qemu-arm, philmd

On Wed, Feb 03, 2021 at 11:13:40AM -0500, Aaron Lindsay wrote:
> As feature flags are added or removed, the meanings of bits in the
> `features` field can change between QEMU versions, causing migration
> failures. Additionally, migrating the field is not useful because it is
> a constant function of the CPU being used.
> 
> Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
> Suggested-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/machine.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/arm/machine.c b/target/arm/machine.c
> index c9e9fd0a12..7f2511b6ed 100644
> --- a/target/arm/machine.c
> +++ b/target/arm/machine.c
> @@ -834,7 +834,7 @@ const VMStateDescription vmstate_arm_cpu = {
>          VMSTATE_UINT64(env.exclusive_addr, ARMCPU),
>          VMSTATE_UINT64(env.exclusive_val, ARMCPU),
>          VMSTATE_UINT64(env.exclusive_high, ARMCPU),
> -        VMSTATE_UINT64(env.features, ARMCPU),
> +        VMSTATE_UNUSED(sizeof(uint64_t)),
>          VMSTATE_UINT32(env.exception.syndrome, ARMCPU),
>          VMSTATE_UINT32(env.exception.fsr, ARMCPU),
>          VMSTATE_UINT64(env.exception.vaddress, ARMCPU),
> -- 
> 2.17.1
>

I tested this with a KVM guest. Prior to this patch, a migration of a 4.1
guest from a 4.1 build of QEMU to a build of the latest bits of QEMU would
fail with a segmentation fault. With this patch, the migration succeeds.
However, migrating a virt-4.1 machine type from a QEMU of the latest bits
to a 4.1 build of QEMU still fails with

  qemu-system-aarch64: error while loading state for instance 0x0 of device 'pl011'
  qemu-system-aarch64: load of migration failed: No such file or directory

This means "ping-pong" migrations fail, which is another thing we would
prefer to work. But, as that appears to be a different bug, regarding this
patch

Reviewed-by: Andrew Jones <drjones@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] target/arm: Don't migrate CPUARMState.features
  2021-02-03 16:13           ` [PATCH] target/arm: Don't migrate CPUARMState.features Aaron Lindsay
  2021-02-03 16:24             ` Philippe Mathieu-Daudé
  2021-02-03 20:06             ` Andrew Jones
@ 2021-02-08 13:11             ` Peter Maydell
  2 siblings, 0 replies; 22+ messages in thread
From: Peter Maydell @ 2021-02-08 13:11 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: Andrew Jones, Juan Quintela, QEMU Developers,
	Dr . David Alan Gilbert, qemu-arm, Philippe Mathieu-Daudé

On Wed, 3 Feb 2021 at 16:14, Aaron Lindsay <aaron@os.amperecomputing.com> wrote:
>
> As feature flags are added or removed, the meanings of bits in the
> `features` field can change between QEMU versions, causing migration
> failures. Additionally, migrating the field is not useful because it is
> a constant function of the CPU being used.
>
> Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com>
> Suggested-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/machine.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)



Applied to target-arm.next, thanks.

-- PMM


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-02-08 19:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-03  4:01 ARM Snapshots Not Backwards-Compatible Aaron Lindsay
2021-02-03  8:21 ` Philippe Mathieu-Daudé
2021-02-03 10:27   ` Dr. David Alan Gilbert
2021-02-03 10:38     ` Peter Maydell
2021-02-03 10:49       ` Dr. David Alan Gilbert
2021-02-03 10:52         ` Peter Maydell
2021-02-03 10:59           ` Dr. David Alan Gilbert
2021-02-03 16:04             ` Philippe Mathieu-Daudé
2021-02-03 12:44           ` Andrew Jones
2021-02-03 15:45             ` aaron--- via
2021-02-03 15:53               ` Andrew Jones
2021-02-03 10:01 ` Peter Maydell
2021-02-03 14:58   ` Aaron Lindsay
2021-02-03 15:02     ` Philippe Mathieu-Daudé
2021-02-03 15:10       ` Dr. David Alan Gilbert
2021-02-03 15:26         ` Peter Maydell
2021-02-03 15:54           ` Aaron Lindsay
2021-02-03 16:13           ` [PATCH] target/arm: Don't migrate CPUARMState.features Aaron Lindsay
2021-02-03 16:24             ` Philippe Mathieu-Daudé
2021-02-03 20:06             ` Andrew Jones
2021-02-08 13:11             ` Peter Maydell
2021-02-03 15:42         ` ARM Snapshots Not Backwards-Compatible aaron--- via

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.