* Kernel patch cases qemu live migration failed. @ 2020-10-15 4:06 ` 张东旭 0 siblings, 0 replies; 30+ messages in thread From: 张东旭 @ 2020-10-15 4:06 UTC (permalink / raw) To: Dave.Martin, marc.zyngier; +Cc: kvmarm, qemu-devel [-- Attachment #1: Type: text/plain, Size: 1193 bytes --] I'm so sorry for disturbing you.When I apply this kernel patch:KVM: arm64/sve: System register context switch and access supporthttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96ddqemu live migration will failed with messages: qemu-kvm: Invalid value 233 expecting positive value <= 232 qemu-kvm: Failed to load cpu:cpreg_vmstate_array_lenNew version kernel exclude ID_AA64ZFR0_EL1 register when host not support SVE feature, so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 register. I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 (migration source) new kernel version:4.18.0-147.5.el8 (migration destination) CentOS linux kernel version 4.18.0-111.el8 applied this patch. So 4.18.0-147.5.el8 also applied this patch. Migration source and destination hosts have the same hardware, just kernel version is different, and the hardware on either side of the migration not support SVE. Is there some good suggestions,which can make sure old version kernel live migration to new version kernel with qemu?Thanks a lot. [-- Attachment #2: Type: text/html, Size: 9109 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Kernel patch cases qemu live migration failed. @ 2020-10-15 4:06 ` 张东旭 0 siblings, 0 replies; 30+ messages in thread From: 张东旭 @ 2020-10-15 4:06 UTC (permalink / raw) To: Dave.Martin, marc.zyngier; +Cc: kvmarm, qemu-devel [-- Attachment #1.1: Type: text/plain, Size: 1193 bytes --] I'm so sorry for disturbing you.When I apply this kernel patch:KVM: arm64/sve: System register context switch and access supporthttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96ddqemu live migration will failed with messages: qemu-kvm: Invalid value 233 expecting positive value <= 232 qemu-kvm: Failed to load cpu:cpreg_vmstate_array_lenNew version kernel exclude ID_AA64ZFR0_EL1 register when host not support SVE feature, so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 register. I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 (migration source) new kernel version:4.18.0-147.5.el8 (migration destination) CentOS linux kernel version 4.18.0-111.el8 applied this patch. So 4.18.0-147.5.el8 also applied this patch. Migration source and destination hosts have the same hardware, just kernel version is different, and the hardware on either side of the migration not support SVE. Is there some good suggestions,which can make sure old version kernel live migration to new version kernel with qemu?Thanks a lot. [-- Attachment #1.2: Type: text/html, Size: 9109 bytes --] [-- Attachment #2: Type: text/plain, Size: 151 bytes --] _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 4:06 ` 张东旭 @ 2020-10-15 11:26 ` Marc Zyngier -1 siblings, 0 replies; 30+ messages in thread From: Marc Zyngier @ 2020-10-15 11:26 UTC (permalink / raw) To: xu910121; +Cc: qemu-devel, Dave.Martin, kvmarm Hi Please don't use my arm.com address anymore, nobody reads it... On 2020-10-15 05:06, 张东旭 wrote: > I'm so sorry for disturbing you. > > When I apply this kernel patch:KVM: arm64/sve: System register > context switch and access support > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd This patch was only introduced in 5.2, and wasn't supposed to be backported to anything else. > qemu live migration will failed with messages: > qemu-kvm: Invalid value 233 expecting positive value > <= 232 > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len > New version kernel exclude ID_AA64ZFR0_EL1 register when host not > support SVE feature, > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 > register. > > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 > (migration source) > new kernel version:4.18.0-147.5.el8 (migration destination) > > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So > 4.18.0-147.5.el8 also applied this patch. > > Migration source and destination hosts have the same hardware, just > kernel version is different, > and the hardware on either side of the migration not support SVE. > > Is there some good suggestions,which can make sure old version > kernel live migration to new version kernel with qemu? I['m afraid you should take this with your distribution of choice, unless you can reproduce the problem with mainline kernels. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-15 11:26 ` Marc Zyngier 0 siblings, 0 replies; 30+ messages in thread From: Marc Zyngier @ 2020-10-15 11:26 UTC (permalink / raw) To: xu910121; +Cc: qemu-devel, Dave.Martin, kvmarm Hi Please don't use my arm.com address anymore, nobody reads it... On 2020-10-15 05:06, 张东旭 wrote: > I'm so sorry for disturbing you. > > When I apply this kernel patch:KVM: arm64/sve: System register > context switch and access support > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd This patch was only introduced in 5.2, and wasn't supposed to be backported to anything else. > qemu live migration will failed with messages: > qemu-kvm: Invalid value 233 expecting positive value > <= 232 > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len > New version kernel exclude ID_AA64ZFR0_EL1 register when host not > support SVE feature, > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 > register. > > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 > (migration source) > new kernel version:4.18.0-147.5.el8 (migration destination) > > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So > 4.18.0-147.5.el8 also applied this patch. > > Migration source and destination hosts have the same hardware, just > kernel version is different, > and the hardware on either side of the migration not support SVE. > > Is there some good suggestions,which can make sure old version > kernel live migration to new version kernel with qemu? I['m afraid you should take this with your distribution of choice, unless you can reproduce the problem with mainline kernels. Thanks, M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 11:26 ` Marc Zyngier @ 2020-10-15 13:35 ` Andrew Jones -1 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-15 13:35 UTC (permalink / raw) To: Marc Zyngier; +Cc: kvmarm, xu910121, qemu-devel, Dave.Martin On Thu, Oct 15, 2020 at 12:26:10PM +0100, Marc Zyngier wrote: > Hi > > Please don't use my arm.com address anymore, nobody reads it... > > On 2020-10-15 05:06, 张东旭 wrote: > > I'm so sorry for disturbing you. > > > > When I apply this kernel patch:KVM: arm64/sve: System register > > context switch and access support > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd > > This patch was only introduced in 5.2, and wasn't supposed to be > backported to anything else. All upstream patches are backport candidates. Welcome to RHEL :-) > > > qemu live migration will failed with messages: > > qemu-kvm: Invalid value 233 expecting positive value > > <= 232 > > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len > > New version kernel exclude ID_AA64ZFR0_EL1 register when host not > > support SVE feature, > > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 > > register. > > > > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 > > (migration source) > > new kernel version:4.18.0-147.5.el8 (migration destination) > > > > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So > > 4.18.0-147.5.el8 also applied this patch. > > > > Migration source and destination hosts have the same hardware, just > > kernel version is different, > > and the hardware on either side of the migration not support SVE. > > > > Is there some good suggestions,which can make sure old version > > kernel live migration to new version kernel with qemu? > > I['m afraid you should take this with your distribution of > choice, unless you can reproduce the problem with mainline kernels. I'll bet this reproduces when migrating from a mainline pre SVE register filtering kernel to a mainline that does the filtering. I don't have time to test that though. Thanks, drew ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-15 13:35 ` Andrew Jones 0 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-15 13:35 UTC (permalink / raw) To: Marc Zyngier; +Cc: kvmarm, xu910121, qemu-devel, Dave.Martin On Thu, Oct 15, 2020 at 12:26:10PM +0100, Marc Zyngier wrote: > Hi > > Please don't use my arm.com address anymore, nobody reads it... > > On 2020-10-15 05:06, 张东旭 wrote: > > I'm so sorry for disturbing you. > > > > When I apply this kernel patch:KVM: arm64/sve: System register > > context switch and access support > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd > > This patch was only introduced in 5.2, and wasn't supposed to be > backported to anything else. All upstream patches are backport candidates. Welcome to RHEL :-) > > > qemu live migration will failed with messages: > > qemu-kvm: Invalid value 233 expecting positive value > > <= 232 > > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len > > New version kernel exclude ID_AA64ZFR0_EL1 register when host not > > support SVE feature, > > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 > > register. > > > > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 > > (migration source) > > new kernel version:4.18.0-147.5.el8 (migration destination) > > > > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So > > 4.18.0-147.5.el8 also applied this patch. > > > > Migration source and destination hosts have the same hardware, just > > kernel version is different, > > and the hardware on either side of the migration not support SVE. > > > > Is there some good suggestions,which can make sure old version > > kernel live migration to new version kernel with qemu? > > I['m afraid you should take this with your distribution of > choice, unless you can reproduce the problem with mainline kernels. I'll bet this reproduces when migrating from a mainline pre SVE register filtering kernel to a mainline that does the filtering. I don't have time to test that though. Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 13:35 ` Andrew Jones @ 2020-10-15 13:52 ` Marc Zyngier -1 siblings, 0 replies; 30+ messages in thread From: Marc Zyngier @ 2020-10-15 13:52 UTC (permalink / raw) To: Andrew Jones; +Cc: kvmarm, xu910121, qemu-devel, Dave.Martin On 2020-10-15 14:35, Andrew Jones wrote: > On Thu, Oct 15, 2020 at 12:26:10PM +0100, Marc Zyngier wrote: >> Hi >> >> Please don't use my arm.com address anymore, nobody reads it... >> >> On 2020-10-15 05:06, 张东旭 wrote: >> > I'm so sorry for disturbing you. >> > >> > When I apply this kernel patch:KVM: arm64/sve: System register >> > context switch and access support >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd >> >> This patch was only introduced in 5.2, and wasn't supposed to be >> backported to anything else. > > All upstream patches are backport candidates. Welcome to RHEL :-) Great! RHEL gets to keep the pieces! ;-) > >> >> > qemu live migration will failed with messages: >> > qemu-kvm: Invalid value 233 expecting positive value >> > <= 232 >> > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len >> > New version kernel exclude ID_AA64ZFR0_EL1 register when host not >> > support SVE feature, >> > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 >> > register. >> > >> > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 >> > (migration source) >> > new kernel version:4.18.0-147.5.el8 (migration destination) >> > >> > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So >> > 4.18.0-147.5.el8 also applied this patch. >> > >> > Migration source and destination hosts have the same hardware, just >> > kernel version is different, >> > and the hardware on either side of the migration not support SVE. >> > >> > Is there some good suggestions,which can make sure old version >> > kernel live migration to new version kernel with qemu? >> >> I['m afraid you should take this with your distribution of >> choice, unless you can reproduce the problem with mainline kernels. > > I'll bet this reproduces when migrating from a mainline pre SVE > register > filtering kernel to a mainline that does the filtering. I don't have > time to test that though. Probably. I'm not really sure where to find SVE HW though, other than emulation... M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-15 13:52 ` Marc Zyngier 0 siblings, 0 replies; 30+ messages in thread From: Marc Zyngier @ 2020-10-15 13:52 UTC (permalink / raw) To: Andrew Jones; +Cc: kvmarm, xu910121, qemu-devel, Dave.Martin On 2020-10-15 14:35, Andrew Jones wrote: > On Thu, Oct 15, 2020 at 12:26:10PM +0100, Marc Zyngier wrote: >> Hi >> >> Please don't use my arm.com address anymore, nobody reads it... >> >> On 2020-10-15 05:06, 张东旭 wrote: >> > I'm so sorry for disturbing you. >> > >> > When I apply this kernel patch:KVM: arm64/sve: System register >> > context switch and access support >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd >> >> This patch was only introduced in 5.2, and wasn't supposed to be >> backported to anything else. > > All upstream patches are backport candidates. Welcome to RHEL :-) Great! RHEL gets to keep the pieces! ;-) > >> >> > qemu live migration will failed with messages: >> > qemu-kvm: Invalid value 233 expecting positive value >> > <= 232 >> > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len >> > New version kernel exclude ID_AA64ZFR0_EL1 register when host not >> > support SVE feature, >> > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 >> > register. >> > >> > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 >> > (migration source) >> > new kernel version:4.18.0-147.5.el8 (migration destination) >> > >> > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So >> > 4.18.0-147.5.el8 also applied this patch. >> > >> > Migration source and destination hosts have the same hardware, just >> > kernel version is different, >> > and the hardware on either side of the migration not support SVE. >> > >> > Is there some good suggestions,which can make sure old version >> > kernel live migration to new version kernel with qemu? >> >> I['m afraid you should take this with your distribution of >> choice, unless you can reproduce the problem with mainline kernels. > > I'll bet this reproduces when migrating from a mainline pre SVE > register > filtering kernel to a mainline that does the filtering. I don't have > time to test that though. Probably. I'm not really sure where to find SVE HW though, other than emulation... M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 13:52 ` Marc Zyngier @ 2020-10-15 14:41 ` Andrew Jones -1 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-15 14:41 UTC (permalink / raw) To: Marc Zyngier; +Cc: Dave.Martin, xu910121, kvmarm, qemu-devel On Thu, Oct 15, 2020 at 02:52:34PM +0100, Marc Zyngier wrote: > On 2020-10-15 14:35, Andrew Jones wrote: > > On Thu, Oct 15, 2020 at 12:26:10PM +0100, Marc Zyngier wrote: > > > Hi > > > > > > Please don't use my arm.com address anymore, nobody reads it... > > > > > > On 2020-10-15 05:06, 张东旭 wrote: > > > > I'm so sorry for disturbing you. > > > > > > > > When I apply this kernel patch:KVM: arm64/sve: System register > > > > context switch and access support > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd > > > > > > This patch was only introduced in 5.2, and wasn't supposed to be > > > backported to anything else. > > > > All upstream patches are backport candidates. Welcome to RHEL :-) > > Great! RHEL gets to keep the pieces! ;-) > > > > > > > > > > qemu live migration will failed with messages: > > > > qemu-kvm: Invalid value 233 expecting positive value > > > > <= 232 > > > > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len > > > > New version kernel exclude ID_AA64ZFR0_EL1 register when host not > > > > support SVE feature, > > > > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 > > > > register. > > > > > > > > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 > > > > (migration source) > > > > new kernel version:4.18.0-147.5.el8 (migration destination) > > > > > > > > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So > > > > 4.18.0-147.5.el8 also applied this patch. > > > > > > > > Migration source and destination hosts have the same hardware, just > > > > kernel version is different, > > > > and the hardware on either side of the migration not support SVE. > > > > > > > > Is there some good suggestions,which can make sure old version > > > > kernel live migration to new version kernel with qemu? > > > > > > I['m afraid you should take this with your distribution of > > > choice, unless you can reproduce the problem with mainline kernels. > > > > I'll bet this reproduces when migrating from a mainline pre SVE register > > filtering kernel to a mainline that does the filtering. I don't have > > time to test that though. > > Probably. I'm not really sure where to find SVE HW though, other > than emulation... > The reporter states neither the source nor destination hardware supports SVE. My guess is that what's happening is the reserved ID register ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on the old kernel, but the new kernel filters it out. Maybe it is a bug to filter it out of the count, as it's a reserved ID register and I suppose the other reserved ID registers are still showing up? Thanks, drew ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-15 14:41 ` Andrew Jones 0 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-15 14:41 UTC (permalink / raw) To: Marc Zyngier; +Cc: Dave.Martin, xu910121, kvmarm, qemu-devel On Thu, Oct 15, 2020 at 02:52:34PM +0100, Marc Zyngier wrote: > On 2020-10-15 14:35, Andrew Jones wrote: > > On Thu, Oct 15, 2020 at 12:26:10PM +0100, Marc Zyngier wrote: > > > Hi > > > > > > Please don't use my arm.com address anymore, nobody reads it... > > > > > > On 2020-10-15 05:06, 张东旭 wrote: > > > > I'm so sorry for disturbing you. > > > > > > > > When I apply this kernel patch:KVM: arm64/sve: System register > > > > context switch and access support > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96dd > > > > > > This patch was only introduced in 5.2, and wasn't supposed to be > > > backported to anything else. > > > > All upstream patches are backport candidates. Welcome to RHEL :-) > > Great! RHEL gets to keep the pieces! ;-) > > > > > > > > > > qemu live migration will failed with messages: > > > > qemu-kvm: Invalid value 233 expecting positive value > > > > <= 232 > > > > qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len > > > > New version kernel exclude ID_AA64ZFR0_EL1 register when host not > > > > support SVE feature, > > > > so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 > > > > register. > > > > > > > > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 > > > > (migration source) > > > > new kernel version:4.18.0-147.5.el8 (migration destination) > > > > > > > > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So > > > > 4.18.0-147.5.el8 also applied this patch. > > > > > > > > Migration source and destination hosts have the same hardware, just > > > > kernel version is different, > > > > and the hardware on either side of the migration not support SVE. > > > > > > > > Is there some good suggestions,which can make sure old version > > > > kernel live migration to new version kernel with qemu? > > > > > > I['m afraid you should take this with your distribution of > > > choice, unless you can reproduce the problem with mainline kernels. > > > > I'll bet this reproduces when migrating from a mainline pre SVE register > > filtering kernel to a mainline that does the filtering. I don't have > > time to test that though. > > Probably. I'm not really sure where to find SVE HW though, other > than emulation... > The reporter states neither the source nor destination hardware supports SVE. My guess is that what's happening is the reserved ID register ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on the old kernel, but the new kernel filters it out. Maybe it is a bug to filter it out of the count, as it's a reserved ID register and I suppose the other reserved ID registers are still showing up? Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 14:41 ` Andrew Jones @ 2020-10-15 14:57 ` Peter Maydell -1 siblings, 0 replies; 30+ messages in thread From: Peter Maydell @ 2020-10-15 14:57 UTC (permalink / raw) To: Andrew Jones; +Cc: qemu-devel, Marc Zyngier, xu910121, Dave.Martin, kvmarm On Thu, 15 Oct 2020 at 15:41, Andrew Jones <drjones@redhat.com> wrote: > The reporter states neither the source nor destination hardware supports > SVE. My guess is that what's happening is the reserved ID register > ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on > the old kernel, but the new kernel filters it out. Maybe it is a > bug to filter it out of the count, as it's a reserved ID register and > I suppose the other reserved ID registers are still showing up? Yeah, RES0 ID registers should show up in the list, because otherwise userspace has to annoyingly special case them when the architecture eventually defines behaviour for them. Dave's comment in the kernel commit message # ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the # guest, but for compatibility with non-SVE aware KVM implementations # the register should not be enumerated at all for KVM_GET_REG_LIST # in this case. seems wrong to me -- for compatibility the register should remain present and behave as RAZ/WI if SVE is disabled in the guest, the same way it was before the kernel/KVM knew about SVE at all. thanks -- PMM ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-15 14:57 ` Peter Maydell 0 siblings, 0 replies; 30+ messages in thread From: Peter Maydell @ 2020-10-15 14:57 UTC (permalink / raw) To: Andrew Jones; +Cc: qemu-devel, Marc Zyngier, xu910121, Dave.Martin, kvmarm On Thu, 15 Oct 2020 at 15:41, Andrew Jones <drjones@redhat.com> wrote: > The reporter states neither the source nor destination hardware supports > SVE. My guess is that what's happening is the reserved ID register > ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on > the old kernel, but the new kernel filters it out. Maybe it is a > bug to filter it out of the count, as it's a reserved ID register and > I suppose the other reserved ID registers are still showing up? Yeah, RES0 ID registers should show up in the list, because otherwise userspace has to annoyingly special case them when the architecture eventually defines behaviour for them. Dave's comment in the kernel commit message # ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the # guest, but for compatibility with non-SVE aware KVM implementations # the register should not be enumerated at all for KVM_GET_REG_LIST # in this case. seems wrong to me -- for compatibility the register should remain present and behave as RAZ/WI if SVE is disabled in the guest, the same way it was before the kernel/KVM knew about SVE at all. thanks -- PMM _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 14:57 ` Peter Maydell @ 2020-10-19 9:25 ` Andrew Jones -1 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-19 9:25 UTC (permalink / raw) To: Peter Maydell; +Cc: Marc Zyngier, xu910121, qemu-devel, kvmarm, Dave.Martin On Thu, Oct 15, 2020 at 03:57:02PM +0100, Peter Maydell wrote: > On Thu, 15 Oct 2020 at 15:41, Andrew Jones <drjones@redhat.com> wrote: > > The reporter states neither the source nor destination hardware supports > > SVE. My guess is that what's happening is the reserved ID register > > ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on > > the old kernel, but the new kernel filters it out. Maybe it is a > > bug to filter it out of the count, as it's a reserved ID register and > > I suppose the other reserved ID registers are still showing up? > > Yeah, RES0 ID registers should show up in the list, because otherwise > userspace has to annoyingly special case them when the architecture > eventually defines behaviour for them. > > Dave's comment in the kernel commit message > # ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the > # guest, but for compatibility with non-SVE aware KVM implementations > # the register should not be enumerated at all for KVM_GET_REG_LIST > # in this case. > seems wrong to me -- for compatibility the register should remain > present and behave as RAZ/WI if SVE is disabled in the guest, > the same way it was before the kernel/KVM knew about SVE at all. Yup, I agree with you and I'll try writing a patch for this. Thanks, drew ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 9:25 ` Andrew Jones 0 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-19 9:25 UTC (permalink / raw) To: Peter Maydell; +Cc: Marc Zyngier, xu910121, qemu-devel, kvmarm, Dave.Martin On Thu, Oct 15, 2020 at 03:57:02PM +0100, Peter Maydell wrote: > On Thu, 15 Oct 2020 at 15:41, Andrew Jones <drjones@redhat.com> wrote: > > The reporter states neither the source nor destination hardware supports > > SVE. My guess is that what's happening is the reserved ID register > > ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on > > the old kernel, but the new kernel filters it out. Maybe it is a > > bug to filter it out of the count, as it's a reserved ID register and > > I suppose the other reserved ID registers are still showing up? > > Yeah, RES0 ID registers should show up in the list, because otherwise > userspace has to annoyingly special case them when the architecture > eventually defines behaviour for them. > > Dave's comment in the kernel commit message > # ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the > # guest, but for compatibility with non-SVE aware KVM implementations > # the register should not be enumerated at all for KVM_GET_REG_LIST > # in this case. > seems wrong to me -- for compatibility the register should remain > present and behave as RAZ/WI if SVE is disabled in the guest, > the same way it was before the kernel/KVM knew about SVE at all. Yup, I agree with you and I'll try writing a patch for this. Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 9:25 ` Andrew Jones @ 2020-10-19 11:32 ` Dave Martin -1 siblings, 0 replies; 30+ messages in thread From: Dave Martin @ 2020-10-19 11:32 UTC (permalink / raw) To: Andrew Jones; +Cc: Peter Maydell, xu910121, qemu-devel, kvmarm, Marc Zyngier On Mon, Oct 19, 2020 at 11:25:25AM +0200, Andrew Jones wrote: > On Thu, Oct 15, 2020 at 03:57:02PM +0100, Peter Maydell wrote: > > On Thu, 15 Oct 2020 at 15:41, Andrew Jones <drjones@redhat.com> wrote: > > > The reporter states neither the source nor destination hardware supports > > > SVE. My guess is that what's happening is the reserved ID register > > > ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on > > > the old kernel, but the new kernel filters it out. Maybe it is a > > > bug to filter it out of the count, as it's a reserved ID register and > > > I suppose the other reserved ID registers are still showing up? > > > > Yeah, RES0 ID registers should show up in the list, because otherwise > > userspace has to annoyingly special case them when the architecture > > eventually defines behaviour for them. > > > > Dave's comment in the kernel commit message > > # ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the > > # guest, but for compatibility with non-SVE aware KVM implementations > > # the register should not be enumerated at all for KVM_GET_REG_LIST > > # in this case. > > seems wrong to me -- for compatibility the register should remain > > present and behave as RAZ/WI if SVE is disabled in the guest, > > the same way it was before the kernel/KVM knew about SVE at all. > > Yup, I agree with you and I'll try writing a patch for this. > > Thanks, > drew I'm not quite sure about Peter's assessment here. I agree with the inconsistency identified here: we always enumerate all unallocated ID regs, but we enumerate ID_AA64ZFR0_EL1 conditionally. This doesn't feel right: on a non-SVE guest, ID_AA64ZFR0_EL1 should behave exactly as an unallocated ID register. I'm not sure about the proposed fix. For one thing, I'm not sure that old hosts will accept writing of 0 to arbitrary ID regs. This may require some digging, but commit 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") may be the place to start. My original idea was that at the source end we should be conservative: enumerate and dump the minimum set of registers relevant to the target -- for compatibility with old hosts that don't handle the unallocated ID regs at all. At the destination end, modern hosts should be permissive, i.e., allow any ID reg to be set to 0, but don't require the setting of any reg that older source hosts might not send. So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, parhaps we should move all ID_UNALLOCATED() regs (and possibly ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. Thoughts? ---Dave ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 11:32 ` Dave Martin 0 siblings, 0 replies; 30+ messages in thread From: Dave Martin @ 2020-10-19 11:32 UTC (permalink / raw) To: Andrew Jones; +Cc: xu910121, qemu-devel, kvmarm, Marc Zyngier On Mon, Oct 19, 2020 at 11:25:25AM +0200, Andrew Jones wrote: > On Thu, Oct 15, 2020 at 03:57:02PM +0100, Peter Maydell wrote: > > On Thu, 15 Oct 2020 at 15:41, Andrew Jones <drjones@redhat.com> wrote: > > > The reporter states neither the source nor destination hardware supports > > > SVE. My guess is that what's happening is the reserved ID register > > > ID_UNALLOCATED(4,4) was showing up in the KVM_GET_REG_LIST count on > > > the old kernel, but the new kernel filters it out. Maybe it is a > > > bug to filter it out of the count, as it's a reserved ID register and > > > I suppose the other reserved ID registers are still showing up? > > > > Yeah, RES0 ID registers should show up in the list, because otherwise > > userspace has to annoyingly special case them when the architecture > > eventually defines behaviour for them. > > > > Dave's comment in the kernel commit message > > # ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the > > # guest, but for compatibility with non-SVE aware KVM implementations > > # the register should not be enumerated at all for KVM_GET_REG_LIST > > # in this case. > > seems wrong to me -- for compatibility the register should remain > > present and behave as RAZ/WI if SVE is disabled in the guest, > > the same way it was before the kernel/KVM knew about SVE at all. > > Yup, I agree with you and I'll try writing a patch for this. > > Thanks, > drew I'm not quite sure about Peter's assessment here. I agree with the inconsistency identified here: we always enumerate all unallocated ID regs, but we enumerate ID_AA64ZFR0_EL1 conditionally. This doesn't feel right: on a non-SVE guest, ID_AA64ZFR0_EL1 should behave exactly as an unallocated ID register. I'm not sure about the proposed fix. For one thing, I'm not sure that old hosts will accept writing of 0 to arbitrary ID regs. This may require some digging, but commit 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") may be the place to start. My original idea was that at the source end we should be conservative: enumerate and dump the minimum set of registers relevant to the target -- for compatibility with old hosts that don't handle the unallocated ID regs at all. At the destination end, modern hosts should be permissive, i.e., allow any ID reg to be set to 0, but don't require the setting of any reg that older source hosts might not send. So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, parhaps we should move all ID_UNALLOCATED() regs (and possibly ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. Thoughts? ---Dave _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 11:32 ` Dave Martin @ 2020-10-19 11:43 ` Peter Maydell -1 siblings, 0 replies; 30+ messages in thread From: Peter Maydell @ 2020-10-19 11:43 UTC (permalink / raw) To: Dave Martin; +Cc: Marc Zyngier, Andrew Jones, xu910121, qemu-devel, kvmarm On Mon, 19 Oct 2020 at 12:32, Dave Martin <Dave.Martin@arm.com> wrote: > I'm not quite sure about Peter's assessment here. > > I agree with the inconsistency identified here: we always enumerate all > unallocated ID regs, but we enumerate ID_AA64ZFR0_EL1 conditionally. > This doesn't feel right: on a non-SVE guest, ID_AA64ZFR0_EL1 should > behave exactly as an unallocated ID register. > > I'm not sure about the proposed fix. > > For one thing, I'm not sure that old hosts will accept writing of 0 to > arbitrary ID regs. This may require some digging, but commit > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > may be the place to start. Well, ID regs are special in the architecture -- they always exist and must RAZ/WI, even if they're not actually given any fields yet. This is different from other "unused" parts of the system register encoding space, which UNDEF. Older hosts didn't permit writing 0 to all parts of the ID register space (and didn't list all ID registers in the KVM_GET_REG_LIST list), but that was a kernel bug which we've since fixed. (QEMU has workaround code for pre-4.15 kernels for this.) Across that older bugfix, migration works from an old kernel to a newer one, but wouldn't have worked from a post-bugfix kernel to a pre-4.15 one. > My original idea was that at the source end we should be conservative: > enumerate and dump the minimum set of registers relevant to the > target -- for compatibility with old hosts that don't handle the > unallocated ID regs at all. At the destination end, modern hosts > should be permissive, i.e., allow any ID reg to be set to 0, but don't > require the setting of any reg that older source hosts might not send. The problem is that you've actually removed registers from the list that were previously in it (because pre-SVE kernels put this ID register in the list as a RAZ/WI register, and now it's not in the list if SVE isn't supported). > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > parhaps we should move all ID_UNALLOCATED() regs (and possibly > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. What does this do as far as the user-facing list-of-registers is concerned? All these registers need to remain in the KVM_GET_REG_LIST list, or you break migration from an old kernel to a new one. thanks -- PMM ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 11:43 ` Peter Maydell 0 siblings, 0 replies; 30+ messages in thread From: Peter Maydell @ 2020-10-19 11:43 UTC (permalink / raw) To: Dave Martin; +Cc: Marc Zyngier, xu910121, qemu-devel, kvmarm On Mon, 19 Oct 2020 at 12:32, Dave Martin <Dave.Martin@arm.com> wrote: > I'm not quite sure about Peter's assessment here. > > I agree with the inconsistency identified here: we always enumerate all > unallocated ID regs, but we enumerate ID_AA64ZFR0_EL1 conditionally. > This doesn't feel right: on a non-SVE guest, ID_AA64ZFR0_EL1 should > behave exactly as an unallocated ID register. > > I'm not sure about the proposed fix. > > For one thing, I'm not sure that old hosts will accept writing of 0 to > arbitrary ID regs. This may require some digging, but commit > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > may be the place to start. Well, ID regs are special in the architecture -- they always exist and must RAZ/WI, even if they're not actually given any fields yet. This is different from other "unused" parts of the system register encoding space, which UNDEF. Older hosts didn't permit writing 0 to all parts of the ID register space (and didn't list all ID registers in the KVM_GET_REG_LIST list), but that was a kernel bug which we've since fixed. (QEMU has workaround code for pre-4.15 kernels for this.) Across that older bugfix, migration works from an old kernel to a newer one, but wouldn't have worked from a post-bugfix kernel to a pre-4.15 one. > My original idea was that at the source end we should be conservative: > enumerate and dump the minimum set of registers relevant to the > target -- for compatibility with old hosts that don't handle the > unallocated ID regs at all. At the destination end, modern hosts > should be permissive, i.e., allow any ID reg to be set to 0, but don't > require the setting of any reg that older source hosts might not send. The problem is that you've actually removed registers from the list that were previously in it (because pre-SVE kernels put this ID register in the list as a RAZ/WI register, and now it's not in the list if SVE isn't supported). > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > parhaps we should move all ID_UNALLOCATED() regs (and possibly > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. What does this do as far as the user-facing list-of-registers is concerned? All these registers need to remain in the KVM_GET_REG_LIST list, or you break migration from an old kernel to a new one. thanks -- PMM _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 11:43 ` Peter Maydell @ 2020-10-19 13:40 ` Andrew Jones -1 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-19 13:40 UTC (permalink / raw) To: Peter Maydell; +Cc: Marc Zyngier, xu910121, Dave Martin, kvmarm, qemu-devel On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > On Mon, 19 Oct 2020 at 12:32, Dave Martin <Dave.Martin@arm.com> wrote: > > I'm not quite sure about Peter's assessment here. > > > > I agree with the inconsistency identified here: we always enumerate all > > unallocated ID regs, but we enumerate ID_AA64ZFR0_EL1 conditionally. > > This doesn't feel right: on a non-SVE guest, ID_AA64ZFR0_EL1 should > > behave exactly as an unallocated ID register. > > > > I'm not sure about the proposed fix. > > > > For one thing, I'm not sure that old hosts will accept writing of 0 to > > arbitrary ID regs. This may require some digging, but commit > > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > > may be the place to start. > > Well, ID regs are special in the architecture -- they always exist > and must RAZ/WI, even if they're not actually given any fields yet. > This is different from other "unused" parts of the system register > encoding space, which UNDEF. Table D12-2 confirms the register should be RAZ, as it says the register is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? For the guest we inject an exception on writes, and for userspace we require the value to be preserved on write. > > Older hosts didn't permit writing 0 to all parts of the ID > register space (and didn't list all ID registers in the KVM_GET_REG_LIST > list), but that was a kernel bug which we've since fixed. > (QEMU has workaround code for pre-4.15 kernels for this.) > Across that older bugfix, migration works from an old kernel to > a newer one, but wouldn't have worked from a post-bugfix kernel > to a pre-4.15 one. > > > My original idea was that at the source end we should be conservative: > > enumerate and dump the minimum set of registers relevant to the > > target -- for compatibility with old hosts that don't handle the > > unallocated ID regs at all. At the destination end, modern hosts > > should be permissive, i.e., allow any ID reg to be set to 0, but don't > > require the setting of any reg that older source hosts might not send. > > The problem is that you've actually removed registers from > the list that were previously in it (because pre-SVE > kernels put this ID register in the list as a RAZ/WI register, > and now it's not in the list if SVE isn't supported). > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > What does this do as far as the user-facing list-of-registers > is concerned? All these registers need to remain in the > KVM_GET_REG_LIST list, or you break migration from an old > kernel to a new one. I think we should follow the spec, even for userspace access, and be RAZ for when the feature isn't implemented. As for writes, assuming the exception injection is what we want for the guest (not WI), then that's correct. For userspace, I think we should continue forcing preservation (which will force preservation of zero when it's RAZ). Thanks, drew ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 13:40 ` Andrew Jones 0 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-19 13:40 UTC (permalink / raw) To: Peter Maydell; +Cc: Marc Zyngier, xu910121, Dave Martin, kvmarm, qemu-devel On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > On Mon, 19 Oct 2020 at 12:32, Dave Martin <Dave.Martin@arm.com> wrote: > > I'm not quite sure about Peter's assessment here. > > > > I agree with the inconsistency identified here: we always enumerate all > > unallocated ID regs, but we enumerate ID_AA64ZFR0_EL1 conditionally. > > This doesn't feel right: on a non-SVE guest, ID_AA64ZFR0_EL1 should > > behave exactly as an unallocated ID register. > > > > I'm not sure about the proposed fix. > > > > For one thing, I'm not sure that old hosts will accept writing of 0 to > > arbitrary ID regs. This may require some digging, but commit > > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > > may be the place to start. > > Well, ID regs are special in the architecture -- they always exist > and must RAZ/WI, even if they're not actually given any fields yet. > This is different from other "unused" parts of the system register > encoding space, which UNDEF. Table D12-2 confirms the register should be RAZ, as it says the register is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? For the guest we inject an exception on writes, and for userspace we require the value to be preserved on write. > > Older hosts didn't permit writing 0 to all parts of the ID > register space (and didn't list all ID registers in the KVM_GET_REG_LIST > list), but that was a kernel bug which we've since fixed. > (QEMU has workaround code for pre-4.15 kernels for this.) > Across that older bugfix, migration works from an old kernel to > a newer one, but wouldn't have worked from a post-bugfix kernel > to a pre-4.15 one. > > > My original idea was that at the source end we should be conservative: > > enumerate and dump the minimum set of registers relevant to the > > target -- for compatibility with old hosts that don't handle the > > unallocated ID regs at all. At the destination end, modern hosts > > should be permissive, i.e., allow any ID reg to be set to 0, but don't > > require the setting of any reg that older source hosts might not send. > > The problem is that you've actually removed registers from > the list that were previously in it (because pre-SVE > kernels put this ID register in the list as a RAZ/WI register, > and now it's not in the list if SVE isn't supported). > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > What does this do as far as the user-facing list-of-registers > is concerned? All these registers need to remain in the > KVM_GET_REG_LIST list, or you break migration from an old > kernel to a new one. I think we should follow the spec, even for userspace access, and be RAZ for when the feature isn't implemented. As for writes, assuming the exception injection is what we want for the guest (not WI), then that's correct. For userspace, I think we should continue forcing preservation (which will force preservation of zero when it's RAZ). Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 13:40 ` Andrew Jones @ 2020-10-19 14:18 ` Peter Maydell -1 siblings, 0 replies; 30+ messages in thread From: Peter Maydell @ 2020-10-19 14:18 UTC (permalink / raw) To: Andrew Jones; +Cc: Marc Zyngier, xu910121, Dave Martin, kvmarm, qemu-devel On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > Well, ID regs are special in the architecture -- they always exist > > and must RAZ/WI, even if they're not actually given any fields yet. > > This is different from other "unused" parts of the system register > > encoding space, which UNDEF. > > Table D12-2 confirms the register should be RAZ, as it says the register > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > For the guest we inject an exception on writes, and for userspace we > require the value to be preserved on write. Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say they'll UNDEF if you try to write to them). > I think we should follow the spec, even for userspace access, and be RAZ > for when the feature isn't implemented. As for writes, assuming the > exception injection is what we want for the guest (not WI), then that's > correct. For userspace, I think we should continue forcing preservation > (which will force preservation of zero when it's RAZ). Yes, that sounds right. thanks -- PMM ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 14:18 ` Peter Maydell 0 siblings, 0 replies; 30+ messages in thread From: Peter Maydell @ 2020-10-19 14:18 UTC (permalink / raw) To: Andrew Jones; +Cc: Marc Zyngier, xu910121, Dave Martin, kvmarm, qemu-devel On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > Well, ID regs are special in the architecture -- they always exist > > and must RAZ/WI, even if they're not actually given any fields yet. > > This is different from other "unused" parts of the system register > > encoding space, which UNDEF. > > Table D12-2 confirms the register should be RAZ, as it says the register > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > For the guest we inject an exception on writes, and for userspace we > require the value to be preserved on write. Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say they'll UNDEF if you try to write to them). > I think we should follow the spec, even for userspace access, and be RAZ > for when the feature isn't implemented. As for writes, assuming the > exception injection is what we want for the guest (not WI), then that's > correct. For userspace, I think we should continue forcing preservation > (which will force preservation of zero when it's RAZ). Yes, that sounds right. thanks -- PMM _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 14:18 ` Peter Maydell @ 2020-10-19 14:58 ` Dave Martin -1 siblings, 0 replies; 30+ messages in thread From: Dave Martin @ 2020-10-19 14:58 UTC (permalink / raw) To: Peter Maydell; +Cc: Marc Zyngier, Andrew Jones, xu910121, kvmarm, qemu-devel On Mon, Oct 19, 2020 at 03:18:11PM +0100, Peter Maydell wrote: > On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > > Well, ID regs are special in the architecture -- they always exist > > > and must RAZ/WI, even if they're not actually given any fields yet. > > > This is different from other "unused" parts of the system register > > > encoding space, which UNDEF. > > > > Table D12-2 confirms the register should be RAZ, as it says the register > > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > > For the guest we inject an exception on writes, and for userspace we > > require the value to be preserved on write. > > Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say > they'll UNDEF if you try to write to them). > > > I think we should follow the spec, even for userspace access, and be RAZ > > for when the feature isn't implemented. As for writes, assuming the > > exception injection is what we want for the guest (not WI), then that's > > correct. For userspace, I think we should continue forcing preservation > > (which will force preservation of zero when it's RAZ). > > Yes, that sounds right. [...] > > > The problem is that you've actually removed registers from > > > the list that were previously in it (because pre-SVE > > > kernels put this ID register in the list as a RAZ/WI register, > > > and now it's not in the list if SVE isn't supported). Define "previously", though. IIUC, the full enumeration was added in v4.15 (with ID_AA64ZFR0_EL1 still not supported at all): v4.15-rc1~110^2~27 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") And then ID_AA64FZR0_EL1 was removed from the enumeration, also in v4.15: v4.15-rc1~110^2~5 07d79fe7c223 ("arm64/sve: KVM: Hide SVE from CPU features exposed to guests") So, are there really two upstram kernel tags that are mismatched on this, or is this just a bisectability issue in v4.14..v4.15? It's a while since I looked at this, and I may have misunderstood the timeline. > > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > > > > > What does this do as far as the user-facing list-of-registers > > > is concerned? All these registers need to remain in the > > > KVM_GET_REG_LIST list, or you break migration from an old > > > kernel to a new one. OK, I think I see where you are coming from, now. It may make sense to get rid of the REG_HIDDEN_GUEST / REG_HIDDEN_USER distinction, and provide the same visibility for userspace as for MSR/ MRS all the time. This would restore ID_AA64ZFR0_EL1 into the userspace view, and may also allow a bit of simplification in the code. Won't this will still break migration from the resulting kernel to a current kernel that hides ID_AA64ZFR0_EL1? Or have I misunderstood something. Cheers ---Dave ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 14:58 ` Dave Martin 0 siblings, 0 replies; 30+ messages in thread From: Dave Martin @ 2020-10-19 14:58 UTC (permalink / raw) To: Peter Maydell; +Cc: Marc Zyngier, xu910121, kvmarm, qemu-devel On Mon, Oct 19, 2020 at 03:18:11PM +0100, Peter Maydell wrote: > On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > > Well, ID regs are special in the architecture -- they always exist > > > and must RAZ/WI, even if they're not actually given any fields yet. > > > This is different from other "unused" parts of the system register > > > encoding space, which UNDEF. > > > > Table D12-2 confirms the register should be RAZ, as it says the register > > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > > For the guest we inject an exception on writes, and for userspace we > > require the value to be preserved on write. > > Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say > they'll UNDEF if you try to write to them). > > > I think we should follow the spec, even for userspace access, and be RAZ > > for when the feature isn't implemented. As for writes, assuming the > > exception injection is what we want for the guest (not WI), then that's > > correct. For userspace, I think we should continue forcing preservation > > (which will force preservation of zero when it's RAZ). > > Yes, that sounds right. [...] > > > The problem is that you've actually removed registers from > > > the list that were previously in it (because pre-SVE > > > kernels put this ID register in the list as a RAZ/WI register, > > > and now it's not in the list if SVE isn't supported). Define "previously", though. IIUC, the full enumeration was added in v4.15 (with ID_AA64ZFR0_EL1 still not supported at all): v4.15-rc1~110^2~27 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") And then ID_AA64FZR0_EL1 was removed from the enumeration, also in v4.15: v4.15-rc1~110^2~5 07d79fe7c223 ("arm64/sve: KVM: Hide SVE from CPU features exposed to guests") So, are there really two upstram kernel tags that are mismatched on this, or is this just a bisectability issue in v4.14..v4.15? It's a while since I looked at this, and I may have misunderstood the timeline. > > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > > > > > What does this do as far as the user-facing list-of-registers > > > is concerned? All these registers need to remain in the > > > KVM_GET_REG_LIST list, or you break migration from an old > > > kernel to a new one. OK, I think I see where you are coming from, now. It may make sense to get rid of the REG_HIDDEN_GUEST / REG_HIDDEN_USER distinction, and provide the same visibility for userspace as for MSR/ MRS all the time. This would restore ID_AA64ZFR0_EL1 into the userspace view, and may also allow a bit of simplification in the code. Won't this will still break migration from the resulting kernel to a current kernel that hides ID_AA64ZFR0_EL1? Or have I misunderstood something. Cheers ---Dave _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 14:58 ` Dave Martin @ 2020-10-19 15:23 ` Andrew Jones -1 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-19 15:23 UTC (permalink / raw) To: Dave Martin; +Cc: qemu-devel, Peter Maydell, xu910121, kvmarm, Marc Zyngier On Mon, Oct 19, 2020 at 03:58:40PM +0100, Dave Martin wrote: > On Mon, Oct 19, 2020 at 03:18:11PM +0100, Peter Maydell wrote: > > On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > > > > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > > > Well, ID regs are special in the architecture -- they always exist > > > > and must RAZ/WI, even if they're not actually given any fields yet. > > > > This is different from other "unused" parts of the system register > > > > encoding space, which UNDEF. > > > > > > Table D12-2 confirms the register should be RAZ, as it says the register > > > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > > > For the guest we inject an exception on writes, and for userspace we > > > require the value to be preserved on write. > > > > Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say > > they'll UNDEF if you try to write to them). > > > > > I think we should follow the spec, even for userspace access, and be RAZ > > > for when the feature isn't implemented. As for writes, assuming the > > > exception injection is what we want for the guest (not WI), then that's > > > correct. For userspace, I think we should continue forcing preservation > > > (which will force preservation of zero when it's RAZ). > > > > Yes, that sounds right. > > [...] > > > > > The problem is that you've actually removed registers from > > > > the list that were previously in it (because pre-SVE > > > > kernels put this ID register in the list as a RAZ/WI register, > > > > and now it's not in the list if SVE isn't supported). > > Define "previously", though. IIUC, the full enumeration was added in > v4.15 (with ID_AA64ZFR0_EL1 still not supported at all): > > v4.15-rc1~110^2~27 > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > > > And then ID_AA64FZR0_EL1 was removed from the enumeration, also in > v4.15: > > v4.15-rc1~110^2~5 > 07d79fe7c223 ("arm64/sve: KVM: Hide SVE from CPU features exposed to guests") > > > So, are there really two upstram kernel tags that are mismatched on > this, or is this just a bisectability issue in v4.14..v4.15? > > It's a while since I looked at this, and I may have misunderstood the > timeline. > > > > > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > > > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > > > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > > > > > > > What does this do as far as the user-facing list-of-registers > > > > is concerned? All these registers need to remain in the > > > > KVM_GET_REG_LIST list, or you break migration from an old > > > > kernel to a new one. > > OK, I think I see where you are coming from, now. > > It may make sense to get rid of the REG_HIDDEN_GUEST / REG_HIDDEN_USER > distinction, and provide the same visibility for userspace as for MSR/ > MRS all the time. This would restore ID_AA64ZFR0_EL1 into the userspace > view, and may also allow a bit of simplification in the code. > > Won't this will still break migration from the resulting kernel to a > current kernel that hides ID_AA64ZFR0_EL1? Or have I misunderstood > something. > Yes, but, while neither direction old -> new nor new -> old is actually something that people should do when using host cpu passthrough (they should only ever migrate between identical hosts, both hardware and host kernel version), migrating from old -> new makes more sense, as that's the upgrade path, and it's more supportable - we can workaround things on the new side. So, long story short, new -> old will fail due to making this change, but it's still probably the right thing to do, as we'll be defining a better pattern for ID registers going forward, and we never claimed new -> old migrations would work anyway with host passthrough. Thanks, drew ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 15:23 ` Andrew Jones 0 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-19 15:23 UTC (permalink / raw) To: Dave Martin; +Cc: qemu-devel, xu910121, kvmarm, Marc Zyngier On Mon, Oct 19, 2020 at 03:58:40PM +0100, Dave Martin wrote: > On Mon, Oct 19, 2020 at 03:18:11PM +0100, Peter Maydell wrote: > > On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > > > > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > > > Well, ID regs are special in the architecture -- they always exist > > > > and must RAZ/WI, even if they're not actually given any fields yet. > > > > This is different from other "unused" parts of the system register > > > > encoding space, which UNDEF. > > > > > > Table D12-2 confirms the register should be RAZ, as it says the register > > > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > > > For the guest we inject an exception on writes, and for userspace we > > > require the value to be preserved on write. > > > > Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say > > they'll UNDEF if you try to write to them). > > > > > I think we should follow the spec, even for userspace access, and be RAZ > > > for when the feature isn't implemented. As for writes, assuming the > > > exception injection is what we want for the guest (not WI), then that's > > > correct. For userspace, I think we should continue forcing preservation > > > (which will force preservation of zero when it's RAZ). > > > > Yes, that sounds right. > > [...] > > > > > The problem is that you've actually removed registers from > > > > the list that were previously in it (because pre-SVE > > > > kernels put this ID register in the list as a RAZ/WI register, > > > > and now it's not in the list if SVE isn't supported). > > Define "previously", though. IIUC, the full enumeration was added in > v4.15 (with ID_AA64ZFR0_EL1 still not supported at all): > > v4.15-rc1~110^2~27 > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > > > And then ID_AA64FZR0_EL1 was removed from the enumeration, also in > v4.15: > > v4.15-rc1~110^2~5 > 07d79fe7c223 ("arm64/sve: KVM: Hide SVE from CPU features exposed to guests") > > > So, are there really two upstram kernel tags that are mismatched on > this, or is this just a bisectability issue in v4.14..v4.15? > > It's a while since I looked at this, and I may have misunderstood the > timeline. > > > > > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > > > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > > > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > > > > > > > What does this do as far as the user-facing list-of-registers > > > > is concerned? All these registers need to remain in the > > > > KVM_GET_REG_LIST list, or you break migration from an old > > > > kernel to a new one. > > OK, I think I see where you are coming from, now. > > It may make sense to get rid of the REG_HIDDEN_GUEST / REG_HIDDEN_USER > distinction, and provide the same visibility for userspace as for MSR/ > MRS all the time. This would restore ID_AA64ZFR0_EL1 into the userspace > view, and may also allow a bit of simplification in the code. > > Won't this will still break migration from the resulting kernel to a > current kernel that hides ID_AA64ZFR0_EL1? Or have I misunderstood > something. > Yes, but, while neither direction old -> new nor new -> old is actually something that people should do when using host cpu passthrough (they should only ever migrate between identical hosts, both hardware and host kernel version), migrating from old -> new makes more sense, as that's the upgrade path, and it's more supportable - we can workaround things on the new side. So, long story short, new -> old will fail due to making this change, but it's still probably the right thing to do, as we'll be defining a better pattern for ID registers going forward, and we never claimed new -> old migrations would work anyway with host passthrough. Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-19 15:23 ` Andrew Jones @ 2020-10-19 16:36 ` Dave Martin -1 siblings, 0 replies; 30+ messages in thread From: Dave Martin @ 2020-10-19 16:36 UTC (permalink / raw) To: Andrew Jones; +Cc: Marc Zyngier, xu910121, qemu-devel, kvmarm On Mon, Oct 19, 2020 at 05:23:11PM +0200, Andrew Jones wrote: > On Mon, Oct 19, 2020 at 03:58:40PM +0100, Dave Martin wrote: > > On Mon, Oct 19, 2020 at 03:18:11PM +0100, Peter Maydell wrote: > > > On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > > > > > > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > > > > Well, ID regs are special in the architecture -- they always exist > > > > > and must RAZ/WI, even if they're not actually given any fields yet. > > > > > This is different from other "unused" parts of the system register > > > > > encoding space, which UNDEF. > > > > > > > > Table D12-2 confirms the register should be RAZ, as it says the register > > > > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > > > > For the guest we inject an exception on writes, and for userspace we > > > > require the value to be preserved on write. > > > > > > Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say > > > they'll UNDEF if you try to write to them). > > > > > > > I think we should follow the spec, even for userspace access, and be RAZ > > > > for when the feature isn't implemented. As for writes, assuming the > > > > exception injection is what we want for the guest (not WI), then that's > > > > correct. For userspace, I think we should continue forcing preservation > > > > (which will force preservation of zero when it's RAZ). > > > > > > Yes, that sounds right. > > > > [...] > > > > > > > The problem is that you've actually removed registers from > > > > > the list that were previously in it (because pre-SVE > > > > > kernels put this ID register in the list as a RAZ/WI register, > > > > > and now it's not in the list if SVE isn't supported). > > > > Define "previously", though. IIUC, the full enumeration was added in > > v4.15 (with ID_AA64ZFR0_EL1 still not supported at all): > > > > v4.15-rc1~110^2~27 > > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > > > > > > And then ID_AA64FZR0_EL1 was removed from the enumeration, also in > > v4.15: > > > > v4.15-rc1~110^2~5 > > 07d79fe7c223 ("arm64/sve: KVM: Hide SVE from CPU features exposed to guests") > > > > > > So, are there really two upstram kernel tags that are mismatched on > > this, or is this just a bisectability issue in v4.14..v4.15? > > > > It's a while since I looked at this, and I may have misunderstood the > > timeline. > > > > > > > > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > > > > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > > > > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > > > > > > > > > What does this do as far as the user-facing list-of-registers > > > > > is concerned? All these registers need to remain in the > > > > > KVM_GET_REG_LIST list, or you break migration from an old > > > > > kernel to a new one. > > > > OK, I think I see where you are coming from, now. > > > > It may make sense to get rid of the REG_HIDDEN_GUEST / REG_HIDDEN_USER > > distinction, and provide the same visibility for userspace as for MSR/ > > MRS all the time. This would restore ID_AA64ZFR0_EL1 into the userspace > > view, and may also allow a bit of simplification in the code. > > > > Won't this will still break migration from the resulting kernel to a > > current kernel that hides ID_AA64ZFR0_EL1? Or have I misunderstood > > something. > > > > Yes, but, while neither direction old -> new nor new -> old is actually > something that people should do when using host cpu passthrough (they > should only ever migrate between identical hosts, both hardware and > host kernel version), migrating from old -> new makes more sense, as > that's the upgrade path, and it's more supportable - we can workaround > things on the new side. So, long story short, new -> old will fail due > to making this change, but it's still probably the right thing to do, > as we'll be defining a better pattern for ID registers going forward, > and we never claimed new -> old migrations would work anyway with host > passthrough. > > Thanks, > drew Ack, just wanted to make sure I understood the implications correctly. I'm still not sure I fully understand why we hit this problem (i.e., ZFR0 enumeration mismatch between host and guest) in the first place, unless I've misunderstood which patches make these changes, or unless RHEL has cherry-picked odd patches that weren't intended to be applied separately... Cheers ---Dave ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-19 16:36 ` Dave Martin 0 siblings, 0 replies; 30+ messages in thread From: Dave Martin @ 2020-10-19 16:36 UTC (permalink / raw) To: Andrew Jones; +Cc: Marc Zyngier, xu910121, qemu-devel, kvmarm On Mon, Oct 19, 2020 at 05:23:11PM +0200, Andrew Jones wrote: > On Mon, Oct 19, 2020 at 03:58:40PM +0100, Dave Martin wrote: > > On Mon, Oct 19, 2020 at 03:18:11PM +0100, Peter Maydell wrote: > > > On Mon, 19 Oct 2020 at 14:40, Andrew Jones <drjones@redhat.com> wrote: > > > > > > > > On Mon, Oct 19, 2020 at 12:43:33PM +0100, Peter Maydell wrote: > > > > > Well, ID regs are special in the architecture -- they always exist > > > > > and must RAZ/WI, even if they're not actually given any fields yet. > > > > > This is different from other "unused" parts of the system register > > > > > encoding space, which UNDEF. > > > > > > > > Table D12-2 confirms the register should be RAZ, as it says the register > > > > is "RO, but RAZ if SVE is not implemented". Does "RO" imply "WI", though? > > > > For the guest we inject an exception on writes, and for userspace we > > > > require the value to be preserved on write. > > > > > > Sorry, I mis-spoke. They're RAZ, but not WI, just RO (which is to say > > > they'll UNDEF if you try to write to them). > > > > > > > I think we should follow the spec, even for userspace access, and be RAZ > > > > for when the feature isn't implemented. As for writes, assuming the > > > > exception injection is what we want for the guest (not WI), then that's > > > > correct. For userspace, I think we should continue forcing preservation > > > > (which will force preservation of zero when it's RAZ). > > > > > > Yes, that sounds right. > > > > [...] > > > > > > > The problem is that you've actually removed registers from > > > > > the list that were previously in it (because pre-SVE > > > > > kernels put this ID register in the list as a RAZ/WI register, > > > > > and now it's not in the list if SVE isn't supported). > > > > Define "previously", though. IIUC, the full enumeration was added in > > v4.15 (with ID_AA64ZFR0_EL1 still not supported at all): > > > > v4.15-rc1~110^2~27 > > 93390c0a1b20 ("arm64: KVM: Hide unsupported AArch64 CPU features from guests") > > > > > > And then ID_AA64FZR0_EL1 was removed from the enumeration, also in > > v4.15: > > > > v4.15-rc1~110^2~5 > > 07d79fe7c223 ("arm64/sve: KVM: Hide SVE from CPU features exposed to guests") > > > > > > So, are there really two upstram kernel tags that are mismatched on > > this, or is this just a bisectability issue in v4.14..v4.15? > > > > It's a while since I looked at this, and I may have misunderstood the > > timeline. > > > > > > > > > > So, I think that instead of changing the ID_AA64ZFR0_EL1 behaviour, > > > > > > parhaps we should move all ID_UNALLOCATED() regs (and possibly > > > > > > ID_HIDDEN(), not sure about that) to have REG_HIDDEN_USER visibility. > > > > > > > > > > What does this do as far as the user-facing list-of-registers > > > > > is concerned? All these registers need to remain in the > > > > > KVM_GET_REG_LIST list, or you break migration from an old > > > > > kernel to a new one. > > > > OK, I think I see where you are coming from, now. > > > > It may make sense to get rid of the REG_HIDDEN_GUEST / REG_HIDDEN_USER > > distinction, and provide the same visibility for userspace as for MSR/ > > MRS all the time. This would restore ID_AA64ZFR0_EL1 into the userspace > > view, and may also allow a bit of simplification in the code. > > > > Won't this will still break migration from the resulting kernel to a > > current kernel that hides ID_AA64ZFR0_EL1? Or have I misunderstood > > something. > > > > Yes, but, while neither direction old -> new nor new -> old is actually > something that people should do when using host cpu passthrough (they > should only ever migrate between identical hosts, both hardware and > host kernel version), migrating from old -> new makes more sense, as > that's the upgrade path, and it's more supportable - we can workaround > things on the new side. So, long story short, new -> old will fail due > to making this change, but it's still probably the right thing to do, > as we'll be defining a better pattern for ID registers going forward, > and we never claimed new -> old migrations would work anyway with host > passthrough. > > Thanks, > drew Ack, just wanted to make sure I understood the implications correctly. I'm still not sure I fully understand why we hit this problem (i.e., ZFR0 enumeration mismatch between host and guest) in the first place, unless I've misunderstood which patches make these changes, or unless RHEL has cherry-picked odd patches that weren't intended to be applied separately... Cheers ---Dave _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. 2020-10-15 4:06 ` 张东旭 @ 2020-10-15 13:26 ` Andrew Jones -1 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-15 13:26 UTC (permalink / raw) To: 张东旭; +Cc: marc.zyngier, qemu-devel, Dave.Martin, kvmarm On Thu, Oct 15, 2020 at 12:06:39PM +0800, 张东旭 wrote: > I'm so sorry for disturbing you.When I apply this kernel patch:KVM: arm64/sve: System register context switch and access supporthttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96ddqemu live migration will failed with messages: qemu-kvm: Invalid value 233 expecting positive value <= 232 qemu-kvm: Failed to load cpu:cpreg_vmstate_array_lenNew version kernel exclude ID_AA64ZFR0_EL1 register when host not support SVE feature, so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 register. > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 (migration source) new kernel version:4.18.0-147.5.el8 (migration destination) > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So 4.18.0-147.5.el8 also applied this patch. > Migration source and destination hosts have the same hardware, just kernel version is different, and the hardware on either side of the migration not support SVE. > Is there some good suggestions,which can make sure old version kernel live migration to new version kernel with qemu?Thanks a lot. ARM KVM guests must be run with CPU host passthrough. This means the host hardware and host kernel versions must be identical in order to guarantee successful migrations. That said, upgrading the host kernel without shutting down the guests is a reasonable thing to try. Unfortunately, in this case, the only way to do it would be to hack QEMU on the destination to allow this extra register in KVM_GET_REG_LIST. It should be harmless, as it's not used. Allowing it is similar to allowing the destination to have a larger number of registers than the source in KVM_GET_REG_LIST (which is why the failing test is <=, not ==). I wouldn't post a real patch to resolve this issue, though, as everything is working as expected. The failing test is failing because it detected a risky migration. And, KVM's filtering of registers from KVM_GET_REG_LIST is also correct, even if previous KVM versions didn't do that. With CPU models you could request the reg list for a particular model and expect it to be the same across all kernel versions that support it, but we don't have those. Thanks, drew ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Kernel patch cases qemu live migration failed. @ 2020-10-15 13:26 ` Andrew Jones 0 siblings, 0 replies; 30+ messages in thread From: Andrew Jones @ 2020-10-15 13:26 UTC (permalink / raw) To: 张东旭; +Cc: marc.zyngier, qemu-devel, Dave.Martin, kvmarm On Thu, Oct 15, 2020 at 12:06:39PM +0800, 张东旭 wrote: > I'm so sorry for disturbing you.When I apply this kernel patch:KVM: arm64/sve: System register context switch and access supporthttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/kvm/sys_regs.c?id=73433762fcaeb9d59e84d299021c6b15466c96ddqemu live migration will failed with messages: qemu-kvm: Invalid value 233 expecting positive value <= 232 qemu-kvm: Failed to load cpu:cpreg_vmstate_array_lenNew version kernel exclude ID_AA64ZFR0_EL1 register when host not support SVE feature, so qemu ioctl kvm(KVM_GET_REG_LIST) will not contain ID_AA64ZFR0_EL1 register. > I'm using CentOS Linux kernel, old kernel version:4.18.0-80.11.el8 (migration source) new kernel version:4.18.0-147.5.el8 (migration destination) > CentOS linux kernel version 4.18.0-111.el8 applied this patch. So 4.18.0-147.5.el8 also applied this patch. > Migration source and destination hosts have the same hardware, just kernel version is different, and the hardware on either side of the migration not support SVE. > Is there some good suggestions,which can make sure old version kernel live migration to new version kernel with qemu?Thanks a lot. ARM KVM guests must be run with CPU host passthrough. This means the host hardware and host kernel versions must be identical in order to guarantee successful migrations. That said, upgrading the host kernel without shutting down the guests is a reasonable thing to try. Unfortunately, in this case, the only way to do it would be to hack QEMU on the destination to allow this extra register in KVM_GET_REG_LIST. It should be harmless, as it's not used. Allowing it is similar to allowing the destination to have a larger number of registers than the source in KVM_GET_REG_LIST (which is why the failing test is <=, not ==). I wouldn't post a real patch to resolve this issue, though, as everything is working as expected. The failing test is failing because it detected a risky migration. And, KVM's filtering of registers from KVM_GET_REG_LIST is also correct, even if previous KVM versions didn't do that. With CPU models you could request the reg list for a particular model and expect it to be the same across all kernel versions that support it, but we don't have those. Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2020-10-19 16:38 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-15 4:06 Kernel patch cases qemu live migration failed 张东旭 2020-10-15 4:06 ` 张东旭 2020-10-15 11:26 ` Marc Zyngier 2020-10-15 11:26 ` Marc Zyngier 2020-10-15 13:35 ` Andrew Jones 2020-10-15 13:35 ` Andrew Jones 2020-10-15 13:52 ` Marc Zyngier 2020-10-15 13:52 ` Marc Zyngier 2020-10-15 14:41 ` Andrew Jones 2020-10-15 14:41 ` Andrew Jones 2020-10-15 14:57 ` Peter Maydell 2020-10-15 14:57 ` Peter Maydell 2020-10-19 9:25 ` Andrew Jones 2020-10-19 9:25 ` Andrew Jones 2020-10-19 11:32 ` Dave Martin 2020-10-19 11:32 ` Dave Martin 2020-10-19 11:43 ` Peter Maydell 2020-10-19 11:43 ` Peter Maydell 2020-10-19 13:40 ` Andrew Jones 2020-10-19 13:40 ` Andrew Jones 2020-10-19 14:18 ` Peter Maydell 2020-10-19 14:18 ` Peter Maydell 2020-10-19 14:58 ` Dave Martin 2020-10-19 14:58 ` Dave Martin 2020-10-19 15:23 ` Andrew Jones 2020-10-19 15:23 ` Andrew Jones 2020-10-19 16:36 ` Dave Martin 2020-10-19 16:36 ` Dave Martin 2020-10-15 13:26 ` Andrew Jones 2020-10-15 13:26 ` Andrew Jones
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.