* [RFC PATCH 0/3] fix migrate failed when vm is in booting @ 2017-09-06 13:05 wanghaibin 2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin ` (3 more replies) 0 siblings, 4 replies; 29+ messages in thread From: wanghaibin @ 2017-09-06 13:05 UTC (permalink / raw) To: marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin We have a test scenario: vmlife and migrate fixed test. Here is a problem; VM migration failed caused the qemu core which gdb trace: #0 0x0000ffffb023fe84 in raise () from /usr/lib64/libc.so.6 #1 0x0000ffffb0241b80 in abort () from /usr/lib64/libc.so.6 #2 0x000000000046b408 in kvm_device_access (fd=<optimized out>, group=4, attr=1, val=<optimized out>, write=<optimized out>) at /usr/src/debug/qemu-2.6.0/kvm-all.c:2064 #3 0x000000000059ade8 in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at vl.c:1728 #4 0x000000000045736c in do_vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:744 #5 0x00000000004573bc in vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1434 #6 0x0000000000457428 in vm_stop_force_state (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1442 #7 0x00000000006d68d0 in migration_completion (s=s@entry=0xb92d60 <current_migration.37525>, current_active_state=4, current_active_state@entry=65535, old_vm_running=0xffffb03c2000, old_vm_running@entry=0xfffe934fc53f, start_time=0xfffe934fcce0, start_time@entry=0xfffe934fc540) at migration/migration.c:1798 #8 0x00000000006d7480 in migration_thread (opaque=0xb92d60 <current_migration.37525>) at migration/migration.c:1995 #9 0x0000ffffb0398dc4 in start_thread () from /usr/lib64/libpthread.so.0 #10 0x0000ffffb02ef020 in thread_start () from /usr/lib64/libc.so.6 qemu failed log : 2017-08-28T12:34:29.654396Z qemu-kvm: KVM_SET_DEVICE_ATTR failed: Invalid argument I try to debug it, and I found the migrate at the moment that the vm is still booting. So just change the guest like : diff --git a/drivers/pci/host/pci-host-common.c b/drivers/pci/host/pci-host-common.c index e9a53ba..d73fc03 100644 --- a/drivers/pci/host/pci-host-common.c +++ b/drivers/pci/host/pci-host-common.c @@ -23,6 +23,7 @@ #include <linux/of_pci.h> #include <linux/pci-ecam.h> #include <linux/platform_device.h> +#include <linux/delay.h> static int gen_pci_parse_request_of_pci_ranges(struct device *dev, struct list_head *resources, struct resource **bus_range) @@ -119,6 +120,7 @@ int pci_host_common_probe(struct platform_device *pdev, struct pci_bus *bus, *child; struct pci_config_window *cfg; struct list_head resources; + int i; type = of_get_property(np, "device_type", NULL); if (!type || strcmp(type, "pci")) { @@ -164,6 +166,8 @@ int pci_host_common_probe(struct platform_device *pdev, pcie_bus_configure_settings(child); } + for (i=0;i<20000; i++) + udelay(1000); pci_bus_add_devices(bus); return 0; } And migrate at this delay time, It must be failed. This patchset try to fix this problem. BTW: This patchset just a demo, haven't do more test, and the implement maybe a little evil. Thanks wanghaibin (3): kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset kvm: arm/arm64: vgic-its: fix return value for restore virt/kvm/arm/arm.c | 5 ++++- virt/kvm/arm/vgic/vgic-its.c | 26 +++++++++++++++++++------- virt/kvm/arm/vgic/vgic.h | 1 + 3 files changed, 24 insertions(+), 8 deletions(-) -- 1.8.3.1 _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin @ 2017-09-06 13:05 ` wanghaibin 2017-09-12 8:50 ` wanghaibin 2017-09-13 19:14 ` Christoffer Dall 2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin ` (2 subsequent siblings) 3 siblings, 2 replies; 29+ messages in thread From: wanghaibin @ 2017-09-06 13:05 UTC (permalink / raw) To: marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin We slightly refactor vgic_its_destroy, separate vgic_its_free_list() function for later patch invoke. The patch also take a functional change. If the its->device_list.next is NULL, we still should free the its. Honestly, I can't understand How does the its->device_list.next is NULL happened at this moment. Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> --- virt/kvm/arm/vgic/vgic-its.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c index aa6b68d..25d614f 100644 --- a/virt/kvm/arm/vgic/vgic-its.c +++ b/virt/kvm/arm/vgic/vgic-its.c @@ -1624,10 +1624,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *dev) kfree(dev); } -static void vgic_its_destroy(struct kvm_device *kvm_dev) +static void vgic_its_free_list(struct kvm *kvm, struct vgic_its *its) { - struct kvm *kvm = kvm_dev->kvm; - struct vgic_its *its = kvm_dev->private; struct list_head *cur, *temp; /* @@ -1653,7 +1651,14 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev) kfree(coll); } mutex_unlock(&its->its_lock); +} + +static void vgic_its_destroy(struct kvm_device *kvm_dev) +{ + struct kvm *kvm = kvm_dev->kvm; + struct vgic_its *its = kvm_dev->private; + vgic_its_free_list(kvm, its); kfree(its); } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin @ 2017-09-12 8:50 ` wanghaibin 2017-09-12 10:08 ` Auger Eric 2017-09-13 19:14 ` Christoffer Dall 1 sibling, 1 reply; 29+ messages in thread From: wanghaibin @ 2017-09-12 8:50 UTC (permalink / raw) To: wanghaibin; +Cc: cdall, marc.zyngier, andre.przywara, kvmarm, wu.wubin On 2017/9/6 21:05, wanghaibin wrote: > We slightly refactor vgic_its_destroy, separate vgic_its_free_list() > function for later patch invoke. > > The patch also take a functional change. If the its->device_list.next > is NULL, we still should free the its. Hi, Eric Does this its->device_list.next is NULL can happened ? Thanks > Honestly, I can't understand How does the its->device_list.next is NULL > happened at this moment. > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/vgic/vgic-its.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index aa6b68d..25d614f 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -1624,10 +1624,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *dev) > kfree(dev); > } > > -static void vgic_its_destroy(struct kvm_device *kvm_dev) > +static void vgic_its_free_list(struct kvm *kvm, struct vgic_its *its) > { > - struct kvm *kvm = kvm_dev->kvm; > - struct vgic_its *its = kvm_dev->private; > struct list_head *cur, *temp; > > /* > @@ -1653,7 +1651,14 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev) > kfree(coll); > } > mutex_unlock(&its->its_lock); > +} > + > +static void vgic_its_destroy(struct kvm_device *kvm_dev) > +{ > + struct kvm *kvm = kvm_dev->kvm; > + struct vgic_its *its = kvm_dev->private; > > + vgic_its_free_list(kvm, its); > kfree(its); > } > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-12 8:50 ` wanghaibin @ 2017-09-12 10:08 ` Auger Eric 2017-09-13 19:13 ` Christoffer Dall 0 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-12 10:08 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara Hi Wanghaibin, On 12/09/2017 10:50, wanghaibin wrote: > On 2017/9/6 21:05, wanghaibin wrote: > >> We slightly refactor vgic_its_destroy, separate vgic_its_free_list() >> function for later patch invoke. >> >> The patch also take a functional change. If the its->device_list.next >> is NULL, we still should free the its. > > > Hi, Eric > > Does this its->device_list.next is NULL can happened ? I don't get why we have this check. The kvm device is removed by kvm_destroy_devices which loops on all devices added to kvm->devices. kvm_ioctl_create_device only adds the device to kvm_devices once the lists have been initialized (in vgic_create_its). So it looks safe to me without the check. André, do we miss something? Thanks Eric > > Thanks > >> Honestly, I can't understand How does the its->device_list.next is NULL >> happened at this moment. >> >> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >> --- >> virt/kvm/arm/vgic/vgic-its.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >> index aa6b68d..25d614f 100644 >> --- a/virt/kvm/arm/vgic/vgic-its.c >> +++ b/virt/kvm/arm/vgic/vgic-its.c >> @@ -1624,10 +1624,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *dev) >> kfree(dev); >> } >> >> -static void vgic_its_destroy(struct kvm_device *kvm_dev) >> +static void vgic_its_free_list(struct kvm *kvm, struct vgic_its *its) >> { >> - struct kvm *kvm = kvm_dev->kvm; >> - struct vgic_its *its = kvm_dev->private; >> struct list_head *cur, *temp; >> >> /* >> @@ -1653,7 +1651,14 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev) >> kfree(coll); >> } >> mutex_unlock(&its->its_lock); >> +} >> + >> +static void vgic_its_destroy(struct kvm_device *kvm_dev) >> +{ >> + struct kvm *kvm = kvm_dev->kvm; >> + struct vgic_its *its = kvm_dev->private; >> >> + vgic_its_free_list(kvm, its); >> kfree(its); >> } >> > > > > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-12 10:08 ` Auger Eric @ 2017-09-13 19:13 ` Christoffer Dall 0 siblings, 0 replies; 29+ messages in thread From: Christoffer Dall @ 2017-09-13 19:13 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, kvmarm, wu.wubin, andre.przywara On Tue, Sep 12, 2017 at 12:08:27PM +0200, Auger Eric wrote: > Hi Wanghaibin, > > On 12/09/2017 10:50, wanghaibin wrote: > > On 2017/9/6 21:05, wanghaibin wrote: > > > >> We slightly refactor vgic_its_destroy, separate vgic_its_free_list() > >> function for later patch invoke. > >> > >> The patch also take a functional change. If the its->device_list.next > >> is NULL, we still should free the its. > > > > > > Hi, Eric > > > > Does this its->device_list.next is NULL can happened ? > > I don't get why we have this check. > > The kvm device is removed by kvm_destroy_devices which loops on all > devices added to kvm->devices. kvm_ioctl_create_device only adds the > device to kvm_devices once the lists have been initialized (in > vgic_create_its). So it looks safe to me without the check. > > André, do we miss something? Eric, I think I agree with you, and even if we need to check for initialization, we should have a separate flag instead of piggy-backing on the list_head. Thanks, -Christoffer ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin 2017-09-12 8:50 ` wanghaibin @ 2017-09-13 19:14 ` Christoffer Dall 2017-09-16 1:59 ` wanghaibin 1 sibling, 1 reply; 29+ messages in thread From: Christoffer Dall @ 2017-09-13 19:14 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, andre.przywara, kvmarm, wu.wubin On Wed, Sep 06, 2017 at 09:05:08PM +0800, wanghaibin wrote: > We slightly refactor vgic_its_destroy, separate vgic_its_free_list() > function for later patch invoke. > > The patch also take a functional change. If the its->device_list.next I don't see a functional change in this patch? Thanks, -Christoffer > is NULL, we still should free the its. > Honestly, I can't understand How does the its->device_list.next is NULL > happened at this moment. > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/vgic/vgic-its.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index aa6b68d..25d614f 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -1624,10 +1624,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *dev) > kfree(dev); > } > > -static void vgic_its_destroy(struct kvm_device *kvm_dev) > +static void vgic_its_free_list(struct kvm *kvm, struct vgic_its *its) > { > - struct kvm *kvm = kvm_dev->kvm; > - struct vgic_its *its = kvm_dev->private; > struct list_head *cur, *temp; > > /* > @@ -1653,7 +1651,14 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev) > kfree(coll); > } > mutex_unlock(&its->its_lock); > +} > + > +static void vgic_its_destroy(struct kvm_device *kvm_dev) > +{ > + struct kvm *kvm = kvm_dev->kvm; > + struct vgic_its *its = kvm_dev->private; > > + vgic_its_free_list(kvm, its); > kfree(its); > } > > -- > 1.8.3.1 > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-13 19:14 ` Christoffer Dall @ 2017-09-16 1:59 ` wanghaibin 2017-09-16 22:17 ` Christoffer Dall 0 siblings, 1 reply; 29+ messages in thread From: wanghaibin @ 2017-09-16 1:59 UTC (permalink / raw) To: Christoffer Dall; +Cc: marc.zyngier, andre.przywara, kvmarm, wu.wubin On 2017/9/14 3:14, Christoffer Dall wrote: > On Wed, Sep 06, 2017 at 09:05:08PM +0800, wanghaibin wrote: >> We slightly refactor vgic_its_destroy, separate vgic_its_free_list() >> function for later patch invoke. >> >> The patch also take a functional change. If the its->device_list.next > > I don't see a functional change in this patch? I mean, if this check can its->device_list.next happened, this patch will free the its structure compared with the original implementation. Thanks > > Thanks, > -Christoffer > >> is NULL, we still should free the its. >> Honestly, I can't understand How does the its->device_list.next is NULL >> happened at this moment. >> >> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >> --- >> virt/kvm/arm/vgic/vgic-its.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >> index aa6b68d..25d614f 100644 >> --- a/virt/kvm/arm/vgic/vgic-its.c >> +++ b/virt/kvm/arm/vgic/vgic-its.c >> @@ -1624,10 +1624,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *dev) >> kfree(dev); >> } >> >> -static void vgic_its_destroy(struct kvm_device *kvm_dev) >> +static void vgic_its_free_list(struct kvm *kvm, struct vgic_its *its) >> { >> - struct kvm *kvm = kvm_dev->kvm; >> - struct vgic_its *its = kvm_dev->private; >> struct list_head *cur, *temp; >> >> /* >> @@ -1653,7 +1651,14 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev) >> kfree(coll); >> } >> mutex_unlock(&its->its_lock); >> +} >> + >> +static void vgic_its_destroy(struct kvm_device *kvm_dev) >> +{ >> + struct kvm *kvm = kvm_dev->kvm; >> + struct vgic_its *its = kvm_dev->private; >> >> + vgic_its_free_list(kvm, its); >> kfree(its); >> } >> >> -- >> 1.8.3.1 >> >> > > . > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function 2017-09-16 1:59 ` wanghaibin @ 2017-09-16 22:17 ` Christoffer Dall 0 siblings, 0 replies; 29+ messages in thread From: Christoffer Dall @ 2017-09-16 22:17 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, andre.przywara, kvmarm, wu.wubin On Sat, Sep 16, 2017 at 09:59:21AM +0800, wanghaibin wrote: > On 2017/9/14 3:14, Christoffer Dall wrote: > > > On Wed, Sep 06, 2017 at 09:05:08PM +0800, wanghaibin wrote: > >> We slightly refactor vgic_its_destroy, separate vgic_its_free_list() > >> function for later patch invoke. > >> > >> The patch also take a functional change. If the its->device_list.next > > > > I don't see a functional change in this patch? > > > I mean, if this check can its->device_list.next happened, this patch will > free the its structure compared with the original implementation. > I really cannot see how this patch alone changes anything at all. Thanks, -Christoffer ^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin 2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin @ 2017-09-06 13:05 ` wanghaibin 2017-09-06 16:20 ` Auger Eric 2017-09-13 19:34 ` Christoffer Dall 2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin 2017-09-20 1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin 3 siblings, 2 replies; 29+ messages in thread From: wanghaibin @ 2017-09-06 13:05 UTC (permalink / raw) To: marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin This patch fix the migrate save tables failure. When the virtual machine is in booting and the devices haven't initialized, the all virtual dte/ite may be invalid. If migrate at this moment, the save tables interface traversal device list, and check the dte is valid or not. if not, it will return the -EINVAL. This patch try to free the its list resource when vm reboot or reset to avoid this. Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> --- virt/kvm/arm/arm.c | 5 ++++- virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ virt/kvm/arm/vgic/vgic.h | 1 + 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index a39a1e1..db7632d 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -46,6 +46,7 @@ #include <asm/kvm_coproc.h> #include <asm/kvm_psci.h> #include <asm/sections.h> +#include "vgic.h" #ifdef REQUIRES_VIRT __asm__(".arch_extension virt"); @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, * Ensure a rebooted VM will fault in RAM pages and detect if the * guest MMU is turned off and flush the caches as needed. */ - if (vcpu->arch.has_run_once) + if (vcpu->arch.has_run_once) { stage2_unmap_vm(vcpu->kvm); + vgic_its_free_resource(vcpu->kvm); + } vcpu_reset_hcr(vcpu); diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c index 25d614f..5c20352 100644 --- a/virt/kvm/arm/vgic/vgic-its.c +++ b/virt/kvm/arm/vgic/vgic-its.c @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, .has_attr = vgic_its_has_attr, }; +void vgic_its_free_resource(struct kvm *kvm) +{ + struct kvm_device *dev, *tmp; + + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { + if(dev->ops == &kvm_arm_vgic_its_ops) + vgic_its_free_list(kvm, dev->private); + } +} + int kvm_vgic_register_its_device(void) { return kvm_register_device_ops(&kvm_arm_vgic_its_ops, diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h index c2be5b7..fbcbdfd 100644 --- a/virt/kvm/arm/vgic/vgic.h +++ b/virt/kvm/arm/vgic/vgic.h @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, bool lock_all_vcpus(struct kvm *kvm); void unlock_all_vcpus(struct kvm *kvm); +void vgic_its_free_resource(struct kvm *kvm); #endif -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin @ 2017-09-06 16:20 ` Auger Eric 2017-09-07 1:32 ` wanghaibin 2017-09-13 19:34 ` Christoffer Dall 1 sibling, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-06 16:20 UTC (permalink / raw) To: wanghaibin, marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin Hi, On 06/09/2017 15:05, wanghaibin wrote: > This patch fix the migrate save tables failure. > > When the virtual machine is in booting and the devices haven't initialized, > the all virtual dte/ite may be invalid. If migrate at this moment, the save > tables interface traversal device list, and check the dte is valid or not. > if not, it will return the -EINVAL. The issue on save is less clear to me. We are not checking the "dte" are valid as it is said above. We are scrolling the ITS lists - which may be empty - and dump them in guest memory. On save() there are quite few checks that can cause a failure. vgic_its_check_id() can be among them. This typically requires the GITS_BASER to have been properly set. Failing on save looks OK to me in such situation. Sorry but I don't get the purpose of this patch. Does it fix a save failure? Thanks Eric > > This patch try to free the its list resource when vm reboot or reset to avoid this. > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/arm.c | 5 ++++- > virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ > virt/kvm/arm/vgic/vgic.h | 1 + > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index a39a1e1..db7632d 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -46,6 +46,7 @@ > #include <asm/kvm_coproc.h> > #include <asm/kvm_psci.h> > #include <asm/sections.h> > +#include "vgic.h" > > #ifdef REQUIRES_VIRT > __asm__(".arch_extension virt"); > @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, > * Ensure a rebooted VM will fault in RAM pages and detect if the > * guest MMU is turned off and flush the caches as needed. > */ > - if (vcpu->arch.has_run_once) > + if (vcpu->arch.has_run_once) { > stage2_unmap_vm(vcpu->kvm); > + vgic_its_free_resource(vcpu->kvm); > + } > > vcpu_reset_hcr(vcpu); > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index 25d614f..5c20352 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, > .has_attr = vgic_its_has_attr, > }; > > +void vgic_its_free_resource(struct kvm *kvm) > +{ > + struct kvm_device *dev, *tmp; > + > + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { > + if(dev->ops == &kvm_arm_vgic_its_ops) > + vgic_its_free_list(kvm, dev->private); > + } > +} > + > int kvm_vgic_register_its_device(void) > { > return kvm_register_device_ops(&kvm_arm_vgic_its_ops, > diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h > index c2be5b7..fbcbdfd 100644 > --- a/virt/kvm/arm/vgic/vgic.h > +++ b/virt/kvm/arm/vgic/vgic.h > @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, > > bool lock_all_vcpus(struct kvm *kvm); > void unlock_all_vcpus(struct kvm *kvm); > +void vgic_its_free_resource(struct kvm *kvm); > > #endif > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-06 16:20 ` Auger Eric @ 2017-09-07 1:32 ` wanghaibin 2017-09-07 11:28 ` Auger Eric 0 siblings, 1 reply; 29+ messages in thread From: wanghaibin @ 2017-09-07 1:32 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara On 2017/9/7 0:20, Auger Eric wrote: > Hi, > > On 06/09/2017 15:05, wanghaibin wrote: >> This patch fix the migrate save tables failure. >> >> When the virtual machine is in booting and the devices haven't initialized, >> the all virtual dte/ite may be invalid. If migrate at this moment, the save >> tables interface traversal device list, and check the dte is valid or not. >> if not, it will return the -EINVAL. > > The issue on save is less clear to me. We are not checking the "dte" are > valid as it is said above. We are scrolling the ITS lists - which may be > empty - and dump them in guest memory. > > On save() there are quite few checks that can cause a failure. > vgic_its_check_id() can be among them. This typically requires the > GITS_BASER to have been properly set. Failing on save looks OK to me in > such situation. > > Sorry but I don't get the purpose of this patch. Does it fix a save failure? Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through the code like : /* Each 1st level entry is represented by a 64-bit value. */ if (kvm_read_guest(its->dev->kvm, BASER_ADDRESS(baser) + index * sizeof(indirect_ptr), &indirect_ptr, sizeof(indirect_ptr))) return false; indirect_ptr = le64_to_cpu(indirect_ptr); /* check the valid bit of the first level entry */ if (!(indirect_ptr & BIT_ULL(63))) return false; If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value. And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling the ITS lists, next time check_id failed and save interface failed. I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free and set NULL. So that, the save interface failed. This patch try to free the resource when vm reboot/reset. BTW: these lists will re-bulid when the reboot vm run the probe pci device step. Thanks > > Thanks > > Eric > > >> >> This patch try to free the its list resource when vm reboot or reset to avoid this. >> >> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >> --- >> virt/kvm/arm/arm.c | 5 ++++- >> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >> virt/kvm/arm/vgic/vgic.h | 1 + >> 3 files changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >> index a39a1e1..db7632d 100644 >> --- a/virt/kvm/arm/arm.c >> +++ b/virt/kvm/arm/arm.c >> @@ -46,6 +46,7 @@ >> #include <asm/kvm_coproc.h> >> #include <asm/kvm_psci.h> >> #include <asm/sections.h> >> +#include "vgic.h" >> >> #ifdef REQUIRES_VIRT >> __asm__(".arch_extension virt"); >> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >> * Ensure a rebooted VM will fault in RAM pages and detect if the >> * guest MMU is turned off and flush the caches as needed. >> */ >> - if (vcpu->arch.has_run_once) >> + if (vcpu->arch.has_run_once) { >> stage2_unmap_vm(vcpu->kvm); >> + vgic_its_free_resource(vcpu->kvm); >> + } >> >> vcpu_reset_hcr(vcpu); >> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >> index 25d614f..5c20352 100644 >> --- a/virt/kvm/arm/vgic/vgic-its.c >> +++ b/virt/kvm/arm/vgic/vgic-its.c >> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >> .has_attr = vgic_its_has_attr, >> }; >> >> +void vgic_its_free_resource(struct kvm *kvm) >> +{ >> + struct kvm_device *dev, *tmp; >> + >> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >> + if(dev->ops == &kvm_arm_vgic_its_ops) >> + vgic_its_free_list(kvm, dev->private); >> + } >> +} >> + >> int kvm_vgic_register_its_device(void) >> { >> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >> index c2be5b7..fbcbdfd 100644 >> --- a/virt/kvm/arm/vgic/vgic.h >> +++ b/virt/kvm/arm/vgic/vgic.h >> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >> >> bool lock_all_vcpus(struct kvm *kvm); >> void unlock_all_vcpus(struct kvm *kvm); >> +void vgic_its_free_resource(struct kvm *kvm); >> >> #endif >> > > . > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-07 1:32 ` wanghaibin @ 2017-09-07 11:28 ` Auger Eric 2017-09-10 18:46 ` Auger Eric 0 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-07 11:28 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara Hi Wanghaibin, On 07/09/2017 03:32, wanghaibin wrote: > On 2017/9/7 0:20, Auger Eric wrote: > >> Hi, >> >> On 06/09/2017 15:05, wanghaibin wrote: >>> This patch fix the migrate save tables failure. >>> >>> When the virtual machine is in booting and the devices haven't initialized, >>> the all virtual dte/ite may be invalid. If migrate at this moment, the save >>> tables interface traversal device list, and check the dte is valid or not. >>> if not, it will return the -EINVAL. >> >> The issue on save is less clear to me. We are not checking the "dte" are >> valid as it is said above. We are scrolling the ITS lists - which may be >> empty - and dump them in guest memory. >> >> On save() there are quite few checks that can cause a failure. >> vgic_its_check_id() can be among them. This typically requires the >> GITS_BASER to have been properly set. Failing on save looks OK to me in >> such situation. >> >> Sorry but I don't get the purpose of this patch. Does it fix a save failure? > > > Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through > the code like : > > /* Each 1st level entry is represented by a 64-bit value. */ > if (kvm_read_guest(its->dev->kvm, > BASER_ADDRESS(baser) + index * sizeof(indirect_ptr), > &indirect_ptr, sizeof(indirect_ptr))) > return false; > > indirect_ptr = le64_to_cpu(indirect_ptr); > > /* check the valid bit of the first level entry */ > if (!(indirect_ptr & BIT_ULL(63))) > return false; > > If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value. > > And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any > mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling > the ITS lists, next time check_id failed and save interface failed. > > I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free > and set NULL. So that, the save interface failed. > This patch try to free the resource when vm reboot/reset. OK understood. Indeed none of the device/collection lists should be non empty at that stage, ie. when GITS_BASERn have not be written yet and are marked invalid. For solving the specific save() issue here, I think the best is to check the validity bit of the GITS_BASER (col, device) and if invalid do nothing. Then we need to have a more global discussion about whether, when and where the device and collection lists need to be freed. If you want I can respin with above suggestion and add the valid pointer to the entry_fn_t to handle the restore path. Up to you. Thanks Eric > BTW: these lists will re-bulid when the reboot vm run the probe pci device step. > > Thanks > >> >> Thanks >> >> Eric >> >> >>> >>> This patch try to free the its list resource when vm reboot or reset to avoid this. >>> >>> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >>> --- >>> virt/kvm/arm/arm.c | 5 ++++- >>> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >>> virt/kvm/arm/vgic/vgic.h | 1 + >>> 3 files changed, 15 insertions(+), 1 deletion(-) >>> >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>> index a39a1e1..db7632d 100644 >>> --- a/virt/kvm/arm/arm.c >>> +++ b/virt/kvm/arm/arm.c >>> @@ -46,6 +46,7 @@ >>> #include <asm/kvm_coproc.h> >>> #include <asm/kvm_psci.h> >>> #include <asm/sections.h> >>> +#include "vgic.h" >>> >>> #ifdef REQUIRES_VIRT >>> __asm__(".arch_extension virt"); >>> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >>> * Ensure a rebooted VM will fault in RAM pages and detect if the >>> * guest MMU is turned off and flush the caches as needed. >>> */ >>> - if (vcpu->arch.has_run_once) >>> + if (vcpu->arch.has_run_once) { >>> stage2_unmap_vm(vcpu->kvm); >>> + vgic_its_free_resource(vcpu->kvm); >>> + } >>> >>> vcpu_reset_hcr(vcpu); >>> >>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >>> index 25d614f..5c20352 100644 >>> --- a/virt/kvm/arm/vgic/vgic-its.c >>> +++ b/virt/kvm/arm/vgic/vgic-its.c >>> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >>> .has_attr = vgic_its_has_attr, >>> }; >>> >>> +void vgic_its_free_resource(struct kvm *kvm) >>> +{ >>> + struct kvm_device *dev, *tmp; >>> + >>> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >>> + if(dev->ops == &kvm_arm_vgic_its_ops) >>> + vgic_its_free_list(kvm, dev->private); >>> + } >>> +} >>> + >>> int kvm_vgic_register_its_device(void) >>> { >>> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >>> index c2be5b7..fbcbdfd 100644 >>> --- a/virt/kvm/arm/vgic/vgic.h >>> +++ b/virt/kvm/arm/vgic/vgic.h >>> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >>> >>> bool lock_all_vcpus(struct kvm *kvm); >>> void unlock_all_vcpus(struct kvm *kvm); >>> +void vgic_its_free_resource(struct kvm *kvm); >>> >>> #endif >>> >> >> . >> > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-07 11:28 ` Auger Eric @ 2017-09-10 18:46 ` Auger Eric 2017-09-12 11:15 ` wanghaibin 0 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-10 18:46 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara Hi Wanghaibin, On 07/09/2017 13:28, Auger Eric wrote: > Hi Wanghaibin, > > On 07/09/2017 03:32, wanghaibin wrote: >> On 2017/9/7 0:20, Auger Eric wrote: >> >>> Hi, >>> >>> On 06/09/2017 15:05, wanghaibin wrote: >>>> This patch fix the migrate save tables failure. >>>> >>>> When the virtual machine is in booting and the devices haven't initialized, >>>> the all virtual dte/ite may be invalid. If migrate at this moment, the save >>>> tables interface traversal device list, and check the dte is valid or not. >>>> if not, it will return the -EINVAL. >>> >>> The issue on save is less clear to me. We are not checking the "dte" are >>> valid as it is said above. We are scrolling the ITS lists - which may be >>> empty - and dump them in guest memory. >>> >>> On save() there are quite few checks that can cause a failure. >>> vgic_its_check_id() can be among them. This typically requires the >>> GITS_BASER to have been properly set. Failing on save looks OK to me in >>> such situation. >>> >>> Sorry but I don't get the purpose of this patch. Does it fix a save failure? >> >> >> Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through >> the code like : >> >> /* Each 1st level entry is represented by a 64-bit value. */ >> if (kvm_read_guest(its->dev->kvm, >> BASER_ADDRESS(baser) + index * sizeof(indirect_ptr), >> &indirect_ptr, sizeof(indirect_ptr))) >> return false; >> >> indirect_ptr = le64_to_cpu(indirect_ptr); >> >> /* check the valid bit of the first level entry */ >> if (!(indirect_ptr & BIT_ULL(63))) >> return false; >> >> If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value. >> >> And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any >> mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling >> the ITS lists, next time check_id failed and save interface failed. >> >> I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free >> and set NULL. So that, the save interface failed. >> This patch try to free the resource when vm reboot/reset. > OK understood. Indeed none of the device/collection lists should be non > empty at that stage, ie. when GITS_BASERn have not be written yet and > are marked invalid. > > For solving the specific save() issue here, I think the best is to check > the validity bit of the GITS_BASER (col, device) and if invalid do nothing. Actually the above proposal does not work as GITS_BASERn is not properly reset. Maybe the best way is to introduce an ITS KVM device reset IOTCL in the control group. Upon this command we could properly reset the requested registers and the lists. Thanks Eric > > Then we need to have a more global discussion about whether, when and > where the device and collection lists need to be freed. > > If you want I can respin with above suggestion and add the valid pointer > to the entry_fn_t to handle the restore path. Up to you. > > Thanks > > Eric > > >> BTW: these lists will re-bulid when the reboot vm run the probe pci device step. > >> >> Thanks >> >>> >>> Thanks >>> >>> Eric >>> >>> >>>> >>>> This patch try to free the its list resource when vm reboot or reset to avoid this. >>>> >>>> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >>>> --- >>>> virt/kvm/arm/arm.c | 5 ++++- >>>> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >>>> virt/kvm/arm/vgic/vgic.h | 1 + >>>> 3 files changed, 15 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>>> index a39a1e1..db7632d 100644 >>>> --- a/virt/kvm/arm/arm.c >>>> +++ b/virt/kvm/arm/arm.c >>>> @@ -46,6 +46,7 @@ >>>> #include <asm/kvm_coproc.h> >>>> #include <asm/kvm_psci.h> >>>> #include <asm/sections.h> >>>> +#include "vgic.h" >>>> >>>> #ifdef REQUIRES_VIRT >>>> __asm__(".arch_extension virt"); >>>> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >>>> * Ensure a rebooted VM will fault in RAM pages and detect if the >>>> * guest MMU is turned off and flush the caches as needed. >>>> */ >>>> - if (vcpu->arch.has_run_once) >>>> + if (vcpu->arch.has_run_once) { >>>> stage2_unmap_vm(vcpu->kvm); >>>> + vgic_its_free_resource(vcpu->kvm); >>>> + } >>>> >>>> vcpu_reset_hcr(vcpu); >>>> >>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >>>> index 25d614f..5c20352 100644 >>>> --- a/virt/kvm/arm/vgic/vgic-its.c >>>> +++ b/virt/kvm/arm/vgic/vgic-its.c >>>> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >>>> .has_attr = vgic_its_has_attr, >>>> }; >>>> >>>> +void vgic_its_free_resource(struct kvm *kvm) >>>> +{ >>>> + struct kvm_device *dev, *tmp; >>>> + >>>> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >>>> + if(dev->ops == &kvm_arm_vgic_its_ops) >>>> + vgic_its_free_list(kvm, dev->private); >>>> + } >>>> +} >>>> + >>>> int kvm_vgic_register_its_device(void) >>>> { >>>> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >>>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >>>> index c2be5b7..fbcbdfd 100644 >>>> --- a/virt/kvm/arm/vgic/vgic.h >>>> +++ b/virt/kvm/arm/vgic/vgic.h >>>> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >>>> >>>> bool lock_all_vcpus(struct kvm *kvm); >>>> void unlock_all_vcpus(struct kvm *kvm); >>>> +void vgic_its_free_resource(struct kvm *kvm); >>>> >>>> #endif >>>> >>> >>> . >>> >> >> >> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-10 18:46 ` Auger Eric @ 2017-09-12 11:15 ` wanghaibin 2017-09-13 8:49 ` Auger Eric 0 siblings, 1 reply; 29+ messages in thread From: wanghaibin @ 2017-09-12 11:15 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara On 2017/9/11 2:46, Auger Eric wrote: > Hi Wanghaibin, > > On 07/09/2017 13:28, Auger Eric wrote: >> Hi Wanghaibin, >> >> On 07/09/2017 03:32, wanghaibin wrote: >>> On 2017/9/7 0:20, Auger Eric wrote: >>> >>>> Hi, >>>> >>>> On 06/09/2017 15:05, wanghaibin wrote: >>>>> This patch fix the migrate save tables failure. >>>>> >>>>> When the virtual machine is in booting and the devices haven't initialized, >>>>> the all virtual dte/ite may be invalid. If migrate at this moment, the save >>>>> tables interface traversal device list, and check the dte is valid or not. >>>>> if not, it will return the -EINVAL. >>>> >>>> The issue on save is less clear to me. We are not checking the "dte" are >>>> valid as it is said above. We are scrolling the ITS lists - which may be >>>> empty - and dump them in guest memory. >>>> >>>> On save() there are quite few checks that can cause a failure. >>>> vgic_its_check_id() can be among them. This typically requires the >>>> GITS_BASER to have been properly set. Failing on save looks OK to me in >>>> such situation. >>>> >>>> Sorry but I don't get the purpose of this patch. Does it fix a save failure? >>> >>> >>> Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through >>> the code like : >>> >>> /* Each 1st level entry is represented by a 64-bit value. */ >>> if (kvm_read_guest(its->dev->kvm, >>> BASER_ADDRESS(baser) + index * sizeof(indirect_ptr), >>> &indirect_ptr, sizeof(indirect_ptr))) >>> return false; >>> >>> indirect_ptr = le64_to_cpu(indirect_ptr); >>> >>> /* check the valid bit of the first level entry */ >>> if (!(indirect_ptr & BIT_ULL(63))) >>> return false; >>> >>> If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value. >>> >>> And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any >>> mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling >>> the ITS lists, next time check_id failed and save interface failed. >>> >>> I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free >>> and set NULL. So that, the save interface failed. >>> This patch try to free the resource when vm reboot/reset. >> OK understood. Indeed none of the device/collection lists should be non >> empty at that stage, ie. when GITS_BASERn have not be written yet and >> are marked invalid. >> >> For solving the specific save() issue here, I think the best is to check >> the validity bit of the GITS_BASER (col, device) and if invalid do nothing. > > Actually the above proposal does not work as GITS_BASERn is not properly > reset. Maybe the best way is to introduce an ITS KVM device reset IOTCL > in the control group. Upon this command we could properly reset the > requested registers and the lists. Yes, It should free these lists when vits reset. This patch according the has_run_once and vcpu_init to mark the vcpu reset happened, and scrolling all kvm devices to find the vits device to free the lists. I think it's a little odd too. If we can add the reset IOCTL, I think it must be the best way. Thanks. > > Thanks > > Eric >> >> Then we need to have a more global discussion about whether, when and >> where the device and collection lists need to be freed. >> >> If you want I can respin with above suggestion and add the valid pointer >> to the entry_fn_t to handle the restore path. Up to you. All along, I want to contribute code to the community, so far, It has not been achieved. So I would like to collect the solutions for this problem and try to fix it first, can I? Thanks. >> >> Thanks >> >> Eric >> >> >>> BTW: these lists will re-bulid when the reboot vm run the probe pci device step. >> >>> >>> Thanks >>> >>>> >>>> Thanks >>>> >>>> Eric >>>> >>>> >>>>> >>>>> This patch try to free the its list resource when vm reboot or reset to avoid this. >>>>> >>>>> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >>>>> --- >>>>> virt/kvm/arm/arm.c | 5 ++++- >>>>> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >>>>> virt/kvm/arm/vgic/vgic.h | 1 + >>>>> 3 files changed, 15 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>>>> index a39a1e1..db7632d 100644 >>>>> --- a/virt/kvm/arm/arm.c >>>>> +++ b/virt/kvm/arm/arm.c >>>>> @@ -46,6 +46,7 @@ >>>>> #include <asm/kvm_coproc.h> >>>>> #include <asm/kvm_psci.h> >>>>> #include <asm/sections.h> >>>>> +#include "vgic.h" >>>>> >>>>> #ifdef REQUIRES_VIRT >>>>> __asm__(".arch_extension virt"); >>>>> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >>>>> * Ensure a rebooted VM will fault in RAM pages and detect if the >>>>> * guest MMU is turned off and flush the caches as needed. >>>>> */ >>>>> - if (vcpu->arch.has_run_once) >>>>> + if (vcpu->arch.has_run_once) { >>>>> stage2_unmap_vm(vcpu->kvm); >>>>> + vgic_its_free_resource(vcpu->kvm); >>>>> + } >>>>> >>>>> vcpu_reset_hcr(vcpu); >>>>> >>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >>>>> index 25d614f..5c20352 100644 >>>>> --- a/virt/kvm/arm/vgic/vgic-its.c >>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c >>>>> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >>>>> .has_attr = vgic_its_has_attr, >>>>> }; >>>>> >>>>> +void vgic_its_free_resource(struct kvm *kvm) >>>>> +{ >>>>> + struct kvm_device *dev, *tmp; >>>>> + >>>>> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >>>>> + if(dev->ops == &kvm_arm_vgic_its_ops) >>>>> + vgic_its_free_list(kvm, dev->private); >>>>> + } >>>>> +} >>>>> + >>>>> int kvm_vgic_register_its_device(void) >>>>> { >>>>> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >>>>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >>>>> index c2be5b7..fbcbdfd 100644 >>>>> --- a/virt/kvm/arm/vgic/vgic.h >>>>> +++ b/virt/kvm/arm/vgic/vgic.h >>>>> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >>>>> >>>>> bool lock_all_vcpus(struct kvm *kvm); >>>>> void unlock_all_vcpus(struct kvm *kvm); >>>>> +void vgic_its_free_resource(struct kvm *kvm); >>>>> >>>>> #endif >>>>> >>>> >>>> . >>>> >>> >>> >>> > > . > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-12 11:15 ` wanghaibin @ 2017-09-13 8:49 ` Auger Eric 0 siblings, 0 replies; 29+ messages in thread From: Auger Eric @ 2017-09-13 8:49 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara Hi Wanghaibin, On 12/09/2017 13:15, wanghaibin wrote: > On 2017/9/11 2:46, Auger Eric wrote: > >> Hi Wanghaibin, >> >> On 07/09/2017 13:28, Auger Eric wrote: >>> Hi Wanghaibin, >>> >>> On 07/09/2017 03:32, wanghaibin wrote: >>>> On 2017/9/7 0:20, Auger Eric wrote: >>>> >>>>> Hi, >>>>> >>>>> On 06/09/2017 15:05, wanghaibin wrote: >>>>>> This patch fix the migrate save tables failure. >>>>>> >>>>>> When the virtual machine is in booting and the devices haven't initialized, >>>>>> the all virtual dte/ite may be invalid. If migrate at this moment, the save >>>>>> tables interface traversal device list, and check the dte is valid or not. >>>>>> if not, it will return the -EINVAL. >>>>> >>>>> The issue on save is less clear to me. We are not checking the "dte" are >>>>> valid as it is said above. We are scrolling the ITS lists - which may be >>>>> empty - and dump them in guest memory. >>>>> >>>>> On save() there are quite few checks that can cause a failure. >>>>> vgic_its_check_id() can be among them. This typically requires the >>>>> GITS_BASER to have been properly set. Failing on save looks OK to me in >>>>> such situation. >>>>> >>>>> Sorry but I don't get the purpose of this patch. Does it fix a save failure? >>>> >>>> >>>> Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through >>>> the code like : >>>> >>>> /* Each 1st level entry is represented by a 64-bit value. */ >>>> if (kvm_read_guest(its->dev->kvm, >>>> BASER_ADDRESS(baser) + index * sizeof(indirect_ptr), >>>> &indirect_ptr, sizeof(indirect_ptr))) >>>> return false; >>>> >>>> indirect_ptr = le64_to_cpu(indirect_ptr); >>>> >>>> /* check the valid bit of the first level entry */ >>>> if (!(indirect_ptr & BIT_ULL(63))) >>>> return false; >>>> >>>> If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value. >>>> >>>> And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any >>>> mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling >>>> the ITS lists, next time check_id failed and save interface failed. >>>> >>>> I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free >>>> and set NULL. So that, the save interface failed. >>>> This patch try to free the resource when vm reboot/reset. >>> OK understood. Indeed none of the device/collection lists should be non >>> empty at that stage, ie. when GITS_BASERn have not be written yet and >>> are marked invalid. >>> >>> For solving the specific save() issue here, I think the best is to check >>> the validity bit of the GITS_BASER (col, device) and if invalid do nothing. >> >> Actually the above proposal does not work as GITS_BASERn is not properly >> reset. Maybe the best way is to introduce an ITS KVM device reset IOTCL >> in the control group. Upon this command we could properly reset the >> requested registers and the lists. > > > Yes, It should free these lists when vits reset. > > This patch according the has_run_once and vcpu_init to mark the vcpu reset happened, > and scrolling all kvm devices to find the vits device to free the lists. > I think it's a little odd too. > > If we can add the reset IOCTL, I think it must be the best way. I looked at the qemu reset of the GICv3 and my understanding is the QEMU reset function sends reset values for each individual register. So my understanding is I should align QEMU ITS reset code. Then, we should free internal lists in the relevant register write when detecting the valid bit is not set. > > Thanks. > >> >> Thanks >> >> Eric >>> >>> Then we need to have a more global discussion about whether, when and >>> where the device and collection lists need to be freed. >>> >>> If you want I can respin with above suggestion and add the valid pointer >>> to the entry_fn_t to handle the restore path. Up to you. > > > All along, I want to contribute code to the community, so far, It has not been achieved. > So I would like to collect the solutions for this problem and try to fix it first, can I? OK sure. So I will send a QEMU patch today and I will review your kernel fixes then. Thanks Eric > > Thanks. > >>> >>> Thanks >>> >>> Eric >>> >>> >>>> BTW: these lists will re-bulid when the reboot vm run the probe pci device step. >>> >>>> >>>> Thanks >>>> >>>>> >>>>> Thanks >>>>> >>>>> Eric >>>>> >>>>> >>>>>> >>>>>> This patch try to free the its list resource when vm reboot or reset to avoid this. >>>>>> >>>>>> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >>>>>> --- >>>>>> virt/kvm/arm/arm.c | 5 ++++- >>>>>> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >>>>>> virt/kvm/arm/vgic/vgic.h | 1 + >>>>>> 3 files changed, 15 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>>>>> index a39a1e1..db7632d 100644 >>>>>> --- a/virt/kvm/arm/arm.c >>>>>> +++ b/virt/kvm/arm/arm.c >>>>>> @@ -46,6 +46,7 @@ >>>>>> #include <asm/kvm_coproc.h> >>>>>> #include <asm/kvm_psci.h> >>>>>> #include <asm/sections.h> >>>>>> +#include "vgic.h" >>>>>> >>>>>> #ifdef REQUIRES_VIRT >>>>>> __asm__(".arch_extension virt"); >>>>>> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >>>>>> * Ensure a rebooted VM will fault in RAM pages and detect if the >>>>>> * guest MMU is turned off and flush the caches as needed. >>>>>> */ >>>>>> - if (vcpu->arch.has_run_once) >>>>>> + if (vcpu->arch.has_run_once) { >>>>>> stage2_unmap_vm(vcpu->kvm); >>>>>> + vgic_its_free_resource(vcpu->kvm); >>>>>> + } >>>>>> >>>>>> vcpu_reset_hcr(vcpu); >>>>>> >>>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >>>>>> index 25d614f..5c20352 100644 >>>>>> --- a/virt/kvm/arm/vgic/vgic-its.c >>>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c >>>>>> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >>>>>> .has_attr = vgic_its_has_attr, >>>>>> }; >>>>>> >>>>>> +void vgic_its_free_resource(struct kvm *kvm) >>>>>> +{ >>>>>> + struct kvm_device *dev, *tmp; >>>>>> + >>>>>> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >>>>>> + if(dev->ops == &kvm_arm_vgic_its_ops) >>>>>> + vgic_its_free_list(kvm, dev->private); >>>>>> + } >>>>>> +} >>>>>> + >>>>>> int kvm_vgic_register_its_device(void) >>>>>> { >>>>>> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >>>>>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >>>>>> index c2be5b7..fbcbdfd 100644 >>>>>> --- a/virt/kvm/arm/vgic/vgic.h >>>>>> +++ b/virt/kvm/arm/vgic/vgic.h >>>>>> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >>>>>> >>>>>> bool lock_all_vcpus(struct kvm *kvm); >>>>>> void unlock_all_vcpus(struct kvm *kvm); >>>>>> +void vgic_its_free_resource(struct kvm *kvm); >>>>>> >>>>>> #endif >>>>>> >>>>> >>>>> . >>>>> >>>> >>>> >>>> >> >> . >> > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin 2017-09-06 16:20 ` Auger Eric @ 2017-09-13 19:34 ` Christoffer Dall 2017-09-13 21:13 ` Auger Eric 1 sibling, 1 reply; 29+ messages in thread From: Christoffer Dall @ 2017-09-13 19:34 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, andre.przywara, kvmarm, wu.wubin Hi Wanghaibin, On Wed, Sep 06, 2017 at 09:05:09PM +0800, wanghaibin wrote: > This patch fix the migrate save tables failure. > > When the virtual machine is in booting and the devices haven't initialized, > the all virtual dte/ite may be invalid. If migrate at this moment, the save > tables interface traversal device list, and check the dte is valid or not. > if not, it will return the -EINVAL. > > This patch try to free the its list resource when vm reboot or reset to avoid this. I think the problem should be described the following way (feel free to use this in a commit message). When rebooting a VM, we currently don't have a way to reset the ITS. This results in the booting a VM with a pre-programmed ITS with existing cached state. This can lead to all sorts of interesting problems. One such problem is that if trying to migrate the VM after rebooting the VM, we try to traverse the device tables in guest memory. When using indirect tables, and the guest has re-initialized the ITS device table base register pointing to cleared memory, this results in trying to access address 0 in the guest physical address space, which in turn causes the ITS saving code to return an error to user space. The correct fix is to introduce a reset function as a device attribute for the ITS. Hope this helps, -Christoffer > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/arm.c | 5 ++++- > virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ > virt/kvm/arm/vgic/vgic.h | 1 + > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index a39a1e1..db7632d 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -46,6 +46,7 @@ > #include <asm/kvm_coproc.h> > #include <asm/kvm_psci.h> > #include <asm/sections.h> > +#include "vgic.h" > > #ifdef REQUIRES_VIRT > __asm__(".arch_extension virt"); > @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, > * Ensure a rebooted VM will fault in RAM pages and detect if the > * guest MMU is turned off and flush the caches as needed. > */ > - if (vcpu->arch.has_run_once) > + if (vcpu->arch.has_run_once) { > stage2_unmap_vm(vcpu->kvm); > + vgic_its_free_resource(vcpu->kvm); > + } > > vcpu_reset_hcr(vcpu); > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index 25d614f..5c20352 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, > .has_attr = vgic_its_has_attr, > }; > > +void vgic_its_free_resource(struct kvm *kvm) > +{ > + struct kvm_device *dev, *tmp; > + > + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { > + if(dev->ops == &kvm_arm_vgic_its_ops) > + vgic_its_free_list(kvm, dev->private); > + } > +} > + > int kvm_vgic_register_its_device(void) > { > return kvm_register_device_ops(&kvm_arm_vgic_its_ops, > diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h > index c2be5b7..fbcbdfd 100644 > --- a/virt/kvm/arm/vgic/vgic.h > +++ b/virt/kvm/arm/vgic/vgic.h > @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, > > bool lock_all_vcpus(struct kvm *kvm); > void unlock_all_vcpus(struct kvm *kvm); > +void vgic_its_free_resource(struct kvm *kvm); > > #endif > -- > 1.8.3.1 > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-13 19:34 ` Christoffer Dall @ 2017-09-13 21:13 ` Auger Eric 2017-09-14 5:34 ` Christoffer Dall 0 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-13 21:13 UTC (permalink / raw) To: Christoffer Dall, wanghaibin Cc: marc.zyngier, andre.przywara, kvmarm, wu.wubin Hi Christoffer, On 13/09/2017 21:34, Christoffer Dall wrote: > Hi Wanghaibin, > > On Wed, Sep 06, 2017 at 09:05:09PM +0800, wanghaibin wrote: >> This patch fix the migrate save tables failure. >> >> When the virtual machine is in booting and the devices haven't initialized, >> the all virtual dte/ite may be invalid. If migrate at this moment, the save >> tables interface traversal device list, and check the dte is valid or not. >> if not, it will return the -EINVAL. >> >> This patch try to free the its list resource when vm reboot or reset to avoid this. > > I think the problem should be described the following way (feel free to > use this in a commit message). > > When rebooting a VM, we currently don't have a way to reset the ITS. > This results in the booting a VM with a pre-programmed ITS with existing > cached state. This can lead to all sorts of interesting problems. > > One such problem is that if trying to migrate the VM after rebooting the > VM, we try to traverse the device tables in guest memory. When using > indirect tables, and the guest has re-initialized the ITS device table > base register pointing to cleared memory, this results in trying to > access address 0 in the guest physical address space, which in turn > causes the ITS saving code to return an error to user space. > > The correct fix is to introduce a reset function as a device attribute > for the ITS. ... or reset the list when the userspace writes GITS_BASERn with a valid bit set to 0. For GICv3 we don't have a specific IOCTL for reset. In QEMU there is a reset callback called whenever the guest is rebooted or reset and in the GICv3 callback we perform user space write access for all the relevant registers. I missed that when doing the ITS QEMU integration. Shall we use the same trick as for GICv3 or add another KVM device group/attribute? Thanks Eric > > Hope this helps, > -Christoffer > >> >> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >> --- >> virt/kvm/arm/arm.c | 5 ++++- >> virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++ >> virt/kvm/arm/vgic/vgic.h | 1 + >> 3 files changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >> index a39a1e1..db7632d 100644 >> --- a/virt/kvm/arm/arm.c >> +++ b/virt/kvm/arm/arm.c >> @@ -46,6 +46,7 @@ >> #include <asm/kvm_coproc.h> >> #include <asm/kvm_psci.h> >> #include <asm/sections.h> >> +#include "vgic.h" >> >> #ifdef REQUIRES_VIRT >> __asm__(".arch_extension virt"); >> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, >> * Ensure a rebooted VM will fault in RAM pages and detect if the >> * guest MMU is turned off and flush the caches as needed. >> */ >> - if (vcpu->arch.has_run_once) >> + if (vcpu->arch.has_run_once) { >> stage2_unmap_vm(vcpu->kvm); >> + vgic_its_free_resource(vcpu->kvm); >> + } >> >> vcpu_reset_hcr(vcpu); >> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >> index 25d614f..5c20352 100644 >> --- a/virt/kvm/arm/vgic/vgic-its.c >> +++ b/virt/kvm/arm/vgic/vgic-its.c >> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev, >> .has_attr = vgic_its_has_attr, >> }; >> >> +void vgic_its_free_resource(struct kvm *kvm) >> +{ >> + struct kvm_device *dev, *tmp; >> + >> + list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) { >> + if(dev->ops == &kvm_arm_vgic_its_ops) >> + vgic_its_free_list(kvm, dev->private); >> + } >> +} >> + >> int kvm_vgic_register_its_device(void) >> { >> return kvm_register_device_ops(&kvm_arm_vgic_its_ops, >> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h >> index c2be5b7..fbcbdfd 100644 >> --- a/virt/kvm/arm/vgic/vgic.h >> +++ b/virt/kvm/arm/vgic/vgic.h >> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write, >> >> bool lock_all_vcpus(struct kvm *kvm); >> void unlock_all_vcpus(struct kvm *kvm); >> +void vgic_its_free_resource(struct kvm *kvm); >> >> #endif >> -- >> 1.8.3.1 >> >> > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset 2017-09-13 21:13 ` Auger Eric @ 2017-09-14 5:34 ` Christoffer Dall 0 siblings, 0 replies; 29+ messages in thread From: Christoffer Dall @ 2017-09-14 5:34 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, kvmarm, wu.wubin, andre.przywara On Wed, Sep 13, 2017 at 11:13:55PM +0200, Auger Eric wrote: > Hi Christoffer, > > On 13/09/2017 21:34, Christoffer Dall wrote: > > Hi Wanghaibin, > > > > On Wed, Sep 06, 2017 at 09:05:09PM +0800, wanghaibin wrote: > >> This patch fix the migrate save tables failure. > >> > >> When the virtual machine is in booting and the devices haven't initialized, > >> the all virtual dte/ite may be invalid. If migrate at this moment, the save > >> tables interface traversal device list, and check the dte is valid or not. > >> if not, it will return the -EINVAL. > >> > >> This patch try to free the its list resource when vm reboot or reset to avoid this. > > > > I think the problem should be described the following way (feel free to > > use this in a commit message). > > > > When rebooting a VM, we currently don't have a way to reset the ITS. > > This results in the booting a VM with a pre-programmed ITS with existing > > cached state. This can lead to all sorts of interesting problems. > > > > One such problem is that if trying to migrate the VM after rebooting the > > VM, we try to traverse the device tables in guest memory. When using > > indirect tables, and the guest has re-initialized the ITS device table > > base register pointing to cleared memory, this results in trying to > > access address 0 in the guest physical address space, which in turn > > causes the ITS saving code to return an error to user space. > > > > The correct fix is to introduce a reset function as a device attribute > > for the ITS. > ... or reset the list when the userspace writes GITS_BASERn with a valid > bit set to 0. But I can't tell if that in any way matches something mandated by the architecture. Would the ITS invalidate its caches because BASERn is written with valid==0 ? To me, it feels like the BASERn trick is trying to guess that we have done a reset and we should probably do something, where what we really want is the system (userspace) to tell us when to reset to a clean state. > > For GICv3 we don't have a specific IOCTL for reset. In QEMU there is a > reset callback called whenever the guest is rebooted or reset and in the > GICv3 callback we perform user space write access for all the relevant > registers. I missed that when doing the ITS QEMU integration. Oh, we let userspace drive reset of the redistributors via setting the mmio registers currently. I sort of think we should perhaps provide a separate attribute for reset as well, but if what we have currently works, why not. > > Shall we use the same trick as for GICv3 or add another KVM device > group/attribute? I think we need a separate attribute, because I don't see that the architecture provides a meaningful way to reset the internal ITS state simply by writing some registers, so we would have to come up with something specific for the user space mmio accessors anyway, and we might as well expose a virtual reset pin to userspace. Thanks, -Christoffer ^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin 2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin 2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin @ 2017-09-06 13:05 ` wanghaibin 2017-09-06 15:18 ` Auger Eric ` (2 more replies) 2017-09-20 1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin 3 siblings, 3 replies; 29+ messages in thread From: wanghaibin @ 2017-09-06 13:05 UTC (permalink / raw) To: marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin This patch fix the migrate restore tables failure. The same scene, at the destination, the restore tables interface traversal guest memory, and check the dte/ite is valid or not. If all dtes/ites are invalid, we will do try next one, and the last it will take the 1 return value, but currently, it be treated as error. That's not correct. This patch try to fix this problem. Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> --- virt/kvm/arm/vgic/vgic-its.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c index 5c20352..2c69aeb 100644 --- a/virt/kvm/arm/vgic/vgic-its.c +++ b/virt/kvm/arm/vgic/vgic-its.c @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, return PTR_ERR(dev); ret = vgic_its_restore_itt(its, dev); - if (ret) { + if (ret < 0) { vgic_its_free_device(its->dev->kvm, dev); return ret; } @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) vgic_its_restore_dte, NULL); } - if (ret > 0) - ret = -EINVAL; - return ret; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin @ 2017-09-06 15:18 ` Auger Eric 2017-09-13 20:02 ` Christoffer Dall 2017-09-13 20:04 ` Christoffer Dall 2017-09-14 8:30 ` Auger Eric 2 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-06 15:18 UTC (permalink / raw) To: wanghaibin, marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin Hi Wanghaibin, On 06/09/2017 15:05, wanghaibin wrote: > This patch fix the migrate restore tables failure. > > The same scene, at the destination, the restore tables interface traversal guest > memory, and check the dte/ite is valid or not. > If all dtes/ites are invalid, we will do try next one, and the last it will take > the 1 return value, but currently, it be treated as error. That's not correct. There's indeed a bug here! In case all entries are invalid we shouldn't return an error. One solution could be to relax the error checking in scan_its_table() and do not return 1 when the whole length has been scanned. This would fix your issue. drawback of that change: at the moment we check the consistency of the entry data (next offset field). At the moment if the next_offset points to an entry outside of the table scope we are able to return an error (for top level tables). Otherwise, if we want to keep that check, I think we would need to add a bool *valid parameter to entry_fn_t. in scan_its_table() we would return 1 only if last fn() call returns valid and len <= 0. Thanks Eric > > This patch try to fix this problem. > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/vgic/vgic-its.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index 5c20352..2c69aeb 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, > return PTR_ERR(dev); > > ret = vgic_its_restore_itt(its, dev); > - if (ret) { > + if (ret < 0) { > vgic_its_free_device(its->dev->kvm, dev); > return ret; > } > @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) > vgic_its_restore_dte, NULL); > } > > - if (ret > 0) > - ret = -EINVAL; > - > return ret; > } > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-06 15:18 ` Auger Eric @ 2017-09-13 20:02 ` Christoffer Dall 2017-09-13 21:25 ` Auger Eric 0 siblings, 1 reply; 29+ messages in thread From: Christoffer Dall @ 2017-09-13 20:02 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, kvmarm, wu.wubin, andre.przywara Hi Eric, On Wed, Sep 06, 2017 at 05:18:35PM +0200, Auger Eric wrote: > Hi Wanghaibin, > > On 06/09/2017 15:05, wanghaibin wrote: > > This patch fix the migrate restore tables failure. > > > > The same scene, at the destination, the restore tables interface traversal guest > > memory, and check the dte/ite is valid or not. > > If all dtes/ites are invalid, we will do try next one, and the last it will take > > the 1 return value, but currently, it be treated as error. That's not correct. > There's indeed a bug here! In case all entries are invalid we shouldn't > return an error. > > One solution could be to relax the error checking in scan_its_table() > and do not return 1 when the whole length has been scanned. This would > fix your issue. > > drawback of that change: > at the moment we check the consistency of the entry data (next offset > field). At the moment if the next_offset points to an entry outside of > the table scope we are able to return an error (for top level tables). > > Otherwise, if we want to keep that check, I think we would need to add a > bool *valid parameter to entry_fn_t. in scan_its_table() we would return > 1 only if last fn() call returns valid and len <= 0. > I don't really understand what you're proposing. I think Wanghaibin's patch actually looks correct. Is there any problem with it that you see? Thanks, -Christoffer > > > > This patch try to fix this problem. > > > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > > --- > > virt/kvm/arm/vgic/vgic-its.c | 5 +---- > > 1 file changed, 1 insertion(+), 4 deletions(-) > > > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > > index 5c20352..2c69aeb 100644 > > --- a/virt/kvm/arm/vgic/vgic-its.c > > +++ b/virt/kvm/arm/vgic/vgic-its.c > > @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, > > return PTR_ERR(dev); > > > > ret = vgic_its_restore_itt(its, dev); > > - if (ret) { > > + if (ret < 0) { > > vgic_its_free_device(its->dev->kvm, dev); > > return ret; > > } > > @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) > > vgic_its_restore_dte, NULL); > > } > > > > - if (ret > 0) > > - ret = -EINVAL; > > - > > return ret; > > } > > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-13 20:02 ` Christoffer Dall @ 2017-09-13 21:25 ` Auger Eric 2017-09-14 5:35 ` Christoffer Dall 0 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-13 21:25 UTC (permalink / raw) To: Christoffer Dall; +Cc: marc.zyngier, kvmarm, wu.wubin, andre.przywara Hi, On 13/09/2017 22:02, Christoffer Dall wrote: > Hi Eric, > > On Wed, Sep 06, 2017 at 05:18:35PM +0200, Auger Eric wrote: >> Hi Wanghaibin, >> >> On 06/09/2017 15:05, wanghaibin wrote: >>> This patch fix the migrate restore tables failure. >>> >>> The same scene, at the destination, the restore tables interface traversal guest >>> memory, and check the dte/ite is valid or not. >>> If all dtes/ites are invalid, we will do try next one, and the last it will take >>> the 1 return value, but currently, it be treated as error. That's not correct. >> There's indeed a bug here! In case all entries are invalid we shouldn't >> return an error. >> >> One solution could be to relax the error checking in scan_its_table() >> and do not return 1 when the whole length has been scanned. This would >> fix your issue. If you do not return 1 in scan_its_table if the whole size has been scanned, you achieve the same thing as in this patch and you simplify the error handling. >> >> drawback of that change: >> at the moment we check the consistency of the entry data (next offset >> field). At the moment if the next_offset points to an entry outside of >> the table scope we are able to return an error (for top level tables). >> >> Otherwise, if we want to keep that check, I think we would need to add a >> bool *valid parameter to entry_fn_t. in scan_its_table() we would return >> 1 only if last fn() call returns valid and len <= 0. >> > > I don't really understand what you're proposing. The above method or Wanghaibin's patch removes a consistency check on the entry next offset field. But maybe this is better to drop it and have a code that gains in readability. Thanks Eric > > I think Wanghaibin's patch actually looks correct. Is there any problem > with it that you see? > > Thanks, > -Christoffer > >>> >>> This patch try to fix this problem. >>> >>> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >>> --- >>> virt/kvm/arm/vgic/vgic-its.c | 5 +---- >>> 1 file changed, 1 insertion(+), 4 deletions(-) >>> >>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >>> index 5c20352..2c69aeb 100644 >>> --- a/virt/kvm/arm/vgic/vgic-its.c >>> +++ b/virt/kvm/arm/vgic/vgic-its.c >>> @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, >>> return PTR_ERR(dev); >>> >>> ret = vgic_its_restore_itt(its, dev); >>> - if (ret) { >>> + if (ret < 0) { >>> vgic_its_free_device(its->dev->kvm, dev); >>> return ret; >>> } >>> @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) >>> vgic_its_restore_dte, NULL); >>> } >>> >>> - if (ret > 0) >>> - ret = -EINVAL; >>> - >>> return ret; >>> } >>> >>> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-13 21:25 ` Auger Eric @ 2017-09-14 5:35 ` Christoffer Dall 0 siblings, 0 replies; 29+ messages in thread From: Christoffer Dall @ 2017-09-14 5:35 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, kvmarm, wu.wubin, andre.przywara On Wed, Sep 13, 2017 at 11:25:01PM +0200, Auger Eric wrote: > Hi, > > On 13/09/2017 22:02, Christoffer Dall wrote: > > Hi Eric, > > > > On Wed, Sep 06, 2017 at 05:18:35PM +0200, Auger Eric wrote: > >> Hi Wanghaibin, > >> > >> On 06/09/2017 15:05, wanghaibin wrote: > >>> This patch fix the migrate restore tables failure. > >>> > >>> The same scene, at the destination, the restore tables interface traversal guest > >>> memory, and check the dte/ite is valid or not. > >>> If all dtes/ites are invalid, we will do try next one, and the last it will take > >>> the 1 return value, but currently, it be treated as error. That's not correct. > >> There's indeed a bug here! In case all entries are invalid we shouldn't > >> return an error. > >> > >> One solution could be to relax the error checking in scan_its_table() > >> and do not return 1 when the whole length has been scanned. This would > >> fix your issue. > If you do not return 1 in scan_its_table if the whole size has been > scanned, you achieve the same thing as in this patch and you simplify > the error handling. Yes, but then you have to rework the scan function otherwise to work with indirect tables, and I'm not sure that becomes more pretty. > >> > >> drawback of that change: > >> at the moment we check the consistency of the entry data (next offset > >> field). At the moment if the next_offset points to an entry outside of > >> the table scope we are able to return an error (for top level tables). > >> > >> Otherwise, if we want to keep that check, I think we would need to add a > >> bool *valid parameter to entry_fn_t. in scan_its_table() we would return > >> 1 only if last fn() call returns valid and len <= 0. > >> > > > > I don't really understand what you're proposing. > The above method or Wanghaibin's patch removes a consistency check on > the entry next offset field. But maybe this is better to drop it and > have a code that gains in readability. > Can you show me an alternative better patch and we can compare? Thanks, -Christoffer ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin 2017-09-06 15:18 ` Auger Eric @ 2017-09-13 20:04 ` Christoffer Dall 2017-09-14 8:30 ` Auger Eric 2 siblings, 0 replies; 29+ messages in thread From: Christoffer Dall @ 2017-09-13 20:04 UTC (permalink / raw) To: wanghaibin; +Cc: marc.zyngier, andre.przywara, kvmarm, wu.wubin Hi Wanghaibin, On Wed, Sep 06, 2017 at 09:05:10PM +0800, wanghaibin wrote: > This patch fix the migrate restore tables failure. > > The same scene, at the destination, the restore tables interface traversal guest > memory, and check the dte/ite is valid or not. > If all dtes/ites are invalid, we will do try next one, and the last it will take > the 1 return value, but currently, it be treated as error. That's not correct. > > This patch try to fix this problem. This commit message needs some work. It could sound something like this: When restoring the ITS, and scanning the ITS tables, we currently treat empty tables as an error and cancel the restore process. The problem is that we don't handle a return value == 1 from scan_its_table as indicating that we have simply scanned an empty table. Hope this helps, -Christoffer > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/vgic/vgic-its.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index 5c20352..2c69aeb 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, > return PTR_ERR(dev); > > ret = vgic_its_restore_itt(its, dev); > - if (ret) { > + if (ret < 0) { > vgic_its_free_device(its->dev->kvm, dev); > return ret; > } > @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) > vgic_its_restore_dte, NULL); > } > > - if (ret > 0) > - ret = -EINVAL; > - > return ret; > } > > -- > 1.8.3.1 > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin 2017-09-06 15:18 ` Auger Eric 2017-09-13 20:04 ` Christoffer Dall @ 2017-09-14 8:30 ` Auger Eric 2017-09-16 2:02 ` wanghaibin 2 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-14 8:30 UTC (permalink / raw) To: wanghaibin, marc.zyngier, cdall, kvmarm, andre.przywara; +Cc: wu.wubin Hi Wanghaibin, On 06/09/2017 15:05, wanghaibin wrote: > This patch fix the migrate restore tables failure. > > The same scene, at the destination, the restore tables interface traversal guest > memory, and check the dte/ite is valid or not. > If all dtes/ites are invalid, we will do try next one, and the last it will take > the 1 return value, but currently, it be treated as error. That's not correct. > > This patch try to fix this problem. > > Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> > --- > virt/kvm/arm/vgic/vgic-its.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c > index 5c20352..2c69aeb 100644 > --- a/virt/kvm/arm/vgic/vgic-its.c > +++ b/virt/kvm/arm/vgic/vgic-its.c > @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, > return PTR_ERR(dev); > > ret = vgic_its_restore_itt(its, dev); > - if (ret) { > + if (ret < 0) { > vgic_its_free_device(its->dev->kvm, dev); > return ret; > } > @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) > vgic_its_restore_dte, NULL); > } > > - if (ret > 0) > - ret = -EINVAL; Reworking error handling in scan_its_table as I previously proposed indeed don't work for L2 tables, as pointed out by Christoffer. As such I eventually think your approach is the best. The only remark I have left is restore table IOCTL is likely to return 1 with your change. 1 is not documented in the API as a returned value. I think we should return 0 in this case. We are lucky because QEMU kvm_device_access does not interpret positive values as errors. Thanks Eric > - > return ret; > } > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore 2017-09-14 8:30 ` Auger Eric @ 2017-09-16 2:02 ` wanghaibin 0 siblings, 0 replies; 29+ messages in thread From: wanghaibin @ 2017-09-16 2:02 UTC (permalink / raw) To: Auger Eric; +Cc: marc.zyngier, cdall, kvmarm, wu.wubin, andre.przywara On 2017/9/14 16:30, Auger Eric wrote: > Hi Wanghaibin, > > On 06/09/2017 15:05, wanghaibin wrote: >> This patch fix the migrate restore tables failure. >> >> The same scene, at the destination, the restore tables interface traversal guest >> memory, and check the dte/ite is valid or not. >> If all dtes/ites are invalid, we will do try next one, and the last it will take >> the 1 return value, but currently, it be treated as error. That's not correct. >> >> This patch try to fix this problem. >> >> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> >> --- >> virt/kvm/arm/vgic/vgic-its.c | 5 +---- >> 1 file changed, 1 insertion(+), 4 deletions(-) >> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c >> index 5c20352..2c69aeb 100644 >> --- a/virt/kvm/arm/vgic/vgic-its.c >> +++ b/virt/kvm/arm/vgic/vgic-its.c >> @@ -2025,7 +2025,7 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, >> return PTR_ERR(dev); >> >> ret = vgic_its_restore_itt(its, dev); >> - if (ret) { >> + if (ret < 0) { >> vgic_its_free_device(its->dev->kvm, dev); >> return ret; >> } >> @@ -2147,9 +2147,6 @@ static int vgic_its_restore_device_tables(struct vgic_its *its) >> vgic_its_restore_dte, NULL); >> } >> >> - if (ret > 0) >> - ret = -EINVAL; > Reworking error handling in scan_its_table as I previously proposed > indeed don't work for L2 tables, as pointed out by Christoffer. As such > I eventually think your approach is the best. > > The only remark I have left is restore table IOCTL is likely to return 1 > with your change. 1 is not documented in the API as a returned value. I > think we should return 0 in this case. We are lucky because QEMU > kvm_device_access does not interpret positive values as errors. Will fix it. Thanks. > > Thanks > > Eric > >> - >> return ret; >> } >> >> > > . > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/3] fix migrate failed when vm is in booting 2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin ` (2 preceding siblings ...) 2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin @ 2017-09-20 1:57 ` wanghaibin 2017-09-20 7:16 ` Auger Eric 3 siblings, 1 reply; 29+ messages in thread From: wanghaibin @ 2017-09-20 1:57 UTC (permalink / raw) To: wanghaibin Cc: cdall, Huangweidong (C), marc.zyngier, andre.przywara, kvmarm, wu.wubin On 2017/9/6 21:05, wanghaibin wrote: > We have a test scenario: vmlife and migrate fixed test. > > Here is a problem; VM migration failed caused the qemu core which gdb trace: I tested migration with these patches in these days, and here is another scene can expose another problem. scene: migrate at the moment which vm boot in UEFI (EDK2, before the guest kernel boot) while vm reboot. core file(at destination) gdb trace: (gdb) bt #0 0x0000ffff978dae84 in raise () from /usr/lib64/libc.so.6 #1 0x0000ffff978dcb80 in abort () from /usr/lib64/libc.so.6 #2 0x0000000000465468 in kvm_device_access () #3 0x00000000004a2d68 in kvm_arm_its_post_load () #4 0x0000000000629038 in gicv3_its_post_load () #5 0x00000000006c042c in vmstate_load_state () #6 0x000000000047acf8 in qemu_loadvm_section_start_full () #7 0x000000000047b234 in qemu_loadvm_state_main () #8 0x000000000047cdb8 in qemu_loadvm_state () #9 0x00000000006bf3cc in process_incoming_migration_co () #10 0x00000000007b40f0 in coroutine_trampoline () #11 0x0000ffff978ebfa0 in ?? () from /usr/lib64/libc.so.6 The reason of the bug is vITS device doesn't reset it's registers value correctly, in qemu vits device reset logic, the iidr is set to 0, and the kvm_arm_its_post_load check the iidr is 0 and return (without put any registers). Here maybe a problem when migrate at that moment (vm boot in UEFI), the restore interface can check the corresponding BASERn registers valid successfully, and flush info from guest memory to kvm will failed in future (for example, the BASER[CT].valid=1, but the CTE maybe invalid). So, we should fix the vITS device reset interface, and there may be a rule that needs clarification: In arm-vgic-its.txt, it has been described about ITS restore sequence: d) restore the ITS in the following order: 1. Restore GITS_CBASER 2. Restore all other GITS_ registers, except GITS_CTLR! 3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES) 4. Restore GITS_CTLR But this sequence is inappropriate for reset interface based on our currently implementation , for example, reset interface write creadr/baser with the initial value must rely on its disable firstly. So, minimal change (fix in qemu) maybe like: diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c index 39903d5..9602dd3 100644 --- a/hw/intc/arm_gicv3_its_kvm.c +++ b/hw/intc/arm_gicv3_its_kvm.c @@ -160,7 +160,8 @@ static void kvm_arm_its_post_load(GICv3ITSState *s) int i; if (!s->iidr) { - return; + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, + GITS_CTLR, &s->ctlr, true, &error_abort); } kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, @@ -190,8 +191,10 @@ static void kvm_arm_its_post_load(GICv3ITSState *s) KVM_DEV_ARM_ITS_RESTORE_TABLES, NULL, true, &error_abort); - kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, - GITS_CTLR, &s->ctlr, true, &error_abort); + if (s->iidr) { + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, + GITS_CTLR, &s->ctlr, true, &error_abort); + } } Any suggestion ? Thanks. > > #0 0x0000ffffb023fe84 in raise () from /usr/lib64/libc.so.6 > #1 0x0000ffffb0241b80 in abort () from /usr/lib64/libc.so.6 > #2 0x000000000046b408 in kvm_device_access (fd=<optimized out>, group=4, attr=1, val=<optimized out>, write=<optimized out>) > at /usr/src/debug/qemu-2.6.0/kvm-all.c:2064 > #3 0x000000000059ade8 in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at vl.c:1728 > #4 0x000000000045736c in do_vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:744 > #5 0x00000000004573bc in vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1434 > #6 0x0000000000457428 in vm_stop_force_state (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1442 > #7 0x00000000006d68d0 in migration_completion (s=s@entry=0xb92d60 <current_migration.37525>, current_active_state=4, > current_active_state@entry=65535, old_vm_running=0xffffb03c2000, old_vm_running@entry=0xfffe934fc53f, > start_time=0xfffe934fcce0, start_time@entry=0xfffe934fc540) at migration/migration.c:1798 > #8 0x00000000006d7480 in migration_thread (opaque=0xb92d60 <current_migration.37525>) at migration/migration.c:1995 > #9 0x0000ffffb0398dc4 in start_thread () from /usr/lib64/libpthread.so.0 > #10 0x0000ffffb02ef020 in thread_start () from /usr/lib64/libc.so.6 > > qemu failed log : > 2017-08-28T12:34:29.654396Z qemu-kvm: KVM_SET_DEVICE_ATTR failed: Invalid argument > > I try to debug it, and I found the migrate at the moment that the vm is still booting. > > So just change the guest like : > diff --git a/drivers/pci/host/pci-host-common.c b/drivers/pci/host/pci-host-common.c > index e9a53ba..d73fc03 100644 > --- a/drivers/pci/host/pci-host-common.c > +++ b/drivers/pci/host/pci-host-common.c > @@ -23,6 +23,7 @@ > #include <linux/of_pci.h> > #include <linux/pci-ecam.h> > #include <linux/platform_device.h> > +#include <linux/delay.h> > > static int gen_pci_parse_request_of_pci_ranges(struct device *dev, > struct list_head *resources, struct resource **bus_range) > @@ -119,6 +120,7 @@ int pci_host_common_probe(struct platform_device *pdev, > struct pci_bus *bus, *child; > struct pci_config_window *cfg; > struct list_head resources; > + int i; > > type = of_get_property(np, "device_type", NULL); > if (!type || strcmp(type, "pci")) { > @@ -164,6 +166,8 @@ int pci_host_common_probe(struct platform_device *pdev, > pcie_bus_configure_settings(child); > } > > + for (i=0;i<20000; i++) > + udelay(1000); > pci_bus_add_devices(bus); > return 0; > } > > And migrate at this delay time, It must be failed. > > This patchset try to fix this problem. > > BTW: This patchset just a demo, haven't do more test, and the implement maybe a little evil. > > Thanks > > > > wanghaibin (3): > kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function > kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset > kvm: arm/arm64: vgic-its: fix return value for restore > > virt/kvm/arm/arm.c | 5 ++++- > virt/kvm/arm/vgic/vgic-its.c | 26 +++++++++++++++++++------- > virt/kvm/arm/vgic/vgic.h | 1 + > 3 files changed, 24 insertions(+), 8 deletions(-) > _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/3] fix migrate failed when vm is in booting 2017-09-20 1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin @ 2017-09-20 7:16 ` Auger Eric 2017-09-21 12:17 ` wanghaibin 0 siblings, 1 reply; 29+ messages in thread From: Auger Eric @ 2017-09-20 7:16 UTC (permalink / raw) To: wanghaibin Cc: cdall, Huangweidong (C), marc.zyngier, andre.przywara, kvmarm, wu.wubin Hi Wanghaibin, On 20/09/2017 03:57, wanghaibin wrote: > On 2017/9/6 21:05, wanghaibin wrote: > >> We have a test scenario: vmlife and migrate fixed test. >> >> Here is a problem; VM migration failed caused the qemu core which gdb trace: > > > I tested migration with these patches in these days, and here is another scene can expose another problem. > > scene: migrate at the moment which vm boot in UEFI (EDK2, before the guest kernel boot) while vm reboot. > core file(at destination) gdb trace: > (gdb) bt > #0 0x0000ffff978dae84 in raise () from /usr/lib64/libc.so.6 > #1 0x0000ffff978dcb80 in abort () from /usr/lib64/libc.so.6 > #2 0x0000000000465468 in kvm_device_access () > #3 0x00000000004a2d68 in kvm_arm_its_post_load () > #4 0x0000000000629038 in gicv3_its_post_load () > #5 0x00000000006c042c in vmstate_load_state () > #6 0x000000000047acf8 in qemu_loadvm_section_start_full () > #7 0x000000000047b234 in qemu_loadvm_state_main () > #8 0x000000000047cdb8 in qemu_loadvm_state () > #9 0x00000000006bf3cc in process_incoming_migration_co () > #10 0x00000000007b40f0 in coroutine_trampoline () > #11 0x0000ffff978ebfa0 in ?? () from /usr/lib64/libc.so.6 > > The reason of the bug is vITS device doesn't reset it's registers value correctly, in qemu vits device reset logic, the iidr is set to 0, > and the kvm_arm_its_post_load check the iidr is 0 and return (without put any registers). That's correct this is what currently prevents from writing the registers on reset. My initial goal was that this code would be executed only on restore() and not on reset(), hence this check. > > Here maybe a problem when migrate at that moment (vm boot in UEFI), the restore interface can check the corresponding BASERn registers > valid successfully, and flush info from guest memory to kvm will failed in future (for example, the BASER[CT].valid=1, but the CTE maybe invalid). > > So, we should fix the vITS device reset interface, and there may be a rule that needs clarification: > In arm-vgic-its.txt, it has been described about ITS restore sequence: > d) restore the ITS in the following order: > 1. Restore GITS_CBASER > 2. Restore all other GITS_ registers, except GITS_CTLR! > 3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES) > 4. Restore GITS_CTLR > > But this sequence is inappropriate for reset interface based on our currently implementation , for example, reset interface write creadr/baser with > the initial value must rely on its disable firstly. After the discussion with Christoffer, I think the preferred solution is to introduce a new KVM ITS device reset IOCTL and not write individual registers as this sequence would do. Anyway as you point ed out, this sequence does not work as is. Typically writing CREADR/CWRITER with 0 fails at the moment because CBASER == 0. So may I suggest I follow up with an implementation of the reset IOCTL + some cleanup patchs? Please an you respin the restore fix according to what we discussed? Thanks Eric > > So, minimal change (fix in qemu) maybe like: > > diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c > index 39903d5..9602dd3 100644 > --- a/hw/intc/arm_gicv3_its_kvm.c > +++ b/hw/intc/arm_gicv3_its_kvm.c > @@ -160,7 +160,8 @@ static void kvm_arm_its_post_load(GICv3ITSState *s) > int i; > > if (!s->iidr) { > - return; > + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, > + GITS_CTLR, &s->ctlr, true, &error_abort); > } > > kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, > @@ -190,8 +191,10 @@ static void kvm_arm_its_post_load(GICv3ITSState *s) > KVM_DEV_ARM_ITS_RESTORE_TABLES, NULL, true, > &error_abort); > > - kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, > - GITS_CTLR, &s->ctlr, true, &error_abort); > + if (s->iidr) { > + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, > + GITS_CTLR, &s->ctlr, true, &error_abort); > + } > } > > Any suggestion ? > > Thanks. > >> >> #0 0x0000ffffb023fe84 in raise () from /usr/lib64/libc.so.6 >> #1 0x0000ffffb0241b80 in abort () from /usr/lib64/libc.so.6 >> #2 0x000000000046b408 in kvm_device_access (fd=<optimized out>, group=4, attr=1, val=<optimized out>, write=<optimized out>) >> at /usr/src/debug/qemu-2.6.0/kvm-all.c:2064 >> #3 0x000000000059ade8 in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at vl.c:1728 >> #4 0x000000000045736c in do_vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:744 >> #5 0x00000000004573bc in vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1434 >> #6 0x0000000000457428 in vm_stop_force_state (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1442 >> #7 0x00000000006d68d0 in migration_completion (s=s@entry=0xb92d60 <current_migration.37525>, current_active_state=4, >> current_active_state@entry=65535, old_vm_running=0xffffb03c2000, old_vm_running@entry=0xfffe934fc53f, >> start_time=0xfffe934fcce0, start_time@entry=0xfffe934fc540) at migration/migration.c:1798 >> #8 0x00000000006d7480 in migration_thread (opaque=0xb92d60 <current_migration.37525>) at migration/migration.c:1995 >> #9 0x0000ffffb0398dc4 in start_thread () from /usr/lib64/libpthread.so.0 >> #10 0x0000ffffb02ef020 in thread_start () from /usr/lib64/libc.so.6 >> >> qemu failed log : >> 2017-08-28T12:34:29.654396Z qemu-kvm: KVM_SET_DEVICE_ATTR failed: Invalid argument >> >> I try to debug it, and I found the migrate at the moment that the vm is still booting. >> >> So just change the guest like : >> diff --git a/drivers/pci/host/pci-host-common.c b/drivers/pci/host/pci-host-common.c >> index e9a53ba..d73fc03 100644 >> --- a/drivers/pci/host/pci-host-common.c >> +++ b/drivers/pci/host/pci-host-common.c >> @@ -23,6 +23,7 @@ >> #include <linux/of_pci.h> >> #include <linux/pci-ecam.h> >> #include <linux/platform_device.h> >> +#include <linux/delay.h> >> >> static int gen_pci_parse_request_of_pci_ranges(struct device *dev, >> struct list_head *resources, struct resource **bus_range) >> @@ -119,6 +120,7 @@ int pci_host_common_probe(struct platform_device *pdev, >> struct pci_bus *bus, *child; >> struct pci_config_window *cfg; >> struct list_head resources; >> + int i; >> >> type = of_get_property(np, "device_type", NULL); >> if (!type || strcmp(type, "pci")) { >> @@ -164,6 +166,8 @@ int pci_host_common_probe(struct platform_device *pdev, >> pcie_bus_configure_settings(child); >> } >> >> + for (i=0;i<20000; i++) >> + udelay(1000); >> pci_bus_add_devices(bus); >> return 0; >> } >> >> And migrate at this delay time, It must be failed. >> >> This patchset try to fix this problem. >> >> BTW: This patchset just a demo, haven't do more test, and the implement maybe a little evil. >> >> Thanks >> >> >> >> wanghaibin (3): >> kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function >> kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset >> kvm: arm/arm64: vgic-its: fix return value for restore >> >> virt/kvm/arm/arm.c | 5 ++++- >> virt/kvm/arm/vgic/vgic-its.c | 26 +++++++++++++++++++------- >> virt/kvm/arm/vgic/vgic.h | 1 + >> 3 files changed, 24 insertions(+), 8 deletions(-) >> > > > > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/3] fix migrate failed when vm is in booting 2017-09-20 7:16 ` Auger Eric @ 2017-09-21 12:17 ` wanghaibin 0 siblings, 0 replies; 29+ messages in thread From: wanghaibin @ 2017-09-21 12:17 UTC (permalink / raw) To: Auger Eric Cc: cdall, Huangweidong (C), marc.zyngier, andre.przywara, kvmarm, wu.wubin On 2017/9/20 15:16, Auger Eric wrote: > Hi Wanghaibin, > > On 20/09/2017 03:57, wanghaibin wrote: >> On 2017/9/6 21:05, wanghaibin wrote: >> >>> We have a test scenario: vmlife and migrate fixed test. >>> >>> Here is a problem; VM migration failed caused the qemu core which gdb trace: >> >> >> I tested migration with these patches in these days, and here is another scene can expose another problem. >> >> scene: migrate at the moment which vm boot in UEFI (EDK2, before the guest kernel boot) while vm reboot. >> core file(at destination) gdb trace: >> (gdb) bt >> #0 0x0000ffff978dae84 in raise () from /usr/lib64/libc.so.6 >> #1 0x0000ffff978dcb80 in abort () from /usr/lib64/libc.so.6 >> #2 0x0000000000465468 in kvm_device_access () >> #3 0x00000000004a2d68 in kvm_arm_its_post_load () >> #4 0x0000000000629038 in gicv3_its_post_load () >> #5 0x00000000006c042c in vmstate_load_state () >> #6 0x000000000047acf8 in qemu_loadvm_section_start_full () >> #7 0x000000000047b234 in qemu_loadvm_state_main () >> #8 0x000000000047cdb8 in qemu_loadvm_state () >> #9 0x00000000006bf3cc in process_incoming_migration_co () >> #10 0x00000000007b40f0 in coroutine_trampoline () >> #11 0x0000ffff978ebfa0 in ?? () from /usr/lib64/libc.so.6 >> >> The reason of the bug is vITS device doesn't reset it's registers value correctly, in qemu vits device reset logic, the iidr is set to 0, >> and the kvm_arm_its_post_load check the iidr is 0 and return (without put any registers). > > That's correct this is what currently prevents from writing the > registers on reset. My initial goal was that this code would be executed > only on restore() and not on reset(), hence this check. > >> >> Here maybe a problem when migrate at that moment (vm boot in UEFI), the restore interface can check the corresponding BASERn registers >> valid successfully, and flush info from guest memory to kvm will failed in future (for example, the BASER[CT].valid=1, but the CTE maybe invalid). >> >> So, we should fix the vITS device reset interface, and there may be a rule that needs clarification: >> In arm-vgic-its.txt, it has been described about ITS restore sequence: >> d) restore the ITS in the following order: >> 1. Restore GITS_CBASER >> 2. Restore all other GITS_ registers, except GITS_CTLR! >> 3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES) >> 4. Restore GITS_CTLR >> >> But this sequence is inappropriate for reset interface based on our currently implementation , for example, reset interface write creadr/baser with >> the initial value must rely on its disable firstly. > > After the discussion with Christoffer, I think the preferred solution is > to introduce a new KVM ITS device reset IOCTL and not write individual > registers as this sequence would do. Anyway as you point ed out, this > sequence does not work as is. Typically writing CREADR/CWRITER with 0 > fails at the moment because CBASER == 0. > > So may I suggest I follow up with an implementation of the reset IOCTL + > some cleanup patchs? Please an you respin the restore fix according to > what we discussed? Sure, Thanks. > > Thanks > > Eric >> >> So, minimal change (fix in qemu) maybe like: >> >> diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c >> index 39903d5..9602dd3 100644 >> --- a/hw/intc/arm_gicv3_its_kvm.c >> +++ b/hw/intc/arm_gicv3_its_kvm.c >> @@ -160,7 +160,8 @@ static void kvm_arm_its_post_load(GICv3ITSState *s) >> int i; >> >> if (!s->iidr) { >> - return; >> + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, >> + GITS_CTLR, &s->ctlr, true, &error_abort); >> } >> >> kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, >> @@ -190,8 +191,10 @@ static void kvm_arm_its_post_load(GICv3ITSState *s) >> KVM_DEV_ARM_ITS_RESTORE_TABLES, NULL, true, >> &error_abort); >> >> - kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, >> - GITS_CTLR, &s->ctlr, true, &error_abort); >> + if (s->iidr) { >> + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS, >> + GITS_CTLR, &s->ctlr, true, &error_abort); >> + } >> } >> >> Any suggestion ? >> >> Thanks. >> >>> >>> #0 0x0000ffffb023fe84 in raise () from /usr/lib64/libc.so.6 >>> #1 0x0000ffffb0241b80 in abort () from /usr/lib64/libc.so.6 >>> #2 0x000000000046b408 in kvm_device_access (fd=<optimized out>, group=4, attr=1, val=<optimized out>, write=<optimized out>) >>> at /usr/src/debug/qemu-2.6.0/kvm-all.c:2064 >>> #3 0x000000000059ade8 in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at vl.c:1728 >>> #4 0x000000000045736c in do_vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:744 >>> #5 0x00000000004573bc in vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1434 >>> #6 0x0000000000457428 in vm_stop_force_state (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1442 >>> #7 0x00000000006d68d0 in migration_completion (s=s@entry=0xb92d60 <current_migration.37525>, current_active_state=4, >>> current_active_state@entry=65535, old_vm_running=0xffffb03c2000, old_vm_running@entry=0xfffe934fc53f, >>> start_time=0xfffe934fcce0, start_time@entry=0xfffe934fc540) at migration/migration.c:1798 >>> #8 0x00000000006d7480 in migration_thread (opaque=0xb92d60 <current_migration.37525>) at migration/migration.c:1995 >>> #9 0x0000ffffb0398dc4 in start_thread () from /usr/lib64/libpthread.so.0 >>> #10 0x0000ffffb02ef020 in thread_start () from /usr/lib64/libc.so.6 >>> >>> qemu failed log : >>> 2017-08-28T12:34:29.654396Z qemu-kvm: KVM_SET_DEVICE_ATTR failed: Invalid argument >>> >>> I try to debug it, and I found the migrate at the moment that the vm is still booting. >>> >>> So just change the guest like : >>> diff --git a/drivers/pci/host/pci-host-common.c b/drivers/pci/host/pci-host-common.c >>> index e9a53ba..d73fc03 100644 >>> --- a/drivers/pci/host/pci-host-common.c >>> +++ b/drivers/pci/host/pci-host-common.c >>> @@ -23,6 +23,7 @@ >>> #include <linux/of_pci.h> >>> #include <linux/pci-ecam.h> >>> #include <linux/platform_device.h> >>> +#include <linux/delay.h> >>> >>> static int gen_pci_parse_request_of_pci_ranges(struct device *dev, >>> struct list_head *resources, struct resource **bus_range) >>> @@ -119,6 +120,7 @@ int pci_host_common_probe(struct platform_device *pdev, >>> struct pci_bus *bus, *child; >>> struct pci_config_window *cfg; >>> struct list_head resources; >>> + int i; >>> >>> type = of_get_property(np, "device_type", NULL); >>> if (!type || strcmp(type, "pci")) { >>> @@ -164,6 +166,8 @@ int pci_host_common_probe(struct platform_device *pdev, >>> pcie_bus_configure_settings(child); >>> } >>> >>> + for (i=0;i<20000; i++) >>> + udelay(1000); >>> pci_bus_add_devices(bus); >>> return 0; >>> } >>> >>> And migrate at this delay time, It must be failed. >>> >>> This patchset try to fix this problem. >>> >>> BTW: This patchset just a demo, haven't do more test, and the implement maybe a little evil. >>> >>> Thanks >>> >>> >>> >>> wanghaibin (3): >>> kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function >>> kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset >>> kvm: arm/arm64: vgic-its: fix return value for restore >>> >>> virt/kvm/arm/arm.c | 5 ++++- >>> virt/kvm/arm/vgic/vgic-its.c | 26 +++++++++++++++++++------- >>> virt/kvm/arm/vgic/vgic.h | 1 + >>> 3 files changed, 24 insertions(+), 8 deletions(-) >>> >> >> >> >> _______________________________________________ >> kvmarm mailing list >> kvmarm@lists.cs.columbia.edu >> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm >> > > . > _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2017-09-21 12:20 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin 2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin 2017-09-12 8:50 ` wanghaibin 2017-09-12 10:08 ` Auger Eric 2017-09-13 19:13 ` Christoffer Dall 2017-09-13 19:14 ` Christoffer Dall 2017-09-16 1:59 ` wanghaibin 2017-09-16 22:17 ` Christoffer Dall 2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin 2017-09-06 16:20 ` Auger Eric 2017-09-07 1:32 ` wanghaibin 2017-09-07 11:28 ` Auger Eric 2017-09-10 18:46 ` Auger Eric 2017-09-12 11:15 ` wanghaibin 2017-09-13 8:49 ` Auger Eric 2017-09-13 19:34 ` Christoffer Dall 2017-09-13 21:13 ` Auger Eric 2017-09-14 5:34 ` Christoffer Dall 2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin 2017-09-06 15:18 ` Auger Eric 2017-09-13 20:02 ` Christoffer Dall 2017-09-13 21:25 ` Auger Eric 2017-09-14 5:35 ` Christoffer Dall 2017-09-13 20:04 ` Christoffer Dall 2017-09-14 8:30 ` Auger Eric 2017-09-16 2:02 ` wanghaibin 2017-09-20 1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin 2017-09-20 7:16 ` Auger Eric 2017-09-21 12:17 ` wanghaibin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.