* [PATCH 0/2] page hinting add passthrough support @ 2020-01-07 14:46 weiqi 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: weiqi @ 2020-01-07 14:46 UTC (permalink / raw) To: alexander.h.duyck, alex.williamson Cc: kvm, linux-kernel, pbonzini, x86, wei qi From: wei qi <weiqi4@huawei.com> I just implemented dynamically updating the iommu table to support pass-through, It seen to work fine. Test: start a 4G vm with 2M hugetlb and ixgbevf passthrough, GuestOS: linux-5.2.6 + (mm / virtio: Provide support for free page reporting) HostOS: 5.5-rc4 Host: Intel(R) Xeon(R) Gold 6161 CPU @ 2.20GHz after enable page hinting, free pages at GuestOS can be free at host. before, # cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/free_hugepages 5620 5620 after start VM, # numastat -c qemu Per-node process memory usage (in MBs) PID Node 0 Node 1 Total --------------- ------ ------ ----- 24463 (qemu_hotr 6 6 12 24479 (qemu_tls_ 0 8 8 70718 (qemu-syst 58 539 597 --------------- ------ ------ ----- Total 64 553 616 # cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/free_hugepages 5595 5366 the modify at qemu, +int kvm_discard_range(struct kvm_discard_msg discard_msg) +{ + return kvm_vm_ioctl(kvm_state, KVM_DISCARD_RANGE, &discard_msg); +} static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) { .................. + discard_msg.in_addr = elem->in_addr[i]; + discard_msg.iov_len = elem->in_sg[i].iov_len; ram_block_discard_range(rb, ram_offset, size); + kvm_discard_range(discard_msg); then, further test network bandwidth, performance seem ok. Is there any hidden problem in this implementation? And, is there plan to support pass-throughyour? wei qi (2): vfio: add mmap/munmap API for page hinting KVM: add support for page hinting arch/x86/kvm/mmu/mmu.c | 79 ++++++++++++++++++++ arch/x86/kvm/x86.c | 96 ++++++++++++++++++++++++ drivers/vfio/vfio.c | 109 ++++++++++++++++++++++++++++ drivers/vfio/vfio_iommu_type1.c | 157 +++++++++++++++++++++++++++++++++++++++- include/linux/kvm_host.h | 41 +++++++++++ include/linux/vfio.h | 17 ++++- include/uapi/linux/kvm.h | 7 ++ virt/kvm/vfio.c | 11 --- 8 files changed, 503 insertions(+), 14 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] vfio: add mmap/munmap API for page hinting 2020-01-07 14:46 [PATCH 0/2] page hinting add passthrough support weiqi @ 2020-01-07 14:46 ` weiqi 2020-01-07 15:22 ` Alex Williamson ` (2 more replies) 2020-01-07 14:46 ` [PATCH 2/2] KVM: add support for page hinting weiqi 2020-01-07 16:37 ` [PATCH 0/2] page hinting add passthrough support Alexander Duyck 2 siblings, 3 replies; 8+ messages in thread From: weiqi @ 2020-01-07 14:46 UTC (permalink / raw) To: alexander.h.duyck, alex.williamson Cc: kvm, linux-kernel, pbonzini, x86, wei qi From: wei qi <weiqi4@huawei.com> add mmap/munmap API for page hinting. Signed-off-by: wei qi <weiqi4@huawei.com> --- drivers/vfio/vfio.c | 109 ++++++++++++++++++++++++++++ drivers/vfio/vfio_iommu_type1.c | 157 +++++++++++++++++++++++++++++++++++++++- include/linux/vfio.h | 17 ++++- 3 files changed, 280 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index c848262..c7e9103 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1866,6 +1866,115 @@ int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs, } EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare); +int vfio_mmap_pages(struct device *dev, unsigned long user_pfn, + unsigned long page_size, int prot, + unsigned long pfn) +{ + struct vfio_container *container; + struct vfio_group *group; + struct vfio_iommu_driver *driver; + int ret; + + if (!dev || !user_pfn || !page_size) + return -EINVAL; + + group = vfio_group_get_from_dev(dev); + if (!group) + return -ENODEV; + + ret = vfio_group_add_container_user(group); + if (ret) + goto err_pin_pages; + + container = group->container; + driver = container->iommu_driver; + if (likely(driver && driver->ops->mmap_pages)) + ret = driver->ops->mmap_pages(container->iommu_data, user_pfn, + page_size, prot, pfn); + else + ret = -ENOTTY; + + vfio_group_try_dissolve_container(group); + +err_pin_pages: + vfio_group_put(group); + return ret; +} +EXPORT_SYMBOL_GPL(vfio_mmap_pages); + +int vfio_munmap_pages(struct device *dev, unsigned long user_pfn, + unsigned long page_size) +{ + struct vfio_container *container; + struct vfio_group *group; + struct vfio_iommu_driver *driver; + int ret; + + if (!dev || !user_pfn || !page_size) + return -EINVAL; + + group = vfio_group_get_from_dev(dev); + if (!group) + return -ENODEV; + + ret = vfio_group_add_container_user(group); + if (ret) + goto err_pin_pages; + + container = group->container; + driver = container->iommu_driver; + if (likely(driver && driver->ops->munmap_pages)) + ret = driver->ops->munmap_pages(container->iommu_data, user_pfn, + page_size); + else + ret = -ENOTTY; + + vfio_group_try_dissolve_container(group); + +err_pin_pages: + vfio_group_put(group); + return ret; +} +EXPORT_SYMBOL_GPL(vfio_munmap_pages); + +int vfio_dma_find(struct device *dev, unsigned long user_pfn, int npage, + unsigned long *phys_pfn) +{ + struct vfio_container *container; + struct vfio_group *group; + struct vfio_iommu_driver *driver; + int ret; + + if (!dev || !user_pfn || !npage || !phys_pfn) + return -EINVAL; + + if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) + return -E2BIG; + + group = vfio_group_get_from_dev(dev); + if (!group) + return -ENODEV; + + ret = vfio_group_add_container_user(group); + if (ret) + goto err_pin_pages; + + container = group->container; + driver = container->iommu_driver; + if (driver && driver->ops->dma_find) + ret = driver->ops->dma_find(container->iommu_data, user_pfn, + npage, phys_pfn); + else + ret = -ENOTTY; + + vfio_group_try_dissolve_container(group); + +err_pin_pages: + vfio_group_put(group); + return ret; +} +EXPORT_SYMBOL(vfio_dma_find); + /* * Pin a set of guest PFNs and return their associated host PFNs for local * domain only. diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 2ada8e6..df115dc 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -414,7 +414,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, goto out; /* Lock all the consecutive pages from pfn_base */ - for (vaddr += PAGE_SIZE, iova += PAGE_SIZE; pinned < npage; + for (vaddr += PAGE_SIZE, iova += PAGE_SIZE; (pinned < npage && pinned < 512); pinned++, vaddr += PAGE_SIZE, iova += PAGE_SIZE) { ret = vaddr_get_pfn(current->mm, vaddr, dma->prot, &pfn); if (ret) @@ -768,7 +768,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, phys_addr_t phys, next; phys = iommu_iova_to_phys(domain->domain, iova); - if (WARN_ON(!phys)) { + if (!phys) { iova += PAGE_SIZE; continue; } @@ -1154,6 +1154,156 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, return ret; } +static int vfio_iommu_type1_munmap_pages(void *iommu_data, + unsigned long user_pfn, + unsigned long page_size) +{ + struct vfio_iommu *iommu = iommu_data; + struct vfio_domain *domain; + struct vfio_dma *dma; + dma_addr_t iova = user_pfn << PAGE_SHIFT; + int ret = 0; + phys_addr_t phys; + size_t unmapped; + long unlocked = 0; + + if (!iommu || !user_pfn || !page_size) + return -EINVAL; + + /* Supported for v2 version only */ + if (!iommu->v2) + return -EACCES; + + mutex_lock(&iommu->lock); + dma = vfio_find_dma(iommu, iova, page_size); + if (!dma) { + ret = -EINVAL; + goto out_unlock; + } + + domain = list_first_entry(&iommu->domain_list, + struct vfio_domain, next); + phys = iommu_iova_to_phys(domain->domain, iova); + if (!phys) { + goto out_unlock; + } else { + unmapped = iommu_unmap(domain->domain, iova, page_size); + unlocked = vfio_unpin_pages_remote(dma, iova, + phys >> PAGE_SHIFT, + unmapped >> PAGE_SHIFT, true); + } + +out_unlock: + mutex_unlock(&iommu->lock); + return ret; +} + +static int vfio_iommu_type1_mmap_pages(void *iommu_data, + unsigned long user_pfn, + unsigned long page_size, int prot, + unsigned long pfn) +{ + struct vfio_iommu *iommu = iommu_data; + struct vfio_domain *domain; + struct vfio_dma *dma; + dma_addr_t iova = user_pfn << PAGE_SHIFT; + int ret = 0; + size_t unmapped; + phys_addr_t phys; + long unlocked = 0; + + if (!iommu || !user_pfn || !page_size || !pfn) + return -EINVAL; + + /* Supported for v2 version only */ + if (!iommu->v2) + return -EACCES; + + mutex_lock(&iommu->lock); + + if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) { + ret = -EACCES; + goto out_unlock; + } + + dma = vfio_find_dma(iommu, iova, page_size); + if (!dma) { + ret = -EINVAL; + goto out_unlock; + } + + domain = list_first_entry(&iommu->domain_list, + struct vfio_domain, next); + + phys = iommu_iova_to_phys(domain->domain, iova); + if (phys) { + unmapped = iommu_unmap(domain->domain, iova, page_size); + unlocked = vfio_unpin_pages_remote(dma, iova, + phys >> PAGE_SHIFT, + unmapped >> PAGE_SHIFT, false); + } + + ret = vfio_iommu_map(iommu, iova, pfn, page_size >> PAGE_SHIFT, prot); + if (ret) { + pr_warn("%s: gfn: %lx, pfn: %lx, npages:%lu\n", __func__, + user_pfn, pfn, page_size >> PAGE_SHIFT); + } + +out_unlock: + mutex_unlock(&iommu->lock); + return ret; +} + +u64 vfio_iommu_iova_to_phys(struct vfio_iommu *iommu, dma_addr_t iova) +{ + struct vfio_domain *d; + u64 phys; + + list_for_each_entry(d, &iommu->domain_list, next) { + phys = iommu_iova_to_phys(d->domain, iova); + if (phys) + return phys; + } + return 0; +} + +static int vfio_iommu_type1_dma_find(void *iommu_data, + unsigned long user_pfn, + int npage, unsigned long *phys_pfn) +{ + struct vfio_iommu *iommu = iommu_data; + int i = 0; + struct vfio_dma *dma; + u64 phys; + dma_addr_t iova; + + if (!iommu || !user_pfn) + return -EINVAL; + + /* Supported for v2 version only */ + if (!iommu->v2) + return -EACCES; + + mutex_lock(&iommu->lock); + + iova = user_pfn << PAGE_SHIFT; + dma = vfio_find_dma(iommu, iova, PAGE_SIZE); + if (!dma) + goto unpin_exit; + + if (((user_pfn + npage) << PAGE_SHIFT) <= (dma->iova + dma->size)) + i = npage; + else + goto unpin_exit; + + phys = vfio_iommu_iova_to_phys(iommu, iova); + *phys_pfn = phys >> PAGE_SHIFT; + +unpin_exit: + mutex_unlock(&iommu->lock); + return i; +} + static int vfio_bus_type(struct device *dev, void *data) { struct bus_type **bus = data; @@ -2336,6 +2486,9 @@ static int vfio_iommu_type1_unregister_notifier(void *iommu_data, .detach_group = vfio_iommu_type1_detach_group, .pin_pages = vfio_iommu_type1_pin_pages, .unpin_pages = vfio_iommu_type1_unpin_pages, + .mmap_pages = vfio_iommu_type1_mmap_pages, + .munmap_pages = vfio_iommu_type1_munmap_pages, + .dma_find = vfio_iommu_type1_dma_find, .register_notifier = vfio_iommu_type1_register_notifier, .unregister_notifier = vfio_iommu_type1_unregister_notifier, }; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index e42a711..d7df495 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -77,6 +77,15 @@ struct vfio_iommu_driver_ops { unsigned long *phys_pfn); int (*unpin_pages)(void *iommu_data, unsigned long *user_pfn, int npage); + int (*mmap_pages)(void *iommu_data, + unsigned long user_pfn, + unsigned long page_size, + int prot, unsigned long pfn); + int (*munmap_pages)(void *iommu_data, + unsigned long user_pfn, + unsigned long page_size); + int (*dma_find)(void *iommu_data, unsigned long user_pfn, + int npage, unsigned long *phys_pfn); int (*register_notifier)(void *iommu_data, unsigned long *events, struct notifier_block *nb); @@ -106,7 +115,13 @@ extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn); extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, int npage); - +extern int vfio_dma_find(struct device *dev, unsigned long user_pfn, int npage, + unsigned long *phys_pfn); +extern int vfio_mmap_pages(struct device *dev, unsigned long user_pfn, + unsigned long page_size, int prot, + unsigned long pfn); +extern int vfio_munmap_pages(struct device *dev, unsigned long user_pfn, + unsigned long page_size); /* each type has independent events */ enum vfio_notify_type { VFIO_IOMMU_NOTIFY = 0, -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] vfio: add mmap/munmap API for page hinting 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi @ 2020-01-07 15:22 ` Alex Williamson 2020-01-10 18:10 ` kbuild test robot 2020-01-10 18:10 ` [RFC PATCH] vfio: vfio_iommu_iova_to_phys() can be static kbuild test robot 2 siblings, 0 replies; 8+ messages in thread From: Alex Williamson @ 2020-01-07 15:22 UTC (permalink / raw) To: weiqi; +Cc: alexander.h.duyck, kvm, linux-kernel, pbonzini, x86 On Tue, 7 Jan 2020 22:46:38 +0800 weiqi <weiqi4@huawei.com> wrote: > From: wei qi <weiqi4@huawei.com> > > add mmap/munmap API for page hinting. AIUI, this is arbitrarily chunking IOMMU mappings into 512 pages (what happens with 1G pages?) and creating a back channel for KVM to map and unmap ranges that the user has mapped (why's it called "mmap"?). Can't we do this via the existing user API rather than directed via another module? For example, userspace can choose to map chunks of IOVA space in whatever granularity they choose. Clearly they can then unmap and re-map chunks from those previous mappings. Why can't KVM tell userspace how and when to do this? I'm really not in favor of back channel paths like this, especially to unmap what a user has told us to map. Thanks, Alex > Signed-off-by: wei qi <weiqi4@huawei.com> > --- > drivers/vfio/vfio.c | 109 ++++++++++++++++++++++++++++ > drivers/vfio/vfio_iommu_type1.c | 157 +++++++++++++++++++++++++++++++++++++++- > include/linux/vfio.h | 17 ++++- > 3 files changed, 280 insertions(+), 3 deletions(-) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index c848262..c7e9103 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -1866,6 +1866,115 @@ int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs, > } > EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare); > > +int vfio_mmap_pages(struct device *dev, unsigned long user_pfn, > + unsigned long page_size, int prot, > + unsigned long pfn) > +{ > + struct vfio_container *container; > + struct vfio_group *group; > + struct vfio_iommu_driver *driver; > + int ret; > + > + if (!dev || !user_pfn || !page_size) > + return -EINVAL; > + > + group = vfio_group_get_from_dev(dev); > + if (!group) > + return -ENODEV; > + > + ret = vfio_group_add_container_user(group); > + if (ret) > + goto err_pin_pages; > + > + container = group->container; > + driver = container->iommu_driver; > + if (likely(driver && driver->ops->mmap_pages)) > + ret = driver->ops->mmap_pages(container->iommu_data, user_pfn, > + page_size, prot, pfn); > + else > + ret = -ENOTTY; > + > + vfio_group_try_dissolve_container(group); > + > +err_pin_pages: > + vfio_group_put(group); > + return ret; > +} > +EXPORT_SYMBOL_GPL(vfio_mmap_pages); > + > +int vfio_munmap_pages(struct device *dev, unsigned long user_pfn, > + unsigned long page_size) > +{ > + struct vfio_container *container; > + struct vfio_group *group; > + struct vfio_iommu_driver *driver; > + int ret; > + > + if (!dev || !user_pfn || !page_size) > + return -EINVAL; > + > + group = vfio_group_get_from_dev(dev); > + if (!group) > + return -ENODEV; > + > + ret = vfio_group_add_container_user(group); > + if (ret) > + goto err_pin_pages; > + > + container = group->container; > + driver = container->iommu_driver; > + if (likely(driver && driver->ops->munmap_pages)) > + ret = driver->ops->munmap_pages(container->iommu_data, user_pfn, > + page_size); > + else > + ret = -ENOTTY; > + > + vfio_group_try_dissolve_container(group); > + > +err_pin_pages: > + vfio_group_put(group); > + return ret; > +} > +EXPORT_SYMBOL_GPL(vfio_munmap_pages); > + > +int vfio_dma_find(struct device *dev, unsigned long user_pfn, int npage, > + unsigned long *phys_pfn) > +{ > + struct vfio_container *container; > + struct vfio_group *group; > + struct vfio_iommu_driver *driver; > + int ret; > + > + if (!dev || !user_pfn || !npage || !phys_pfn) > + return -EINVAL; > + > + if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) > + return -E2BIG; > + > + group = vfio_group_get_from_dev(dev); > + if (!group) > + return -ENODEV; > + > + ret = vfio_group_add_container_user(group); > + if (ret) > + goto err_pin_pages; > + > + container = group->container; > + driver = container->iommu_driver; > + if (driver && driver->ops->dma_find) > + ret = driver->ops->dma_find(container->iommu_data, user_pfn, > + npage, phys_pfn); > + else > + ret = -ENOTTY; > + > + vfio_group_try_dissolve_container(group); > + > +err_pin_pages: > + vfio_group_put(group); > + return ret; > +} > +EXPORT_SYMBOL(vfio_dma_find); > + > /* > * Pin a set of guest PFNs and return their associated host PFNs for local > * domain only. > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 2ada8e6..df115dc 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -414,7 +414,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, > goto out; > > /* Lock all the consecutive pages from pfn_base */ > - for (vaddr += PAGE_SIZE, iova += PAGE_SIZE; pinned < npage; > + for (vaddr += PAGE_SIZE, iova += PAGE_SIZE; (pinned < npage && pinned < 512); > pinned++, vaddr += PAGE_SIZE, iova += PAGE_SIZE) { > ret = vaddr_get_pfn(current->mm, vaddr, dma->prot, &pfn); > if (ret) > @@ -768,7 +768,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, > phys_addr_t phys, next; > > phys = iommu_iova_to_phys(domain->domain, iova); > - if (WARN_ON(!phys)) { > + if (!phys) { > iova += PAGE_SIZE; > continue; > } > @@ -1154,6 +1154,156 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, > return ret; > } > > +static int vfio_iommu_type1_munmap_pages(void *iommu_data, > + unsigned long user_pfn, > + unsigned long page_size) > +{ > + struct vfio_iommu *iommu = iommu_data; > + struct vfio_domain *domain; > + struct vfio_dma *dma; > + dma_addr_t iova = user_pfn << PAGE_SHIFT; > + int ret = 0; > + phys_addr_t phys; > + size_t unmapped; > + long unlocked = 0; > + > + if (!iommu || !user_pfn || !page_size) > + return -EINVAL; > + > + /* Supported for v2 version only */ > + if (!iommu->v2) > + return -EACCES; > + > + mutex_lock(&iommu->lock); > + dma = vfio_find_dma(iommu, iova, page_size); > + if (!dma) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + domain = list_first_entry(&iommu->domain_list, > + struct vfio_domain, next); > + phys = iommu_iova_to_phys(domain->domain, iova); > + if (!phys) { > + goto out_unlock; > + } else { > + unmapped = iommu_unmap(domain->domain, iova, page_size); > + unlocked = vfio_unpin_pages_remote(dma, iova, > + phys >> PAGE_SHIFT, > + unmapped >> PAGE_SHIFT, true); > + } > + > +out_unlock: > + mutex_unlock(&iommu->lock); > + return ret; > +} > + > +static int vfio_iommu_type1_mmap_pages(void *iommu_data, > + unsigned long user_pfn, > + unsigned long page_size, int prot, > + unsigned long pfn) > +{ > + struct vfio_iommu *iommu = iommu_data; > + struct vfio_domain *domain; > + struct vfio_dma *dma; > + dma_addr_t iova = user_pfn << PAGE_SHIFT; > + int ret = 0; > + size_t unmapped; > + phys_addr_t phys; > + long unlocked = 0; > + > + if (!iommu || !user_pfn || !page_size || !pfn) > + return -EINVAL; > + > + /* Supported for v2 version only */ > + if (!iommu->v2) > + return -EACCES; > + > + mutex_lock(&iommu->lock); > + > + if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) { > + ret = -EACCES; > + goto out_unlock; > + } > + > + dma = vfio_find_dma(iommu, iova, page_size); > + if (!dma) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + domain = list_first_entry(&iommu->domain_list, > + struct vfio_domain, next); > + > + phys = iommu_iova_to_phys(domain->domain, iova); > + if (phys) { > + unmapped = iommu_unmap(domain->domain, iova, page_size); > + unlocked = vfio_unpin_pages_remote(dma, iova, > + phys >> PAGE_SHIFT, > + unmapped >> PAGE_SHIFT, false); > + } > + > + ret = vfio_iommu_map(iommu, iova, pfn, page_size >> PAGE_SHIFT, prot); > + if (ret) { > + pr_warn("%s: gfn: %lx, pfn: %lx, npages:%lu\n", __func__, > + user_pfn, pfn, page_size >> PAGE_SHIFT); > + } > + > +out_unlock: > + mutex_unlock(&iommu->lock); > + return ret; > +} > + > +u64 vfio_iommu_iova_to_phys(struct vfio_iommu *iommu, dma_addr_t iova) > +{ > + struct vfio_domain *d; > + u64 phys; > + > + list_for_each_entry(d, &iommu->domain_list, next) { > + phys = iommu_iova_to_phys(d->domain, iova); > + if (phys) > + return phys; > + } > + return 0; > +} > + > +static int vfio_iommu_type1_dma_find(void *iommu_data, > + unsigned long user_pfn, > + int npage, unsigned long *phys_pfn) > +{ > + struct vfio_iommu *iommu = iommu_data; > + int i = 0; > + struct vfio_dma *dma; > + u64 phys; > + dma_addr_t iova; > + > + if (!iommu || !user_pfn) > + return -EINVAL; > + > + /* Supported for v2 version only */ > + if (!iommu->v2) > + return -EACCES; > + > + mutex_lock(&iommu->lock); > + > + iova = user_pfn << PAGE_SHIFT; > + dma = vfio_find_dma(iommu, iova, PAGE_SIZE); > + if (!dma) > + goto unpin_exit; > + > + if (((user_pfn + npage) << PAGE_SHIFT) <= (dma->iova + dma->size)) > + i = npage; > + else > + goto unpin_exit; > + > + phys = vfio_iommu_iova_to_phys(iommu, iova); > + *phys_pfn = phys >> PAGE_SHIFT; > + > +unpin_exit: > + mutex_unlock(&iommu->lock); > + return i; > +} > + > static int vfio_bus_type(struct device *dev, void *data) > { > struct bus_type **bus = data; > @@ -2336,6 +2486,9 @@ static int vfio_iommu_type1_unregister_notifier(void *iommu_data, > .detach_group = vfio_iommu_type1_detach_group, > .pin_pages = vfio_iommu_type1_pin_pages, > .unpin_pages = vfio_iommu_type1_unpin_pages, > + .mmap_pages = vfio_iommu_type1_mmap_pages, > + .munmap_pages = vfio_iommu_type1_munmap_pages, > + .dma_find = vfio_iommu_type1_dma_find, > .register_notifier = vfio_iommu_type1_register_notifier, > .unregister_notifier = vfio_iommu_type1_unregister_notifier, > }; > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index e42a711..d7df495 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -77,6 +77,15 @@ struct vfio_iommu_driver_ops { > unsigned long *phys_pfn); > int (*unpin_pages)(void *iommu_data, > unsigned long *user_pfn, int npage); > + int (*mmap_pages)(void *iommu_data, > + unsigned long user_pfn, > + unsigned long page_size, > + int prot, unsigned long pfn); > + int (*munmap_pages)(void *iommu_data, > + unsigned long user_pfn, > + unsigned long page_size); > + int (*dma_find)(void *iommu_data, unsigned long user_pfn, > + int npage, unsigned long *phys_pfn); > int (*register_notifier)(void *iommu_data, > unsigned long *events, > struct notifier_block *nb); > @@ -106,7 +115,13 @@ extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn, > int npage, int prot, unsigned long *phys_pfn); > extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn, > int npage); > - > +extern int vfio_dma_find(struct device *dev, unsigned long user_pfn, int npage, > + unsigned long *phys_pfn); > +extern int vfio_mmap_pages(struct device *dev, unsigned long user_pfn, > + unsigned long page_size, int prot, > + unsigned long pfn); > +extern int vfio_munmap_pages(struct device *dev, unsigned long user_pfn, > + unsigned long page_size); > /* each type has independent events */ > enum vfio_notify_type { > VFIO_IOMMU_NOTIFY = 0, ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] vfio: add mmap/munmap API for page hinting 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi 2020-01-07 15:22 ` Alex Williamson @ 2020-01-10 18:10 ` kbuild test robot 2020-01-10 18:10 ` [RFC PATCH] vfio: vfio_iommu_iova_to_phys() can be static kbuild test robot 2 siblings, 0 replies; 8+ messages in thread From: kbuild test robot @ 2020-01-10 18:10 UTC (permalink / raw) To: weiqi Cc: kbuild-all, alexander.h.duyck, alex.williamson, kvm, linux-kernel, pbonzini, x86, wei qi Hi weiqi, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on kvm/linux-next] [also build test WARNING on vfio/next vhost/linux-next v5.5-rc5 next-20200109] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/weiqi/page-hinting-add-passthrough-support/20200108-152941 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next reproduce: # apt-get install sparse # sparse version: v0.6.1-129-g341daf20-dirty make ARCH=x86_64 allmodconfig make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> sparse warnings: (new ones prefixed by >>) >> drivers/vfio/vfio_iommu_type1.c:1275:5: sparse: sparse: symbol 'vfio_iommu_iova_to_phys' was not declared. Should it be static? Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC PATCH] vfio: vfio_iommu_iova_to_phys() can be static 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi 2020-01-07 15:22 ` Alex Williamson 2020-01-10 18:10 ` kbuild test robot @ 2020-01-10 18:10 ` kbuild test robot 2 siblings, 0 replies; 8+ messages in thread From: kbuild test robot @ 2020-01-10 18:10 UTC (permalink / raw) To: weiqi Cc: kbuild-all, alexander.h.duyck, alex.williamson, kvm, linux-kernel, pbonzini, x86, wei qi Fixes: 3b764b3397df ("vfio: add mmap/munmap API for page hinting") Signed-off-by: kbuild test robot <lkp@intel.com> --- vfio_iommu_type1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 3939f9573f74f..f25c107e2c709 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1272,7 +1272,7 @@ static int vfio_iommu_type1_mmap_pages(void *iommu_data, return ret; } -u64 vfio_iommu_iova_to_phys(struct vfio_iommu *iommu, dma_addr_t iova) +static u64 vfio_iommu_iova_to_phys(struct vfio_iommu *iommu, dma_addr_t iova) { struct vfio_domain *d; u64 phys; ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] KVM: add support for page hinting 2020-01-07 14:46 [PATCH 0/2] page hinting add passthrough support weiqi 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi @ 2020-01-07 14:46 ` weiqi 2020-02-18 11:45 ` kbuild test robot 2020-01-07 16:37 ` [PATCH 0/2] page hinting add passthrough support Alexander Duyck 2 siblings, 1 reply; 8+ messages in thread From: weiqi @ 2020-01-07 14:46 UTC (permalink / raw) To: alexander.h.duyck, alex.williamson Cc: kvm, linux-kernel, pbonzini, x86, wei qi From: wei qi <weiqi4@huawei.com> add support for page hinting. Signed-off-by: wei qi <weiqi4@huawei.com> --- arch/x86/kvm/mmu/mmu.c | 79 +++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/kvm_host.h | 41 +++++++++++++++++++++ include/uapi/linux/kvm.h | 7 ++++ virt/kvm/vfio.c | 11 ------ 5 files changed, 223 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6f92b40..0cf2584 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4259,6 +4259,71 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, return kvm_mtrr_check_gfn_range_consistency(vcpu, gfn, page_num); } +#include <linux/vfio.h> +static void kvm_vfio_mmap_range(struct kvm_vcpu *vcpu, struct kvm_device *tmp, + gfn_t gfn, kvm_pfn_t pfn) +{ + struct kvm_vfio *kv = tmp->private; + struct kvm_vfio_group *kvg; + + list_for_each_entry(kvg, &kv->group_list, node) { + struct vfio_group *group = kvg->vfio_group; + struct vfio_device *it, *device = NULL; + + list_for_each_entry(it, &group->device_list, group_next) { + int flags; + unsigned long page_size; + struct kvm_memory_slot *memslot; + int size; + unsigned long old_pfn = 0; + gfn_t gfn_base; + kvm_pfn_t pfn_base; + + device = it; + memslot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); + page_size = kvm_host_page_size(vcpu->kvm, gfn); + + /* only discard free pages, just check 2M hugetlb */ + if (page_size >> PAGE_SHIFT != 512) + return; + + gfn_base = ((gfn << PAGE_SHIFT) & (~(page_size - 1))) + >> PAGE_SHIFT; + pfn_base = ((pfn << PAGE_SHIFT) & (~(page_size - 1))) + >> PAGE_SHIFT; + + while ((gfn << PAGE_SHIFT) & (page_size - 1)) + page_size >>= 1; + + while (__gfn_to_hva_memslot(memslot, gfn) & + (page_size - 1)) + page_size >>= 1; + + size = vfio_dma_find(device->dev, gfn_base, + page_size >> PAGE_SHIFT, &old_pfn); + if (!size) { + pr_err("%s:not find dma: gfn: %llx, size: %lu.\n", + __func__, gfn_base, + page_size >> PAGE_SHIFT); + return; + } + if (!old_pfn) + pr_err("%s: not find pfn: gfn: %llx, size: %lu.\n", + __func__, gfn_base, + page_size >> PAGE_SHIFT); + + if (pfn_base == old_pfn) + return; + + flags = IOMMU_READ; + if (!(memslot->flags & KVM_MEM_READONLY)) + flags |= IOMMU_WRITE; + vfio_mmap_pages(device->dev, gfn, + page_size, flags, pfn); + } + } +} + static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code, bool prefault) { @@ -4317,6 +4382,20 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code, prefault, lpage_disallowed); out_unlock: spin_unlock(&vcpu->kvm->mmu_lock); + + if (!is_noslot_pfn(pfn) && gfn) { + struct kvm_device *tmp; + + list_for_each_entry(tmp, &vcpu->kvm->devices, vm_node) { + if (tmp->ops && tmp->ops->name && + (!strcmp(tmp->ops->name, "kvm-vfio"))) { + spin_lock(&vcpu->kvm->discard_lock); + kvm_vfio_mmap_range(vcpu, tmp, gfn, pfn); + spin_unlock(&vcpu->kvm->discard_lock); + } + } + } + kvm_release_pfn_clean(pfn); return r; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cf91713..264c65e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4837,6 +4837,92 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, return r; } +#include <linux/vfio.h> +static void kvm_vfio_ummap_range(struct kvm *kvm, struct kvm_device *tmp, + gfn_t gfn, int npages, unsigned long hva) +{ + struct kvm_vfio *kv = tmp->private; + struct kvm_vfio_group *kvg; + + list_for_each_entry(kvg, &kv->group_list, node) { + struct vfio_group *group = kvg->vfio_group; + struct vfio_device *it, *device = NULL; + + list_for_each_entry(it, &group->device_list, group_next) { + unsigned long page_size, page_size_base; + unsigned long addr; + int size; + unsigned long old_pfn = 0; + int ret = 0; + size_t unmapped = npages; + gfn_t iova_gfn = gfn; + unsigned long iova_hva = hva; + + device = it; + while (unmapped) { + addr = gfn_to_hva(kvm, iova_gfn); + page_size_base = page_size = + kvm_host_page_size(kvm, + iova_gfn); + + if (addr != iova_hva) + return; + + while ((iova_gfn << PAGE_SHIFT) & + (page_size - 1)) + page_size >>= 1; + + while (addr & (page_size - 1)) + page_size >>= 1; + + if (page_size_base != page_size) + return; + + size = vfio_dma_find(device->dev, iova_gfn, + page_size >> PAGE_SHIFT, + &old_pfn); + if (!size) + return; + + if (!old_pfn) + return; + + ret = vfio_munmap_pages(device->dev, + iova_gfn, page_size); + unmapped -= page_size >> PAGE_SHIFT; + iova_hva += page_size; + iova_gfn += page_size >> PAGE_SHIFT; + } + } + } +} + + +static int kvm_vm_ioctl_discard_range(struct kvm *kvm, + struct kvm_discard_msg *msg) +{ + gfn_t gfn, end_gfn; + int idx; + struct kvm_device *tmp; + unsigned long hva = msg->iov_base; + int npages = msg->iov_len >> PAGE_SHIFT; + + gfn = gpa_to_gfn(msg->in_addr); + end_gfn = gpa_to_gfn(msg->in_addr + msg->iov_len); + + idx = srcu_read_lock(&kvm->srcu); + + list_for_each_entry(tmp, &kvm->devices, vm_node) { + if (tmp->ops->name && (!strcmp(tmp->ops->name, "kvm-vfio"))) { + spin_lock(&kvm->discard_lock); + kvm_vfio_ummap_range(kvm, tmp, gfn, npages, hva); + spin_unlock(&kvm->discard_lock); + } + } + srcu_read_unlock(&kvm->srcu, idx); + return 0; +} + long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -5134,6 +5220,16 @@ long kvm_arch_vm_ioctl(struct file *filp, case KVM_SET_PMU_EVENT_FILTER: r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp); break; + case KVM_DISCARD_RANGE: { + struct kvm_discard_msg discard_msg; + + r = -EFAULT; + if (copy_from_user(&discard_msg, argp, sizeof(discard_msg))) + goto out; + + r = kvm_vm_ioctl_discard_range(kvm, &discard_msg); + break; + } default: r = -ENOTTY; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 538c25e..6667e6b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -442,6 +442,7 @@ struct kvm_memslots { struct kvm { spinlock_t mmu_lock; + spinlock_t discard_lock; struct mutex slots_lock; struct mm_struct *mm; /* userspace tied to this vm */ struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM]; @@ -502,6 +503,46 @@ struct kvm { struct srcu_struct irq_srcu; pid_t userspace_pid; }; +struct vfio_device { + struct kref kref; + struct device *dev; + const struct vfio_device_ops *ops; + struct vfio_group *group; + struct list_head group_next; + void *device_data; +}; + +struct vfio_group { + struct kref kref; + int minor; + atomic_t container_users; + struct iommu_group *iommu_group; + struct vfio_container *container; + struct list_head device_list; + struct mutex device_lock; + struct device *dev; + struct notifier_block nb; + struct list_head vfio_next; + struct list_head container_next; + struct list_head unbound_list; + struct mutex unbound_lock; + atomic_t opened; + wait_queue_head_t container_q; + bool noiommu; + struct kvm *kvm; + struct blocking_notifier_head notifier; +}; + +struct kvm_vfio_group { + struct list_head node; + struct vfio_group *vfio_group; +}; + +struct kvm_vfio { + struct list_head group_list; + struct mutex lock; + bool noncoherent; +}; #define kvm_err(fmt, ...) \ pr_err("kvm [%i]: " fmt, task_pid_nr(current), ## __VA_ARGS__) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index f0a16b4..53331fe 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1264,6 +1264,13 @@ struct kvm_vfio_spapr_tce { struct kvm_userspace_memory_region) #define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47) #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64) +struct kvm_discard_msg { + __u64 iov_len; + __u64 iov_base; + __u64 in_addr; +}; +#define KVM_DISCARD_RANGE _IOW(KVMIO, 0x49, struct kvm_discard_msg) + /* enable ucontrol for s390 */ struct kvm_s390_ucas_mapping { diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 8fcbc50..f6dc61e 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -21,17 +21,6 @@ #include <asm/kvm_ppc.h> #endif -struct kvm_vfio_group { - struct list_head node; - struct vfio_group *vfio_group; -}; - -struct kvm_vfio { - struct list_head group_list; - struct mutex lock; - bool noncoherent; -}; - static struct vfio_group *kvm_vfio_group_get_external_user(struct file *filep) { struct vfio_group *vfio_group; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] KVM: add support for page hinting 2020-01-07 14:46 ` [PATCH 2/2] KVM: add support for page hinting weiqi @ 2020-02-18 11:45 ` kbuild test robot 0 siblings, 0 replies; 8+ messages in thread From: kbuild test robot @ 2020-02-18 11:45 UTC (permalink / raw) To: weiqi Cc: kbuild-all, alexander.h.duyck, alex.williamson, kvm, linux-kernel, pbonzini, x86, wei qi [-- Attachment #1: Type: text/plain, Size: 3281 bytes --] Hi weiqi, Thank you for the patch! Yet something to improve: [auto build test ERROR on kvm/linux-next] [also build test ERROR on vfio/next vhost/linux-next] [cannot apply to v5.6-rc2 next-20200217] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/weiqi/page-hinting-add-passthrough-support/20200108-152941 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next config: x86_64-lkp (attached as .config) compiler: gcc-7 (Debian 7.5.0-3) 7.5.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All errors (new ones prefixed by >>): ld: arch/x86/kvm/x86.o: in function `kvm_vfio_ummap_range': >> arch/x86/kvm/x86.c:4880: undefined reference to `vfio_dma_find' >> ld: arch/x86/kvm/x86.c:4889: undefined reference to `vfio_munmap_pages' ld: arch/x86/kvm/mmu/mmu.o: in function `kvm_vfio_mmap_range': >> arch/x86/kvm/mmu/mmu.c:4302: undefined reference to `vfio_dma_find' >> ld: arch/x86/kvm/mmu/mmu.c:4321: undefined reference to `vfio_mmap_pages' vim +4880 arch/x86/kvm/x86.c 4838 4839 #include <linux/vfio.h> 4840 static void kvm_vfio_ummap_range(struct kvm *kvm, struct kvm_device *tmp, 4841 gfn_t gfn, int npages, unsigned long hva) 4842 { 4843 struct kvm_vfio *kv = tmp->private; 4844 struct kvm_vfio_group *kvg; 4845 4846 list_for_each_entry(kvg, &kv->group_list, node) { 4847 struct vfio_group *group = kvg->vfio_group; 4848 struct vfio_device *it, *device = NULL; 4849 4850 list_for_each_entry(it, &group->device_list, group_next) { 4851 unsigned long page_size, page_size_base; 4852 unsigned long addr; 4853 int size; 4854 unsigned long old_pfn = 0; 4855 int ret = 0; 4856 size_t unmapped = npages; 4857 gfn_t iova_gfn = gfn; 4858 unsigned long iova_hva = hva; 4859 4860 device = it; 4861 while (unmapped) { 4862 addr = gfn_to_hva(kvm, iova_gfn); 4863 page_size_base = page_size = 4864 kvm_host_page_size(kvm, 4865 iova_gfn); 4866 4867 if (addr != iova_hva) 4868 return; 4869 4870 while ((iova_gfn << PAGE_SHIFT) & 4871 (page_size - 1)) 4872 page_size >>= 1; 4873 4874 while (addr & (page_size - 1)) 4875 page_size >>= 1; 4876 4877 if (page_size_base != page_size) 4878 return; 4879 > 4880 size = vfio_dma_find(device->dev, iova_gfn, 4881 page_size >> PAGE_SHIFT, 4882 &old_pfn); 4883 if (!size) 4884 return; 4885 4886 if (!old_pfn) 4887 return; 4888 > 4889 ret = vfio_munmap_pages(device->dev, 4890 iova_gfn, page_size); 4891 unmapped -= page_size >> PAGE_SHIFT; 4892 iova_hva += page_size; 4893 iova_gfn += page_size >> PAGE_SHIFT; 4894 } 4895 } 4896 } 4897 } 4898 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 28600 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] page hinting add passthrough support 2020-01-07 14:46 [PATCH 0/2] page hinting add passthrough support weiqi 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi 2020-01-07 14:46 ` [PATCH 2/2] KVM: add support for page hinting weiqi @ 2020-01-07 16:37 ` Alexander Duyck 2 siblings, 0 replies; 8+ messages in thread From: Alexander Duyck @ 2020-01-07 16:37 UTC (permalink / raw) To: weiqi, alex.williamson; +Cc: kvm, linux-kernel, pbonzini, x86 On Tue, 2020-01-07 at 22:46 +0800, weiqi wrote: > From: wei qi <weiqi4@huawei.com> > > > I just implemented dynamically updating the iommu table to support pass-through, > It seen to work fine. > > Test: > start a 4G vm with 2M hugetlb and ixgbevf passthrough, > GuestOS: linux-5.2.6 + (mm / virtio: Provide support for free page reporting) > HostOS: 5.5-rc4 > Host: Intel(R) Xeon(R) Gold 6161 CPU @ 2.20GHz > > after enable page hinting, free pages at GuestOS can be free at host. > > > before, > # cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/free_hugepages > 5620 > 5620 > after start VM, > # numastat -c qemu > > Per-node process memory usage (in MBs) > PID Node 0 Node 1 Total > --------------- ------ ------ ----- > 24463 (qemu_hotr 6 6 12 > 24479 (qemu_tls_ 0 8 8 > 70718 (qemu-syst 58 539 597 > --------------- ------ ------ ----- > Total 64 553 616 > # cat /sys/devices/system/node/node*/hugepages/hugepages-2048kB/free_hugepages > 5595 > 5366 > > the modify at qemu, > +int kvm_discard_range(struct kvm_discard_msg discard_msg) > +{ > + return kvm_vm_ioctl(kvm_state, KVM_DISCARD_RANGE, &discard_msg); > +} > > static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) > { > .................. > + discard_msg.in_addr = elem->in_addr[i]; > + discard_msg.iov_len = elem->in_sg[i].iov_len; > > ram_block_discard_range(rb, ram_offset, size); > + kvm_discard_range(discard_msg); > > then, further test network bandwidth, performance seem ok. > > Is there any hidden problem in this implementation? How is it you are avoiding triggering the call to qemu_balloon_inhibit in QEMU? > And, is there plan to support pass-throughyour? It wasn't something I was immediately planning to do. Before we got there we would need to really address the fact that the host has no idea what pages the device could be accessing since normally the entire guest is pinned. I guess these patches are a step toward addressing that? ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-02-18 11:46 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-01-07 14:46 [PATCH 0/2] page hinting add passthrough support weiqi 2020-01-07 14:46 ` [PATCH 1/2] vfio: add mmap/munmap API for page hinting weiqi 2020-01-07 15:22 ` Alex Williamson 2020-01-10 18:10 ` kbuild test robot 2020-01-10 18:10 ` [RFC PATCH] vfio: vfio_iommu_iova_to_phys() can be static kbuild test robot 2020-01-07 14:46 ` [PATCH 2/2] KVM: add support for page hinting weiqi 2020-02-18 11:45 ` kbuild test robot 2020-01-07 16:37 ` [PATCH 0/2] page hinting add passthrough support Alexander Duyck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).