From: David Hildenbrand <david@redhat.com> To: qemu-devel@nongnu.org Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>, Wei Yang <richard.weiyang@linux.alibaba.com>, "Michael S . Tsirkin" <mst@redhat.com>, David Hildenbrand <david@redhat.com>, "Dr . David Alan Gilbert" <dgilbert@redhat.com>, Peter Xu <peterx@redhat.com>, Auger Eric <eric.auger@redhat.com>, Alex Williamson <alex.williamson@redhat.com>, teawater <teawaterz@linux.alibaba.com>, Igor Mammedov <imammedo@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Marek Kedzierski <mkedzier@redhat.com> Subject: [PATCH RESEND v7 13/13] vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus Date: Tue, 13 Apr 2021 11:55:31 +0200 [thread overview] Message-ID: <20210413095531.25603-14-david@redhat.com> (raw) In-Reply-To: <20210413095531.25603-1-david@redhat.com> We support coordinated discarding of RAM using the RamDiscardManager for the VFIO_TYPE1 iommus. Let's unlock support for coordinated discards, keeping uncoordinated discards (e.g., via virtio-balloon) disabled if possible. This unlocks virtio-mem + vfio on x86-64. Note that vfio used via "nvme://" by the block layer has to be implemented/unlocked separately. For now, virtio-mem only supports x86-64; we don't restrict RamDiscardManager to x86-64, though: arm64 and s390x are supposed to work as well, and we'll test once unlocking virtio-mem support. The spapr IOMMUs will need special care, to be tackled later, e.g.., once supporting virtio-mem. Note: The block size of a virtio-mem device has to be set to sane sizes, depending on the maximum hotplug size - to not run out of vfio mappings. The default virtio-mem block size is usually in the range of a couple of MBs. The maximum number of mapping is 64k, shared with other users. Assume you want to hotplug 256GB using virtio-mem - the block size would have to be set to at least 8 MiB (resulting in 32768 separate mappings). Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- hw/vfio/common.c | 65 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 53 insertions(+), 12 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 8a9bbf2791..3f0d111360 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -135,6 +135,29 @@ static const char *index_to_str(VFIODevice *vbasedev, int index) } } +static int vfio_ram_block_discard_disable(VFIOContainer *container, bool state) +{ + switch (container->iommu_type) { + case VFIO_TYPE1v2_IOMMU: + case VFIO_TYPE1_IOMMU: + /* + * We support coordinated discarding of RAM via the RamDiscardManager. + */ + return ram_block_uncoordinated_discard_disable(state); + default: + /* + * VFIO_SPAPR_TCE_IOMMU most probably works just fine with + * RamDiscardManager, however, it is completely untested. + * + * VFIO_SPAPR_TCE_v2_IOMMU with "DMA memory preregistering" does + * completely the opposite of managing mapping/pinning dynamically as + * required by RamDiscardManager. We would have to special-case sections + * with a RamDiscardManager. + */ + return ram_block_discard_disable(state); + } +} + int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, int action, int fd, Error **errp) { @@ -1977,15 +2000,25 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, * new memory, it will not yet set ram_block_discard_set_required() and * therefore, neither stops us here or deals with the sudden memory * consumption of inflated memory. + * + * We do support discarding of memory coordinated via the RamDiscardManager + * with some IOMMU types. vfio_ram_block_discard_disable() handles the + * details once we know which type of IOMMU we are using. */ - ret = ram_block_discard_disable(true); - if (ret) { - error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"); - return ret; - } QLIST_FOREACH(container, &space->containers, next) { if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { + ret = vfio_ram_block_discard_disable(container, true); + if (ret) { + error_setg_errno(errp, -ret, + "Cannot set discarding of RAM broken"); + if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, + &container->fd)) { + error_report("vfio: error disconnecting group %d from" + " container", group->groupid); + } + return ret; + } group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); vfio_kvm_device_add_group(group); @@ -2023,6 +2056,12 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, goto free_container_exit; } + ret = vfio_ram_block_discard_disable(container, true); + if (ret) { + error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"); + goto free_container_exit; + } + switch (container->iommu_type) { case VFIO_TYPE1v2_IOMMU: case VFIO_TYPE1_IOMMU: @@ -2070,7 +2109,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, if (ret) { error_setg_errno(errp, errno, "failed to enable container"); ret = -errno; - goto free_container_exit; + goto enable_discards_exit; } } else { container->prereg_listener = vfio_prereg_listener; @@ -2082,7 +2121,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, ret = -1; error_propagate_prepend(errp, container->error, "RAM memory listener initialization failed: "); - goto free_container_exit; + goto enable_discards_exit; } } @@ -2095,7 +2134,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, if (v2) { memory_listener_unregister(&container->prereg_listener); } - goto free_container_exit; + goto enable_discards_exit; } if (v2) { @@ -2110,7 +2149,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, if (ret) { error_setg_errno(errp, -ret, "failed to remove existing window"); - goto free_container_exit; + goto enable_discards_exit; } } else { /* The default table uses 4K pages */ @@ -2151,6 +2190,9 @@ listener_release_exit: vfio_kvm_device_del_group(group); vfio_listener_release(container); +enable_discards_exit: + vfio_ram_block_discard_disable(container, false); + free_container_exit: g_free(container); @@ -2158,7 +2200,6 @@ close_fd_exit: close(fd); put_space_exit: - ram_block_discard_disable(false); vfio_put_address_space(space); return ret; @@ -2280,7 +2321,7 @@ void vfio_put_group(VFIOGroup *group) } if (!group->ram_block_discard_allowed) { - ram_block_discard_disable(false); + vfio_ram_block_discard_disable(group->container, false); } vfio_kvm_device_del_group(group); vfio_disconnect_container(group); @@ -2334,7 +2375,7 @@ int vfio_get_device(VFIOGroup *group, const char *name, if (!group->ram_block_discard_allowed) { group->ram_block_discard_allowed = true; - ram_block_discard_disable(false); + vfio_ram_block_discard_disable(group->container, false); } } -- 2.30.2
prev parent reply other threads:[~2021-04-13 10:04 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-13 9:55 [PATCH RESEND v7 00/13] virtio-mem: vfio support David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 01/13] memory: Introduce RamDiscardManager for RAM memory regions David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 02/13] memory: Helpers to copy/free a MemoryRegionSection David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 03/13] virtio-mem: Factor out traversing unplugged ranges David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 04/13] virtio-mem: Don't report errors when ram_block_discard_range() fails David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 05/13] virtio-mem: Implement RamDiscardManager interface David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 06/13] vfio: Support for RamDiscardManager in the !vIOMMU case David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 07/13] vfio: Query and store the maximum number of possible DMA mappings David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 08/13] vfio: Sanity check maximum number of DMA mappings with RamDiscardManager David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 09/13] vfio: Support for RamDiscardManager in the vIOMMU case David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 10/13] softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require) David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 11/13] softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types David Hildenbrand 2021-04-13 9:55 ` [PATCH RESEND v7 12/13] virtio-mem: Require only coordinated discards David Hildenbrand 2021-04-13 9:55 ` David Hildenbrand [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210413095531.25603-14-david@redhat.com \ --to=david@redhat.com \ --cc=alex.williamson@redhat.com \ --cc=dgilbert@redhat.com \ --cc=eric.auger@redhat.com \ --cc=imammedo@redhat.com \ --cc=mkedzier@redhat.com \ --cc=mst@redhat.com \ --cc=pankaj.gupta.linux@gmail.com \ --cc=pbonzini@redhat.com \ --cc=peterx@redhat.com \ --cc=qemu-devel@nongnu.org \ --cc=richard.weiyang@linux.alibaba.com \ --cc=teawaterz@linux.alibaba.com \ --subject='Re: [PATCH RESEND v7 13/13] vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).