From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87C70C433E0 for ; Wed, 24 Feb 2021 09:51:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A880864DE7 for ; Wed, 24 Feb 2021 09:51:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A880864DE7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37220 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEqp1-0002Dq-Ij for qemu-devel@archiver.kernel.org; Wed, 24 Feb 2021 04:51:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:56486) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lEqnj-0001Pa-Fh for qemu-devel@nongnu.org; Wed, 24 Feb 2021 04:49:47 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:55744) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lEqnb-0007At-Q5 for qemu-devel@nongnu.org; Wed, 24 Feb 2021 04:49:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1614160177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=cC1WuzQFiOCHJ3GepQ/dG2KOQCVkEG5EudxOsrMHZXc=; b=SSRwF5HNCfvHAXPyfex7tf4QHwQ2Xt1RFCuYim18/PlyzdbwHIqNvjEbEL1OPp7/w7ARdU CHF4FCsMnFSgM1O7xJTahUd6CCmKOLX525n4JB2U+qgzms0mwZ1k9KWn8M/POcU9ejScc8 1XDUib52xNG3r3rghHdZCH+MRVtxYu8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-362-Nboxveu5MTmncO1Das0now-1; Wed, 24 Feb 2021 04:49:34 -0500 X-MC-Unique: Nboxveu5MTmncO1Das0now-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9F7BC804036; Wed, 24 Feb 2021 09:49:33 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-83.ams2.redhat.com [10.36.114.83]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0207C70476; Wed, 24 Feb 2021 09:49:10 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v7 00/13] virtio-mem: vfio support Date: Wed, 24 Feb 2021 10:48:56 +0100 Message-Id: <20210224094910.44986-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , David Hildenbrand , "Michael S. Tsirkin" , "Dr. David Alan Gilbert" , Peter Xu , Pankaj Gupta , Auger Eric , Alex Williamson , teawater , Paolo Bonzini , Igor Mammedov , Marek Kedzierski Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A virtio-mem device manages a memory region in guest physical address space, represented as a single (currently large) memory region in QEMU, mapped into system memory address space. Before the guest is allowed to use memory blocks, it must coordinate with the hypervisor (plug blocks). After a reboot, all memory is usually unplugged - when the guest comes up, it detects the virtio-mem device and selects memory blocks to plug (based on resize requests from the hypervisor). Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem device (triggered by the guest). When unplugging blocks, we discard the memory - similar to memory balloon inflation. In contrast to memory ballooning, we always know which memory blocks a guest may actually use - especially during a reboot, after a crash, or after kexec (and during hibernation as well). Guests agreed to not access unplugged memory again, especially not via DMA. The issue with vfio is, that it cannot deal with random discards - for this reason, virtio-mem and vfio can currently only run mutually exclusive. Especially, vfio would currently map the whole memory region (with possible only little/no plugged blocks), resulting in all pages getting pinned and therefore resulting in a higher memory consumption than expected (turning virtio-mem basically useless in these environments). To make vfio work nicely with virtio-mem, we have to map only the plugged blocks, and map/unmap properly when plugging/unplugging blocks (including discarding of RAM when unplugging). We achieve that by using a new notifier mechanism that communicates changes. It's important to map memory in the granularity in which we could see unmaps again (-> virtio-mem block size) - so when e.g., plugging consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When unmapping, we can use a single vfio_unmap call for the applicable range. We expect that the block size of virtio-mem devices will be fairly large in the future (to not run out of mappings and to improve hot(un)plug performance), configured by the user, when used with vfio (e.g., 128MB, 1G, ...), but it will depend on the setup. More info regarding virtio-mem can be found at: https://virtio-mem.gitlab.io/ v7 is located at: git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v7 v6 -> v7: - s/RamDiscardMgr/RamDiscardManager/ - "memory: Introduce RamDiscardManager for RAM memory regions" -- Make RamDiscardManager/RamDiscardListener eat MemoryRegionSections -- Replace notify_discard_all callback by double_discard_supported -- Reshuffle the individual hunks in memory.h -- Provide function wrappers for RamDiscardManager calls - "memory: Helpers to copy/free a MemoryRegionSection" -- Added - "virtio-mem: Implement RamDiscardManager interface" -- Work on MemoryRegionSections instead of ranges -- Minor optimizations - "vfio: Support for RamDiscardManager in the !vIOMMU case" -- Simplify based on new interfaces / MemoryRegionSections -- Minor cleanups and optimizations -- Add a comment regarding dirty bitmap sync. -- Don't store "offset_within_region" in VFIORamDiscardListener - "vfio: Support for RamDiscardManager in the vIOMMU case" -- Adjust to new interface - "softmmu/physmem: Don't use atomic operations in ..." -- Rename variables - "softmmu/physmem: Extend ram_block_discard_(require|disable) ..." -- Rename variables - Rebased and retested v5 -> v6: - "memory: Introduce RamDiscardMgr for RAM memory regions" -- Fix variable names in one prototype. - "virtio-mem: Don't report errors when ram_block_discard_range() fails" -- Added - "virtio-mem: Implement RamDiscardMgr interface" -- Don't report an error if discarding fails - Rebased and retested v4 -> v5: - "vfio: Support for RamDiscardMgr in the !vIOMMU case" -- Added more assertions for granularity vs. iommu supported pagesize - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" -- Fix accounting of mappings - "vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus" -- Fence off SPAPR and add some comments regarding future support. -- Tweak patch description - Rebase and retest v3 -> v4: - "vfio: Query and store the maximum number of DMA mappings -- Limit the patch to querying and storing only -- Renamed to "vfio: Query and store the maximum number of possible DMA mappings" - "vfio: Support for RamDiscardMgr in the !vIOMMU case" -- Remove sanity checks / warning the user - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" -- Perform sanity checks by looking at the number of memslots and all registered RamDiscardMgr sections - Rebase and retest - Reshuffled the patches slightly v2 -> v3: - Rebased + retested - Fixed some typos - Added RB's v1 -> v2: - "memory: Introduce RamDiscardMgr for RAM memory regions" -- Fix some errors in the documentation -- Make register_listener() notify about populated parts and unregister_listener() notify about discarding populated parts, to simplify future locking inside virtio-mem, when handling requests via a separate thread. - "vfio: Query and store the maximum number of DMA mappings" -- Query number of mappings and track mappings (except for vIOMMU) - "vfio: Support for RamDiscardMgr in the !vIOMMU case" -- Adapt to RamDiscardMgr changes and warn via generic DMA reservation - "vfio: Support for RamDiscardMgr in the vIOMMU case" -- Use vmstate priority to handle migration dependencies RFC - v1: - VFIO migration code. Due to missing kernel support, I cannot really test if that part works. - Understand/test/document vIOMMU implications, also regarding migration - Nicer ram_block_discard_disable/require handling. - s/SparseRAMHandler/RamDiscardMgr/, refactorings, cleanups, documentation, testing, ... David Hildenbrand (13): memory: Introduce RamDiscardManager for RAM memory regions memory: Helpers to copy/free a MemoryRegionSection virtio-mem: Factor out traversing unplugged ranges virtio-mem: Don't report errors when ram_block_discard_range() fails virtio-mem: Implement RamDiscardManager interface vfio: Support for RamDiscardManager in the !vIOMMU case vfio: Query and store the maximum number of possible DMA mappings vfio: Sanity check maximum number of DMA mappings with RamDiscardManager vfio: Support for RamDiscardManager in the vIOMMU case softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require) softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types virtio-mem: Require only coordinated discards vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus hw/vfio/common.c | 315 +++++++++++++++++++++++++- hw/virtio/virtio-mem.c | 391 ++++++++++++++++++++++++++++----- include/exec/memory.h | 324 +++++++++++++++++++++++++-- include/hw/vfio/vfio-common.h | 12 + include/hw/virtio/virtio-mem.h | 3 + include/migration/vmstate.h | 1 + softmmu/memory.c | 98 +++++++++ softmmu/physmem.c | 108 ++++++--- 8 files changed, 1133 insertions(+), 119 deletions(-) -- 2.29.2