kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
To: kvm@vger.kernel.org
Cc: will.deacon@arm.com, robin.murphy@arm.com,
	lorenzo.pieralisi@arm.com, marc.zyngier@arm.com
Subject: [PATCH v2 kvmtool 09/10] Introduce reserved memory regions
Date: Thu, 22 Jun 2017 18:05:35 +0100	[thread overview]
Message-ID: <20170622170536.14319-10-jean-philippe.brucker@arm.com> (raw)
In-Reply-To: <20170622170536.14319-1-jean-philippe.brucker@arm.com>

When passing devices to the guest, there might be address ranges
unavailable to the device. For instance, if address 0x10000000 corresponds
to an MSI doorbell, any transaction from a device to that address will be
directed to the MSI controller and might not even reach the IOMMU.
0x10000000 is therefore reserved by the physical IOMMU in the guest's
physical space.

This patch introduces a simple API to register reserved ranges of
addresses that should not or cannot be provided to the guest. For the
moment it only checks that a reserved range does not overlap any user
memory (we don't consider MMIO) and aborts otherwise.

It should be possible instead to poke holes in the guest-physical memory
map and report them via the architecture's preferred route:
* ARM and PowerPC can add reserved-memory nodes to the DT they provide to
  the guest.
* x86 could poke holes in the memory map reported with e820. This requires
  to postpone creating the memory map until at least VFIO is initialized.
* MIPS could describe the reserved ranges with the "memmap=mm$ss" kernel
  parameter.

This would also require to call KVM_SET_USER_MEMORY_REGION for all memory
regions at the end of kvmtool initialisation. Extra care should be taken
to ensure we don't break any architecture, since they currently rely on
having a linear address space with at most two memory blocks.

This patch doesn't implement any address space carving. If an abort is
encountered, user can try to rebuild kvmtool with different addresses or
change its IOMMU resv regions if possible.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/kvm.h | 12 +++++++++-
 kvm.c             | 68 +++++++++++++++++++++++++++++++++++++++++++------------
 2 files changed, 65 insertions(+), 15 deletions(-)

diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index 57061566..8546787a 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -35,10 +35,12 @@ enum {
 };
 
 enum kvm_mem_type {
+	KVM_MEM_TYPE_RESERVED	= 1 << 0,
 	KVM_MEM_TYPE_RAM	= 1 << 1,
 	KVM_MEM_TYPE_DEVICE	= 1 << 2,
 
-	KVM_MEM_TYPE_ALL	= KVM_MEM_TYPE_RAM
+	KVM_MEM_TYPE_ALL	= KVM_MEM_TYPE_RESERVED
+				| KVM_MEM_TYPE_RAM
 				| KVM_MEM_TYPE_DEVICE
 };
 
@@ -115,6 +117,12 @@ static inline int kvm__register_dev_mem(struct kvm *kvm, u64 guest_phys,
 				 KVM_MEM_TYPE_DEVICE);
 }
 
+static inline int kvm__reserve_mem(struct kvm *kvm, u64 guest_phys, u64 size)
+{
+	return kvm__register_mem(kvm, guest_phys, size, NULL,
+				 KVM_MEM_TYPE_RESERVED);
+}
+
 int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce,
 		       void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr),
 			void *ptr);
@@ -150,6 +158,8 @@ static inline const char *kvm_mem_type_to_string(enum kvm_mem_type type)
 		return "RAM";
 	case KVM_MEM_TYPE_DEVICE:
 		return "device";
+	case KVM_MEM_TYPE_RESERVED:
+		return "reserved";
 	}
 
 	return "???";
diff --git a/kvm.c b/kvm.c
index 39ce88c4..9078a026 100644
--- a/kvm.c
+++ b/kvm.c
@@ -177,18 +177,55 @@ int kvm__exit(struct kvm *kvm)
 }
 core_exit(kvm__exit);
 
-/*
- * Note: KVM_SET_USER_MEMORY_REGION assumes that we don't pass overlapping
- * memory regions to it. Therefore, be careful if you use this function for
- * registering memory regions for emulating hardware.
- */
 int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size,
 		      void *userspace_addr, enum kvm_mem_type type)
 {
 	struct kvm_userspace_memory_region mem;
+	struct kvm_mem_bank *merged = NULL;
 	struct kvm_mem_bank *bank;
 	int ret;
 
+	/* Check for overlap */
+	list_for_each_entry(bank, &kvm->mem_banks, list) {
+		u64 bank_end = bank->guest_phys_addr + bank->size - 1;
+		u64 end = guest_phys + size - 1;
+		if (guest_phys > bank_end || end < bank->guest_phys_addr)
+			continue;
+
+		/* Merge overlapping reserved regions */
+		if (bank->type == KVM_MEM_TYPE_RESERVED &&
+		    type == KVM_MEM_TYPE_RESERVED) {
+			bank->guest_phys_addr = min(bank->guest_phys_addr, guest_phys);
+			bank->size = max(bank_end, end) - bank->guest_phys_addr + 1;
+
+			if (merged) {
+				/*
+				 * This is at least the second merge, remove
+				 * previous result.
+				 */
+				list_del(&merged->list);
+				free(merged);
+			}
+
+			guest_phys = bank->guest_phys_addr;
+			size = bank->size;
+			merged = bank;
+
+			/* Keep checking that we don't overlap another region */
+			continue;
+		}
+
+		pr_err("%s region [%llx-%llx] would overlap %s region [%llx-%llx]",
+		       kvm_mem_type_to_string(type), guest_phys, guest_phys + size - 1,
+		       kvm_mem_type_to_string(bank->type), bank->guest_phys_addr,
+		       bank->guest_phys_addr + bank->size - 1);
+
+		return -EINVAL;
+	}
+
+	if (merged)
+		return 0;
+
 	bank = malloc(sizeof(*bank));
 	if (!bank)
 		return -ENOMEM;
@@ -199,18 +236,21 @@ int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size,
 	bank->size			= size;
 	bank->type			= type;
 
-	mem = (struct kvm_userspace_memory_region) {
-		.slot			= kvm->mem_slots++,
-		.guest_phys_addr	= guest_phys,
-		.memory_size		= size,
-		.userspace_addr		= (unsigned long)userspace_addr,
-	};
+	if (type != KVM_MEM_TYPE_RESERVED) {
+		mem = (struct kvm_userspace_memory_region) {
+			.slot			= kvm->mem_slots++,
+			.guest_phys_addr	= guest_phys,
+			.memory_size		= size,
+			.userspace_addr		= (unsigned long)userspace_addr,
+		};
 
-	ret = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &mem);
-	if (ret < 0)
-		return -errno;
+		ret = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &mem);
+		if (ret < 0)
+			return -errno;
+	}
 
 	list_add(&bank->list, &kvm->mem_banks);
+
 	return 0;
 }
 
-- 
2.13.1

  parent reply	other threads:[~2017-06-22 17:04 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-22 17:05 [PATCH v2 kvmtool 00/10] Add PCI passthrough support with VFIO Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 01/10] pci: add config operations callbacks on the PCI header Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 02/10] pci: allow to specify IRQ type for PCI devices Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 03/10] irq: add irqfd helpers Jean-Philippe Brucker
2017-07-31 17:55   ` Punit Agrawal
2017-08-02 15:17     ` Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 04/10] Extend memory bank API with memory types Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 05/10] pci: add capability helpers Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 06/10] Add PCI device passthrough using VFIO Jean-Philippe Brucker
2017-07-31 17:52   ` Punit Agrawal
2017-08-02 15:17     ` Jean-Philippe Brucker
2017-08-03  9:36       ` Punit Agrawal
2017-08-03 11:24         ` Jean-Philippe Brucker
2017-06-22 17:05 ` [PATCH v2 kvmtool 07/10] vfio-pci: add MSI-X support Jean-Philippe Brucker
2017-07-31 17:49   ` Punit Agrawal
2017-08-01 16:04     ` Punit Agrawal
2017-08-02 15:18       ` Jean-Philippe Brucker
2017-08-03 10:25         ` Punit Agrawal
2017-08-03 10:53           ` Jean-Philippe Brucker
2017-08-18 17:42   ` Jean-Philippe Brucker
2017-08-22 11:25     ` Punit Agrawal
2017-06-22 17:05 ` [PATCH v2 kvmtool 08/10] vfio-pci: add MSI support Jean-Philippe Brucker
2017-06-22 17:05 ` Jean-Philippe Brucker [this message]
2017-06-22 17:05 ` [PATCH v2 kvmtool 10/10] vfio: check reserved regions before mapping DMA Jean-Philippe Brucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170622170536.14319-10-jean-philippe.brucker@arm.com \
    --to=jean-philippe.brucker@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=marc.zyngier@arm.com \
    --cc=robin.murphy@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).