All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>,
	Janosch Frank <frankja@linux.vnet.ibm.com>
Cc: KVM <kvm@vger.kernel.org>, Cornelia Huck <cohuck@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Thomas Huth <thuth@redhat.com>,
	Ulrich Weigand <Ulrich.Weigand@de.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Michael Mueller <mimu@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>
Subject: [PATCH v4 02/36] KVM: s390/interrupt: do not pin adapter interrupt pages
Date: Mon, 24 Feb 2020 06:40:33 -0500	[thread overview]
Message-ID: <20200224114107.4646-3-borntraeger@de.ibm.com> (raw)
In-Reply-To: <20200224114107.4646-1-borntraeger@de.ibm.com>

From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>

The adapter interrupt page containing the indicator bits is currently
pinned. That means that a guest with many devices can pin a lot of
memory pages in the host. This also complicates the reference tracking
which is needed for memory management handling of protected virtual
machines. It might also have some strange side effects for madvise
MADV_DONTNEED and other things.

We can simply try to get the userspace page set the bits and free the
page. By storing the userspace address in the irq routing entry instead
of the guest address we can actually avoid many lookups and list walks
so that this variant is very likely not slower.

If userspace messes around with the memory slots the worst thing that
can happen is that we write to some other memory within that process.
As we get the the page with FOLL_WRITE this can also not be used to
write to shared read-only pages.

Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch simplification]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/devices/s390_flic.rst |  11 +-
 arch/s390/include/asm/kvm_host.h             |   3 -
 arch/s390/kvm/interrupt.c                    | 170 ++++++-------------
 3 files changed, 51 insertions(+), 133 deletions(-)

diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst
index 954190da7d04..ea96559ba501 100644
--- a/Documentation/virt/kvm/devices/s390_flic.rst
+++ b/Documentation/virt/kvm/devices/s390_flic.rst
@@ -108,16 +108,9 @@ Groups:
       mask or unmask the adapter, as specified in mask
 
     KVM_S390_IO_ADAPTER_MAP
-      perform a gmap translation for the guest address provided in addr,
-      pin a userspace page for the translated address and add it to the
-      list of mappings
-
-      .. note:: A new mapping will be created unconditionally; therefore,
-	        the calling code should avoid making duplicate mappings.
-
+      This is now a no-op. The mapping is purely done by the irq route.
     KVM_S390_IO_ADAPTER_UNMAP
-      release a userspace page for the translated address specified in addr
-      from the list of mappings
+      This is now a no-op. The mapping is purely done by the irq route.
 
   KVM_DEV_FLIC_AISM
     modify the adapter-interruption-suppression mode for a given isc if the
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 1726224e7772..d058289385a5 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -701,9 +701,6 @@ struct s390_io_adapter {
 	bool masked;
 	bool swap;
 	bool suppressible;
-	struct rw_semaphore maps_lock;
-	struct list_head maps;
-	atomic_t nr_maps;
 };
 
 #define MAX_S390_IO_ADAPTERS ((MAX_ISC + 1) * 8)
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index c06c89d370a7..d29f575fb372 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -2,7 +2,7 @@
 /*
  * handling kvm guest interrupts
  *
- * Copyright IBM Corp. 2008, 2015
+ * Copyright IBM Corp. 2008, 2020
  *
  *    Author(s): Carsten Otte <cotte@de.ibm.com>
  */
@@ -2327,9 +2327,6 @@ static int register_io_adapter(struct kvm_device *dev,
 	if (!adapter)
 		return -ENOMEM;
 
-	INIT_LIST_HEAD(&adapter->maps);
-	init_rwsem(&adapter->maps_lock);
-	atomic_set(&adapter->nr_maps, 0);
 	adapter->id = adapter_info.id;
 	adapter->isc = adapter_info.isc;
 	adapter->maskable = adapter_info.maskable;
@@ -2354,87 +2351,12 @@ int kvm_s390_mask_adapter(struct kvm *kvm, unsigned int id, bool masked)
 	return ret;
 }
 
-static int kvm_s390_adapter_map(struct kvm *kvm, unsigned int id, __u64 addr)
-{
-	struct s390_io_adapter *adapter = get_io_adapter(kvm, id);
-	struct s390_map_info *map;
-	int ret;
-
-	if (!adapter || !addr)
-		return -EINVAL;
-
-	map = kzalloc(sizeof(*map), GFP_KERNEL);
-	if (!map) {
-		ret = -ENOMEM;
-		goto out;
-	}
-	INIT_LIST_HEAD(&map->list);
-	map->guest_addr = addr;
-	map->addr = gmap_translate(kvm->arch.gmap, addr);
-	if (map->addr == -EFAULT) {
-		ret = -EFAULT;
-		goto out;
-	}
-	ret = get_user_pages_fast(map->addr, 1, FOLL_WRITE, &map->page);
-	if (ret < 0)
-		goto out;
-	BUG_ON(ret != 1);
-	down_write(&adapter->maps_lock);
-	if (atomic_inc_return(&adapter->nr_maps) < MAX_S390_ADAPTER_MAPS) {
-		list_add_tail(&map->list, &adapter->maps);
-		ret = 0;
-	} else {
-		put_page(map->page);
-		ret = -EINVAL;
-	}
-	up_write(&adapter->maps_lock);
-out:
-	if (ret)
-		kfree(map);
-	return ret;
-}
-
-static int kvm_s390_adapter_unmap(struct kvm *kvm, unsigned int id, __u64 addr)
-{
-	struct s390_io_adapter *adapter = get_io_adapter(kvm, id);
-	struct s390_map_info *map, *tmp;
-	int found = 0;
-
-	if (!adapter || !addr)
-		return -EINVAL;
-
-	down_write(&adapter->maps_lock);
-	list_for_each_entry_safe(map, tmp, &adapter->maps, list) {
-		if (map->guest_addr == addr) {
-			found = 1;
-			atomic_dec(&adapter->nr_maps);
-			list_del(&map->list);
-			put_page(map->page);
-			kfree(map);
-			break;
-		}
-	}
-	up_write(&adapter->maps_lock);
-
-	return found ? 0 : -EINVAL;
-}
-
 void kvm_s390_destroy_adapters(struct kvm *kvm)
 {
 	int i;
-	struct s390_map_info *map, *tmp;
 
-	for (i = 0; i < MAX_S390_IO_ADAPTERS; i++) {
-		if (!kvm->arch.adapters[i])
-			continue;
-		list_for_each_entry_safe(map, tmp,
-					 &kvm->arch.adapters[i]->maps, list) {
-			list_del(&map->list);
-			put_page(map->page);
-			kfree(map);
-		}
+	for (i = 0; i < MAX_S390_IO_ADAPTERS; i++)
 		kfree(kvm->arch.adapters[i]);
-	}
 }
 
 static int modify_io_adapter(struct kvm_device *dev,
@@ -2456,11 +2378,14 @@ static int modify_io_adapter(struct kvm_device *dev,
 		if (ret > 0)
 			ret = 0;
 		break;
+	/*
+	 * The following operations are no longer needed and therefore no-ops.
+	 * The gpa to hva translation is done when an IRQ route is set up. The
+	 * set_irq code uses get_user_pages_remote() to do the actual write.
+	 */
 	case KVM_S390_IO_ADAPTER_MAP:
-		ret = kvm_s390_adapter_map(dev->kvm, req.id, req.addr);
-		break;
 	case KVM_S390_IO_ADAPTER_UNMAP:
-		ret = kvm_s390_adapter_unmap(dev->kvm, req.id, req.addr);
+		ret = 0;
 		break;
 	default:
 		ret = -EINVAL;
@@ -2699,19 +2624,15 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
 	return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit;
 }
 
-static struct s390_map_info *get_map_info(struct s390_io_adapter *adapter,
-					  u64 addr)
+static struct page *get_map_page(struct kvm *kvm, u64 uaddr)
 {
-	struct s390_map_info *map;
+	struct page *page = NULL;
 
-	if (!adapter)
-		return NULL;
-
-	list_for_each_entry(map, &adapter->maps, list) {
-		if (map->guest_addr == addr)
-			return map;
-	}
-	return NULL;
+	down_read(&kvm->mm->mmap_sem);
+	get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE,
+			      &page, NULL, NULL);
+	up_read(&kvm->mm->mmap_sem);
+	return page;
 }
 
 static int adapter_indicators_set(struct kvm *kvm,
@@ -2720,30 +2641,35 @@ static int adapter_indicators_set(struct kvm *kvm,
 {
 	unsigned long bit;
 	int summary_set, idx;
-	struct s390_map_info *info;
+	struct page *ind_page, *summary_page;
 	void *map;
 
-	info = get_map_info(adapter, adapter_int->ind_addr);
-	if (!info)
+	ind_page = get_map_page(kvm, adapter_int->ind_addr);
+	if (!ind_page)
 		return -1;
-	map = page_address(info->page);
-	bit = get_ind_bit(info->addr, adapter_int->ind_offset, adapter->swap);
-	set_bit(bit, map);
-	idx = srcu_read_lock(&kvm->srcu);
-	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
-	set_page_dirty_lock(info->page);
-	info = get_map_info(adapter, adapter_int->summary_addr);
-	if (!info) {
-		srcu_read_unlock(&kvm->srcu, idx);
+	summary_page = get_map_page(kvm, adapter_int->summary_addr);
+	if (!summary_page) {
+		put_page(ind_page);
 		return -1;
 	}
-	map = page_address(info->page);
-	bit = get_ind_bit(info->addr, adapter_int->summary_offset,
-			  adapter->swap);
+
+	idx = srcu_read_lock(&kvm->srcu);
+	map = page_address(ind_page);
+	bit = get_ind_bit(adapter_int->ind_addr,
+			  adapter_int->ind_offset, adapter->swap);
+	set_bit(bit, map);
+	mark_page_dirty(kvm, adapter_int->ind_addr >> PAGE_SHIFT);
+	set_page_dirty_lock(ind_page);
+	map = page_address(summary_page);
+	bit = get_ind_bit(adapter_int->summary_addr,
+			  adapter_int->summary_offset, adapter->swap);
 	summary_set = test_and_set_bit(bit, map);
-	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
-	set_page_dirty_lock(info->page);
+	mark_page_dirty(kvm, adapter_int->summary_addr >> PAGE_SHIFT);
+	set_page_dirty_lock(summary_page);
 	srcu_read_unlock(&kvm->srcu, idx);
+
+	put_page(ind_page);
+	put_page(summary_page);
 	return summary_set ? 0 : 1;
 }
 
@@ -2765,9 +2691,7 @@ static int set_adapter_int(struct kvm_kernel_irq_routing_entry *e,
 	adapter = get_io_adapter(kvm, e->adapter.adapter_id);
 	if (!adapter)
 		return -1;
-	down_read(&adapter->maps_lock);
 	ret = adapter_indicators_set(kvm, adapter, &e->adapter);
-	up_read(&adapter->maps_lock);
 	if ((ret > 0) && !adapter->masked) {
 		ret = kvm_s390_inject_airq(kvm, adapter);
 		if (ret == 0)
@@ -2818,23 +2742,27 @@ int kvm_set_routing_entry(struct kvm *kvm,
 			  struct kvm_kernel_irq_routing_entry *e,
 			  const struct kvm_irq_routing_entry *ue)
 {
-	int ret;
+	u64 uaddr;
 
 	switch (ue->type) {
+	/* we store the userspace addresses instead of the guest addresses */
 	case KVM_IRQ_ROUTING_S390_ADAPTER:
 		e->set = set_adapter_int;
-		e->adapter.summary_addr = ue->u.adapter.summary_addr;
-		e->adapter.ind_addr = ue->u.adapter.ind_addr;
+		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);
+		if (uaddr == -EFAULT)
+			return -EFAULT;
+		e->adapter.summary_addr = uaddr;
+		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr);
+		if (uaddr == -EFAULT)
+			return -EFAULT;
+		e->adapter.ind_addr = uaddr;
 		e->adapter.summary_offset = ue->u.adapter.summary_offset;
 		e->adapter.ind_offset = ue->u.adapter.ind_offset;
 		e->adapter.adapter_id = ue->u.adapter.adapter_id;
-		ret = 0;
-		break;
+		return 0;
 	default:
-		ret = -EINVAL;
+		return -EINVAL;
 	}
-
-	return ret;
 }
 
 int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm,
-- 
2.25.0

  parent reply	other threads:[~2020-02-24 11:41 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24 11:40 [PATCH v4 00/36] KVM: s390: Add support for protected VMs Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 01/36] mm/gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
2020-02-24 11:40 ` Christian Borntraeger [this message]
2020-02-25 10:18   ` [PATCH v4 02/36] KVM: s390/interrupt: do not pin adapter interrupt pages Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 03/36] s390/protvirt: introduce host side setup Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 04/36] s390/protvirt: add ultravisor initialization Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 05/36] s390/mm: provide memory management functions for protected KVM guests Christian Borntraeger
2020-02-25 10:32   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 06/36] s390/mm: add (non)secure page access exceptions handlers Christian Borntraeger
2020-02-25 10:37   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 07/36] KVM: s390: protvirt: Add UV debug trace Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 08/36] KVM: s390: add new variants of UV CALL Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 09/36] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
2020-02-25 17:46   ` David Hildenbrand
2020-02-25 21:44     ` Christian Borntraeger
2020-02-25 22:29       ` David Hildenbrand
2020-02-25 21:48     ` [PATCH v4.5 " Christian Borntraeger
2020-02-25 22:37       ` David Hildenbrand
2020-02-26  8:12         ` Christian Borntraeger
2020-02-26  8:28           ` David Hildenbrand
2020-02-26  9:12             ` Christian Borntraeger
2020-02-26  9:15               ` David Hildenbrand
2020-02-26 10:01       ` Cornelia Huck
2020-02-26 10:52         ` Christian Borntraeger
2020-02-26 10:38       ` Cornelia Huck
2020-02-26 11:03         ` Christian Borntraeger
2020-02-26 12:26       ` Cornelia Huck
2020-02-26 13:31         ` Christian Borntraeger
2020-02-26 16:54           ` Cornelia Huck
2020-02-26 17:00             ` [PATCH v4.6 " Christian Borntraeger
2020-02-26 17:08               ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 10/36] KVM: s390: protvirt: Secure memory is not mergeable Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 11/36] KVM: s390/mm: Make pages accessible before destroying the guest Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 12/36] KVM: s390: protvirt: Handle SE notification interceptions Christian Borntraeger
2020-02-25 11:11   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 13/36] KVM: s390: protvirt: Instruction emulation Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 14/36] KVM: s390: protvirt: Implement interrupt injection Christian Borntraeger
2020-02-25 12:07   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 15/36] KVM: s390: protvirt: Add SCLP interrupt handling Christian Borntraeger
2020-02-25 12:11   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 16/36] KVM: s390: protvirt: Handle spec exception loops Christian Borntraeger
2020-02-24 19:14   ` David Hildenbrand
2020-02-24 11:40 ` [PATCH v4 17/36] KVM: s390: protvirt: Add new gprs location handling Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 18/36] KVM: S390: protvirt: Introduce instruction data area bounce buffer Christian Borntraeger
2020-02-24 19:13   ` David Hildenbrand
2020-02-25  7:50     ` Christian Borntraeger
2020-02-25  8:18       ` David Hildenbrand
2020-02-25 17:21       ` Cornelia Huck
2020-02-25 18:39         ` Christian Borntraeger
2020-02-25 17:19   ` Cornelia Huck
2020-02-25 18:37     ` Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 19/36] KVM: s390: protvirt: handle secure guest prefix pages Christian Borntraeger
2020-02-25 12:15   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 20/36] KVM: s390/mm: handle guest unpin events Christian Borntraeger
2020-02-25 12:18   ` Cornelia Huck
2020-02-25 14:21     ` Christian Borntraeger
2020-02-25 14:30       ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 21/36] KVM: s390: protvirt: Write sthyi data to instruction data area Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 22/36] KVM: s390: protvirt: STSI handling Christian Borntraeger
2020-02-24 19:00   ` David Hildenbrand
2020-02-24 11:40 ` [PATCH v4 23/36] KVM: s390: protvirt: disallow one_reg Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 24/36] KVM: s390: protvirt: Do only reset registers that are accessible Christian Borntraeger
2020-02-25 12:32   ` Cornelia Huck
2020-02-25 12:51     ` Janosch Frank
2020-02-25 13:06       ` Cornelia Huck
2020-02-25 13:08         ` Christian Borntraeger
2020-02-25 13:16           ` Cornelia Huck
2020-02-25 13:07     ` Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 25/36] KVM: s390: protvirt: Only sync fmt4 registers Christian Borntraeger
2020-02-25 12:36   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 26/36] KVM: s390: protvirt: Add program exception injection Christian Borntraeger
2020-02-24 11:40 ` [PATCH v4 27/36] KVM: s390: protvirt: UV calls in support of diag308 0, 1 Christian Borntraeger
2020-02-25 12:51   ` Cornelia Huck
2020-02-24 11:40 ` [PATCH v4 28/36] KVM: s390: protvirt: Report CPU state to Ultravisor Christian Borntraeger
2020-02-24 19:05   ` David Hildenbrand
2020-02-25  8:29     ` Christian Borntraeger
2020-02-25  8:41       ` David Hildenbrand
2020-02-25 13:01       ` Cornelia Huck
2020-02-25 13:21         ` Christian Borntraeger
2020-02-25 13:44           ` Cornelia Huck
2020-02-24 11:41 ` [PATCH v4 29/36] KVM: s390: protvirt: Support cmd 5 operation state Christian Borntraeger
2020-02-24 19:08   ` David Hildenbrand
2020-02-25  7:53     ` Christian Borntraeger
2020-02-25 13:21       ` Cornelia Huck
2020-02-24 11:41 ` [PATCH v4 30/36] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112 Christian Borntraeger
2020-02-24 11:41 ` [PATCH v4 31/36] KVM: s390: protvirt: do not inject interrupts after start Christian Borntraeger
2020-02-24 11:41 ` [PATCH v4 32/36] KVM: s390: protvirt: Add UV cpu reset calls Christian Borntraeger
2020-02-24 11:41 ` [PATCH v4 33/36] DOCUMENTATION: Protected virtual machine introduction and IPL Christian Borntraeger
2020-02-25 16:22   ` Cornelia Huck
2020-02-25 16:42     ` Christian Borntraeger
2020-02-24 11:41 ` [PATCH v4 34/36] s390: protvirt: Add sysfs firmware interface for Ultravisor information Christian Borntraeger
2020-02-25 13:30   ` Cornelia Huck
2020-02-25 13:37   ` Cornelia Huck
2020-02-24 11:41 ` [PATCH v4 35/36] KVM: s390: protvirt: introduce and enable KVM_CAP_S390_PROTECTED Christian Borntraeger
2020-02-25 13:22   ` Cornelia Huck
2020-02-24 11:41 ` [PATCH v4 36/36] KVM: s390: protvirt: Add KVM api documentation Christian Borntraeger
2020-02-25 15:50   ` Cornelia Huck
2020-02-25 19:30     ` Christian Borntraeger
2020-02-27  8:47       ` [PATCH v4.1 " Christian Borntraeger
2020-02-27  9:04         ` Cornelia Huck
2020-02-26  9:35 ` [PATCH v4 00/36] KVM: s390: Add support for protected VMs Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200224114107.4646-3-borntraeger@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.vnet.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mimu@linux.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.