KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 00/42] KVM: s390: Add support for protected VMs
@ 2020-02-14 22:26 Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
                   ` (41 more replies)
  0 siblings, 42 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Andrea Arcangeli, linux-mm, Will Deacon,
	Sean Christopherson

mm people: This series contains a "pretty small" common code memory
management change that will allow paging, guest backing with files etc
almost just like normal VMs. It should be a no-op for all architectures
not opting in. And it should be usable for others that also try to get
notified on "the pages are in the process of being used for things like
I/O". At the end of the series are two sample patches as these hooks
seem to be useful for other with error handling/call information.  I
would suggest to keep the patch as is and add the additional things when
intel/arm know exactly what they need.

mm-related patches CCed on linux-mm, the complete list can be found on
the KVM and linux-s390 list. 

Andrew, any chance to either take " mm:gup/writeback: add callbacks for
inaccessible pages" or ACK so that I can take it?

Overview
--------
Protected VMs (PVM) are KVM VMs, where KVM can't access the VM's state
like guest memory and guest registers anymore. Instead the PVMs are
mostly managed by a new entity called Ultravisor (UV), which provides
an API, so KVM and the PV can request management actions.

PVMs are encrypted at rest and protected from hypervisor access while
running. They switch from a normal operation into protected mode, so
we can still use the standard boot process to load a encrypted blob
and then move it into protected mode.

Rebooting is only possible by passing through the unprotected/normal
mode and switching to protected again.

All patches are in the protvirtv4 branch of the korg s390 kvm git
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git/log/?h=protvirtv4

Claudio presented the technology at his presentation at KVM Forum
2019.

https://static.sched.com/hosted_files/kvmforum2019/3b/ibm_protected_vms_s390x.pdf


v-> v2
- rebase on top of kvm/master
- pipe through rc and rrc. This might have created some churn here and
  there
- turn off sclp masking when rebooting into "unsecure"
- memory management simplification
- prefix page handling now via intercept 112
- io interrupt intervention request fix (do not use GISA)
- api.txt conversion to rst
- sample patches on top of mm/gup/writeback
- tons of review feedback
- kvm_uv debug feature fixes and unifications
- ultravisor information for /sys/firmware
- 

RFCv2 -> v1 (you can diff the protvirtv2 and the protvirtv3 branch)
- tons of review feedback integrated (see mail thread)
- memory management now complete and working
- Documentation patches merged
- interrupt patches merged
- CONFIG_KVM_S390_PROTECTED_VIRTUALIZATION_HOST removed
- SIDA interface integrated into memop
- for merged patches I removed reviews that were not in all patches

Christian Borntraeger (5):
  KVM: s390/mm: Make pages accessible before destroying the guest
  KVM: s390: protvirt: Add SCLP interrupt handling
  KVM: s390: protvirt: do not inject interrupts after start
  s390/uv: Fix handling of length extensions (already in s390 tree)
  KVM: s390: rstify new ioctls in api.rst

Claudio Imbrenda (6):
  mm:gup/writeback: add callbacks for inaccessible pages
  s390/mm: provide memory management functions for protected KVM guests
  KVM: s390/mm: handle guest unpin events
  example for future extension: mm:gup/writeback: add callbacks for
    inaccessible pages: error cases
  example for future extension: mm:gup/writeback: add callbacks for
    inaccessible pages: source indication
  potential fixup for "s390/mm: provide memory management functions for
    protected KVM guests"

Janosch Frank (25):
  KVM: s390: protvirt: Add UV debug trace
  KVM: s390: add new variants of UV CALL
  KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  KVM: s390: protvirt: Add KVM api documentation
  KVM: s390: protvirt: Secure memory is not mergeable
  KVM: s390: protvirt: Handle SE notification interceptions
  KVM: s390: protvirt: Instruction emulation
  KVM: s390: protvirt: Handle spec exception loops
  KVM: s390: protvirt: Add new gprs location handling
  KVM: S390: protvirt: Introduce instruction data area bounce buffer
  KVM: s390: protvirt: handle secure guest prefix pages
  KVM: s390: protvirt: Write sthyi data to instruction data area
  KVM: s390: protvirt: STSI handling
  KVM: s390: protvirt: disallow one_reg
  KVM: s390: protvirt: Do only reset registers that are accessible
  KVM: s390: protvirt: Only sync fmt4 registers
  KVM: s390: protvirt: Add program exception injection
  KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling
  KVM: s390: protvirt: UV calls in support of diag308 0, 1
  KVM: s390: protvirt: Report CPU state to Ultravisor
  KVM: s390: protvirt: Support cmd 5 operation state
  KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and
    112
  KVM: s390: protvirt: Add UV cpu reset calls
  DOCUMENTATION: Protected virtual machine introduction and IPL
  s390: protvirt: Add sysfs firmware interface for Ultravisor
    information

Michael Mueller (2):
  KVM: s390: protvirt: Add interruption injection controls
  KVM: s390: protvirt: Implement interruption injection

Ulrich Weigand (1):
  KVM: s390/interrupt: do not pin adapter interrupt pages

Vasily Gorbik (3):
  s390/protvirt: introduce host side setup
  s390/protvirt: add ultravisor initialization
  s390/mm: add (non)secure page access exceptions handlers

 .../admin-guide/kernel-parameters.txt         |   5 +
 Documentation/virt/kvm/api.rst                | 108 +++-
 Documentation/virt/kvm/devices/s390_flic.rst  |  11 +-
 Documentation/virt/kvm/index.rst              |   2 +
 Documentation/virt/kvm/s390-pv-boot.rst       |  83 +++
 Documentation/virt/kvm/s390-pv.rst            | 116 +++++
 MAINTAINERS                                   |   1 +
 arch/s390/boot/Makefile                       |   2 +-
 arch/s390/boot/uv.c                           |  23 +-
 arch/s390/include/asm/gmap.h                  |   6 +
 arch/s390/include/asm/kvm_host.h              | 113 ++++-
 arch/s390/include/asm/mmu.h                   |   2 +
 arch/s390/include/asm/mmu_context.h           |   1 +
 arch/s390/include/asm/page.h                  |   5 +
 arch/s390/include/asm/pgtable.h               |  35 +-
 arch/s390/include/asm/uv.h                    | 251 ++++++++-
 arch/s390/kernel/Makefile                     |   1 +
 arch/s390/kernel/pgm_check.S                  |   4 +-
 arch/s390/kernel/setup.c                      |   9 +-
 arch/s390/kernel/uv.c                         | 412 +++++++++++++++
 arch/s390/kvm/Makefile                        |   2 +-
 arch/s390/kvm/intercept.c                     | 111 +++-
 arch/s390/kvm/interrupt.c                     | 391 ++++++++------
 arch/s390/kvm/kvm-s390.c                      | 479 ++++++++++++++++--
 arch/s390/kvm/kvm-s390.h                      |  40 ++
 arch/s390/kvm/priv.c                          |  11 +-
 arch/s390/kvm/pv.c                            | 295 +++++++++++
 arch/s390/mm/fault.c                          |  78 +++
 arch/s390/mm/gmap.c                           |  65 ++-
 include/linux/gfp.h                           |  12 +
 include/uapi/linux/kvm.h                      |  44 +-
 mm/gup.c                                      |  15 +-
 mm/page-writeback.c                           |   5 +
 33 files changed, 2438 insertions(+), 300 deletions(-)
 create mode 100644 Documentation/virt/kvm/s390-pv-boot.rst
 create mode 100644 Documentation/virt/kvm/s390-pv.rst
 create mode 100644 arch/s390/kernel/uv.c
 create mode 100644 arch/s390/kvm/pv.c

-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17  9:14   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages Christian Borntraeger
                   ` (40 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Andrea Arcangeli, linux-mm, Will Deacon,
	Sean Christopherson

From: Claudio Imbrenda <imbrenda@linux.ibm.com>

With the introduction of protected KVM guests on s390 there is now a
concept of inaccessible pages. These pages need to be made accessible
before the host can access them.

While cpu accesses will trigger a fault that can be resolved, I/O
accesses will just fail.  We need to add a callback into architecture
code for places that will do I/O, namely when writeback is started or
when a page reference is taken.

This is not only to enable paging, file backing etc, it is also
necessary to protect the host against a malicious user space. For
example a bad QEMU could simply start direct I/O on such protected
memory.  We do not want userspace to be able to trigger I/O errors and
thus we the logic is "whenever somebody accesses that page (gup) or
doing I/O, make sure that this page can be accessed. When the guest
tries to access that page we will wait in the page fault handler for
writeback to have finished and for the page_ref to be the expected
value.

If wanted by others, the callbacks can be extended with error handlin
and a parameter from where this is called.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 include/linux/gfp.h | 6 ++++++
 mm/gup.c            | 2 ++
 mm/page-writeback.c | 1 +
 3 files changed, 9 insertions(+)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index e5b817cb86e7..be2754841369 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -485,6 +485,12 @@ static inline void arch_free_page(struct page *page, int order) { }
 #ifndef HAVE_ARCH_ALLOC_PAGE
 static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
+#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
+static inline int arch_make_page_accessible(struct page *page)
+{
+	return 0;
+}
+#endif
 
 struct page *
 __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
diff --git a/mm/gup.c b/mm/gup.c
index 1b521e0ac1de..a1c15d029f7c 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -276,6 +276,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 			page = ERR_PTR(-ENOMEM);
 			goto out;
 		}
+		arch_make_page_accessible(page);
 	}
 	if (flags & FOLL_TOUCH) {
 		if ((flags & FOLL_WRITE) &&
@@ -1919,6 +1920,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 
 		VM_BUG_ON_PAGE(compound_head(page) != head, page);
 
+		arch_make_page_accessible(page);
 		SetPageReferenced(page);
 		pages[*nr] = page;
 		(*nr)++;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2caf780a42e7..4c020e4ae71c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2807,6 +2807,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 		inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
 	}
 	unlock_page_memcg(page);
+	arch_make_page_accessible(page);
 	return ret;
 
 }
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17  9:43   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 03/42] s390/protvirt: introduce host side setup Christian Borntraeger
                   ` (39 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>

The adapter interrupt page containing the indicator bits is currently
pinned. That means that a guest with many devices can pin a lot of
memory pages in the host. This also complicates the reference tracking
which is needed for memory management handling of protected virtual
machines. It might also have some strange side effects for madvise
MADV_DONTNEED and other things.

We can simply try to get the userspace page set the bits and free the
page. By storing the userspace address in the irq routing entry instead
of the guest address we can actually avoid many lookups and list walks
so that this variant is very likely not slower.

If userspace messes around with the memory slots the worst thing that
can happen is that we write to some other memory within that process.
As we get the the page with FOLL_WRITE this can also not be used to
write to shared read-only pages.

Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
[borntraeger@de.ibm.com: patch simplification]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/devices/s390_flic.rst |  11 +-
 arch/s390/include/asm/kvm_host.h             |   3 -
 arch/s390/kvm/interrupt.c                    | 170 ++++++-------------
 3 files changed, 53 insertions(+), 131 deletions(-)

diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst
index 954190da7d04..ea96559ba501 100644
--- a/Documentation/virt/kvm/devices/s390_flic.rst
+++ b/Documentation/virt/kvm/devices/s390_flic.rst
@@ -108,16 +108,9 @@ Groups:
       mask or unmask the adapter, as specified in mask
 
     KVM_S390_IO_ADAPTER_MAP
-      perform a gmap translation for the guest address provided in addr,
-      pin a userspace page for the translated address and add it to the
-      list of mappings
-
-      .. note:: A new mapping will be created unconditionally; therefore,
-	        the calling code should avoid making duplicate mappings.
-
+      This is now a no-op. The mapping is purely done by the irq route.
     KVM_S390_IO_ADAPTER_UNMAP
-      release a userspace page for the translated address specified in addr
-      from the list of mappings
+      This is now a no-op. The mapping is purely done by the irq route.
 
   KVM_DEV_FLIC_AISM
     modify the adapter-interruption-suppression mode for a given isc if the
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 1726224e7772..d058289385a5 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -701,9 +701,6 @@ struct s390_io_adapter {
 	bool masked;
 	bool swap;
 	bool suppressible;
-	struct rw_semaphore maps_lock;
-	struct list_head maps;
-	atomic_t nr_maps;
 };
 
 #define MAX_S390_IO_ADAPTERS ((MAX_ISC + 1) * 8)
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index c06c89d370a7..0cebebf56515 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -2327,9 +2327,6 @@ static int register_io_adapter(struct kvm_device *dev,
 	if (!adapter)
 		return -ENOMEM;
 
-	INIT_LIST_HEAD(&adapter->maps);
-	init_rwsem(&adapter->maps_lock);
-	atomic_set(&adapter->nr_maps, 0);
 	adapter->id = adapter_info.id;
 	adapter->isc = adapter_info.isc;
 	adapter->maskable = adapter_info.maskable;
@@ -2354,87 +2351,12 @@ int kvm_s390_mask_adapter(struct kvm *kvm, unsigned int id, bool masked)
 	return ret;
 }
 
-static int kvm_s390_adapter_map(struct kvm *kvm, unsigned int id, __u64 addr)
-{
-	struct s390_io_adapter *adapter = get_io_adapter(kvm, id);
-	struct s390_map_info *map;
-	int ret;
-
-	if (!adapter || !addr)
-		return -EINVAL;
-
-	map = kzalloc(sizeof(*map), GFP_KERNEL);
-	if (!map) {
-		ret = -ENOMEM;
-		goto out;
-	}
-	INIT_LIST_HEAD(&map->list);
-	map->guest_addr = addr;
-	map->addr = gmap_translate(kvm->arch.gmap, addr);
-	if (map->addr == -EFAULT) {
-		ret = -EFAULT;
-		goto out;
-	}
-	ret = get_user_pages_fast(map->addr, 1, FOLL_WRITE, &map->page);
-	if (ret < 0)
-		goto out;
-	BUG_ON(ret != 1);
-	down_write(&adapter->maps_lock);
-	if (atomic_inc_return(&adapter->nr_maps) < MAX_S390_ADAPTER_MAPS) {
-		list_add_tail(&map->list, &adapter->maps);
-		ret = 0;
-	} else {
-		put_page(map->page);
-		ret = -EINVAL;
-	}
-	up_write(&adapter->maps_lock);
-out:
-	if (ret)
-		kfree(map);
-	return ret;
-}
-
-static int kvm_s390_adapter_unmap(struct kvm *kvm, unsigned int id, __u64 addr)
-{
-	struct s390_io_adapter *adapter = get_io_adapter(kvm, id);
-	struct s390_map_info *map, *tmp;
-	int found = 0;
-
-	if (!adapter || !addr)
-		return -EINVAL;
-
-	down_write(&adapter->maps_lock);
-	list_for_each_entry_safe(map, tmp, &adapter->maps, list) {
-		if (map->guest_addr == addr) {
-			found = 1;
-			atomic_dec(&adapter->nr_maps);
-			list_del(&map->list);
-			put_page(map->page);
-			kfree(map);
-			break;
-		}
-	}
-	up_write(&adapter->maps_lock);
-
-	return found ? 0 : -EINVAL;
-}
-
 void kvm_s390_destroy_adapters(struct kvm *kvm)
 {
 	int i;
-	struct s390_map_info *map, *tmp;
 
-	for (i = 0; i < MAX_S390_IO_ADAPTERS; i++) {
-		if (!kvm->arch.adapters[i])
-			continue;
-		list_for_each_entry_safe(map, tmp,
-					 &kvm->arch.adapters[i]->maps, list) {
-			list_del(&map->list);
-			put_page(map->page);
-			kfree(map);
-		}
+	for (i = 0; i < MAX_S390_IO_ADAPTERS; i++)
 		kfree(kvm->arch.adapters[i]);
-	}
 }
 
 static int modify_io_adapter(struct kvm_device *dev,
@@ -2456,12 +2378,13 @@ static int modify_io_adapter(struct kvm_device *dev,
 		if (ret > 0)
 			ret = 0;
 		break;
+	/*
+	 * We resolve the gpa to hva when setting the IRQ routing. the set_irq
+	 * code uses get_user_pages_remote to do the actual write.
+	 */
 	case KVM_S390_IO_ADAPTER_MAP:
-		ret = kvm_s390_adapter_map(dev->kvm, req.id, req.addr);
-		break;
 	case KVM_S390_IO_ADAPTER_UNMAP:
-		ret = kvm_s390_adapter_unmap(dev->kvm, req.id, req.addr);
-		break;
+		return 0;
 	default:
 		ret = -EINVAL;
 	}
@@ -2699,19 +2622,21 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
 	return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit;
 }
 
-static struct s390_map_info *get_map_info(struct s390_io_adapter *adapter,
-					  u64 addr)
+static struct page *get_map_page(struct kvm *kvm,
+				 struct s390_io_adapter *adapter,
+				 u64 uaddr)
 {
-	struct s390_map_info *map;
+	struct page *page = NULL;
 
 	if (!adapter)
 		return NULL;
-
-	list_for_each_entry(map, &adapter->maps, list) {
-		if (map->guest_addr == addr)
-			return map;
-	}
-	return NULL;
+	if (!uaddr)
+		return NULL;
+	down_read(&kvm->mm->mmap_sem);
+	get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE,
+			      &page, NULL, NULL);
+	up_read(&kvm->mm->mmap_sem);
+	return page;
 }
 
 static int adapter_indicators_set(struct kvm *kvm,
@@ -2720,30 +2645,35 @@ static int adapter_indicators_set(struct kvm *kvm,
 {
 	unsigned long bit;
 	int summary_set, idx;
-	struct s390_map_info *info;
+	struct page *ind_page, *summary_page;
 	void *map;
 
-	info = get_map_info(adapter, adapter_int->ind_addr);
-	if (!info)
+	ind_page = get_map_page(kvm, adapter, adapter_int->ind_addr);
+	if (!ind_page)
 		return -1;
-	map = page_address(info->page);
-	bit = get_ind_bit(info->addr, adapter_int->ind_offset, adapter->swap);
-	set_bit(bit, map);
-	idx = srcu_read_lock(&kvm->srcu);
-	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
-	set_page_dirty_lock(info->page);
-	info = get_map_info(adapter, adapter_int->summary_addr);
-	if (!info) {
-		srcu_read_unlock(&kvm->srcu, idx);
+	summary_page = get_map_page(kvm, adapter, adapter_int->summary_addr);
+	if (!summary_page) {
+		put_page(ind_page);
 		return -1;
 	}
-	map = page_address(info->page);
-	bit = get_ind_bit(info->addr, adapter_int->summary_offset,
-			  adapter->swap);
+
+	idx = srcu_read_lock(&kvm->srcu);
+	map = page_address(ind_page);
+	bit = get_ind_bit(adapter_int->ind_addr,
+			  adapter_int->ind_offset, adapter->swap);
+	set_bit(bit, map);
+	mark_page_dirty(kvm, adapter_int->ind_addr >> PAGE_SHIFT);
+	set_page_dirty_lock(ind_page);
+	map = page_address(summary_page);
+	bit = get_ind_bit(adapter_int->summary_addr,
+			  adapter_int->summary_offset, adapter->swap);
 	summary_set = test_and_set_bit(bit, map);
-	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
-	set_page_dirty_lock(info->page);
+	mark_page_dirty(kvm, adapter_int->summary_addr >> PAGE_SHIFT);
+	set_page_dirty_lock(summary_page);
 	srcu_read_unlock(&kvm->srcu, idx);
+
+	put_page(ind_page);
+	put_page(summary_page);
 	return summary_set ? 0 : 1;
 }
 
@@ -2765,9 +2695,7 @@ static int set_adapter_int(struct kvm_kernel_irq_routing_entry *e,
 	adapter = get_io_adapter(kvm, e->adapter.adapter_id);
 	if (!adapter)
 		return -1;
-	down_read(&adapter->maps_lock);
 	ret = adapter_indicators_set(kvm, adapter, &e->adapter);
-	up_read(&adapter->maps_lock);
 	if ((ret > 0) && !adapter->masked) {
 		ret = kvm_s390_inject_airq(kvm, adapter);
 		if (ret == 0)
@@ -2818,23 +2746,27 @@ int kvm_set_routing_entry(struct kvm *kvm,
 			  struct kvm_kernel_irq_routing_entry *e,
 			  const struct kvm_irq_routing_entry *ue)
 {
-	int ret;
+	u64 uaddr;
 
 	switch (ue->type) {
+	/* we store the userspace addresses instead of the guest addresses */
 	case KVM_IRQ_ROUTING_S390_ADAPTER:
 		e->set = set_adapter_int;
-		e->adapter.summary_addr = ue->u.adapter.summary_addr;
-		e->adapter.ind_addr = ue->u.adapter.ind_addr;
+		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);
+		if (uaddr == -EFAULT)
+			return -EFAULT;
+		e->adapter.summary_addr = uaddr;
+		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr);
+		if (uaddr == -EFAULT)
+			return -EFAULT;
+		e->adapter.ind_addr = uaddr;
 		e->adapter.summary_offset = ue->u.adapter.summary_offset;
 		e->adapter.ind_offset = ue->u.adapter.ind_offset;
 		e->adapter.adapter_id = ue->u.adapter.adapter_id;
-		ret = 0;
-		break;
+		return 0;
 	default:
-		ret = -EINVAL;
+		return -EINVAL;
 	}
-
-	return ret;
 }
 
 int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm,
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 03/42] s390/protvirt: introduce host side setup
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17  9:53   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 04/42] s390/protvirt: add ultravisor initialization Christian Borntraeger
                   ` (38 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Vasily Gorbik <gor@linux.ibm.com>

Add "prot_virt" command line option which controls if the kernel
protected VMs support is enabled at early boot time. This has to be
done early, because it needs large amounts of memory and will disable
some features like STP time sync for the lpar.

Extend ultravisor info definitions and expose it via uv_info struct
filled in during startup.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 .../admin-guide/kernel-parameters.txt         |  5 ++
 arch/s390/boot/Makefile                       |  2 +-
 arch/s390/boot/uv.c                           | 21 +++++++-
 arch/s390/include/asm/uv.h                    | 45 +++++++++++++++-
 arch/s390/kernel/Makefile                     |  1 +
 arch/s390/kernel/setup.c                      |  4 --
 arch/s390/kernel/uv.c                         | 52 +++++++++++++++++++
 7 files changed, 122 insertions(+), 8 deletions(-)
 create mode 100644 arch/s390/kernel/uv.c

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index dbc22d684627..b0beae9b9e36 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3795,6 +3795,11 @@
 			before loading.
 			See Documentation/admin-guide/blockdev/ramdisk.rst.
 
+	prot_virt=	[S390] enable hosting protected virtual machines
+			isolated from the hypervisor (if hardware supports
+			that).
+			Format: <bool>
+
 	psi=		[KNL] Enable or disable pressure stall information
 			tracking.
 			Format: <bool>
diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index e2c47d3a1c89..30f1811540c5 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -37,7 +37,7 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
 obj-y	:= head.o als.o startup.o mem_detect.o ipl_parm.o ipl_report.o
 obj-y	+= string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
 obj-y	+= version.o pgm_check_info.o ctype.o text_dma.o
-obj-$(CONFIG_PROTECTED_VIRTUALIZATION_GUEST)	+= uv.o
+obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
 obj-$(CONFIG_RELOCATABLE)	+= machine_kexec_reloc.o
 obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 targets	:= bzImage startup.a section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y)
diff --git a/arch/s390/boot/uv.c b/arch/s390/boot/uv.c
index ed007f4a6444..af9e1cc93c68 100644
--- a/arch/s390/boot/uv.c
+++ b/arch/s390/boot/uv.c
@@ -3,7 +3,13 @@
 #include <asm/facility.h>
 #include <asm/sections.h>
 
+/* will be used in arch/s390/kernel/uv.c */
+#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
 int __bootdata_preserved(prot_virt_guest);
+#endif
+#if IS_ENABLED(CONFIG_KVM)
+struct uv_info __bootdata_preserved(uv_info);
+#endif
 
 void uv_query_info(void)
 {
@@ -18,7 +24,20 @@ void uv_query_info(void)
 	if (uv_call(0, (uint64_t)&uvcb))
 		return;
 
-	if (test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
+	if (IS_ENABLED(CONFIG_KVM)) {
+		memcpy(uv_info.inst_calls_list, uvcb.inst_calls_list, sizeof(uv_info.inst_calls_list));
+		uv_info.uv_base_stor_len = uvcb.uv_base_stor_len;
+		uv_info.guest_base_stor_len = uvcb.conf_base_phys_stor_len;
+		uv_info.guest_virt_base_stor_len = uvcb.conf_base_virt_stor_len;
+		uv_info.guest_virt_var_stor_len = uvcb.conf_virt_var_stor_len;
+		uv_info.guest_cpu_stor_len = uvcb.cpu_stor_len;
+		uv_info.max_sec_stor_addr = ALIGN(uvcb.max_guest_stor_addr, PAGE_SIZE);
+		uv_info.max_num_sec_conf = uvcb.max_num_sec_conf;
+		uv_info.max_guest_cpus = uvcb.max_guest_cpus;
+	}
+
+	if (IS_ENABLED(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) &&
+	    test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
 	    test_bit_inv(BIT_UVC_CMD_REMOVE_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list))
 		prot_virt_guest = 1;
 }
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index 4093a2856929..34b1114dcc38 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -44,7 +44,19 @@ struct uv_cb_qui {
 	struct uv_cb_header header;
 	u64 reserved08;
 	u64 inst_calls_list[4];
-	u64 reserved30[15];
+	u64 reserved30[2];
+	u64 uv_base_stor_len;
+	u64 reserved48;
+	u64 conf_base_phys_stor_len;
+	u64 conf_base_virt_stor_len;
+	u64 conf_virt_var_stor_len;
+	u64 cpu_stor_len;
+	u32 reserved70[3];
+	u32 max_num_sec_conf;
+	u64 max_guest_stor_addr;
+	u8  reserved88[158-136];
+	u16 max_guest_cpus;
+	u64 reserveda0;
 } __packed __aligned(8);
 
 struct uv_cb_share {
@@ -69,6 +81,19 @@ static inline int uv_call(unsigned long r1, unsigned long r2)
 	return cc;
 }
 
+struct uv_info {
+	unsigned long inst_calls_list[4];
+	unsigned long uv_base_stor_len;
+	unsigned long guest_base_stor_len;
+	unsigned long guest_virt_base_stor_len;
+	unsigned long guest_virt_var_stor_len;
+	unsigned long guest_cpu_stor_len;
+	unsigned long max_sec_stor_addr;
+	unsigned int max_num_sec_conf;
+	unsigned short max_guest_cpus;
+};
+extern struct uv_info uv_info;
+
 #ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
 extern int prot_virt_guest;
 
@@ -121,11 +146,27 @@ static inline int uv_remove_shared(unsigned long addr)
 	return share(addr, UVC_CMD_REMOVE_SHARED_ACCESS);
 }
 
-void uv_query_info(void);
 #else
 #define is_prot_virt_guest() 0
 static inline int uv_set_shared(unsigned long addr) { return 0; }
 static inline int uv_remove_shared(unsigned long addr) { return 0; }
+#endif
+
+#if IS_ENABLED(CONFIG_KVM)
+extern int prot_virt_host;
+
+static inline int is_prot_virt_host(void)
+{
+	return prot_virt_host;
+}
+#else
+#define is_prot_virt_host() 0
+#endif
+
+#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
+	IS_ENABLED(CONFIG_KVM)
+void uv_query_info(void);
+#else
 static inline void uv_query_info(void) {}
 #endif
 
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
index 2b1203cf7be6..22bfb8d5084e 100644
--- a/arch/s390/kernel/Makefile
+++ b/arch/s390/kernel/Makefile
@@ -78,6 +78,7 @@ obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_events.o perf_regs.o
 obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_diag.o
 
 obj-$(CONFIG_TRACEPOINTS)	+= trace.o
+obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
 
 # vdso
 obj-y				+= vdso64/
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index b2c2f75860e8..a2496382175e 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -92,10 +92,6 @@ char elf_platform[ELF_PLATFORM_SIZE];
 
 unsigned long int_hwcap = 0;
 
-#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
-int __bootdata_preserved(prot_virt_guest);
-#endif
-
 int __bootdata(noexec_disabled);
 int __bootdata(memory_end_set);
 unsigned long __bootdata(memory_end);
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
new file mode 100644
index 000000000000..b1f936710360
--- /dev/null
+++ b/arch/s390/kernel/uv.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Common Ultravisor functions and initialization
+ *
+ * Copyright IBM Corp. 2019, 2020
+ */
+#define KMSG_COMPONENT "prot_virt"
+#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/sizes.h>
+#include <linux/bitmap.h>
+#include <linux/memblock.h>
+#include <asm/facility.h>
+#include <asm/sections.h>
+#include <asm/uv.h>
+
+/* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */
+#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
+int __bootdata_preserved(prot_virt_guest);
+#endif
+
+#if IS_ENABLED(CONFIG_KVM)
+int prot_virt_host;
+EXPORT_SYMBOL(prot_virt_host);
+struct uv_info __bootdata_preserved(uv_info);
+EXPORT_SYMBOL(uv_info);
+
+static int __init prot_virt_setup(char *val)
+{
+	bool enabled;
+	int rc;
+
+	rc = kstrtobool(val, &enabled);
+	if (!rc && enabled)
+		prot_virt_host = 1;
+
+	if (is_prot_virt_guest() && prot_virt_host) {
+		prot_virt_host = 0;
+		pr_warn("Protected virtualization not available in protected guests.");
+	}
+
+	if (prot_virt_host && !test_facility(158)) {
+		prot_virt_host = 0;
+		pr_warn("Protected virtualization not supported by the hardware.");
+	}
+
+	return rc;
+}
+early_param("prot_virt", prot_virt_setup);
+#endif
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 04/42] s390/protvirt: add ultravisor initialization
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (2 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 03/42] s390/protvirt: introduce host side setup Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17  9:57   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests Christian Borntraeger
                   ` (37 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Vasily Gorbik <gor@linux.ibm.com>

Before being able to host protected virtual machines, donate some of
the memory to the ultravisor. Besides that the ultravisor might impose
addressing limitations for memory used to back protected VM storage. Treat
that limit as protected virtualization host's virtual memory limit.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/uv.h | 15 +++++++++++
 arch/s390/kernel/setup.c   |  5 ++++
 arch/s390/kernel/uv.c      | 51 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 71 insertions(+)

diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index 34b1114dcc38..f5b55e3972b3 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -23,12 +23,14 @@
 #define UVC_RC_NO_RESUME	0x0007
 
 #define UVC_CMD_QUI			0x0001
+#define UVC_CMD_INIT_UV			0x000f
 #define UVC_CMD_SET_SHARED_ACCESS	0x1000
 #define UVC_CMD_REMOVE_SHARED_ACCESS	0x1001
 
 /* Bits in installed uv calls */
 enum uv_cmds_inst {
 	BIT_UVC_CMD_QUI = 0,
+	BIT_UVC_CMD_INIT_UV = 1,
 	BIT_UVC_CMD_SET_SHARED_ACCESS = 8,
 	BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9,
 };
@@ -59,6 +61,14 @@ struct uv_cb_qui {
 	u64 reserveda0;
 } __packed __aligned(8);
 
+struct uv_cb_init {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 stor_origin;
+	u64 stor_len;
+	u64 reserved28[4];
+} __packed __aligned(8);
+
 struct uv_cb_share {
 	struct uv_cb_header header;
 	u64 reserved08[3];
@@ -159,8 +169,13 @@ static inline int is_prot_virt_host(void)
 {
 	return prot_virt_host;
 }
+
+void setup_uv(void);
+void adjust_to_uv_max(unsigned long *vmax);
 #else
 #define is_prot_virt_host() 0
+static inline void setup_uv(void) {}
+static inline void adjust_to_uv_max(unsigned long *vmax) {}
 #endif
 
 #if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index a2496382175e..e02727812e67 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -560,6 +560,9 @@ static void __init setup_memory_end(void)
 			vmax = _REGION1_SIZE; /* 4-level kernel page table */
 	}
 
+	if (prot_virt_host)
+		adjust_to_uv_max(&vmax);
+
 	/* module area is at the end of the kernel address space. */
 	MODULES_END = vmax;
 	MODULES_VADDR = MODULES_END - MODULES_LEN;
@@ -1134,6 +1137,8 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	memblock_trim_memory(1UL << (MAX_ORDER - 1 + PAGE_SHIFT));
 
+	if (prot_virt_host)
+		setup_uv();
 	setup_memory_end();
 	setup_memory();
 	dma_contiguous_reserve(memory_end);
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index b1f936710360..1424994f5489 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -49,4 +49,55 @@ static int __init prot_virt_setup(char *val)
 	return rc;
 }
 early_param("prot_virt", prot_virt_setup);
+
+static int __init uv_init(unsigned long stor_base, unsigned long stor_len)
+{
+	struct uv_cb_init uvcb = {
+		.header.cmd = UVC_CMD_INIT_UV,
+		.header.len = sizeof(uvcb),
+		.stor_origin = stor_base,
+		.stor_len = stor_len,
+	};
+
+	if (uv_call(0, (uint64_t)&uvcb)) {
+		pr_err("Ultravisor init failed with rc: 0x%x rrc: 0%x\n",
+		       uvcb.header.rc, uvcb.header.rrc);
+		return -1;
+	}
+	return 0;
+}
+
+void __init setup_uv(void)
+{
+	unsigned long uv_stor_base;
+
+	if (!prot_virt_host)
+		return;
+
+	uv_stor_base = (unsigned long)memblock_alloc_try_nid(
+		uv_info.uv_base_stor_len, SZ_1M, SZ_2G,
+		MEMBLOCK_ALLOC_ACCESSIBLE, NUMA_NO_NODE);
+	if (!uv_stor_base) {
+		pr_warn("Failed to reserve %lu bytes for ultravisor base storage\n",
+			uv_info.uv_base_stor_len);
+		goto fail;
+	}
+
+	if (uv_init(uv_stor_base, uv_info.uv_base_stor_len)) {
+		memblock_free(uv_stor_base, uv_info.uv_base_stor_len);
+		goto fail;
+	}
+
+	pr_info("Reserving %luMB as ultravisor base storage\n",
+		uv_info.uv_base_stor_len >> 20);
+	return;
+fail:
+	pr_info("Disabling support for protected virtualization");
+	prot_virt_host = 0;
+}
+
+void adjust_to_uv_max(unsigned long *vmax)
+{
+	*vmax = min_t(unsigned long, *vmax, uv_info.max_sec_stor_addr);
+}
 #endif
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (3 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 04/42] s390/protvirt: add ultravisor initialization Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 10:21   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 06/42] s390/mm: add (non)secure page access exceptions handlers Christian Borntraeger
                   ` (36 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Andrea Arcangeli, linux-mm

From: Claudio Imbrenda <imbrenda@linux.ibm.com>

This provides the basic ultravisor calls and page table handling to cope
with secure guests:
- provide arch_make_page_accessible
- make pages accessible after unmapping of secure guests
- provide the ultravisor commands convert to/from secure
- provide the ultravisor commands pin/unpin shared
- provide callbacks to make pages secure (inacccessible)
 - we check for the expected pin count to only make pages secure if the
   host is not accessing them
 - we fence hugetlbfs for secure pages

Co-developed-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/gmap.h        |   4 +
 arch/s390/include/asm/mmu.h         |   2 +
 arch/s390/include/asm/mmu_context.h |   1 +
 arch/s390/include/asm/page.h        |   5 +
 arch/s390/include/asm/pgtable.h     |  35 ++++-
 arch/s390/include/asm/uv.h          |  31 ++++
 arch/s390/kernel/uv.c               | 223 ++++++++++++++++++++++++++++
 7 files changed, 296 insertions(+), 5 deletions(-)

diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
index 37f96b6f0e61..3c4926aa78f4 100644
--- a/arch/s390/include/asm/gmap.h
+++ b/arch/s390/include/asm/gmap.h
@@ -9,6 +9,7 @@
 #ifndef _ASM_S390_GMAP_H
 #define _ASM_S390_GMAP_H
 
+#include <linux/radix-tree.h>
 #include <linux/refcount.h>
 
 /* Generic bits for GMAP notification on DAT table entry changes. */
@@ -31,6 +32,7 @@
  * @table: pointer to the page directory
  * @asce: address space control element for gmap page table
  * @pfault_enabled: defines if pfaults are applicable for the guest
+ * @guest_handle: protected virtual machine handle for the ultravisor
  * @host_to_rmap: radix tree with gmap_rmap lists
  * @children: list of shadow gmap structures
  * @pt_list: list of all page tables used in the shadow guest address space
@@ -54,6 +56,8 @@ struct gmap {
 	unsigned long asce_end;
 	void *private;
 	bool pfault_enabled;
+	/* only set for protected virtual machines */
+	unsigned long guest_handle;
 	/* Additional data for shadow guest address spaces */
 	struct radix_tree_root host_to_rmap;
 	struct list_head children;
diff --git a/arch/s390/include/asm/mmu.h b/arch/s390/include/asm/mmu.h
index bcfb6371086f..e21b618ad432 100644
--- a/arch/s390/include/asm/mmu.h
+++ b/arch/s390/include/asm/mmu.h
@@ -16,6 +16,8 @@ typedef struct {
 	unsigned long asce;
 	unsigned long asce_limit;
 	unsigned long vdso_base;
+	/* The mmu context belongs to a secure guest. */
+	atomic_t is_protected;
 	/*
 	 * The following bitfields need a down_write on the mm
 	 * semaphore when they are written to. As they are only
diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
index 8d04e6f3f796..afa836014076 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -23,6 +23,7 @@ static inline int init_new_context(struct task_struct *tsk,
 	INIT_LIST_HEAD(&mm->context.gmap_list);
 	cpumask_clear(&mm->context.cpu_attach_mask);
 	atomic_set(&mm->context.flush_count, 0);
+	atomic_set(&mm->context.is_protected, 0);
 	mm->context.gmap_asce = 0;
 	mm->context.flush_mm = 0;
 	mm->context.compat_mm = test_thread_flag(TIF_31BIT);
diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
index 85e944f04c70..4ebcf891ff3c 100644
--- a/arch/s390/include/asm/page.h
+++ b/arch/s390/include/asm/page.h
@@ -153,6 +153,11 @@ static inline int devmem_is_allowed(unsigned long pfn)
 #define HAVE_ARCH_FREE_PAGE
 #define HAVE_ARCH_ALLOC_PAGE
 
+#if IS_ENABLED(CONFIG_PGSTE)
+int arch_make_page_accessible(struct page *page);
+#define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
+#endif
+
 #endif /* !__ASSEMBLY__ */
 
 #define __PAGE_OFFSET		0x0UL
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 137a3920ca36..cc7a1adacb94 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -19,6 +19,7 @@
 #include <linux/atomic.h>
 #include <asm/bug.h>
 #include <asm/page.h>
+#include <asm/uv.h>
 
 extern pgd_t swapper_pg_dir[];
 extern void paging_init(void);
@@ -520,6 +521,15 @@ static inline int mm_has_pgste(struct mm_struct *mm)
 	return 0;
 }
 
+static inline int mm_is_protected(struct mm_struct *mm)
+{
+#ifdef CONFIG_PGSTE
+	if (unlikely(atomic_read(&mm->context.is_protected)))
+		return 1;
+#endif
+	return 0;
+}
+
 static inline int mm_alloc_pgste(struct mm_struct *mm)
 {
 #ifdef CONFIG_PGSTE
@@ -1061,7 +1071,12 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long addr, pte_t *ptep)
 {
-	return ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID));
+	pte_t res;
+
+	res = ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID));
+	if (mm_is_protected(mm) && pte_present(res))
+		uv_convert_from_secure(pte_val(res) & PAGE_MASK);
+	return res;
 }
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
@@ -1073,7 +1088,12 @@ void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
 				     unsigned long addr, pte_t *ptep)
 {
-	return ptep_xchg_direct(vma->vm_mm, addr, ptep, __pte(_PAGE_INVALID));
+	pte_t res;
+
+	res = ptep_xchg_direct(vma->vm_mm, addr, ptep, __pte(_PAGE_INVALID));
+	if (mm_is_protected(vma->vm_mm) && pte_present(res))
+		uv_convert_from_secure(pte_val(res) & PAGE_MASK);
+	return res;
 }
 
 /*
@@ -1088,12 +1108,17 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
 					    unsigned long addr,
 					    pte_t *ptep, int full)
 {
+	pte_t res;
+
 	if (full) {
-		pte_t pte = *ptep;
+		res = *ptep;
 		*ptep = __pte(_PAGE_INVALID);
-		return pte;
+	} else {
+		res = ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID));
 	}
-	return ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID));
+	if (mm_is_protected(mm) && pte_present(res))
+		uv_convert_from_secure(pte_val(res) & PAGE_MASK);
+	return res;
 }
 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index f5b55e3972b3..e45963cc7f40 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -15,6 +15,7 @@
 #include <linux/errno.h>
 #include <linux/bug.h>
 #include <asm/page.h>
+#include <asm/gmap.h>
 
 #define UVC_RC_EXECUTED		0x0001
 #define UVC_RC_INV_CMD		0x0002
@@ -24,6 +25,10 @@
 
 #define UVC_CMD_QUI			0x0001
 #define UVC_CMD_INIT_UV			0x000f
+#define UVC_CMD_CONV_TO_SEC_STOR	0x0200
+#define UVC_CMD_CONV_FROM_SEC_STOR	0x0201
+#define UVC_CMD_PIN_PAGE_SHARED		0x0341
+#define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
 #define UVC_CMD_SET_SHARED_ACCESS	0x1000
 #define UVC_CMD_REMOVE_SHARED_ACCESS	0x1001
 
@@ -31,8 +36,12 @@
 enum uv_cmds_inst {
 	BIT_UVC_CMD_QUI = 0,
 	BIT_UVC_CMD_INIT_UV = 1,
+	BIT_UVC_CMD_CONV_TO_SEC_STOR = 6,
+	BIT_UVC_CMD_CONV_FROM_SEC_STOR = 7,
 	BIT_UVC_CMD_SET_SHARED_ACCESS = 8,
 	BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9,
+	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
+	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
 };
 
 struct uv_cb_header {
@@ -69,6 +78,19 @@ struct uv_cb_init {
 	u64 reserved28[4];
 } __packed __aligned(8);
 
+struct uv_cb_cts {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 gaddr;
+} __packed __aligned(8);
+
+struct uv_cb_cfs {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 paddr;
+} __packed __aligned(8);
+
 struct uv_cb_share {
 	struct uv_cb_header header;
 	u64 reserved08[3];
@@ -170,12 +192,21 @@ static inline int is_prot_virt_host(void)
 	return prot_virt_host;
 }
 
+int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);
+int uv_convert_from_secure(unsigned long paddr);
+int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
+
 void setup_uv(void);
 void adjust_to_uv_max(unsigned long *vmax);
 #else
 #define is_prot_virt_host() 0
 static inline void setup_uv(void) {}
 static inline void adjust_to_uv_max(unsigned long *vmax) {}
+
+static inline int uv_convert_from_secure(unsigned long paddr)
+{
+	return 0;
+}
 #endif
 
 #if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 1424994f5489..9a6c309864a0 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -12,6 +12,8 @@
 #include <linux/sizes.h>
 #include <linux/bitmap.h>
 #include <linux/memblock.h>
+#include <linux/pagemap.h>
+#include <linux/swap.h>
 #include <asm/facility.h>
 #include <asm/sections.h>
 #include <asm/uv.h>
@@ -100,4 +102,225 @@ void adjust_to_uv_max(unsigned long *vmax)
 {
 	*vmax = min_t(unsigned long, *vmax, uv_info.max_sec_stor_addr);
 }
+
+/*
+ * Requests the Ultravisor to pin the page in the shared state. This will
+ * cause an intercept when the guest attempts to unshare the pinned page.
+ */
+static int uv_pin_shared(unsigned long paddr)
+{
+	struct uv_cb_cfs uvcb = {
+		.header.cmd = UVC_CMD_PIN_PAGE_SHARED,
+		.header.len = sizeof(uvcb),
+		.paddr = paddr,
+	};
+
+	if (uv_call(0, (u64)&uvcb))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Requests the Ultravisor to encrypt a guest page and make it
+ * accessible to the host for paging (export).
+ *
+ * @paddr: Absolute host address of page to be exported
+ */
+int uv_convert_from_secure(unsigned long paddr)
+{
+	struct uv_cb_cfs uvcb = {
+		.header.cmd = UVC_CMD_CONV_FROM_SEC_STOR,
+		.header.len = sizeof(uvcb),
+		.paddr = paddr
+	};
+
+	if (uv_call(0, (u64)&uvcb))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Calculate the expected ref_count for a page that would otherwise have no
+ * further pins. This was cribbed from similar functions in other places in
+ * the kernel, but with some slight modifications. We know that a secure
+ * page can not be a huge page for example.
+ */
+static int expected_page_refs(struct page *page)
+{
+	int res;
+
+	res = page_mapcount(page);
+	if (PageSwapCache(page)) {
+		res++;
+	} else if (page_mapping(page)) {
+		res++;
+		if (page_has_private(page))
+			res++;
+	}
+	return res;
+}
+
+static int make_secure_pte(pte_t *ptep, unsigned long addr,
+			   struct page *exp_page, struct uv_cb_header *uvcb)
+{
+	pte_t entry = READ_ONCE(*ptep);
+	struct page *page;
+	int expected, rc = 0;
+
+	if (!pte_present(entry))
+		return -ENXIO;
+	if (pte_val(entry) & _PAGE_INVALID)
+		return -ENXIO;
+
+	page = pte_page(entry);
+	if (page != exp_page)
+		return -ENXIO;
+	if (PageWriteback(page))
+		return -EAGAIN;
+	expected = expected_page_refs(page);
+	if (!page_ref_freeze(page, expected))
+		return -EBUSY;
+	set_bit(PG_arch_1, &page->flags);
+	rc = uv_call(0, (u64)uvcb);
+	page_ref_unfreeze(page, expected);
+	/* Return -ENXIO if the page was not mapped, -EINVAL otherwise */
+	if (rc)
+		rc = uvcb->rc == 0x10a ? -ENXIO : -EINVAL;
+	return rc;
+}
+
+/*
+ * Requests the Ultravisor to make a page accessible to a guest.
+ * If it's brought in the first time, it will be cleared. If
+ * it has been exported before, it will be decrypted and integrity
+ * checked.
+ */
+int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
+{
+	struct vm_area_struct *vma;
+	unsigned long uaddr;
+	struct page *page;
+	int rc, local_drain = 0;
+	spinlock_t *ptelock;
+	pte_t *ptep;
+
+again:
+	rc = -EFAULT;
+	down_read(&gmap->mm->mmap_sem);
+
+	uaddr = __gmap_translate(gmap, gaddr);
+	if (IS_ERR_VALUE(uaddr))
+		goto out;
+	vma = find_vma(gmap->mm, uaddr);
+	if (!vma)
+		goto out;
+	/*
+	 * Secure pages cannot be huge and userspace should not combine both.
+	 * In case userspace does it anyway this will result in an -EFAULT for
+	 * the unpack. The guest is thus never reaching secure mode. If
+	 * userspace is playing dirty tricky with mapping huge pages later
+	 * on this will result in a segmenation fault.
+	 */
+	if (is_vm_hugetlb_page(vma))
+		goto out;
+
+	rc = -ENXIO;
+	page = follow_page(vma, uaddr, FOLL_WRITE);
+	if (IS_ERR_OR_NULL(page))
+		goto out;
+
+	lock_page(page);
+	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
+	rc = make_secure_pte(ptep, uaddr, page, uvcb);
+	pte_unmap_unlock(ptep, ptelock);
+	unlock_page(page);
+out:
+	up_read(&gmap->mm->mmap_sem);
+
+	if (rc == -EAGAIN) {
+		wait_on_page_writeback(page);
+	} else if (rc == -EBUSY) {
+		/*
+		 * If we have tried a local drain and the page refcount
+		 * still does not match our expected safe value, try with a
+		 * system wide drain. This is needed if the pagevecs holding
+		 * the page are on a different CPU.
+		 */
+		if (local_drain) {
+			lru_add_drain_all();
+			/* We give up here, and let the caller try again */
+			return -EAGAIN;
+		}
+		/*
+		 * We are here if the page refcount does not match the
+		 * expected safe value. The main culprits are usually
+		 * pagevecs. With lru_add_drain() we drain the pagevecs
+		 * on the local CPU so that hopefully the refcount will
+		 * reach the expected safe value.
+		 */
+		lru_add_drain();
+		local_drain = 1;
+		/* And now we try again immediately after draining */
+		goto again;
+	} else if (rc == -ENXIO) {
+		if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))
+			return -EFAULT;
+		return -EAGAIN;
+	}
+	return rc;
+}
+EXPORT_SYMBOL_GPL(gmap_make_secure);
+
+int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr)
+{
+	struct uv_cb_cts uvcb = {
+		.header.cmd = UVC_CMD_CONV_TO_SEC_STOR,
+		.header.len = sizeof(uvcb),
+		.guest_handle = gmap->guest_handle,
+		.gaddr = gaddr,
+	};
+
+	return gmap_make_secure(gmap, gaddr, &uvcb);
+}
+EXPORT_SYMBOL_GPL(gmap_convert_to_secure);
+
+/**
+ * To be called with the page locked or with an extra reference!
+ */
+int arch_make_page_accessible(struct page *page)
+{
+	int rc = 0;
+
+	/* Hugepage cannot be protected, so nothing to do */
+	if (PageHuge(page))
+		return 0;
+
+	/*
+	 * PG_arch_1 is used in 3 places:
+	 * 1. for kernel page tables during early boot
+	 * 2. for storage keys of huge pages and KVM
+	 * 3. As an indication that this page might be secure. This can
+	 *    overindicate, e.g. we set the bit before calling
+	 *    convert_to_secure.
+	 * As secure pages are never huge, all 3 variants can co-exists.
+	 */
+	if (!test_bit(PG_arch_1, &page->flags))
+		return 0;
+
+	rc = uv_pin_shared(page_to_phys(page));
+	if (!rc) {
+		clear_bit(PG_arch_1, &page->flags);
+		return 0;
+	}
+
+	rc = uv_convert_from_secure(page_to_phys(page));
+	if (!rc) {
+		clear_bit(PG_arch_1, &page->flags);
+		return 0;
+	}
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(arch_make_page_accessible);
+
 #endif
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 06/42] s390/mm: add (non)secure page access exceptions handlers
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (4 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 07/42] KVM: s390: protvirt: Add UV debug trace Christian Borntraeger
                   ` (35 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Andrea Arcangeli, linux-mm, Janosch Frank

From: Vasily Gorbik <gor@linux.ibm.com>

Add exceptions handlers performing transparent transition of non-secure
pages to secure (import) upon guest access and secure pages to
non-secure (export) upon hypervisor access.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
[frankja@linux.ibm.com: adding checks for failures]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[imbrenda@linux.ibm.com:  adding a check for gmap fault]
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kernel/pgm_check.S |  4 +-
 arch/s390/mm/fault.c         | 78 ++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/arch/s390/kernel/pgm_check.S b/arch/s390/kernel/pgm_check.S
index eee3a482195a..2c27907a5ffc 100644
--- a/arch/s390/kernel/pgm_check.S
+++ b/arch/s390/kernel/pgm_check.S
@@ -78,8 +78,8 @@ PGM_CHECK(do_dat_exception)		/* 39 */
 PGM_CHECK(do_dat_exception)		/* 3a */
 PGM_CHECK(do_dat_exception)		/* 3b */
 PGM_CHECK_DEFAULT			/* 3c */
-PGM_CHECK_DEFAULT			/* 3d */
-PGM_CHECK_DEFAULT			/* 3e */
+PGM_CHECK(do_secure_storage_access)	/* 3d */
+PGM_CHECK(do_non_secure_storage_access)	/* 3e */
 PGM_CHECK_DEFAULT			/* 3f */
 PGM_CHECK(monitor_event_exception)	/* 40 */
 PGM_CHECK_DEFAULT			/* 41 */
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 7b0bb475c166..7bd86ebc882f 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -38,6 +38,7 @@
 #include <asm/irq.h>
 #include <asm/mmu_context.h>
 #include <asm/facility.h>
+#include <asm/uv.h>
 #include "../kernel/entry.h"
 
 #define __FAIL_ADDR_MASK -4096L
@@ -816,3 +817,80 @@ static int __init pfault_irq_init(void)
 early_initcall(pfault_irq_init);
 
 #endif /* CONFIG_PFAULT */
+
+#if IS_ENABLED(CONFIG_PGSTE)
+void do_secure_storage_access(struct pt_regs *regs)
+{
+	unsigned long addr = regs->int_parm_long & __FAIL_ADDR_MASK;
+	struct vm_area_struct *vma;
+	struct mm_struct *mm;
+	struct page *page;
+	int rc;
+
+	switch (get_fault_type(regs)) {
+	case USER_FAULT:
+		mm = current->mm;
+		down_read(&mm->mmap_sem);
+		vma = find_vma(mm, addr);
+		if (!vma) {
+			up_read(&mm->mmap_sem);
+			do_fault_error(regs, VM_READ | VM_WRITE, VM_FAULT_BADMAP);
+			break;
+		}
+		page = follow_page(vma, addr, FOLL_WRITE | FOLL_GET);
+		if (IS_ERR_OR_NULL(page)) {
+			up_read(&mm->mmap_sem);
+			break;
+		}
+		if (arch_make_page_accessible(page))
+			send_sig(SIGSEGV, current, 0);
+		put_page(page);
+		up_read(&mm->mmap_sem);
+		break;
+	case KERNEL_FAULT:
+		page = phys_to_page(addr);
+		if (unlikely(!try_get_page(page)))
+			break;
+		rc = arch_make_page_accessible(page);
+		put_page(page);
+		if (rc)
+			BUG();
+		break;
+	case VDSO_FAULT:
+		/* fallthrough */
+	case GMAP_FAULT:
+		/* fallthrough */
+	default:
+		do_fault_error(regs, VM_READ | VM_WRITE, VM_FAULT_BADMAP);
+		WARN_ON_ONCE(1);
+	}
+}
+NOKPROBE_SYMBOL(do_secure_storage_access);
+
+void do_non_secure_storage_access(struct pt_regs *regs)
+{
+	unsigned long gaddr = regs->int_parm_long & __FAIL_ADDR_MASK;
+	struct gmap *gmap = (struct gmap *)S390_lowcore.gmap;
+
+	if (get_fault_type(regs) != GMAP_FAULT) {
+		do_fault_error(regs, VM_READ | VM_WRITE, VM_FAULT_BADMAP);
+		WARN_ON_ONCE(1);
+		return;
+	}
+
+	if (gmap_convert_to_secure(gmap, gaddr) == -EINVAL)
+		send_sig(SIGSEGV, current, 0);
+}
+NOKPROBE_SYMBOL(do_non_secure_storage_access);
+
+#else
+void do_secure_storage_access(struct pt_regs *regs)
+{
+	default_trap_handler(regs);
+}
+
+void do_non_secure_storage_access(struct pt_regs *regs)
+{
+	default_trap_handler(regs);
+}
+#endif
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 07/42] KVM: s390: protvirt: Add UV debug trace
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (5 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 06/42] s390/mm: add (non)secure page access exceptions handlers Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 10:41   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 08/42] KVM: s390: add new variants of UV CALL Christian Borntraeger
                   ` (34 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Let's have some debug traces which stay around for longer than the
guest.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/kvm-s390.c |  9 ++++++++-
 arch/s390/kvm/kvm-s390.h | 11 +++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index d7ff30e45589..cc7793525a69 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -220,6 +220,7 @@ static struct kvm_s390_vm_cpu_subfunc kvm_s390_available_subfunc;
 static struct gmap_notifier gmap_notifier;
 static struct gmap_notifier vsie_gmap_notifier;
 debug_info_t *kvm_s390_dbf;
+debug_info_t *kvm_s390_dbf_uv;
 
 /* Section: not file related */
 int kvm_arch_hardware_enable(void)
@@ -460,7 +461,12 @@ int kvm_arch_init(void *opaque)
 	if (!kvm_s390_dbf)
 		return -ENOMEM;
 
-	if (debug_register_view(kvm_s390_dbf, &debug_sprintf_view))
+	kvm_s390_dbf_uv = debug_register("kvm-uv", 32, 1, 7 * sizeof(long));
+	if (!kvm_s390_dbf_uv)
+		goto out;
+
+	if (debug_register_view(kvm_s390_dbf, &debug_sprintf_view) ||
+	    debug_register_view(kvm_s390_dbf_uv, &debug_sprintf_view))
 		goto out;
 
 	kvm_s390_cpu_feat_init();
@@ -487,6 +493,7 @@ void kvm_arch_exit(void)
 {
 	kvm_s390_gib_destroy();
 	debug_unregister(kvm_s390_dbf);
+	debug_unregister(kvm_s390_dbf_uv);
 }
 
 /* Section: device related */
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 6d9448dbd052..83dabb18e4d9 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -25,6 +25,17 @@
 #define IS_ITDB_VALID(vcpu)	((*(char *)vcpu->arch.sie_block->itdba == TDB_FORMAT1))
 
 extern debug_info_t *kvm_s390_dbf;
+extern debug_info_t *kvm_s390_dbf_uv;
+
+#define KVM_UV_EVENT(d_kvm, d_loglevel, d_string, d_args...)\
+do { \
+	debug_sprintf_event((d_kvm)->arch.dbf, d_loglevel, d_string "\n", \
+	  d_args); \
+	debug_sprintf_event(kvm_s390_dbf_uv, d_loglevel, \
+			    "%d: " d_string "\n", (d_kvm)->userspace_pid, \
+			    d_args); \
+} while (0)
+
 #define KVM_EVENT(d_loglevel, d_string, d_args...)\
 do { \
 	debug_sprintf_event(kvm_s390_dbf, d_loglevel, d_string "\n", \
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 08/42] KVM: s390: add new variants of UV CALL
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (6 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 07/42] KVM: s390: protvirt: Add UV debug trace Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 10:42   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
                   ` (33 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

This adds two new helper functions for doing UV CALLs.

The first variant handles UV CALLs that might have longer busy
conditions or just need longer when doing partial completion. We should
schedule when necessary.

The second variant handles UV CALLs that only need the handle but have
no payload (e.g. destroying a VM). We can provide a simple wrapper for
those.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/uv.h | 65 +++++++++++++++++++++++++++++++++++---
 1 file changed, 60 insertions(+), 5 deletions(-)

diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index e45963cc7f40..bc452a15ac3f 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -14,6 +14,7 @@
 #include <linux/types.h>
 #include <linux/errno.h>
 #include <linux/bug.h>
+#include <linux/sched.h>
 #include <asm/page.h>
 #include <asm/gmap.h>
 
@@ -91,6 +92,19 @@ struct uv_cb_cfs {
 	u64 paddr;
 } __packed __aligned(8);
 
+/*
+ * A common UV call struct for calls that take no payload
+ * Examples:
+ * Destroy cpu/config
+ * Verify
+ */
+struct uv_cb_nodata {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 handle;
+	u64 reserved20[4];
+} __packed __aligned(8);
+
 struct uv_cb_share {
 	struct uv_cb_header header;
 	u64 reserved08[3];
@@ -98,21 +112,62 @@ struct uv_cb_share {
 	u64 reserved28;
 } __packed __aligned(8);
 
-static inline int uv_call(unsigned long r1, unsigned long r2)
+static inline int __uv_call(unsigned long r1, unsigned long r2)
 {
 	int cc;
 
 	asm volatile(
-		"0:	.insn rrf,0xB9A40000,%[r1],%[r2],0,0\n"
-		"		brc	3,0b\n"
-		"		ipm	%[cc]\n"
-		"		srl	%[cc],28\n"
+		"	.insn rrf,0xB9A40000,%[r1],%[r2],0,0\n"
+		"	ipm	%[cc]\n"
+		"	srl	%[cc],28\n"
 		: [cc] "=d" (cc)
 		: [r1] "a" (r1), [r2] "a" (r2)
 		: "memory", "cc");
 	return cc;
 }
 
+static inline int uv_call(unsigned long r1, unsigned long r2)
+{
+	int cc;
+
+	do {
+		cc = __uv_call(r1, r2);
+	} while (cc > 1);
+	return cc;
+}
+
+/* Low level uv_call that avoids stalls for long running busy conditions  */
+static inline int uv_call_sched(unsigned long r1, unsigned long r2)
+{
+	int cc;
+
+	do {
+		cc = __uv_call(r1, r2);
+		cond_resched();
+	} while (cc > 1);
+	return cc;
+}
+
+/*
+ * special variant of uv_call that only transports the cpu or guest
+ * handle and the command, like destroy or verify.
+ */
+static inline int uv_cmd_nodata(u64 handle, u16 cmd, u16 *rc, u16 *rrc)
+{
+	struct uv_cb_nodata uvcb = {
+		.header.cmd = cmd,
+		.header.len = sizeof(uvcb),
+		.handle = handle,
+	};
+	int cc;
+
+	WARN(!handle, "No handle provided to Ultravisor call cmd %x\n", cmd);
+	cc = uv_call_sched(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	return cc ? -EINVAL : 0;
+}
+
 struct uv_info {
 	unsigned long inst_calls_list[4];
 	unsigned long uv_base_stor_len;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (7 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 08/42] KVM: s390: add new variants of UV CALL Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 10:56   ` David Hildenbrand
  2020-02-18  8:39   ` [PATCH v2.1] " Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 10/42] KVM: s390: protvirt: Add KVM api documentation Christian Borntraeger
                   ` (32 subsequent siblings)
  41 siblings, 2 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

This contains 3 main changes:
1. changes in SIE control block handling for secure guests
2. helper functions for create/destroy/unpack secure guests
3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
machines

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  24 ++-
 arch/s390/include/asm/uv.h       |  69 ++++++++
 arch/s390/kvm/Makefile           |   2 +-
 arch/s390/kvm/kvm-s390.c         | 202 +++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.h         |  27 ++++
 arch/s390/kvm/pv.c               | 262 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h         |  35 +++++
 7 files changed, 617 insertions(+), 4 deletions(-)
 create mode 100644 arch/s390/kvm/pv.c

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index d058289385a5..1aa2382fe363 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -160,7 +160,13 @@ struct kvm_s390_sie_block {
 	__u8	reserved08[4];		/* 0x0008 */
 #define PROG_IN_SIE (1<<0)
 	__u32	prog0c;			/* 0x000c */
-	__u8	reserved10[16];		/* 0x0010 */
+	union {
+		__u8	reserved10[16];		/* 0x0010 */
+		struct {
+			__u64	pv_handle_cpu;
+			__u64	pv_handle_config;
+		};
+	};
 #define PROG_BLOCK_SIE	(1<<0)
 #define PROG_REQUEST	(1<<1)
 	atomic_t prog20;		/* 0x0020 */
@@ -233,7 +239,7 @@ struct kvm_s390_sie_block {
 #define ECB3_RI  0x01
 	__u8    ecb3;			/* 0x0063 */
 	__u32	scaol;			/* 0x0064 */
-	__u8	reserved68;		/* 0x0068 */
+	__u8	sdf;			/* 0x0068 */
 	__u8    epdx;			/* 0x0069 */
 	__u8    reserved6a[2];		/* 0x006a */
 	__u32	todpr;			/* 0x006c */
@@ -645,6 +651,11 @@ struct kvm_guestdbg_info_arch {
 	unsigned long last_bp;
 };
 
+struct kvm_s390_pv_vcpu {
+	u64 handle;
+	unsigned long stor_base;
+};
+
 struct kvm_vcpu_arch {
 	struct kvm_s390_sie_block *sie_block;
 	/* if vsie is active, currently executed shadow sie control block */
@@ -673,6 +684,7 @@ struct kvm_vcpu_arch {
 	__u64 cputm_start;
 	bool gs_enabled;
 	bool skey_enabled;
+	struct kvm_s390_pv_vcpu pv;
 };
 
 struct kvm_vm_stat {
@@ -843,6 +855,13 @@ struct kvm_s390_gisa_interrupt {
 	DECLARE_BITMAP(kicked_mask, KVM_MAX_VCPUS);
 };
 
+struct kvm_s390_pv {
+	u64 handle;
+	u64 guest_len;
+	unsigned long stor_base;
+	void *stor_var;
+};
+
 struct kvm_arch{
 	void *sca;
 	int use_esca;
@@ -878,6 +897,7 @@ struct kvm_arch{
 	DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
 	DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS);
 	struct kvm_s390_gisa_interrupt gisa_int;
+	struct kvm_s390_pv pv;
 };
 
 #define KVM_HVA_ERR_BAD		(-1UL)
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index bc452a15ac3f..839cb3a89986 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -23,11 +23,19 @@
 #define UVC_RC_INV_STATE	0x0003
 #define UVC_RC_INV_LEN		0x0005
 #define UVC_RC_NO_RESUME	0x0007
+#define UVC_RC_NEED_DESTROY	0x8000
 
 #define UVC_CMD_QUI			0x0001
 #define UVC_CMD_INIT_UV			0x000f
+#define UVC_CMD_CREATE_SEC_CONF		0x0100
+#define UVC_CMD_DESTROY_SEC_CONF	0x0101
+#define UVC_CMD_CREATE_SEC_CPU		0x0120
+#define UVC_CMD_DESTROY_SEC_CPU		0x0121
 #define UVC_CMD_CONV_TO_SEC_STOR	0x0200
 #define UVC_CMD_CONV_FROM_SEC_STOR	0x0201
+#define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
+#define UVC_CMD_UNPACK_IMG		0x0301
+#define UVC_CMD_VERIFY_IMG		0x0302
 #define UVC_CMD_PIN_PAGE_SHARED		0x0341
 #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
 #define UVC_CMD_SET_SHARED_ACCESS	0x1000
@@ -37,10 +45,17 @@
 enum uv_cmds_inst {
 	BIT_UVC_CMD_QUI = 0,
 	BIT_UVC_CMD_INIT_UV = 1,
+	BIT_UVC_CMD_CREATE_SEC_CONF = 2,
+	BIT_UVC_CMD_DESTROY_SEC_CONF = 3,
+	BIT_UVC_CMD_CREATE_SEC_CPU = 4,
+	BIT_UVC_CMD_DESTROY_SEC_CPU = 5,
 	BIT_UVC_CMD_CONV_TO_SEC_STOR = 6,
 	BIT_UVC_CMD_CONV_FROM_SEC_STOR = 7,
 	BIT_UVC_CMD_SET_SHARED_ACCESS = 8,
 	BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9,
+	BIT_UVC_CMD_SET_SEC_PARMS = 11,
+	BIT_UVC_CMD_UNPACK_IMG = 13,
+	BIT_UVC_CMD_VERIFY_IMG = 14,
 	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
 	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
 };
@@ -52,6 +67,7 @@ struct uv_cb_header {
 	u16 rrc;	/* Return Reason Code */
 } __packed __aligned(8);
 
+/* Query Ultravisor Information */
 struct uv_cb_qui {
 	struct uv_cb_header header;
 	u64 reserved08;
@@ -71,6 +87,7 @@ struct uv_cb_qui {
 	u64 reserveda0;
 } __packed __aligned(8);
 
+/* Initialize Ultravisor */
 struct uv_cb_init {
 	struct uv_cb_header header;
 	u64 reserved08[2];
@@ -79,6 +96,35 @@ struct uv_cb_init {
 	u64 reserved28[4];
 } __packed __aligned(8);
 
+/* Create Guest Configuration */
+struct uv_cb_cgc {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 conf_base_stor_origin;
+	u64 conf_virt_stor_origin;
+	u64 reserved30;
+	u64 guest_stor_origin;
+	u64 guest_stor_len;
+	u64 guest_sca;
+	u64 guest_asce;
+	u64 reserved58[5];
+} __packed __aligned(8);
+
+/* Create Secure CPU */
+struct uv_cb_csc {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 cpu_handle;
+	u64 guest_handle;
+	u64 stor_origin;
+	u8  reserved30[6];
+	u16 num;
+	u64 state_origin;
+	u64 reserved40[4];
+} __packed __aligned(8);
+
+/* Convert to Secure */
 struct uv_cb_cts {
 	struct uv_cb_header header;
 	u64 reserved08[2];
@@ -86,12 +132,34 @@ struct uv_cb_cts {
 	u64 gaddr;
 } __packed __aligned(8);
 
+/* Convert from Secure / Pin Page Shared */
 struct uv_cb_cfs {
 	struct uv_cb_header header;
 	u64 reserved08[2];
 	u64 paddr;
 } __packed __aligned(8);
 
+/* Set Secure Config Parameter */
+struct uv_cb_ssc {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 sec_header_origin;
+	u32 sec_header_len;
+	u32 reserved2c;
+	u64 reserved30[4];
+} __packed __aligned(8);
+
+/* Unpack */
+struct uv_cb_unp {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 gaddr;
+	u64 tweak[2];
+	u64 reserved38[3];
+} __packed __aligned(8);
+
 /*
  * A common UV call struct for calls that take no payload
  * Examples:
@@ -105,6 +173,7 @@ struct uv_cb_nodata {
 	u64 reserved20[4];
 } __packed __aligned(8);
 
+/* Set Shared Access */
 struct uv_cb_share {
 	struct uv_cb_header header;
 	u64 reserved08[3];
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 05ee90a5ea08..12decca22e7c 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o  $(KVM)/async_pf.o $(KVM)/irqch
 ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
 
 kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
-kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
+kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
 
 obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index cc7793525a69..f3cd469c2e7b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -44,6 +44,7 @@
 #include <asm/cpacf.h>
 #include <asm/timex.h>
 #include <asm/ap.h>
+#include <asm/uv.h>
 #include "kvm-s390.h"
 #include "gaccess.h"
 
@@ -234,8 +235,10 @@ int kvm_arch_check_processor_compat(void)
 	return 0;
 }
 
+/* forward declarations */
 static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
 			      unsigned long end);
+static int sca_switch_to_extended(struct kvm *kvm);
 
 static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
 {
@@ -571,6 +574,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_BPB:
 		r = test_facility(82);
 		break;
+	case KVM_CAP_S390_PROTECTED:
+		r = is_prot_virt_host();
+		break;
 	default:
 		r = 0;
 	}
@@ -2165,6 +2171,114 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
+{
+	int r = 0;
+	void __user *argp = (void __user *)cmd->data;
+
+	switch (cmd->cmd) {
+	case KVM_PV_VM_CREATE: {
+		r = -EINVAL;
+		if (kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = kvm_s390_pv_alloc_vm(kvm);
+		if (r)
+			break;
+
+		mutex_lock(&kvm->lock);
+		kvm_s390_vcpu_block_all(kvm);
+		/* FMT 4 SIE needs esca */
+		r = sca_switch_to_extended(kvm);
+		if (r) {
+			kvm_s390_pv_dealloc_vm(kvm);
+			kvm_s390_vcpu_unblock_all(kvm);
+			mutex_unlock(&kvm->lock);
+			break;
+		}
+		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
+		kvm_s390_vcpu_unblock_all(kvm);
+		mutex_unlock(&kvm->lock);
+		break;
+	}
+	case KVM_PV_VM_DESTROY: {
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		/* All VCPUs have to be destroyed before this call. */
+		mutex_lock(&kvm->lock);
+		kvm_s390_vcpu_block_all(kvm);
+		r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
+		if (!r)
+			kvm_s390_pv_dealloc_vm(kvm);
+		kvm_s390_vcpu_unblock_all(kvm);
+		mutex_unlock(&kvm->lock);
+		break;
+	}
+	case KVM_PV_VM_SET_SEC_PARMS: {
+		struct kvm_s390_pv_sec_parm parms = {};
+		void *hdr;
+
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = -EFAULT;
+		if (copy_from_user(&parms, argp, sizeof(parms)))
+			break;
+
+		/* Currently restricted to 8KB */
+		r = -EINVAL;
+		if (parms.length > PAGE_SIZE * 2)
+			break;
+
+		r = -ENOMEM;
+		hdr = vmalloc(parms.length);
+		if (!hdr)
+			break;
+
+		r = -EFAULT;
+		if (!copy_from_user(hdr, (void __user *)parms.origin,
+				    parms.length))
+			r = kvm_s390_pv_set_sec_parms(kvm, hdr, parms.length,
+						      &cmd->rc, &cmd->rrc);
+
+		vfree(hdr);
+		break;
+	}
+	case KVM_PV_VM_UNPACK: {
+		struct kvm_s390_pv_unp unp = {};
+
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = -EFAULT;
+		if (copy_from_user(&unp, argp, sizeof(unp)))
+			break;
+
+		r = kvm_s390_pv_unpack(kvm, unp.addr, unp.size, unp.tweak,
+				       &cmd->rc, &cmd->rrc);
+		break;
+	}
+	case KVM_PV_VM_VERIFY: {
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
+				  UVC_CMD_VERIFY_IMG, &cmd->rc, &cmd->rrc);
+		KVM_UV_EVENT(kvm, 3, "PROTVIRT VERIFY: rc %x rrc %x", cmd->rc,
+			     cmd->rrc);
+		break;
+	}
+	default:
+		return -ENOTTY;
+	}
+	return r;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -2262,6 +2376,25 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		mutex_unlock(&kvm->slots_lock);
 		break;
 	}
+	case KVM_S390_PV_COMMAND: {
+		struct kvm_pv_cmd args;
+
+		r = 0;
+		if (!is_prot_virt_host()) {
+			r = -EINVAL;
+			break;
+		}
+		if (copy_from_user(&args, argp, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		r = kvm_s390_handle_pv(kvm, &args);
+		if (copy_to_user(argp, &args, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
@@ -2525,6 +2658,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
+	u16 rc, rrc;
+
 	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
 	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
 	kvm_s390_clear_local_irqs(vcpu);
@@ -2537,6 +2672,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 	if (vcpu->kvm->arch.use_cmma)
 		kvm_s390_vcpu_unsetup_cmma(vcpu);
+	if (kvm_s390_pv_handle_cpu(vcpu))
+		kvm_s390_pv_destroy_cpu(vcpu, &rc, &rrc);
 	free_page((unsigned long)(vcpu->arch.sie_block));
 }
 
@@ -2558,10 +2695,15 @@ static void kvm_free_vcpus(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	u16 rc, rrc;
 	kvm_free_vcpus(kvm);
 	sca_dispose(kvm);
-	debug_unregister(kvm->arch.dbf);
 	kvm_s390_gisa_destroy(kvm);
+	if (kvm_s390_pv_is_protected(kvm)) {
+		kvm_s390_pv_destroy_vm(kvm, &rc, &rrc);
+		kvm_s390_pv_dealloc_vm(kvm);
+	}
+	debug_unregister(kvm->arch.dbf);
 	free_page((unsigned long)kvm->arch.sie_page2);
 	if (!kvm_is_ucontrol(kvm))
 		gmap_remove(kvm->arch.gmap);
@@ -2657,6 +2799,9 @@ static int sca_switch_to_extended(struct kvm *kvm)
 	unsigned int vcpu_idx;
 	u32 scaol, scaoh;
 
+	if (kvm->arch.use_esca)
+		return 0;
+
 	new_sca = alloc_pages_exact(sizeof(*new_sca), GFP_KERNEL|__GFP_ZERO);
 	if (!new_sca)
 		return -ENOMEM;
@@ -2908,6 +3053,7 @@ static void kvm_s390_vcpu_setup_model(struct kvm_vcpu *vcpu)
 static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 {
 	int rc = 0;
+	u16 uvrc, uvrrc;
 
 	atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH |
 						    CPUSTAT_SM |
@@ -2975,6 +3121,9 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 
 	kvm_s390_vcpu_crypto_setup(vcpu);
 
+	if (kvm_s390_pv_is_protected(vcpu->kvm))
+		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
+
 	return rc;
 }
 
@@ -4352,6 +4501,38 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
 	return -ENOIOCTLCMD;
 }
 
+static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
+				   struct kvm_pv_cmd *cmd)
+{
+	int r = 0;
+
+	if (!kvm_s390_pv_is_protected(vcpu->kvm))
+		return -EINVAL;
+
+	if (cmd->flags)
+		return -EINVAL;
+
+	switch (cmd->cmd) {
+	case KVM_PV_VCPU_CREATE: {
+		if (kvm_s390_pv_handle_cpu(vcpu))
+			return -EINVAL;
+
+		r = kvm_s390_pv_create_cpu(vcpu, &cmd->rc, &cmd->rrc);
+		break;
+	}
+	case KVM_PV_VCPU_DESTROY: {
+		if (!kvm_s390_pv_handle_cpu(vcpu))
+			return -EINVAL;
+
+		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
+		break;
+	}
+	default:
+		r = -ENOTTY;
+	}
+	return r;
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 			 unsigned int ioctl, unsigned long arg)
 {
@@ -4493,6 +4674,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 					   irq_state.len);
 		break;
 	}
+	case KVM_S390_PV_COMMAND_VCPU: {
+		struct kvm_pv_cmd args;
+
+		r = 0;
+		if (!is_prot_virt_host()) {
+			r = -EINVAL;
+			break;
+		}
+		if (copy_from_user(&args, argp, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		r = kvm_s390_handle_pv_vcpu(vcpu, &args);
+		if (copy_to_user(argp, &args, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 83dabb18e4d9..d5503dd0d1e4 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -207,6 +207,33 @@ static inline int kvm_s390_user_cpu_state_ctrl(struct kvm *kvm)
 	return kvm->arch.user_cpu_state_ctrl != 0;
 }
 
+/* implemented in pv.c */
+void kvm_s390_pv_dealloc_vm(struct kvm *kvm);
+int kvm_s390_pv_alloc_vm(struct kvm *kvm);
+int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc);
+int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
+int kvm_s390_pv_destroy_vm(struct kvm *kvm, u16 *rc, u16 *rrc);
+int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
+int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
+			      u16 *rrc);
+int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
+		       unsigned long tweak, u16 *rc, u16 *rrc);
+
+static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
+{
+	return !!kvm->arch.pv.handle;
+}
+
+static inline u64 kvm_s390_pv_handle(struct kvm *kvm)
+{
+	return kvm->arch.pv.handle;
+}
+
+static inline u64 kvm_s390_pv_handle_cpu(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.pv.handle;
+}
+
 /* implemented in interrupt.c */
 int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
 void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu);
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
new file mode 100644
index 000000000000..bf00cde1ead8
--- /dev/null
+++ b/arch/s390/kvm/pv.c
@@ -0,0 +1,262 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hosting Secure Execution virtual machines
+ *
+ * Copyright IBM Corp. 2019
+ *    Author(s): Janosch Frank <frankja@linux.ibm.com>
+ */
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/pagemap.h>
+#include <linux/sched/signal.h>
+#include <asm/pgalloc.h>
+#include <asm/gmap.h>
+#include <asm/uv.h>
+#include <asm/gmap.h>
+#include <asm/mman.h>
+#include "kvm-s390.h"
+
+void kvm_s390_pv_dealloc_vm(struct kvm *kvm)
+{
+	vfree(kvm->arch.pv.stor_var);
+	free_pages(kvm->arch.pv.stor_base,
+		   get_order(uv_info.guest_base_stor_len));
+	memset(&kvm->arch.pv, 0, sizeof(kvm->arch.pv));
+}
+
+int kvm_s390_pv_alloc_vm(struct kvm *kvm)
+{
+	unsigned long base = uv_info.guest_base_stor_len;
+	unsigned long virt = uv_info.guest_virt_var_stor_len;
+	unsigned long npages = 0, vlen = 0;
+	struct kvm_memory_slot *memslot;
+
+	kvm->arch.pv.stor_var = NULL;
+	kvm->arch.pv.stor_base = __get_free_pages(GFP_KERNEL, get_order(base));
+	if (!kvm->arch.pv.stor_base)
+		return -ENOMEM;
+
+	/*
+	 * Calculate current guest storage for allocation of the
+	 * variable storage, which is based on the length in MB.
+	 *
+	 * Slots are sorted by GFN
+	 */
+	mutex_lock(&kvm->slots_lock);
+	memslot = kvm_memslots(kvm)->memslots;
+	npages = memslot->base_gfn + memslot->npages;
+	mutex_unlock(&kvm->slots_lock);
+
+	kvm->arch.pv.guest_len = npages * PAGE_SIZE;
+
+	/* Allocate variable storage */
+	vlen = ALIGN(virt * ((npages * PAGE_SIZE) / HPAGE_SIZE), PAGE_SIZE);
+	vlen += uv_info.guest_virt_base_stor_len;
+	kvm->arch.pv.stor_var = vzalloc(vlen);
+	if (!kvm->arch.pv.stor_var)
+		goto out_err;
+	return 0;
+
+out_err:
+	kvm_s390_pv_dealloc_vm(kvm);
+	return -ENOMEM;
+}
+
+int kvm_s390_pv_destroy_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	int cc;
+
+	cc = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
+			   UVC_CMD_DESTROY_SEC_CONF, rc, rrc);
+	WRITE_ONCE(kvm->arch.gmap->guest_handle, 0);
+	atomic_set(&kvm->mm->context.is_protected, 0);
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc);
+	return cc;
+}
+
+int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
+{
+	int cc = 0;
+
+	if (kvm_s390_pv_handle_cpu(vcpu)) {
+		cc = uv_cmd_nodata(kvm_s390_pv_handle_cpu(vcpu),
+				   UVC_CMD_DESTROY_SEC_CPU, rc, rrc);
+
+		VCPU_EVENT(vcpu, 3, "PROTVIRT DESTROY VCPU: rc %x rrc %x",
+			   *rc, *rrc);
+		KVM_UV_EVENT(vcpu->kvm, 3, "PROTVIRT DESTROY VCPU: rc %x rrc %x",
+			     *rc, *rrc);
+	}
+
+	free_pages(vcpu->arch.pv.stor_base,
+		   get_order(uv_info.guest_cpu_stor_len));
+	vcpu->arch.sie_block->pv_handle_cpu = 0;
+	vcpu->arch.sie_block->pv_handle_config = 0;
+	memset(&vcpu->arch.pv, 0, sizeof(vcpu->arch.pv));
+	vcpu->arch.sie_block->sdf = 0;
+	return cc;
+}
+
+int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
+{
+	struct uv_cb_csc uvcb = {
+		.header.cmd = UVC_CMD_CREATE_SEC_CPU,
+		.header.len = sizeof(uvcb),
+	};
+	int cc;
+
+	if (kvm_s390_pv_handle_cpu(vcpu))
+		return -EINVAL;
+
+	vcpu->arch.pv.stor_base = __get_free_pages(GFP_KERNEL,
+						   get_order(uv_info.guest_cpu_stor_len));
+	if (!vcpu->arch.pv.stor_base)
+		return -ENOMEM;
+
+	/* Input */
+	uvcb.guest_handle = kvm_s390_pv_handle(vcpu->kvm);
+	uvcb.num = vcpu->arch.sie_block->icpua;
+	uvcb.state_origin = (u64)vcpu->arch.sie_block;
+	uvcb.stor_origin = (u64)vcpu->arch.pv.stor_base;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	VCPU_EVENT(vcpu, 3, "PROTVIRT CREATE VCPU: handle %llx rc %x rrc %x",
+		   uvcb.cpu_handle, uvcb.header.rc, uvcb.header.rrc);
+	KVM_UV_EVENT(vcpu->kvm, 3,
+		     "PROTVIRT CREATE VCPU: cpu %d handle %llx rc %x rrc %x",
+		     vcpu->vcpu_id, uvcb.cpu_handle, uvcb.header.rc,
+		     uvcb.header.rrc);
+
+	if (cc) {
+		u16 dummy;
+
+		kvm_s390_pv_destroy_cpu(vcpu, &dummy, &dummy);
+		return -EINVAL;
+	}
+
+	/* Output */
+	vcpu->arch.pv.handle = uvcb.cpu_handle;
+	vcpu->arch.sie_block->pv_handle_cpu = uvcb.cpu_handle;
+	vcpu->arch.sie_block->pv_handle_config = kvm_s390_pv_handle(vcpu->kvm);
+	vcpu->arch.sie_block->sdf = 2;
+	return 0;
+}
+
+int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	u16 drc, drrc;
+	int cc;
+
+	struct uv_cb_cgc uvcb = {
+		.header.cmd = UVC_CMD_CREATE_SEC_CONF,
+		.header.len = sizeof(uvcb)
+	};
+
+	if (kvm_s390_pv_handle(kvm))
+		return -EINVAL;
+
+	/* Inputs */
+	uvcb.guest_stor_origin = 0; /* MSO is 0 for KVM */
+	uvcb.guest_stor_len = kvm->arch.pv.guest_len;
+	uvcb.guest_asce = kvm->arch.gmap->asce;
+	uvcb.guest_sca = (unsigned long)kvm->arch.sca;
+	uvcb.conf_base_stor_origin = (u64)kvm->arch.pv.stor_base;
+	uvcb.conf_virt_stor_origin = (u64)kvm->arch.pv.stor_var;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT CREATE VM: handle %llx len %llx rc %x rrc %x",
+		     uvcb.guest_handle, uvcb.guest_stor_len, *rc, *rrc);
+
+	/* Outputs */
+	kvm->arch.pv.handle = uvcb.guest_handle;
+
+	if (cc && (uvcb.header.rc & UVC_RC_NEED_DESTROY)) {
+		kvm_s390_pv_destroy_vm(kvm, &drc, &drrc);
+		return -EINVAL;
+	}
+	kvm->arch.gmap->guest_handle = uvcb.guest_handle;
+	atomic_set(&kvm->mm->context.is_protected, 1);
+	return cc;
+}
+
+int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
+			      u16 *rrc)
+{
+	struct uv_cb_ssc uvcb = {
+		.header.cmd = UVC_CMD_SET_SEC_CONF_PARAMS,
+		.header.len = sizeof(uvcb),
+		.sec_header_origin = (u64)hdr,
+		.sec_header_len = length,
+		.guest_handle = kvm_s390_pv_handle(kvm),
+	};
+	int cc;
+
+	if (!kvm_s390_pv_handle(kvm))
+		return -EINVAL;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT VM SET PARMS: rc %x rrc %x",
+		     uvcb.header.rc, uvcb.header.rrc);
+	if (cc)
+		return -EINVAL;
+	return 0;
+}
+
+static int unpack_one(struct kvm *kvm, unsigned long addr, u64 tweak[2],
+		      u16 *rc, u16 *rrc)
+{
+	struct uv_cb_unp uvcb = {
+		.header.cmd = UVC_CMD_UNPACK_IMG,
+		.header.len = sizeof(uvcb),
+		.guest_handle = kvm_s390_pv_handle(kvm),
+		.gaddr = addr,
+		.tweak[0] = tweak[0],
+		.tweak[1] = tweak[1],
+	};
+	int ret;
+
+	ret = gmap_make_secure(kvm->arch.gmap, addr, &uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+
+	if (ret && ret != -EAGAIN)
+		KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: failed addr %llx with rc %x rrc %x",
+			     uvcb.gaddr, *rc, *rrc);
+	return ret;
+}
+
+int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
+		       unsigned long tweak, u16 *rc, u16 *rrc)
+{
+	u64 tw[2] = {tweak, 0};
+	int ret = 0;
+
+	if (addr & ~PAGE_MASK || !size || size & ~PAGE_MASK)
+		return -EINVAL;
+
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: start addr %lx size %lx",
+		     addr, size);
+
+	while (tw[1] < size) {
+		ret = unpack_one(kvm, addr, tw, rc, rrc);
+		if (ret == -EAGAIN) {
+			cond_resched();
+			if (fatal_signal_pending(current))
+				break;
+			continue;
+		}
+		if (ret)
+			break;
+		addr += PAGE_SIZE;
+		tw[1] += PAGE_SIZE;
+	}
+	if (!ret)
+		KVM_UV_EVENT(kvm, 3, "%s", "PROTVIRT VM UNPACK: successful");
+	return ret;
+}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4b95f9a31a2f..207915488502 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1010,6 +1010,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_NISV_TO_USER 177
 #define KVM_CAP_ARM_INJECT_EXT_DABT 178
 #define KVM_CAP_S390_VCPU_RESETS 179
+#define KVM_CAP_S390_PROTECTED 181
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1478,6 +1479,40 @@ struct kvm_enc_region {
 #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
 #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
 
+struct kvm_s390_pv_sec_parm {
+	__u64	origin;
+	__u64	length;
+};
+
+struct kvm_s390_pv_unp {
+	__u64 addr;
+	__u64 size;
+	__u64 tweak;
+};
+
+enum pv_cmd_id {
+	KVM_PV_VM_CREATE,
+	KVM_PV_VM_DESTROY,
+	KVM_PV_VM_SET_SEC_PARMS,
+	KVM_PV_VM_UNPACK,
+	KVM_PV_VM_VERIFY,
+	KVM_PV_VCPU_CREATE,
+	KVM_PV_VCPU_DESTROY,
+};
+
+struct kvm_pv_cmd {
+	__u32 cmd;	/* Command to be executed */
+	__u16 rc;	/* Ultravisor return code */
+	__u16 rrc;	/* Ultravisor return reason code */
+	__u64 data;	/* Data or address */
+	__u32 flags;    /* flags for future extensions. Must be 0 for now */
+	__u32 reserved[3];
+};
+
+/* Available with KVM_CAP_S390_PROTECTED */
+#define KVM_S390_PV_COMMAND		_IOWR(KVMIO, 0xc5, struct kvm_pv_cmd)
+#define KVM_S390_PV_COMMAND_VCPU	_IOWR(KVMIO, 0xc6, struct kvm_pv_cmd)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 10/42] KVM: s390: protvirt: Add KVM api documentation
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (8 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 11/42] KVM: s390: protvirt: Secure memory is not mergeable Christian Borntraeger
                   ` (31 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Add documentation for KVM_CAP_S390_PROTECTED capability and the
KVM_S390_PV_COMMAND and KVM_S390_PV_COMMAND_VCPU ioctls.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/api.rst | 69 ++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 97a72a53fa4b..cb58714fe60d 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4646,6 +4646,68 @@ the clear cpu reset definition in the POP. However, the cpu is not put
 into ESA mode. This reset is a superset of the initial reset.
 
 
+4.125 KVM_S390_PV_COMMAND
+-------------------------
+
+:Capability: KVM_CAP_S390_PROTECTED
+:Architectures: s390
+:Type: vm ioctl
+:Parameters: struct kvm_pv_cmd
+:Returns: 0 on success, < 0 on error
+
+::
+
+  struct kvm_pv_cmd {
+	__u32 cmd;	/* Command to be executed */
+	__u16 rc;	/* Ultravisor return code */
+	__u16 rrc;	/* Ultravisor return reason code */
+	__u64 data;	/* Data or address */
+	__u32 flags;    /* flags for future extensions. Must be 0 for now */
+	__u32 reserved[3];
+  };
+
+cmd values:
+
+KVM_PV_VM_CREATE
+  Allocate memory and register the VM with the Ultravisor, thereby
+  donating memory to the Ultravisor making it inaccessible to KVM.
+
+KVM_PV_VM_DESTROY
+  Deregisters the VM from the Ultravisor and frees memory that was
+  donated, so the kernel can use it again. All registered VCPUs have to
+  be unregistered beforehand and all memory has to be exported or
+  shared.
+
+KVM_PV_VM_SET_SEC_PARMS
+  Pass the image header from VM memory to the Ultravisor in
+  preparation of image unpacking and verification.
+
+KVM_PV_VM_UNPACK
+  Unpack (protect and decrypt) a page of the encrypted boot image.
+
+KVM_PV_VM_VERIFY
+  Verify the integrity of the unpacked image. Only if this succeeds,
+  KVM is allowed to start protected VCPUs.
+
+4.126 KVM_S390_PV_COMMAND_VCPU
+------------------------------
+
+:Capability: KVM_CAP_S390_PROTECTED
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_pv_cmd
+:Returns: 0 on success, < 0 on error
+
+cmd values:
+
+KVM_PV_VCPU_CREATE
+  Allocate memory and register a VCPU with the Ultravisor, thereby
+  donating memory to the Ultravisor making it inaccessible to KVM.
+
+KVM_PV_VCPU_DESTROY
+  Unregisters the VCPU from the Ultravisor and frees memory that was
+  donated, so the kernel can use it again.
+
 5. The kvm_run structure
 ========================
 
@@ -6024,3 +6086,10 @@ Architectures: s390
 
 This capability indicates that the KVM_S390_NORMAL_RESET and
 KVM_S390_CLEAR_RESET ioctls are available.
+
+8.23 KVM_CAP_S390_PROTECTED
+
+Architecture: s390
+
+This capability indicates that KVM can start protected VMs and the
+Ultravisor has therefore been initialized.
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 11/42] KVM: s390: protvirt: Secure memory is not mergeable
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (9 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 10/42] KVM: s390: protvirt: Add KVM api documentation Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 12/42] KVM: s390/mm: Make pages accessible before destroying the guest Christian Borntraeger
                   ` (30 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

KSM will not work on secure pages, because when the kernel reads a
secure page, it will be encrypted and hence no two pages will look the
same.

Let's mark the guest pages as unmergeable when we transition to secure
mode.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/gmap.h |  1 +
 arch/s390/kvm/kvm-s390.c     |  8 ++++++++
 arch/s390/mm/gmap.c          | 30 ++++++++++++++++++++----------
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
index 3c4926aa78f4..6f9ff7a69fa2 100644
--- a/arch/s390/include/asm/gmap.h
+++ b/arch/s390/include/asm/gmap.h
@@ -148,4 +148,5 @@ int gmap_mprotect_notify(struct gmap *, unsigned long start,
 
 void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],
 			     unsigned long gaddr, unsigned long vmaddr);
+int gmap_mark_unmergeable(void);
 #endif /* _ASM_S390_GMAP_H */
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index f3cd469c2e7b..d7f0f3939847 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2186,6 +2186,14 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 		if (r)
 			break;
 
+		down_write(&current->mm->mmap_sem);
+		r = gmap_mark_unmergeable();
+		up_write(&current->mm->mmap_sem);
+		if (r) {
+			kvm_s390_pv_dealloc_vm(kvm);
+			break;
+		}
+
 		mutex_lock(&kvm->lock);
 		kvm_s390_vcpu_block_all(kvm);
 		/* FMT 4 SIE needs esca */
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index edcdca97e85e..7291452fe5f0 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2548,6 +2548,22 @@ int s390_enable_sie(void)
 }
 EXPORT_SYMBOL_GPL(s390_enable_sie);
 
+int gmap_mark_unmergeable(void)
+{
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if (ksm_madvise(vma, vma->vm_start, vma->vm_end,
+				MADV_UNMERGEABLE, &vma->vm_flags)) {
+			return -ENOMEM;
+		}
+	}
+	mm->def_flags &= ~VM_MERGEABLE;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(gmap_mark_unmergeable);
+
 /*
  * Enable storage key handling from now on and initialize the storage
  * keys with the default key.
@@ -2593,7 +2609,6 @@ static const struct mm_walk_ops enable_skey_walk_ops = {
 int s390_enable_skey(void)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
 	int rc = 0;
 
 	down_write(&mm->mmap_sem);
@@ -2601,16 +2616,11 @@ int s390_enable_skey(void)
 		goto out_up;
 
 	mm->context.uses_skeys = 1;
-	for (vma = mm->mmap; vma; vma = vma->vm_next) {
-		if (ksm_madvise(vma, vma->vm_start, vma->vm_end,
-				MADV_UNMERGEABLE, &vma->vm_flags)) {
-			mm->context.uses_skeys = 0;
-			rc = -ENOMEM;
-			goto out_up;
-		}
+	rc = gmap_mark_unmergeable();
+	if (rc) {
+		mm->context.uses_skeys = 0;
+		goto out_up;
 	}
-	mm->def_flags &= ~VM_MERGEABLE;
-
 	walk_page_range(mm, 0, TASK_SIZE, &enable_skey_walk_ops, NULL);
 
 out_up:
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 12/42] KVM: s390/mm: Make pages accessible before destroying the guest
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (10 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 11/42] KVM: s390: protvirt: Secure memory is not mergeable Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 13/42] KVM: s390: protvirt: Handle SE notification interceptions Christian Borntraeger
                   ` (29 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

Before we destroy the secure configuration, we better make all
pages accessible again. This also happens during reboot, where we reboot
into a non-secure guest that then can go again into secure mode. As
this "new" secure guest will have a new ID we cannot reuse the old page
state.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/include/asm/gmap.h |  1 +
 arch/s390/kvm/pv.c           |  2 ++
 arch/s390/mm/gmap.c          | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+)

diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
index 6f9ff7a69fa2..a816fb4734b8 100644
--- a/arch/s390/include/asm/gmap.h
+++ b/arch/s390/include/asm/gmap.h
@@ -149,4 +149,5 @@ int gmap_mprotect_notify(struct gmap *, unsigned long start,
 void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],
 			     unsigned long gaddr, unsigned long vmaddr);
 int gmap_mark_unmergeable(void);
+void s390_reset_acc(struct mm_struct *mm);
 #endif /* _ASM_S390_GMAP_H */
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
index bf00cde1ead8..09573e36c329 100644
--- a/arch/s390/kvm/pv.c
+++ b/arch/s390/kvm/pv.c
@@ -66,6 +66,8 @@ int kvm_s390_pv_destroy_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
 {
 	int cc;
 
+	/* make all pages accessible before destroying the guest */
+	s390_reset_acc(kvm->mm);
 	cc = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
 			   UVC_CMD_DESTROY_SEC_CONF, rc, rrc);
 	WRITE_ONCE(kvm->arch.gmap->guest_handle, 0);
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 7291452fe5f0..27926a06df32 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2650,3 +2650,38 @@ void s390_reset_cmma(struct mm_struct *mm)
 	up_write(&mm->mmap_sem);
 }
 EXPORT_SYMBOL_GPL(s390_reset_cmma);
+
+/*
+ * make inaccessible pages accessible again
+ */
+static int __s390_reset_acc(pte_t *ptep, unsigned long addr,
+			    unsigned long next, struct mm_walk *walk)
+{
+	pte_t pte = READ_ONCE(*ptep);
+
+	if (pte_present(pte))
+		WARN_ON_ONCE(uv_convert_from_secure(pte_val(pte) & PAGE_MASK));
+	return 0;
+}
+
+static const struct mm_walk_ops reset_acc_walk_ops = {
+	.pte_entry		= __s390_reset_acc,
+};
+
+#include <linux/sched/mm.h>
+void s390_reset_acc(struct mm_struct *mm)
+{
+	/*
+	 * we might be called during
+	 * reset:                             we walk the pages and clear
+	 * close of all kvm file descriptors: we walk the pages and clear
+	 * exit of process on fd closure:     vma already gone, do nothing
+	 */
+	if (!mmget_not_zero(mm))
+		return;
+	down_read(&mm->mmap_sem);
+	walk_page_range(mm, 0, TASK_SIZE, &reset_acc_walk_ops, NULL);
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+}
+EXPORT_SYMBOL_GPL(s390_reset_acc);
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 13/42] KVM: s390: protvirt: Handle SE notification interceptions
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (11 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 12/42] KVM: s390/mm: Make pages accessible before destroying the guest Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 14/42] KVM: s390: protvirt: Instruction emulation Christian Borntraeger
                   ` (28 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Since there is no interception for load control and load psw
instruction in the protected mode, we need a new way to get notified
whenever we can inject an IRQ right after the guest has just enabled
the possibility for receiving them.

The new interception codes solve that problem by providing a
notification for changes to IRQ enablement relevant bits in CRs 0, 6
and 14, as well a the machine check mask bit in the PSW.

No special handling is needed for these interception codes, the KVM
pre-run code will consult all necessary CRs and PSW bits and inject
IRQs the guest is enabled for.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h | 2 ++
 arch/s390/kvm/intercept.c        | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 1aa2382fe363..fdc6ceff6397 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -215,6 +215,8 @@ struct kvm_s390_sie_block {
 #define ICPT_PARTEXEC	0x38
 #define ICPT_IOINST	0x40
 #define ICPT_KSS	0x5c
+#define ICPT_MCHKREQ	0x60
+#define ICPT_INT_ENABLE	0x64
 	__u8	icptcode;		/* 0x0050 */
 	__u8	icptstatus;		/* 0x0051 */
 	__u16	ihcpu;			/* 0x0052 */
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index a389fa85cca2..6aeb4b36042c 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -480,6 +480,15 @@ int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
 	case ICPT_KSS:
 		rc = kvm_s390_skey_check_enable(vcpu);
 		break;
+	case ICPT_MCHKREQ:
+	case ICPT_INT_ENABLE:
+		/*
+		 * PSW bit 13 or a CR (0, 6, 14) changed and we might
+		 * now be able to deliver interrupts. The pre-run code
+		 * will take care of this.
+		 */
+		rc = 0;
+		break;
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 14/42] KVM: s390: protvirt: Instruction emulation
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (12 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 13/42] KVM: s390: protvirt: Handle SE notification interceptions Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 15/42] KVM: s390: protvirt: Add interruption injection controls Christian Borntraeger
                   ` (27 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

We have two new SIE exit codes dealing with instructions.
104 (0x68) for a secure instruction interception, on which the SIE needs
hypervisor action to complete the instruction. We can piggy-back on the
existing instruction handlers.

108 which is merely a notification and provides data for tracking and
management. For example this is used to tell the host about a new value
for the prefix register. As there will be several special case handlers
in later patches, we handle this in a separate function.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  2 ++
 arch/s390/kvm/intercept.c        | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index fdc6ceff6397..c6694f47b73b 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -217,6 +217,8 @@ struct kvm_s390_sie_block {
 #define ICPT_KSS	0x5c
 #define ICPT_MCHKREQ	0x60
 #define ICPT_INT_ENABLE	0x64
+#define ICPT_PV_INSTR	0x68
+#define ICPT_PV_NOTIFY	0x6c
 	__u8	icptcode;		/* 0x0050 */
 	__u8	icptstatus;		/* 0x0051 */
 	__u16	ihcpu;			/* 0x0052 */
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index 6aeb4b36042c..6fdbac696f65 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -444,6 +444,11 @@ static int handle_operexc(struct kvm_vcpu *vcpu)
 	return kvm_s390_inject_program_int(vcpu, PGM_OPERATION);
 }
 
+static int handle_pv_notification(struct kvm_vcpu *vcpu)
+{
+	return handle_instruction(vcpu);
+}
+
 int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
 {
 	int rc, per_rc = 0;
@@ -489,6 +494,12 @@ int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
 		 */
 		rc = 0;
 		break;
+	case ICPT_PV_INSTR:
+		rc = handle_instruction(vcpu);
+		break;
+	case ICPT_PV_NOTIFY:
+		rc = handle_pv_notification(vcpu);
+		break;
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 15/42] KVM: s390: protvirt: Add interruption injection controls
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (13 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 14/42] KVM: s390: protvirt: Instruction emulation Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 10:59   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 16/42] KVM: s390: protvirt: Implement interruption injection Christian Borntraeger
                   ` (26 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Michael Mueller <mimu@linux.ibm.com>

This defines the necessary data structures in the SIE control block to
inject machine checks,external and I/O interrupts. We first define the
the interrupt injection control, which defines the next interrupt to
inject. Then we define the fields that contain the payload for machine
checks,external and I/O interrupts

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h | 56 +++++++++++++++++++++++++-------
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index c6694f47b73b..834b3b7a6e23 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -222,7 +222,15 @@ struct kvm_s390_sie_block {
 	__u8	icptcode;		/* 0x0050 */
 	__u8	icptstatus;		/* 0x0051 */
 	__u16	ihcpu;			/* 0x0052 */
-	__u8	reserved54[2];		/* 0x0054 */
+	__u8	reserved54;		/* 0x0054 */
+#define IICTL_CODE_NONE		 0x00
+#define IICTL_CODE_MCHK		 0x01
+#define IICTL_CODE_EXT		 0x02
+#define IICTL_CODE_IO		 0x03
+#define IICTL_CODE_RESTART	 0x04
+#define IICTL_CODE_SPECIFICATION 0x10
+#define IICTL_CODE_OPERAND	 0x11
+	__u8	iictl;			/* 0x0055 */
 	__u16	ipa;			/* 0x0056 */
 	__u32	ipb;			/* 0x0058 */
 	__u32	scaoh;			/* 0x005c */
@@ -259,24 +267,48 @@ struct kvm_s390_sie_block {
 #define HPID_KVM	0x4
 #define HPID_VSIE	0x5
 	__u8	hpid;			/* 0x00b8 */
-	__u8	reservedb9[11];		/* 0x00b9 */
-	__u16	extcpuaddr;		/* 0x00c4 */
-	__u16	eic;			/* 0x00c6 */
+	__u8	reservedb9[7];		/* 0x00b9 */
+	union {
+		struct {
+			__u32	eiparams;	/* 0x00c0 */
+			__u16	extcpuaddr;	/* 0x00c4 */
+			__u16	eic;		/* 0x00c6 */
+		};
+		__u64	mcic;			/* 0x00c0 */
+	} __packed;
 	__u32	reservedc8;		/* 0x00c8 */
-	__u16	pgmilc;			/* 0x00cc */
-	__u16	iprcc;			/* 0x00ce */
-	__u32	dxc;			/* 0x00d0 */
-	__u16	mcn;			/* 0x00d4 */
-	__u8	perc;			/* 0x00d6 */
-	__u8	peratmid;		/* 0x00d7 */
+	union {
+		struct {
+			__u16	pgmilc;		/* 0x00cc */
+			__u16	iprcc;		/* 0x00ce */
+		};
+		__u32	edc;			/* 0x00cc */
+	} __packed;
+	union {
+		struct {
+			__u32	dxc;		/* 0x00d0 */
+			__u16	mcn;		/* 0x00d4 */
+			__u8	perc;		/* 0x00d6 */
+			__u8	peratmid;	/* 0x00d7 */
+		};
+		__u64	faddr;			/* 0x00d0 */
+	} __packed;
 	__u64	peraddr;		/* 0x00d8 */
 	__u8	eai;			/* 0x00e0 */
 	__u8	peraid;			/* 0x00e1 */
 	__u8	oai;			/* 0x00e2 */
 	__u8	armid;			/* 0x00e3 */
 	__u8	reservede4[4];		/* 0x00e4 */
-	__u64	tecmc;			/* 0x00e8 */
-	__u8	reservedf0[12];		/* 0x00f0 */
+	union {
+		__u64	tecmc;		/* 0x00e8 */
+		struct {
+			__u16	subchannel_id;	/* 0x00e8 */
+			__u16	subchannel_nr;	/* 0x00ea */
+			__u32	io_int_parm;	/* 0x00ec */
+			__u32	io_int_word;	/* 0x00f0 */
+		};
+	} __packed;
+	__u8	reservedf4[8];		/* 0x00f4 */
 #define CRYCB_FORMAT_MASK 0x00000003
 #define CRYCB_FORMAT0 0x00000000
 #define CRYCB_FORMAT1 0x00000001
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 16/42] KVM: s390: protvirt: Implement interruption injection
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (14 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 15/42] KVM: s390: protvirt: Add interruption injection controls Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 17/42] KVM: s390: protvirt: Add SCLP interrupt handling Christian Borntraeger
                   ` (25 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Michael Mueller <mimu@linux.ibm.com>

The patch implements interruption injection for the following
list of interruption types:

   - I/O (uses inject io interruption)
     __deliver_io

   - External (uses inject external interruption)
     __deliver_cpu_timer
     __deliver_ckc
     __deliver_emergency_signal
     __deliver_external_call

   - cpu restart (uses inject restart interruption)
     __deliver_restart

   - machine checks (uses mcic, failing address and external damage)
     __write_machine_check

Please note that posted interrupts (GISA) are not used for protected
guests as of today.

The service interrupt is handled in a followup patch.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |   6 ++
 arch/s390/kvm/interrupt.c        | 110 +++++++++++++++++++++++--------
 2 files changed, 89 insertions(+), 27 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 834b3b7a6e23..a13dc77f8b07 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -578,6 +578,12 @@ enum irq_types {
 #define IRQ_PEND_MCHK_MASK ((1UL << IRQ_PEND_MCHK_REP) | \
 			    (1UL << IRQ_PEND_MCHK_EX))
 
+#define IRQ_PEND_EXT_II_MASK ((1UL << IRQ_PEND_EXT_CPU_TIMER)  | \
+			      (1UL << IRQ_PEND_EXT_CLOCK_COMP) | \
+			      (1UL << IRQ_PEND_EXT_EMERGENCY)  | \
+			      (1UL << IRQ_PEND_EXT_EXTERNAL)   | \
+			      (1UL << IRQ_PEND_EXT_SERVICE))
+
 struct kvm_s390_interrupt_info {
 	struct list_head list;
 	u64	type;
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 0cebebf56515..3fc748c0d924 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -387,6 +387,12 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
 		__clear_bit(IRQ_PEND_EXT_SERVICE, &active_mask);
 	if (psw_mchk_disabled(vcpu))
 		active_mask &= ~IRQ_PEND_MCHK_MASK;
+	/* PV guest cpus can have a single interruption injected at a time. */
+	if (kvm_s390_pv_is_protected(vcpu->kvm) &&
+	    vcpu->arch.sie_block->iictl != IICTL_CODE_NONE)
+		active_mask &= ~(IRQ_PEND_EXT_II_MASK |
+				 IRQ_PEND_IO_MASK |
+				 IRQ_PEND_MCHK_MASK);
 	/*
 	 * Check both floating and local interrupt's cr14 because
 	 * bit IRQ_PEND_MCHK_REP could be set in both cases.
@@ -479,19 +485,23 @@ static void set_intercept_indicators(struct kvm_vcpu *vcpu)
 static int __must_check __deliver_cpu_timer(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
-	int rc;
+	int rc = 0;
 
 	vcpu->stat.deliver_cputm++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER,
 					 0, 0);
-
-	rc  = put_guest_lc(vcpu, EXT_IRQ_CPU_TIMER,
-			   (u16 *)__LC_EXT_INT_CODE);
-	rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR);
-	rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW,
-			     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
-	rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW,
-			    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
+		vcpu->arch.sie_block->eic = EXT_IRQ_CPU_TIMER;
+	} else {
+		rc  = put_guest_lc(vcpu, EXT_IRQ_CPU_TIMER,
+				   (u16 *)__LC_EXT_INT_CODE);
+		rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR);
+		rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW,
+				     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW,
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	}
 	clear_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs);
 	return rc ? -EFAULT : 0;
 }
@@ -499,19 +509,23 @@ static int __must_check __deliver_cpu_timer(struct kvm_vcpu *vcpu)
 static int __must_check __deliver_ckc(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
-	int rc;
+	int rc = 0;
 
 	vcpu->stat.deliver_ckc++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP,
 					 0, 0);
-
-	rc  = put_guest_lc(vcpu, EXT_IRQ_CLK_COMP,
-			   (u16 __user *)__LC_EXT_INT_CODE);
-	rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR);
-	rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW,
-			     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
-	rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW,
-			    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
+		vcpu->arch.sie_block->eic = EXT_IRQ_CLK_COMP;
+	} else {
+		rc  = put_guest_lc(vcpu, EXT_IRQ_CLK_COMP,
+				   (u16 __user *)__LC_EXT_INT_CODE);
+		rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR);
+		rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW,
+				     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW,
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	}
 	clear_bit(IRQ_PEND_EXT_CLOCK_COMP, &li->pending_irqs);
 	return rc ? -EFAULT : 0;
 }
@@ -553,6 +567,20 @@ static int __write_machine_check(struct kvm_vcpu *vcpu,
 	union mci mci;
 	int rc;
 
+	/*
+	 * All other possible payload for a machine check (e.g. the register
+	 * contents in the save area) will be handled by the ultravisor, as
+	 * the hypervisor does not not have the needed information for
+	 * protected guests.
+	 */
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_MCHK;
+		vcpu->arch.sie_block->mcic = mchk->mcic;
+		vcpu->arch.sie_block->faddr = mchk->failing_storage_address;
+		vcpu->arch.sie_block->edc = mchk->ext_damage_code;
+		return 0;
+	}
+
 	mci.val = mchk->mcic;
 	/* take care of lazy register loading */
 	save_fpu_regs();
@@ -696,17 +724,21 @@ static int __must_check __deliver_machine_check(struct kvm_vcpu *vcpu)
 static int __must_check __deliver_restart(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
-	int rc;
+	int rc = 0;
 
 	VCPU_EVENT(vcpu, 3, "%s", "deliver: cpu restart");
 	vcpu->stat.deliver_restart_signal++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_RESTART, 0, 0);
 
-	rc  = write_guest_lc(vcpu,
-			     offsetof(struct lowcore, restart_old_psw),
-			     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
-	rc |= read_guest_lc(vcpu, offsetof(struct lowcore, restart_psw),
-			    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_RESTART;
+	} else {
+		rc  = write_guest_lc(vcpu,
+				     offsetof(struct lowcore, restart_old_psw),
+				     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= read_guest_lc(vcpu, offsetof(struct lowcore, restart_psw),
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	}
 	clear_bit(IRQ_PEND_RESTART, &li->pending_irqs);
 	return rc ? -EFAULT : 0;
 }
@@ -748,6 +780,12 @@ static int __must_check __deliver_emergency_signal(struct kvm_vcpu *vcpu)
 	vcpu->stat.deliver_emergency_signal++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_EMERGENCY,
 					 cpu_addr, 0);
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
+		vcpu->arch.sie_block->eic = EXT_IRQ_EMERGENCY_SIG;
+		vcpu->arch.sie_block->extcpuaddr = cpu_addr;
+		return 0;
+	}
 
 	rc  = put_guest_lc(vcpu, EXT_IRQ_EMERGENCY_SIG,
 			   (u16 *)__LC_EXT_INT_CODE);
@@ -776,6 +814,12 @@ static int __must_check __deliver_external_call(struct kvm_vcpu *vcpu)
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
 					 KVM_S390_INT_EXTERNAL_CALL,
 					 extcall.code, 0);
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
+		vcpu->arch.sie_block->eic = EXT_IRQ_EXTERNAL_CALL;
+		vcpu->arch.sie_block->extcpuaddr = extcall.code;
+		return 0;
+	}
 
 	rc  = put_guest_lc(vcpu, EXT_IRQ_EXTERNAL_CALL,
 			   (u16 *)__LC_EXT_INT_CODE);
@@ -1028,6 +1072,15 @@ static int __do_deliver_io(struct kvm_vcpu *vcpu, struct kvm_s390_io_info *io)
 {
 	int rc;
 
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_IO;
+		vcpu->arch.sie_block->subchannel_id = io->subchannel_id;
+		vcpu->arch.sie_block->subchannel_nr = io->subchannel_nr;
+		vcpu->arch.sie_block->io_int_parm = io->io_int_parm;
+		vcpu->arch.sie_block->io_int_word = io->io_int_word;
+		return 0;
+	}
+
 	rc  = put_guest_lc(vcpu, io->subchannel_id, (u16 *)__LC_SUBCHANNEL_ID);
 	rc |= put_guest_lc(vcpu, io->subchannel_nr, (u16 *)__LC_SUBCHANNEL_NR);
 	rc |= put_guest_lc(vcpu, io->io_int_parm, (u32 *)__LC_IO_INT_PARM);
@@ -1421,7 +1474,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
 	if (kvm_get_vcpu_by_id(vcpu->kvm, src_id) == NULL)
 		return -EINVAL;
 
-	if (sclp.has_sigpif)
+	if (sclp.has_sigpif && !kvm_s390_pv_handle_cpu(vcpu))
 		return sca_inject_ext_call(vcpu, src_id);
 
 	if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs))
@@ -1773,7 +1826,9 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
 	kvm->stat.inject_io++;
 	isc = int_word_to_isc(inti->io.io_int_word);
 
-	if (gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) {
+	/* do not make use of gisa in protected mode */
+	if (!kvm_s390_pv_is_protected(kvm) &&
+	    gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) {
 		VM_EVENT(kvm, 4, "%s isc %1u", "inject: I/O (AI/gisa)", isc);
 		gisa_set_ipm_gisc(gi->origin, isc);
 		kfree(inti);
@@ -1834,7 +1889,8 @@ static void __floating_irq_kick(struct kvm *kvm, u64 type)
 		break;
 	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
 		if (!(type & KVM_S390_INT_IO_AI_MASK &&
-		      kvm->arch.gisa_int.origin))
+		      kvm->arch.gisa_int.origin) ||
+		      kvm_s390_pv_handle_cpu(dst_vcpu))
 			kvm_s390_set_cpuflags(dst_vcpu, CPUSTAT_IO_INT);
 		break;
 	default:
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 17/42] KVM: s390: protvirt: Add SCLP interrupt handling
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (15 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 16/42] KVM: s390: protvirt: Implement interruption injection Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 18/42] KVM: s390: protvirt: Handle spec exception loops Christian Borntraeger
                   ` (24 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

The sclp interrupt is kind of special. The ultravisor polices that we
do not inject an sclp interrupt with payload if no sccb is outstanding.
On the other hand we have "asynchronous" event interrupts, e.g. for
console input.
We separate both variants into sclp interrupt and sclp event interrupt.
The sclp interrupt is masked until a previous servc instruction has
finished (sie exit 108).

[frankja@linux.ibm.com: factoring out write_sclp]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  6 ++-
 arch/s390/kvm/intercept.c        | 27 ++++++++++
 arch/s390/kvm/interrupt.c        | 93 +++++++++++++++++++++++++-------
 arch/s390/kvm/kvm-s390.c         |  4 ++
 4 files changed, 111 insertions(+), 19 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index a13dc77f8b07..ba3364b37159 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -518,6 +518,7 @@ enum irq_types {
 	IRQ_PEND_PFAULT_INIT,
 	IRQ_PEND_EXT_HOST,
 	IRQ_PEND_EXT_SERVICE,
+	IRQ_PEND_EXT_SERVICE_EV,
 	IRQ_PEND_EXT_TIMING,
 	IRQ_PEND_EXT_CPU_TIMER,
 	IRQ_PEND_EXT_CLOCK_COMP,
@@ -562,6 +563,7 @@ enum irq_types {
 			   (1UL << IRQ_PEND_EXT_TIMING)     | \
 			   (1UL << IRQ_PEND_EXT_HOST)       | \
 			   (1UL << IRQ_PEND_EXT_SERVICE)    | \
+			   (1UL << IRQ_PEND_EXT_SERVICE_EV) | \
 			   (1UL << IRQ_PEND_VIRTIO)         | \
 			   (1UL << IRQ_PEND_PFAULT_INIT)    | \
 			   (1UL << IRQ_PEND_PFAULT_DONE))
@@ -582,7 +584,8 @@ enum irq_types {
 			      (1UL << IRQ_PEND_EXT_CLOCK_COMP) | \
 			      (1UL << IRQ_PEND_EXT_EMERGENCY)  | \
 			      (1UL << IRQ_PEND_EXT_EXTERNAL)   | \
-			      (1UL << IRQ_PEND_EXT_SERVICE))
+			      (1UL << IRQ_PEND_EXT_SERVICE)    | \
+			      (1UL << IRQ_PEND_EXT_SERVICE_EV))
 
 struct kvm_s390_interrupt_info {
 	struct list_head list;
@@ -642,6 +645,7 @@ struct kvm_s390_local_interrupt {
 
 struct kvm_s390_float_interrupt {
 	unsigned long pending_irqs;
+	unsigned long masked_irqs;
 	spinlock_t lock;
 	struct list_head lists[FIRQ_LIST_COUNT];
 	int counters[FIRQ_MAX_COUNT];
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index 6fdbac696f65..d50a0214eba1 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -444,8 +444,35 @@ static int handle_operexc(struct kvm_vcpu *vcpu)
 	return kvm_s390_inject_program_int(vcpu, PGM_OPERATION);
 }
 
+static int handle_pv_sclp(struct kvm_vcpu *vcpu)
+{
+	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
+
+	spin_lock(&fi->lock);
+	/*
+	 * 2 cases:
+	 * a: an sccb answering interrupt was already pending or in flight.
+	 *    As the sccb value is not known we can simply set some value to
+	 *    trigger delivery of a saved SCCB. UV will then use its saved
+	 *    copy of the SCCB value.
+	 * b: an error SCCB interrupt needs to be injected so we also inject
+	 *    a fake SCCB address. Firmware will use the proper one.
+	 * This makes sure, that both errors and real sccb returns will only
+	 * be delivered after a notification intercept (instruction has
+	 * finished) but not after others.
+	 */
+	fi->srv_signal.ext_params |= 0x43000;
+	set_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs);
+	clear_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs);
+	spin_unlock(&fi->lock);
+	return 0;
+}
+
 static int handle_pv_notification(struct kvm_vcpu *vcpu)
 {
+	if (vcpu->arch.sie_block->ipa == 0xb220)
+		return handle_pv_sclp(vcpu);
+
 	return handle_instruction(vcpu);
 }
 
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 3fc748c0d924..3e160d9a214f 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -324,8 +324,11 @@ static inline int gisa_tac_ipm_gisc(struct kvm_s390_gisa *gisa, u32 gisc)
 
 static inline unsigned long pending_irqs_no_gisa(struct kvm_vcpu *vcpu)
 {
-	return vcpu->kvm->arch.float_int.pending_irqs |
-		vcpu->arch.local_int.pending_irqs;
+	unsigned long pending = vcpu->kvm->arch.float_int.pending_irqs |
+				vcpu->arch.local_int.pending_irqs;
+
+	pending &= ~vcpu->kvm->arch.float_int.masked_irqs;
+	return pending;
 }
 
 static inline unsigned long pending_irqs(struct kvm_vcpu *vcpu)
@@ -383,8 +386,10 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
 		__clear_bit(IRQ_PEND_EXT_CLOCK_COMP, &active_mask);
 	if (!(vcpu->arch.sie_block->gcr[0] & CR0_CPU_TIMER_SUBMASK))
 		__clear_bit(IRQ_PEND_EXT_CPU_TIMER, &active_mask);
-	if (!(vcpu->arch.sie_block->gcr[0] & CR0_SERVICE_SIGNAL_SUBMASK))
+	if (!(vcpu->arch.sie_block->gcr[0] & CR0_SERVICE_SIGNAL_SUBMASK)) {
 		__clear_bit(IRQ_PEND_EXT_SERVICE, &active_mask);
+		__clear_bit(IRQ_PEND_EXT_SERVICE_EV, &active_mask);
+	}
 	if (psw_mchk_disabled(vcpu))
 		active_mask &= ~IRQ_PEND_MCHK_MASK;
 	/* PV guest cpus can have a single interruption injected at a time. */
@@ -946,20 +951,49 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
 	return rc ? -EFAULT : 0;
 }
 
+#define SCCB_MASK 0xFFFFFFF8
+#define SCCB_EVENT_PENDING 0x3
+
+static int write_sclp(struct kvm_vcpu *vcpu, u32 parm)
+{
+	int rc;
+
+	if (kvm_s390_pv_handle_cpu(vcpu)) {
+		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
+		vcpu->arch.sie_block->eic = EXT_IRQ_SERVICE_SIG;
+		vcpu->arch.sie_block->eiparams = parm;
+		return 0;
+	}
+
+	rc  = put_guest_lc(vcpu, EXT_IRQ_SERVICE_SIG, (u16 *)__LC_EXT_INT_CODE);
+	rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR);
+	rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW,
+			     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW,
+			    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+	rc |= put_guest_lc(vcpu, parm,
+			   (u32 *)__LC_EXT_PARAMS);
+
+	return rc ? -EFAULT : 0;
+}
+
 static int __must_check __deliver_service(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
 	struct kvm_s390_ext_info ext;
-	int rc = 0;
 
 	spin_lock(&fi->lock);
-	if (!(test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs))) {
+	if (test_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs) ||
+	    !(test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs))) {
 		spin_unlock(&fi->lock);
 		return 0;
 	}
 	ext = fi->srv_signal;
 	memset(&fi->srv_signal, 0, sizeof(ext));
 	clear_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs);
+	clear_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs);
+	if (kvm_s390_pv_is_protected(vcpu->kvm))
+		set_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs);
 	spin_unlock(&fi->lock);
 
 	VCPU_EVENT(vcpu, 4, "deliver: sclp parameter 0x%x",
@@ -968,16 +1002,31 @@ static int __must_check __deliver_service(struct kvm_vcpu *vcpu)
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_SERVICE,
 					 ext.ext_params, 0);
 
-	rc  = put_guest_lc(vcpu, EXT_IRQ_SERVICE_SIG, (u16 *)__LC_EXT_INT_CODE);
-	rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR);
-	rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW,
-			     &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
-	rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW,
-			    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
-	rc |= put_guest_lc(vcpu, ext.ext_params,
-			   (u32 *)__LC_EXT_PARAMS);
+	return write_sclp(vcpu, ext.ext_params);
+}
 
-	return rc ? -EFAULT : 0;
+static int __must_check __deliver_service_ev(struct kvm_vcpu *vcpu)
+{
+	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
+	struct kvm_s390_ext_info ext;
+
+	spin_lock(&fi->lock);
+	if (!(test_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs))) {
+		spin_unlock(&fi->lock);
+		return 0;
+	}
+	ext = fi->srv_signal;
+	/* only clear the event bit */
+	fi->srv_signal.ext_params &= ~SCCB_EVENT_PENDING;
+	clear_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs);
+	spin_unlock(&fi->lock);
+
+	VCPU_EVENT(vcpu, 4, "%s", "deliver: sclp parameter event");
+	vcpu->stat.deliver_service_signal++;
+	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_SERVICE,
+					 ext.ext_params, 0);
+
+	return write_sclp(vcpu, SCCB_EVENT_PENDING);
 }
 
 static int __must_check __deliver_pfault_done(struct kvm_vcpu *vcpu)
@@ -1382,6 +1431,9 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
 		case IRQ_PEND_EXT_SERVICE:
 			rc = __deliver_service(vcpu);
 			break;
+		case IRQ_PEND_EXT_SERVICE_EV:
+			rc = __deliver_service_ev(vcpu);
+			break;
 		case IRQ_PEND_PFAULT_DONE:
 			rc = __deliver_pfault_done(vcpu);
 			break;
@@ -1734,9 +1786,6 @@ struct kvm_s390_interrupt_info *kvm_s390_get_io_int(struct kvm *kvm,
 	return inti;
 }
 
-#define SCCB_MASK 0xFFFFFFF8
-#define SCCB_EVENT_PENDING 0x3
-
 static int __inject_service(struct kvm *kvm,
 			     struct kvm_s390_interrupt_info *inti)
 {
@@ -1745,6 +1794,11 @@ static int __inject_service(struct kvm *kvm,
 	kvm->stat.inject_service_signal++;
 	spin_lock(&fi->lock);
 	fi->srv_signal.ext_params |= inti->ext.ext_params & SCCB_EVENT_PENDING;
+
+	/* We always allow events, track them separately from the sccb ints */
+	if (fi->srv_signal.ext_params & SCCB_EVENT_PENDING)
+		set_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs);
+
 	/*
 	 * Early versions of the QEMU s390 bios will inject several
 	 * service interrupts after another without handling a
@@ -2138,6 +2192,8 @@ void kvm_s390_clear_float_irqs(struct kvm *kvm)
 
 	spin_lock(&fi->lock);
 	fi->pending_irqs = 0;
+	if (!kvm_s390_pv_is_protected(kvm))
+		fi->masked_irqs = 0;
 	memset(&fi->srv_signal, 0, sizeof(fi->srv_signal));
 	memset(&fi->mchk, 0, sizeof(fi->mchk));
 	for (i = 0; i < FIRQ_LIST_COUNT; i++)
@@ -2202,7 +2258,8 @@ static int get_all_floating_irqs(struct kvm *kvm, u8 __user *usrbuf, u64 len)
 			n++;
 		}
 	}
-	if (test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs)) {
+	if (test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs) ||
+	    test_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs)) {
 		if (n == max_irqs) {
 			/* signal userspace to try again */
 			ret = -ENOMEM;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index d7f0f3939847..a85e50075d99 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2206,6 +2206,8 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 		}
 		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
 		kvm_s390_vcpu_unblock_all(kvm);
+		/* we need to block service interrupts from now on */
+		set_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs);
 		mutex_unlock(&kvm->lock);
 		break;
 	}
@@ -2221,6 +2223,8 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 		if (!r)
 			kvm_s390_pv_dealloc_vm(kvm);
 		kvm_s390_vcpu_unblock_all(kvm);
+		/* no need to block service interrupts any more */
+		clear_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs);
 		mutex_unlock(&kvm->lock);
 		break;
 	}
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 18/42] KVM: s390: protvirt: Handle spec exception loops
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (16 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 17/42] KVM: s390: protvirt: Add SCLP interrupt handling Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling Christian Borntraeger
                   ` (23 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

SIE intercept code 8 is used only on exception loops for protected
guests. That means we need to stop the guest when we see it. This is
done by userspace.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/intercept.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index d50a0214eba1..db3dd5ee0b7a 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -231,6 +231,13 @@ static int handle_prog(struct kvm_vcpu *vcpu)
 
 	vcpu->stat.exit_program_interruption++;
 
+	/*
+	 * Intercept 8 indicates a loop of specification exceptions
+	 * for protected guests.
+	 */
+	if (kvm_s390_pv_is_protected(vcpu->kvm))
+		return -EOPNOTSUPP;
+
 	if (guestdbg_enabled(vcpu) && per_event(vcpu)) {
 		rc = kvm_s390_handle_per_event(vcpu);
 		if (rc)
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (17 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 18/42] KVM: s390: protvirt: Handle spec exception loops Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 11:01   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer Christian Borntraeger
                   ` (22 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Guest registers for protected guests are stored at offset 0x380.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  4 +++-
 arch/s390/kvm/kvm-s390.c         | 11 +++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index ba3364b37159..4fcbb055a565 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -343,7 +343,9 @@ struct kvm_s390_itdb {
 struct sie_page {
 	struct kvm_s390_sie_block sie_block;
 	struct mcck_volatile_info mcck_info;	/* 0x0200 */
-	__u8 reserved218[1000];		/* 0x0218 */
+	__u8 reserved218[360];		/* 0x0218 */
+	__u64 pv_grregs[16];		/* 0x0380 */
+	__u8 reserved400[512];		/* 0x0400 */
 	struct kvm_s390_itdb itdb;	/* 0x0600 */
 	__u8 reserved700[2304];		/* 0x0700 */
 };
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a85e50075d99..6ebb0dae5a2e 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3999,6 +3999,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
 static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
 	int rc, exit_reason;
+	struct sie_page *sie_page = (struct sie_page *)vcpu->arch.sie_block;
 
 	/*
 	 * We try to hold kvm->srcu during most of vcpu_run (except when run-
@@ -4020,8 +4021,18 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 		guest_enter_irqoff();
 		__disable_cpu_timer_accounting(vcpu);
 		local_irq_enable();
+		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+			memcpy(sie_page->pv_grregs,
+			       vcpu->run->s.regs.gprs,
+			       sizeof(sie_page->pv_grregs));
+		}
 		exit_reason = sie64a(vcpu->arch.sie_block,
 				     vcpu->run->s.regs.gprs);
+		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+			memcpy(vcpu->run->s.regs.gprs,
+			       sie_page->pv_grregs,
+			       sizeof(sie_page->pv_grregs));
+		}
 		local_irq_disable();
 		__enable_cpu_timer_accounting(vcpu);
 		guest_exit_irqoff();
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (18 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 11:08   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 21/42] KVM: s390: protvirt: handle secure guest prefix pages Christian Borntraeger
                   ` (21 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Now that we can't access guest memory anymore, we have a dedicated
satellite block that's a bounce buffer for instruction data.

We re-use the memop interface to copy the instruction data to / from
userspace. This lets us re-use a lot of QEMU code which used that
interface to make logical guest memory accesses which are not possible
anymore in protected mode anyway.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h | 11 ++++++++-
 arch/s390/kvm/kvm-s390.c         | 42 ++++++++++++++++++++++++++++++++
 arch/s390/kvm/pv.c               |  9 +++++++
 include/uapi/linux/kvm.h         |  6 ++++-
 4 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 4fcbb055a565..aa945b101fff 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -127,6 +127,12 @@ struct mcck_volatile_info {
 #define CR14_INITIAL_MASK (CR14_UNUSED_32 | CR14_UNUSED_33 | \
 			   CR14_EXTERNAL_DAMAGE_SUBMASK)
 
+#define SIDAD_SIZE_MASK		0xff
+#define sida_origin(sie_block) \
+	((sie_block)->sidad & PAGE_MASK)
+#define sida_size(sie_block) \
+	((((sie_block)->sidad & SIDAD_SIZE_MASK) + 1) * PAGE_SIZE)
+
 #define CPUSTAT_STOPPED    0x80000000
 #define CPUSTAT_WAIT       0x10000000
 #define CPUSTAT_ECALL_PEND 0x08000000
@@ -315,7 +321,10 @@ struct kvm_s390_sie_block {
 #define CRYCB_FORMAT2 0x00000003
 	__u32	crycbd;			/* 0x00fc */
 	__u64	gcr[16];		/* 0x0100 */
-	__u64	gbea;			/* 0x0180 */
+	union {
+		__u64	gbea;		/* 0x0180 */
+		__u64	sidad;
+	};
 	__u8    reserved188[8];		/* 0x0188 */
 	__u64   sdnxo;			/* 0x0190 */
 	__u8    reserved198[8];		/* 0x0198 */
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 6ebb0dae5a2e..8db82aaf1275 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4435,6 +4435,34 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 	return r;
 }
 
+static long kvm_s390_guest_sida_op(struct kvm_vcpu *vcpu,
+				   struct kvm_s390_mem_op *mop)
+{
+	void __user *uaddr = (void __user *)mop->buf;
+	int r = 0;
+
+	if (mop->flags || !mop->size)
+		return -EINVAL;
+	if (mop->size + mop->sida_offset < mop->size)
+		return -EINVAL;
+	if (mop->size + mop->sida_offset > sida_size(vcpu->arch.sie_block))
+		return -E2BIG;
+
+	switch (mop->op) {
+	case KVM_S390_MEMOP_SIDA_READ:
+		if (copy_to_user(uaddr, (void *)(sida_origin(vcpu->arch.sie_block) +
+				 mop->sida_offset), mop->size))
+			r = -EFAULT;
+
+		break;
+	case KVM_S390_MEMOP_SIDA_WRITE:
+		if (copy_from_user((void *)(sida_origin(vcpu->arch.sie_block) +
+				   mop->sida_offset), uaddr, mop->size))
+			r = -EFAULT;
+		break;
+	}
+	return r;
+}
 static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 				  struct kvm_s390_mem_op *mop)
 {
@@ -4444,6 +4472,7 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 	const u64 supported_flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION
 				    | KVM_S390_MEMOP_F_CHECK_ONLY;
 
+	BUILD_BUG_ON(sizeof(*mop) != 64);
 	if (mop->flags & ~supported_flags || mop->ar >= NUM_ACRS || !mop->size)
 		return -EINVAL;
 
@@ -4460,6 +4489,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 
 	switch (mop->op) {
 	case KVM_S390_MEMOP_LOGICAL_READ:
+		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+			r = -EINVAL;
+			break;
+		}
 		if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
 			r = check_gva_range(vcpu, mop->gaddr, mop->ar,
 					    mop->size, GACC_FETCH);
@@ -4472,6 +4505,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 		}
 		break;
 	case KVM_S390_MEMOP_LOGICAL_WRITE:
+		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+			r = -EINVAL;
+			break;
+		}
 		if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
 			r = check_gva_range(vcpu, mop->gaddr, mop->ar,
 					    mop->size, GACC_STORE);
@@ -4483,6 +4520,11 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 		}
 		r = write_guest(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size);
 		break;
+	case KVM_S390_MEMOP_SIDA_READ:
+	case KVM_S390_MEMOP_SIDA_WRITE:
+		/* we are locked against sida going away by the vcpu->mutex */
+		r = kvm_s390_guest_sida_op(vcpu, mop);
+		break;
 	default:
 		r = -EINVAL;
 	}
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
index 09573e36c329..80169a9b43ec 100644
--- a/arch/s390/kvm/pv.c
+++ b/arch/s390/kvm/pv.c
@@ -92,6 +92,7 @@ int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
 
 	free_pages(vcpu->arch.pv.stor_base,
 		   get_order(uv_info.guest_cpu_stor_len));
+	free_page(sida_origin(vcpu->arch.sie_block));
 	vcpu->arch.sie_block->pv_handle_cpu = 0;
 	vcpu->arch.sie_block->pv_handle_config = 0;
 	memset(&vcpu->arch.pv, 0, sizeof(vcpu->arch.pv));
@@ -121,6 +122,14 @@ int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
 	uvcb.state_origin = (u64)vcpu->arch.sie_block;
 	uvcb.stor_origin = (u64)vcpu->arch.pv.stor_base;
 
+	/* Alloc Secure Instruction Data Area Designation */
+	vcpu->arch.sie_block->sidad = __get_free_page(GFP_KERNEL | __GFP_ZERO);
+	if (!vcpu->arch.sie_block->sidad) {
+		free_pages(vcpu->arch.pv.stor_base,
+			   get_order(uv_info.guest_cpu_stor_len));
+		return -ENOMEM;
+	}
+
 	cc = uv_call(0, (u64)&uvcb);
 	*rc = uvcb.header.rc;
 	*rrc = uvcb.header.rrc;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 207915488502..0fdee1bc3798 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -475,11 +475,15 @@ struct kvm_s390_mem_op {
 	__u32 op;		/* type of operation */
 	__u64 buf;		/* buffer in userspace */
 	__u8 ar;		/* the access register number */
-	__u8 reserved[31];	/* should be set to 0 */
+	__u8 reserved21[3];	/* should be set to 0 */
+	__u32 sida_offset;	/* offset into the sida */
+	__u8 reserved28[24];	/* should be set to 0 */
 };
 /* types for kvm_s390_mem_op->op */
 #define KVM_S390_MEMOP_LOGICAL_READ	0
 #define KVM_S390_MEMOP_LOGICAL_WRITE	1
+#define KVM_S390_MEMOP_SIDA_READ	2
+#define KVM_S390_MEMOP_SIDA_WRITE	3
 /* flags for kvm_s390_mem_op->flags */
 #define KVM_S390_MEMOP_F_CHECK_ONLY		(1ULL << 0)
 #define KVM_S390_MEMOP_F_INJECT_EXCEPTION	(1ULL << 1)
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 21/42] KVM: s390: protvirt: handle secure guest prefix pages
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (19 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 11:11   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 22/42] KVM: s390/mm: handle guest unpin events Christian Borntraeger
                   ` (20 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

The SPX instruction is handled by the ultravisor. We do get a
notification intercept, though. Let us update our internal view.

In addition to that, when the guest prefix page is not secure, an
intercept 112 (0x70) is indicated. Let us make the prefix pages
secure again.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  1 +
 arch/s390/kvm/intercept.c        | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index aa945b101fff..0ea82152d2f7 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -225,6 +225,7 @@ struct kvm_s390_sie_block {
 #define ICPT_INT_ENABLE	0x64
 #define ICPT_PV_INSTR	0x68
 #define ICPT_PV_NOTIFY	0x6c
+#define ICPT_PV_PREF	0x70
 	__u8	icptcode;		/* 0x0050 */
 	__u8	icptstatus;		/* 0x0051 */
 	__u16	ihcpu;			/* 0x0052 */
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index db3dd5ee0b7a..6c9db886381c 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -451,6 +451,15 @@ static int handle_operexc(struct kvm_vcpu *vcpu)
 	return kvm_s390_inject_program_int(vcpu, PGM_OPERATION);
 }
 
+static int handle_pv_spx(struct kvm_vcpu *vcpu)
+{
+	u32 pref = *(u32 *)vcpu->arch.sie_block->sidad;
+
+	kvm_s390_set_prefix(vcpu, pref);
+	trace_kvm_s390_handle_prefix(vcpu, 1, pref);
+	return 0;
+}
+
 static int handle_pv_sclp(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
@@ -477,6 +486,8 @@ static int handle_pv_sclp(struct kvm_vcpu *vcpu)
 
 static int handle_pv_notification(struct kvm_vcpu *vcpu)
 {
+	if (vcpu->arch.sie_block->ipa == 0xb210)
+		return handle_pv_spx(vcpu);
 	if (vcpu->arch.sie_block->ipa == 0xb220)
 		return handle_pv_sclp(vcpu);
 
@@ -534,6 +545,13 @@ int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
 	case ICPT_PV_NOTIFY:
 		rc = handle_pv_notification(vcpu);
 		break;
+	case ICPT_PV_PREF:
+		rc = 0;
+		gmap_convert_to_secure(vcpu->arch.gmap,
+				       kvm_s390_get_prefix(vcpu));
+		gmap_convert_to_secure(vcpu->arch.gmap,
+				       kvm_s390_get_prefix(vcpu) + PAGE_SIZE);
+		break;
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 22/42] KVM: s390/mm: handle guest unpin events
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (20 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 21/42] KVM: s390: protvirt: handle secure guest prefix pages Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 14:23   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area Christian Borntraeger
                   ` (19 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Claudio Imbrenda <imbrenda@linux.ibm.com>

The current code tries to first pin shared pages, if that fails (e.g.
because the page is not shared) it will export them. For shared pages
this means that we get a new intercept telling us that the guest is
unsharing that page. We will make the page secure at that point in time
and revoke the host access. This is synchronized with other host events,
e.g. the code will wait until host I/O has finished.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/intercept.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index 6c9db886381c..1e231058e4b3 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -16,6 +16,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/irq.h>
 #include <asm/sysinfo.h>
+#include <asm/uv.h>
 
 #include "kvm-s390.h"
 #include "gaccess.h"
@@ -484,12 +485,35 @@ static int handle_pv_sclp(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int handle_pv_uvc(struct kvm_vcpu *vcpu)
+{
+	struct uv_cb_share *guest_uvcb = (void *)vcpu->arch.sie_block->sidad;
+	struct uv_cb_cts uvcb = {
+		.header.cmd	= UVC_CMD_UNPIN_PAGE_SHARED,
+		.header.len	= sizeof(uvcb),
+		.guest_handle	= kvm_s390_pv_handle(vcpu->kvm),
+		.gaddr		= guest_uvcb->paddr,
+	};
+	int rc;
+
+	if (guest_uvcb->header.cmd != UVC_CMD_REMOVE_SHARED_ACCESS) {
+		WARN_ONCE(1, "Unexpected UVC 0x%x!\n", guest_uvcb->header.cmd);
+		return 0;
+	}
+	rc = gmap_make_secure(vcpu->arch.gmap, uvcb.gaddr, &uvcb);
+	if (rc == -EINVAL)
+		return 0;
+	return rc;
+}
+
 static int handle_pv_notification(struct kvm_vcpu *vcpu)
 {
 	if (vcpu->arch.sie_block->ipa == 0xb210)
 		return handle_pv_spx(vcpu);
 	if (vcpu->arch.sie_block->ipa == 0xb220)
 		return handle_pv_sclp(vcpu);
+	if (vcpu->arch.sie_block->ipa == 0xb9a4)
+		return handle_pv_uvc(vcpu);
 
 	return handle_instruction(vcpu);
 }
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (21 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 22/42] KVM: s390/mm: handle guest unpin events Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 14:24   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 24/42] KVM: s390: protvirt: STSI handling Christian Borntraeger
                   ` (18 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

STHYI data has to go through the bounce buffer.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/intercept.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index 1e231058e4b3..cfabeecbb777 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -392,7 +392,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
 		goto out;
 	}
 
-	if (addr & ~PAGE_MASK)
+	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (addr & ~PAGE_MASK))
 		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
 
 	sctns = (void *)get_zeroed_page(GFP_KERNEL);
@@ -403,10 +403,15 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
 
 out:
 	if (!cc) {
-		r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE);
-		if (r) {
-			free_page((unsigned long)sctns);
-			return kvm_s390_inject_prog_cond(vcpu, r);
+		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+			memcpy((void *)(sida_origin(vcpu->arch.sie_block)),
+			       sctns, PAGE_SIZE);
+		} else {
+			r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE);
+			if (r) {
+				free_page((unsigned long)sctns);
+				return kvm_s390_inject_prog_cond(vcpu, r);
+			}
 		}
 	}
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 24/42] KVM: s390: protvirt: STSI handling
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (22 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  8:35   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg Christian Borntraeger
                   ` (17 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Save response to sidad and disable address checking for protected
guests.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/priv.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index ed52ffa8d5d4..b2de7dc5f58d 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -872,7 +872,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 
 	operand2 = kvm_s390_get_base_disp_s(vcpu, &ar);
 
-	if (operand2 & 0xfff)
+	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (operand2 & 0xfff))
 		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
 
 	switch (fc) {
@@ -893,8 +893,13 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 		handle_stsi_3_2_2(vcpu, (void *) mem);
 		break;
 	}
-
-	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
+	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
+		       PAGE_SIZE);
+		rc = 0;
+	} else {
+		rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
+	}
 	if (rc) {
 		rc = kvm_s390_inject_prog_cond(vcpu, rc);
 		goto out;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (23 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 24/42] KVM: s390: protvirt: STSI handling Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  8:40   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible Christian Borntraeger
                   ` (16 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

A lot of the registers are controlled by the Ultravisor and never
visible to KVM. Some fields in the sie control block are overlayed, like
gbea. As no known userspace uses the ONE_REG interface on s390 if sync
regs are available, no functionality is lost if it is disabled for
protected guests.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/api.rst | 6 ++++--
 arch/s390/kvm/kvm-s390.c       | 3 +++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index cb58714fe60d..a82166e5f7d9 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2117,7 +2117,8 @@ Errors:
 
   ======   ============================================================
   ENOENT   no such register
-  EINVAL   invalid register ID, or no such register
+  EINVAL   invalid register ID, or no such register, ONE_REG forbidden
+           for protected guests (s390)
   EPERM    (arm64) register access not allowed before vcpu finalization
   ======   ============================================================
 
@@ -2552,7 +2553,8 @@ Errors include:
 
   ======== ============================================================
   ENOENT   no such register
-  EINVAL   invalid register ID, or no such register
+  EINVAL   invalid register ID, or no such register, ONE_REG forbidden
+           for protected guests (s390)
   EPERM    (arm64) register access not allowed before vcpu finalization
   ======== ============================================================
 
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 8db82aaf1275..d20a7fa9d480 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4638,6 +4638,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	case KVM_SET_ONE_REG:
 	case KVM_GET_ONE_REG: {
 		struct kvm_one_reg reg;
+		r = -EINVAL;
+		if (kvm_s390_pv_is_protected(vcpu->kvm))
+			break;
 		r = -EFAULT;
 		if (copy_from_user(&reg, argp, sizeof(reg)))
 			break;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (24 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  8:42   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 27/42] KVM: s390: protvirt: Only sync fmt4 registers Christian Borntraeger
                   ` (15 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

For protected VMs the hypervisor can not access guest breaking event
address, program parameter, bpbc and todpr. Do not reset those fields
as the control block does not provide access to these fields.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index d20a7fa9d480..5b551cc73540 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3442,14 +3442,16 @@ static void kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
 	kvm_s390_set_prefix(vcpu, 0);
 	kvm_s390_set_cpu_timer(vcpu, 0);
 	vcpu->arch.sie_block->ckc = 0;
-	vcpu->arch.sie_block->todpr = 0;
 	memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr));
 	vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK;
 	vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK;
 	vcpu->run->s.regs.fpc = 0;
-	vcpu->arch.sie_block->gbea = 1;
-	vcpu->arch.sie_block->pp = 0;
-	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
+	if (!kvm_s390_pv_handle_cpu(vcpu)) {
+		vcpu->arch.sie_block->gbea = 1;
+		vcpu->arch.sie_block->pp = 0;
+		vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
+		vcpu->arch.sie_block->todpr = 0;
+	}
 }
 
 static void kvm_arch_vcpu_ioctl_clear_reset(struct kvm_vcpu *vcpu)
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 27/42] KVM: s390: protvirt: Only sync fmt4 registers
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (25 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection Christian Borntraeger
                   ` (14 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

A lot of the registers are controlled by the Ultravisor and never
visible to KVM. Also some registers are overlayed, like gbea is with
sidad, which might leak data to userspace.

Hence we sync a minimal set of registers for both SIE formats and then
check and sync format 2 registers if necessary.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 110 +++++++++++++++++++++++++--------------
 1 file changed, 70 insertions(+), 40 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 5b551cc73540..4a97d3b7840e 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4048,7 +4048,7 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 	return rc;
 }
 
-static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+static void sync_regs_fmt2(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	struct runtime_instr_cb *riccb;
 	struct gs_cb *gscb;
@@ -4057,16 +4057,7 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	gscb = (struct gs_cb *) &kvm_run->s.regs.gscb;
 	vcpu->arch.sie_block->gpsw.mask = kvm_run->psw_mask;
 	vcpu->arch.sie_block->gpsw.addr = kvm_run->psw_addr;
-	if (kvm_run->kvm_dirty_regs & KVM_SYNC_PREFIX)
-		kvm_s390_set_prefix(vcpu, kvm_run->s.regs.prefix);
-	if (kvm_run->kvm_dirty_regs & KVM_SYNC_CRS) {
-		memcpy(&vcpu->arch.sie_block->gcr, &kvm_run->s.regs.crs, 128);
-		/* some control register changes require a tlb flush */
-		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
-	}
 	if (kvm_run->kvm_dirty_regs & KVM_SYNC_ARCH0) {
-		kvm_s390_set_cpu_timer(vcpu, kvm_run->s.regs.cputm);
-		vcpu->arch.sie_block->ckc = kvm_run->s.regs.ckc;
 		vcpu->arch.sie_block->todpr = kvm_run->s.regs.todpr;
 		vcpu->arch.sie_block->pp = kvm_run->s.regs.pp;
 		vcpu->arch.sie_block->gbea = kvm_run->s.regs.gbea;
@@ -4107,6 +4098,36 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
 		vcpu->arch.sie_block->fpf |= kvm_run->s.regs.bpbc ? FPF_BPBC : 0;
 	}
+	if (MACHINE_HAS_GS) {
+		preempt_disable();
+		__ctl_set_bit(2, 4);
+		if (current->thread.gs_cb) {
+			vcpu->arch.host_gscb = current->thread.gs_cb;
+			save_gs_cb(vcpu->arch.host_gscb);
+		}
+		if (vcpu->arch.gs_enabled) {
+			current->thread.gs_cb = (struct gs_cb *)
+						&vcpu->run->s.regs.gscb;
+			restore_gs_cb(current->thread.gs_cb);
+		}
+		preempt_enable();
+	}
+	/* SIE will load etoken directly from SDNX and therefore kvm_run */
+}
+
+static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+{
+	if (kvm_run->kvm_dirty_regs & KVM_SYNC_PREFIX)
+		kvm_s390_set_prefix(vcpu, kvm_run->s.regs.prefix);
+	if (kvm_run->kvm_dirty_regs & KVM_SYNC_CRS) {
+		memcpy(&vcpu->arch.sie_block->gcr, &kvm_run->s.regs.crs, 128);
+		/* some control register changes require a tlb flush */
+		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
+	}
+	if (kvm_run->kvm_dirty_regs & KVM_SYNC_ARCH0) {
+		kvm_s390_set_cpu_timer(vcpu, kvm_run->s.regs.cputm);
+		vcpu->arch.sie_block->ckc = kvm_run->s.regs.ckc;
+	}
 	save_access_regs(vcpu->arch.host_acrs);
 	restore_access_regs(vcpu->run->s.regs.acrs);
 	/* save host (userspace) fprs/vrs */
@@ -4121,23 +4142,47 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	if (test_fp_ctl(current->thread.fpu.fpc))
 		/* User space provided an invalid FPC, let's clear it */
 		current->thread.fpu.fpc = 0;
+
+	/* Sync fmt2 only data */
+	if (likely(!kvm_s390_pv_is_protected(vcpu->kvm))) {
+		sync_regs_fmt2(vcpu, kvm_run);
+	} else {
+		/*
+		 * In several places we have to modify our internal view to
+		 * not do things that are disallowed by the ultravisor. For
+		 * example we must not inject interrupts after specific exits
+		 * (e.g. 112 prefix page not secure). We do this by turning
+		 * off the machine check, external and I/O interrupt bits
+		 * of our PSW copy. To avoid getting validity intercepts, we
+		 * do only accept the condition code from userspace.
+		 */
+		vcpu->arch.sie_block->gpsw.mask &= ~PSW_MASK_CC;
+		vcpu->arch.sie_block->gpsw.mask |= kvm_run->psw_mask &
+						   PSW_MASK_CC;
+	}
+
+	kvm_run->kvm_dirty_regs = 0;
+}
+
+static void store_regs_fmt2(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+{
+	kvm_run->s.regs.todpr = vcpu->arch.sie_block->todpr;
+	kvm_run->s.regs.pp = vcpu->arch.sie_block->pp;
+	kvm_run->s.regs.gbea = vcpu->arch.sie_block->gbea;
+	kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC;
 	if (MACHINE_HAS_GS) {
-		preempt_disable();
 		__ctl_set_bit(2, 4);
-		if (current->thread.gs_cb) {
-			vcpu->arch.host_gscb = current->thread.gs_cb;
-			save_gs_cb(vcpu->arch.host_gscb);
-		}
-		if (vcpu->arch.gs_enabled) {
-			current->thread.gs_cb = (struct gs_cb *)
-						&vcpu->run->s.regs.gscb;
-			restore_gs_cb(current->thread.gs_cb);
-		}
+		if (vcpu->arch.gs_enabled)
+			save_gs_cb(current->thread.gs_cb);
+		preempt_disable();
+		current->thread.gs_cb = vcpu->arch.host_gscb;
+		restore_gs_cb(vcpu->arch.host_gscb);
 		preempt_enable();
+		if (!vcpu->arch.host_gscb)
+			__ctl_clear_bit(2, 4);
+		vcpu->arch.host_gscb = NULL;
 	}
-	/* SIE will load etoken directly from SDNX and therefore kvm_run */
-
-	kvm_run->kvm_dirty_regs = 0;
+	/* SIE will save etoken directly into SDNX and therefore kvm_run */
 }
 
 static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
@@ -4148,13 +4193,9 @@ static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	memcpy(&kvm_run->s.regs.crs, &vcpu->arch.sie_block->gcr, 128);
 	kvm_run->s.regs.cputm = kvm_s390_get_cpu_timer(vcpu);
 	kvm_run->s.regs.ckc = vcpu->arch.sie_block->ckc;
-	kvm_run->s.regs.todpr = vcpu->arch.sie_block->todpr;
-	kvm_run->s.regs.pp = vcpu->arch.sie_block->pp;
-	kvm_run->s.regs.gbea = vcpu->arch.sie_block->gbea;
 	kvm_run->s.regs.pft = vcpu->arch.pfault_token;
 	kvm_run->s.regs.pfs = vcpu->arch.pfault_select;
 	kvm_run->s.regs.pfc = vcpu->arch.pfault_compare;
-	kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC;
 	save_access_regs(vcpu->run->s.regs.acrs);
 	restore_access_regs(vcpu->arch.host_acrs);
 	/* Save guest register state */
@@ -4163,19 +4204,8 @@ static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	/* Restore will be done lazily at return */
 	current->thread.fpu.fpc = vcpu->arch.host_fpregs.fpc;
 	current->thread.fpu.regs = vcpu->arch.host_fpregs.regs;
-	if (MACHINE_HAS_GS) {
-		__ctl_set_bit(2, 4);
-		if (vcpu->arch.gs_enabled)
-			save_gs_cb(current->thread.gs_cb);
-		preempt_disable();
-		current->thread.gs_cb = vcpu->arch.host_gscb;
-		restore_gs_cb(vcpu->arch.host_gscb);
-		preempt_enable();
-		if (!vcpu->arch.host_gscb)
-			__ctl_clear_bit(2, 4);
-		vcpu->arch.host_gscb = NULL;
-	}
-	/* SIE will save etoken directly into SDNX and therefore kvm_run */
+	if (likely(!kvm_s390_pv_is_protected(vcpu->kvm)))
+		store_regs_fmt2(vcpu, kvm_run);
 }
 
 int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (26 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 27/42] KVM: s390: protvirt: Only sync fmt4 registers Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:33   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling Christian Borntraeger
                   ` (13 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Only two program exceptions can be injected for a protected guest:
specification and operand.

For both, a code needs to be specified in the interrupt injection
control of the state description, as the guest prefix page is not
accessible to KVM for such guests.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/interrupt.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 3e160d9a214f..7a10096fa204 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -836,6 +836,21 @@ static int __must_check __deliver_external_call(struct kvm_vcpu *vcpu)
 	return rc ? -EFAULT : 0;
 }
 
+static int __deliver_prog_pv(struct kvm_vcpu *vcpu, u16 code)
+{
+	switch (code) {
+	case PGM_SPECIFICATION:
+		vcpu->arch.sie_block->iictl = IICTL_CODE_SPECIFICATION;
+		break;
+	case PGM_OPERAND:
+		vcpu->arch.sie_block->iictl = IICTL_CODE_OPERAND;
+		break;
+	default:
+		return -EINVAL;
+	}
+	return 0;
+}
+
 static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
@@ -856,6 +871,9 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
 					 pgm_info.code, 0);
 
+	if (kvm_s390_pv_is_protected(vcpu->kvm))
+		return __deliver_prog_pv(vcpu, pgm_info.code & ~PGM_PER);
+
 	switch (pgm_info.code & ~PGM_PER) {
 	case PGM_AFX_TRANSLATION:
 	case PGM_ASX_TRANSLATION:
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (27 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:38   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1 Christian Borntraeger
                   ` (12 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

If the host initialized the Ultravisor, we can set stfle bit 161
(protected virtual IPL enhancements facility), which indicates that
the IPL subcodes 8, 9, and 10 are valid. These subcodes are used by a
normal guest to set/retrieve an IPL information block of type 5 (for
protected virtual machines) and transition into protected mode.

Once in protected mode, the Ultravisor will conceal the facility bit.
Therefore each boot into protected mode has to go through
non-protected mode. There is no secure re-ipl with subcode 10 without
a previous subcode 3.

In protected mode, there is no subcode 4 available, as the VM has no
more access to its memory from non-protected mode. I.e., only a IPL
clear is possible.

The error cases will all be handled in userspace.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 4a97d3b7840e..f96c1f530cc2 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2621,6 +2621,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (css_general_characteristics.aiv && test_facility(65))
 		set_kvm_facility(kvm->arch.model.fac_mask, 65);
 
+	if (is_prot_virt_host()) {
+		set_kvm_facility(kvm->arch.model.fac_mask, 161);
+		set_kvm_facility(kvm->arch.model.fac_list, 161);
+	}
+
 	kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
 	kvm->arch.model.ibc = sclp.ibc & 0x0fff;
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (28 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:44   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor Christian Borntraeger
                   ` (11 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

diag 308 subcode 0 and 1 require several KVM and Ultravisor interactions.
Specific to these "soft" reboots are

* The "unshare all" UVC
* The "prepare for reset" UVC

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/uv.h |  4 ++++
 arch/s390/kvm/kvm-s390.c   | 22 ++++++++++++++++++++++
 include/uapi/linux/kvm.h   |  2 ++
 3 files changed, 28 insertions(+)

diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index 839cb3a89986..254d5769d136 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -36,6 +36,8 @@
 #define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
 #define UVC_CMD_UNPACK_IMG		0x0301
 #define UVC_CMD_VERIFY_IMG		0x0302
+#define UVC_CMD_PREPARE_RESET		0x0320
+#define UVC_CMD_SET_UNSHARE_ALL		0x0340
 #define UVC_CMD_PIN_PAGE_SHARED		0x0341
 #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
 #define UVC_CMD_SET_SHARED_ACCESS	0x1000
@@ -56,6 +58,8 @@ enum uv_cmds_inst {
 	BIT_UVC_CMD_SET_SEC_PARMS = 11,
 	BIT_UVC_CMD_UNPACK_IMG = 13,
 	BIT_UVC_CMD_VERIFY_IMG = 14,
+	BIT_UVC_CMD_PREPARE_RESET = 18,
+	BIT_UVC_CMD_UNSHARE_ALL = 20,
 	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
 	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
 };
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index f96c1f530cc2..ad84c1144908 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2285,6 +2285,28 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 			     cmd->rrc);
 		break;
 	}
+	case KVM_PV_VM_PREP_RESET: {
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
+				  UVC_CMD_PREPARE_RESET, &cmd->rc, &cmd->rrc);
+		KVM_UV_EVENT(kvm, 3, "PROTVIRT PREP RESET: rc %x rrc %x",
+			     cmd->rc, cmd->rrc);
+		break;
+	}
+	case KVM_PV_VM_UNSHARE_ALL: {
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
+				  UVC_CMD_SET_UNSHARE_ALL, &cmd->rc, &cmd->rrc);
+		KVM_UV_EVENT(kvm, 3, "PROTVIRT UNSHARE: rc %x rrc %x",
+			     cmd->rc, cmd->rrc);
+		break;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0fdee1bc3798..b6ab4ad21b60 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1500,6 +1500,8 @@ enum pv_cmd_id {
 	KVM_PV_VM_SET_SEC_PARMS,
 	KVM_PV_VM_UNPACK,
 	KVM_PV_VM_VERIFY,
+	KVM_PV_VM_PREP_RESET,
+	KVM_PV_VM_UNSHARE_ALL,
 	KVM_PV_VCPU_CREATE,
 	KVM_PV_VCPU_DESTROY,
 };
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (29 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1 Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:48   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state Christian Borntraeger
                   ` (10 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

VCPU states have to be reported to the ultravisor for SIGP
interpretation, kdump, kexec and reboot.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/uv.h | 15 +++++++++++++++
 arch/s390/kvm/kvm-s390.c   |  7 ++++++-
 arch/s390/kvm/kvm-s390.h   |  2 ++
 arch/s390/kvm/pv.c         | 22 ++++++++++++++++++++++
 4 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index 254d5769d136..7b82881ec3b4 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -37,6 +37,7 @@
 #define UVC_CMD_UNPACK_IMG		0x0301
 #define UVC_CMD_VERIFY_IMG		0x0302
 #define UVC_CMD_PREPARE_RESET		0x0320
+#define UVC_CMD_CPU_SET_STATE		0x0330
 #define UVC_CMD_SET_UNSHARE_ALL		0x0340
 #define UVC_CMD_PIN_PAGE_SHARED		0x0341
 #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
@@ -58,6 +59,7 @@ enum uv_cmds_inst {
 	BIT_UVC_CMD_SET_SEC_PARMS = 11,
 	BIT_UVC_CMD_UNPACK_IMG = 13,
 	BIT_UVC_CMD_VERIFY_IMG = 14,
+	BIT_UVC_CMD_CPU_SET_STATE = 17,
 	BIT_UVC_CMD_PREPARE_RESET = 18,
 	BIT_UVC_CMD_UNSHARE_ALL = 20,
 	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
@@ -164,6 +166,19 @@ struct uv_cb_unp {
 	u64 reserved38[3];
 } __packed __aligned(8);
 
+#define PV_CPU_STATE_OPR	1
+#define PV_CPU_STATE_STP	2
+#define PV_CPU_STATE_CHKSTP	3
+
+struct uv_cb_cpu_set_state {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 cpu_handle;
+	u8  reserved20[7];
+	u8  state;
+	u64 reserved28[5];
+};
+
 /*
  * A common UV call struct for calls that take no payload
  * Examples:
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ad84c1144908..5426b01e3da1 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4396,6 +4396,7 @@ static void __enable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
 void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 {
 	int i, online_vcpus, started_vcpus = 0;
+	u16 rc, rrc;
 
 	if (!is_vcpu_stopped(vcpu))
 		return;
@@ -4421,7 +4422,8 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 		 */
 		__disable_ibs_on_all_vcpus(vcpu->kvm);
 	}
-
+	/* Let's tell the UV that we want to start again */
+	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR, &rc, &rrc);
 	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
 	/*
 	 * Another VCPU might have used IBS while we were offline.
@@ -4436,6 +4438,7 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
 {
 	int i, online_vcpus, started_vcpus = 0;
 	struct kvm_vcpu *started_vcpu = NULL;
+	u16 rc, rrc;
 
 	if (is_vcpu_stopped(vcpu))
 		return;
@@ -4449,6 +4452,8 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
 	kvm_s390_clear_stop_irq(vcpu);
 
 	kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
+	/* Let's tell the UV that we successfully stopped the vcpu */
+	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_STP, &rc, &rrc);
 	__disable_ibs_on_vcpu(vcpu);
 
 	for (i = 0; i < online_vcpus; i++) {
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index d5503dd0d1e4..1af1e30beead 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -218,6 +218,8 @@ int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
 			      u16 *rrc);
 int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
 		       unsigned long tweak, u16 *rc, u16 *rrc);
+int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
+			      u16 *rrc);
 
 static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
 {
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
index 80169a9b43ec..b4bf6b6eb708 100644
--- a/arch/s390/kvm/pv.c
+++ b/arch/s390/kvm/pv.c
@@ -271,3 +271,25 @@ int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
 		KVM_UV_EVENT(kvm, 3, "%s", "PROTVIRT VM UNPACK: successful");
 	return ret;
 }
+
+int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
+			      u16 *rrc)
+{
+	struct uv_cb_cpu_set_state uvcb = {
+		.header.cmd	= UVC_CMD_CPU_SET_STATE,
+		.header.len	= sizeof(uvcb),
+		.cpu_handle	= kvm_s390_pv_handle_cpu(vcpu),
+		.state		= state,
+	};
+	int cc;
+
+	if (!kvm_s390_pv_handle_cpu(vcpu))
+		return -EINVAL;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	if (cc)
+		return -EINVAL;
+	return 0;
+}
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (30 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:50   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112 Christian Borntraeger
                   ` (9 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Code 5 for the set cpu state UV call tells the UV to load a PSW from
the SE header (first IPL) or from guest location 0x0 (diag 308 subcode
0/1). Also it sets the cpu into operating state afterwards, so we can
start it.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/uv.h | 1 +
 arch/s390/kvm/kvm-s390.c   | 8 ++++++++
 include/uapi/linux/kvm.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index 7b82881ec3b4..d59825d95b9d 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -169,6 +169,7 @@ struct uv_cb_unp {
 #define PV_CPU_STATE_OPR	1
 #define PV_CPU_STATE_STP	2
 #define PV_CPU_STATE_CHKSTP	3
+#define PV_CPU_STATE_OPR_LOAD	5
 
 struct uv_cb_cpu_set_state {
 	struct uv_cb_header header;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 5426b01e3da1..b6113285f47f 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4656,6 +4656,14 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
 		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
 		break;
 	}
+	case KVM_PV_VCPU_SET_IPL_PSW: {
+		if (!kvm_s390_pv_handle_cpu(vcpu))
+			return -EINVAL;
+
+		r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD,
+					      &cmd->rc, &cmd->rrc);
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b6ab4ad21b60..887ecb3dbc79 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1504,6 +1504,7 @@ enum pv_cmd_id {
 	KVM_PV_VM_UNSHARE_ALL,
 	KVM_PV_VCPU_CREATE,
 	KVM_PV_VCPU_DESTROY,
+	KVM_PV_VCPU_SET_IPL_PSW,
 };
 
 struct kvm_pv_cmd {
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (31 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:53   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start Christian Borntraeger
                   ` (8 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

We're not allowed to inject interrupts on intercepts that leave the
guest state in an "in-between" state where the next SIE entry will do a
continuation, namely secure instruction interception (104) and secure
prefix interception (112).
As our PSW is just a copy of the real one that will be replaced on the
next exit, we can mask out the interrupt bits in the PSW to make sure
that we do not inject anything.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index b6113285f47f..1b6963bbc96f 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4025,6 +4025,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
 	return vcpu_post_run_fault_in_sie(vcpu);
 }
 
+#define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK)
 static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
 	int rc, exit_reason;
@@ -4061,6 +4062,16 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 			memcpy(vcpu->run->s.regs.gprs,
 			       sie_page->pv_grregs,
 			       sizeof(sie_page->pv_grregs));
+			/*
+			 * We're not allowed to inject interrupts on intercepts
+			 * that leave the guest state in an "in-between" state
+			 * where the next SIE entry will do a continuation.
+			 * Fence interrupts in our "internal" PSW.
+			 */
+			if (vcpu->arch.sie_block->icptcode == ICPT_PV_INSTR ||
+			    vcpu->arch.sie_block->icptcode == ICPT_PV_PREF) {
+				vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
+			}
 		}
 		local_irq_disable();
 		__enable_cpu_timer_accounting(vcpu);
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (32 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112 Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:53   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 35/42] KVM: s390: protvirt: Add UV cpu reset calls Christian Borntraeger
                   ` (7 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

As PSW restart is handled by the ultravisor (and we only get a start
notification) we must re-check the PSW after a start before injecting
interrupts.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
---
 arch/s390/kvm/kvm-s390.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 1b6963bbc96f..16af4d1a2c29 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4436,6 +4436,13 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 	/* Let's tell the UV that we want to start again */
 	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR, &rc, &rrc);
 	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
+	/*
+	 * The real PSW might have changed due to a RESTART interpreted by the
+	 * ultravisor. We block all interrupts and let the next sie exit
+	 * refresh our view.
+	 */
+	if (kvm_s390_pv_is_protected(vcpu->kvm))
+		vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
 	/*
 	 * Another VCPU might have used IBS while we were offline.
 	 * Let's play safe and flush the VCPU at startup.
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 35/42] KVM: s390: protvirt: Add UV cpu reset calls
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (33 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18  9:54   ` David Hildenbrand
  2020-02-14 22:26 ` [PATCH v2 36/42] DOCUMENTATION: Protected virtual machine introduction and IPL Christian Borntraeger
                   ` (6 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

For protected VMs, the VCPU resets are done by the Ultravisor, as KVM
has no access to the VCPU registers.

Note that the ultravisor will only accept a call for the exact reset
that has been requested.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/uv.h |  6 ++++++
 arch/s390/kvm/kvm-s390.c   | 20 ++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index d59825d95b9d..d4fb54231932 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -36,7 +36,10 @@
 #define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
 #define UVC_CMD_UNPACK_IMG		0x0301
 #define UVC_CMD_VERIFY_IMG		0x0302
+#define UVC_CMD_CPU_RESET		0x0310
+#define UVC_CMD_CPU_RESET_INITIAL	0x0311
 #define UVC_CMD_PREPARE_RESET		0x0320
+#define UVC_CMD_CPU_RESET_CLEAR		0x0321
 #define UVC_CMD_CPU_SET_STATE		0x0330
 #define UVC_CMD_SET_UNSHARE_ALL		0x0340
 #define UVC_CMD_PIN_PAGE_SHARED		0x0341
@@ -59,8 +62,11 @@ enum uv_cmds_inst {
 	BIT_UVC_CMD_SET_SEC_PARMS = 11,
 	BIT_UVC_CMD_UNPACK_IMG = 13,
 	BIT_UVC_CMD_VERIFY_IMG = 14,
+	BIT_UVC_CMD_CPU_RESET = 15,
+	BIT_UVC_CMD_CPU_RESET_INITIAL = 16,
 	BIT_UVC_CMD_CPU_SET_STATE = 17,
 	BIT_UVC_CMD_PREPARE_RESET = 18,
+	BIT_UVC_CMD_CPU_PERFORM_CLEAR_RESET = 19,
 	BIT_UVC_CMD_UNSHARE_ALL = 20,
 	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
 	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 16af4d1a2c29..932f7f32e82f 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4695,6 +4695,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	void __user *argp = (void __user *)arg;
 	int idx;
 	long r;
+	u16 rc, rrc;
 
 	vcpu_load(vcpu);
 
@@ -4716,14 +4717,33 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	case KVM_S390_CLEAR_RESET:
 		r = 0;
 		kvm_arch_vcpu_ioctl_clear_reset(vcpu);
+		if (kvm_s390_pv_handle_cpu(vcpu)) {
+			r = uv_cmd_nodata(kvm_s390_pv_handle_cpu(vcpu),
+					  UVC_CMD_CPU_RESET_CLEAR, &rc, &rrc);
+			VCPU_EVENT(vcpu, 3, "PROTVIRT RESET CLEAR VCPU: rc %x rrc %x",
+				   rc, rrc);
+		}
 		break;
 	case KVM_S390_INITIAL_RESET:
 		r = 0;
 		kvm_arch_vcpu_ioctl_initial_reset(vcpu);
+		if (kvm_s390_pv_handle_cpu(vcpu)) {
+			r = uv_cmd_nodata(kvm_s390_pv_handle_cpu(vcpu),
+					  UVC_CMD_CPU_RESET_INITIAL,
+					  &rc, &rrc);
+			VCPU_EVENT(vcpu, 3, "PROTVIRT RESET INITIAL VCPU: rc %x rrc %x",
+				   rc, rrc);
+		}
 		break;
 	case KVM_S390_NORMAL_RESET:
 		r = 0;
 		kvm_arch_vcpu_ioctl_normal_reset(vcpu);
+		if (kvm_s390_pv_handle_cpu(vcpu)) {
+			r = uv_cmd_nodata(kvm_s390_pv_handle_cpu(vcpu),
+					  UVC_CMD_CPU_RESET, &rc, &rrc);
+			VCPU_EVENT(vcpu, 3, "PROTVIRT RESET NORMAL VCPU: rc %x rrc %x",
+				   rc, rrc);
+		}
 		break;
 	case KVM_SET_ONE_REG:
 	case KVM_GET_ONE_REG: {
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 36/42] DOCUMENTATION: Protected virtual machine introduction and IPL
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (34 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 35/42] KVM: s390: protvirt: Add UV cpu reset calls Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 37/42] s390/uv: Fix handling of length extensions (already in s390 tree) Christian Borntraeger
                   ` (5 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

Add documentation about protected KVM guests and description of changes
that are necessary to move a KVM VM into Protected Virtualization mode.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: fixing and conversion to rst]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/index.rst        |   2 +
 Documentation/virt/kvm/s390-pv-boot.rst |  83 +++++++++++++++++
 Documentation/virt/kvm/s390-pv.rst      | 116 ++++++++++++++++++++++++
 MAINTAINERS                             |   1 +
 4 files changed, 202 insertions(+)
 create mode 100644 Documentation/virt/kvm/s390-pv-boot.rst
 create mode 100644 Documentation/virt/kvm/s390-pv.rst

diff --git a/Documentation/virt/kvm/index.rst b/Documentation/virt/kvm/index.rst
index 774deaebf7fa..dcc252634cf9 100644
--- a/Documentation/virt/kvm/index.rst
+++ b/Documentation/virt/kvm/index.rst
@@ -18,6 +18,8 @@ KVM
    nested-vmx
    ppc-pv
    s390-diag
+   s390-pv
+   s390-pv-boot
    timekeeping
    vcpu-requests
 
diff --git a/Documentation/virt/kvm/s390-pv-boot.rst b/Documentation/virt/kvm/s390-pv-boot.rst
new file mode 100644
index 000000000000..b762df206ab7
--- /dev/null
+++ b/Documentation/virt/kvm/s390-pv-boot.rst
@@ -0,0 +1,83 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================
+s390 (IBM Z) Boot/IPL of Protected VMs
+======================================
+
+Summary
+-------
+The memory of Protected Virtual Machines (PVMs) is not accessible to
+I/O or the hypervisor. In those cases where the hypervisor needs to
+access the memory of a PVM, that memory must be made accessible.
+Memory made accessible to the hypervisor will be encrypted. See
+:doc:`s390-pv` for details."
+
+On IPL (boot) a small plaintext bootloader is started, which provides
+information about the encrypted components and necessary metadata to
+KVM to decrypt the protected virtual machine.
+
+Based on this data, KVM will make the protected virtual machine known
+to the Ultravisor(UV) and instruct it to secure the memory of the PVM,
+decrypt the components and verify the data and address list hashes, to
+ensure integrity. Afterwards KVM can run the PVM via the SIE
+instruction which the UV will intercept and execute on KVM's behalf.
+
+As the guest image is just like an opaque kernel image that does the
+switch into PV mode itself, the user can load encrypted guest
+executables and data via every available method (network, dasd, scsi,
+direct kernel, ...) without the need to change the boot process.
+
+
+Diag308
+-------
+This diagnose instruction is the basic mechanism to handle IPL and
+related operations for virtual machines. The VM can set and retrieve
+IPL information blocks, that specify the IPL method/devices and
+request VM memory and subsystem resets, as well as IPLs.
+
+For PVMs this concept has been extended with new subcodes:
+
+Subcode 8: Set an IPL Information Block of type 5 (information block
+for PVMs)
+Subcode 9: Store the saved block in guest memory
+Subcode 10: Move into Protected Virtualization mode
+
+The new PV load-device-specific-parameters field specifies all data
+that is necessary to move into PV mode.
+
+* PV Header origin
+* PV Header length
+* List of Components composed of
+   * AES-XTS Tweak prefix
+   * Origin
+   * Size
+
+The PV header contains the keys and hashes, which the UV will use to
+decrypt and verify the PV, as well as control flags and a start PSW.
+
+The components are for instance an encrypted kernel, kernel parameters
+and initrd. The components are decrypted by the UV.
+
+After the initial import of the encrypted data, all defined pages will
+contain the guest content. All non-specified pages will start out as
+zero pages on first access.
+
+
+When running in protected virtualization mode, some subcodes will result in
+exceptions or return error codes.
+
+Subcodes 4 and 7, which specify operations that do not clear the guest
+memory, will result in specification exceptions. This is because the
+UV will clear all memory when a secure VM is removed, and therefore
+non-clearing IPL subcodes are not allowed."
+
+Subcodes 8, 9, 10 will result in specification exceptions.
+Re-IPL into a protected mode is only possible via a detour into non
+protected mode.
+
+Keys
+----
+Every CEC will have a unique public key to enable tooling to build
+encrypted images.
+See  `s390-tools <https://github.com/ibm-s390-tools/s390-tools/>`_
+for the tooling.
diff --git a/Documentation/virt/kvm/s390-pv.rst b/Documentation/virt/kvm/s390-pv.rst
new file mode 100644
index 000000000000..27fe03eaeaad
--- /dev/null
+++ b/Documentation/virt/kvm/s390-pv.rst
@@ -0,0 +1,116 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+s390 (IBM Z) Ultravisor and Protected VMs
+=========================================
+
+Summary
+-------
+Protected virtual machines (PVM) are KVM VMs that do not allow KVM to
+access VM state like guest memory or guest registers. Instead, the
+PVMs are mostly managed by a new entity called Ultravisor (UV). The UV
+provides an API that can be used by PVMs and KVM to request management
+actions.
+
+Each guest starts in the non-protected mode and then may make a
+request to transition into protected mode. On transition, KVM
+registers the guest and its VCPUs with the Ultravisor and prepares
+everything for running it.
+
+The Ultravisor will secure and decrypt the guest's boot memory
+(i.e. kernel/initrd). It will safeguard state changes like VCPU
+starts/stops and injected interrupts while the guest is running.
+
+As access to the guest's state, such as the SIE state description, is
+normally needed to be able to run a VM, some changes have been made in
+the behavior of the SIE instruction. A new format 4 state description
+has been introduced, where some fields have different meanings for a
+PVM. SIE exits are minimized as much as possible to improve speed and
+reduce exposed guest state.
+
+
+Interrupt injection
+-------------------
+Interrupt injection is safeguarded by the Ultravisor. As KVM doesn't
+have access to the VCPUs' lowcores, injection is handled via the
+format 4 state description.
+
+Machine check, external, IO and restart interruptions each can be
+injected on SIE entry via a bit in the interrupt injection control
+field (offset 0x54). If the guest cpu is not enabled for the interrupt
+at the time of injection, a validity interception is recognized. The
+format 4 state description contains fields in the interception data
+block where data associated with the interrupt can be transported.
+
+Program and Service Call exceptions have another layer of
+safeguarding; they can only be injected for instructions that have
+been intercepted into KVM. The exceptions need to be a valid outcome
+of an instruction emulation by KVM, e.g. we can never inject a
+addressing exception as they are reported by SIE since KVM has no
+access to the guest memory.
+
+
+Mask notification interceptions
+-------------------------------
+In order to be notified when a PVM enables a certain class of
+interrupt, KVM cannot intercept lctl(g) and lpsw(e) anymore. As a
+replacement, two new interception codes have been introduced: One
+indicating that the contents of CRs 0, 6, or 14 have been changed,
+indicating different interruption subclasses; and one indicating that
+PSW bit 13 has been changed, indicating that a machine check
+intervention was requested and those are now enabled.
+
+Instruction emulation
+---------------------
+With the format 4 state description for PVMs, the SIE instruction already
+interprets more instructions than it does with format 2. It is not able
+to interpret every instruction, but needs to hand some tasks to KVM;
+therefore, the SIE and the ultravisor safeguard emulation inputs and outputs.
+
+The control structures associated with SIE provide the Secure
+Instruction Data Area (SIDA), the Interception Parameters (IP) and the
+Secure Interception General Register Save Area.  Guest GRs and most of
+the instruction data, such as I/O data structures, are filtered.
+Instruction data is copied to and from the Secure Instruction Data
+Area (SIDA) when needed.  Guest GRs are put into / retrieved from the
+Secure Interception General Register Save Area.
+
+Only GR values needed to emulate an instruction will be copied into this
+save area and the real register numbers will be hidden.
+
+The Interception Parameters state description field still contains the
+the bytes of the instruction text, but with pre-set register values
+instead of the actual ones. I.e. each instruction always uses the same
+instruction text, in order not to leak guest instruction text.
+This also implies that the register content that a guest had in r<n>
+may be in r<m> from the hypervisor's point of view.
+
+The Secure Instruction Data Area contains instruction storage
+data. Instruction data, i.e. data being referenced by an instruction
+like the SCCB for sclp, is moved via the SIDA. When an instruction is
+intercepted, the SIE will only allow data and program interrupts for
+this instruction to be moved to the guest via the two data areas
+discussed before. Other data is either ignored or results in validity
+interceptions.
+
+
+Instruction emulation interceptions
+-----------------------------------
+There are two types of SIE secure instruction intercepts: the normal
+and the notification type. Normal secure instruction intercepts will
+make the guest pending for instruction completion of the intercepted
+instruction type, i.e. on SIE entry it is attempted to complete
+emulation of the instruction with the data provided by KVM. That might
+be a program exception or instruction completion.
+
+The notification type intercepts inform KVM about guest environment
+changes due to guest instruction interpretation. Such an interception
+is recognized, for example, for the store prefix instruction to provide
+the new lowcore location. On SIE reentry, any KVM data in the data areas
+is ignored and execution continues as if the guest instruction had
+completed. For that reason KVM is not allowed to inject a program
+interrupt.
+
+Links
+-----
+`KVM Forum 2019 presentation <https://static.sched.com/hosted_files/kvmforum2019/3b/ibm_protected_vms_s390x.pdf>`_
diff --git a/MAINTAINERS b/MAINTAINERS
index 38fe2f3f7b6f..115956e9ac8f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9209,6 +9209,7 @@ L:	kvm@vger.kernel.org
 W:	http://www.ibm.com/developerworks/linux/linux390/
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
 S:	Supported
+F:	Documentation/virt/kvm/s390*
 F:	arch/s390/include/uapi/asm/kvm*
 F:	arch/s390/include/asm/gmap.h
 F:	arch/s390/include/asm/kvm*
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 37/42] s390/uv: Fix handling of length extensions (already in s390 tree)
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (35 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 36/42] DOCUMENTATION: Protected virtual machine introduction and IPL Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 38/42] s390: protvirt: Add sysfs firmware interface for Ultravisor information Christian Borntraeger
                   ` (4 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, stable

The query parameter block might contain additional information and can
be extended in the future. If the size of the block does not suffice we
get an error code of rc=0x100.  The buffer will contain all information
up to the specified size and the hypervisor/guest simply do not need the
additional information as they do not know about the new data.  That
means that we can (and must) accept rc=0x100 as success.

Cc: stable@vger.kernel.org
Fixes: 5abb9351dfd9 ("s390/uv: introduce guest side ultravisor code")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/boot/uv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/boot/uv.c b/arch/s390/boot/uv.c
index af9e1cc93c68..c003593664cd 100644
--- a/arch/s390/boot/uv.c
+++ b/arch/s390/boot/uv.c
@@ -21,7 +21,7 @@ void uv_query_info(void)
 	if (!test_facility(158))
 		return;
 
-	if (uv_call(0, (uint64_t)&uvcb))
+	if (uv_call(0, (uint64_t)&uvcb) && uvcb.header.rc != 0x100)
 		return;
 
 	if (IS_ENABLED(CONFIG_KVM)) {
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 38/42] s390: protvirt: Add sysfs firmware interface for Ultravisor information
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (36 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 37/42] s390/uv: Fix handling of length extensions (already in s390 tree) Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases Christian Borntraeger
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Janosch Frank

From: Janosch Frank <frankja@linux.ibm.com>

That information, e.g. the maximum number of guests or installed
Ultravisor facilities, is interesting for QEMU, Libvirt and
administrators.

Let's provide an easily parsable API to get that information.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
---
 arch/s390/kernel/uv.c | 86 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 9a6c309864a0..fb606b171f42 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -322,5 +322,91 @@ int arch_make_page_accessible(struct page *page)
 	return rc;
 }
 EXPORT_SYMBOL_GPL(arch_make_page_accessible);
+#endif
+
+#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM)
+static ssize_t uv_query_facilities(struct kobject *kobj,
+				   struct kobj_attribute *attr, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%lx\n%lx\n%lx\n%lx\n",
+			uv_info.inst_calls_list[0],
+			uv_info.inst_calls_list[1],
+			uv_info.inst_calls_list[2],
+			uv_info.inst_calls_list[3]);
+}
+
+static struct kobj_attribute uv_query_facilities_attr =
+	__ATTR(facilities, 0444, uv_query_facilities, NULL);
+
+static ssize_t uv_query_max_guest_cpus(struct kobject *kobj,
+				       struct kobj_attribute *attr, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%d\n",
+			uv_info.max_guest_cpus);
+}
+
+static struct kobj_attribute uv_query_max_guest_cpus_attr =
+	__ATTR(max_cpus, 0444, uv_query_max_guest_cpus, NULL);
+
+static ssize_t uv_query_max_guest_vms(struct kobject *kobj,
+				      struct kobj_attribute *attr, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%d\n",
+			uv_info.max_num_sec_conf);
+}
+
+static struct kobj_attribute uv_query_max_guest_vms_attr =
+	__ATTR(max_guests, 0444, uv_query_max_guest_vms, NULL);
+
+static ssize_t uv_query_max_guest_addr(struct kobject *kobj,
+				       struct kobj_attribute *attr, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%lx\n",
+			uv_info.max_sec_stor_addr);
+}
+
+static struct kobj_attribute uv_query_max_guest_addr_attr =
+	__ATTR(max_address, 0444, uv_query_max_guest_addr, NULL);
+
+static struct attribute *uv_query_attrs[] = {
+	&uv_query_facilities_attr.attr,
+	&uv_query_max_guest_cpus_attr.attr,
+	&uv_query_max_guest_vms_attr.attr,
+	&uv_query_max_guest_addr_attr.attr,
+	NULL,
+};
+
+static struct attribute_group uv_query_attr_group = {
+	.attrs = uv_query_attrs,
+};
 
+static struct kset *uv_query_kset;
+struct kobject *uv_kobj;
+
+static int __init uv_info_init(void)
+{
+	int rc = -ENOMEM;
+
+	if (!test_facility(158))
+		return 0;
+
+	uv_kobj = kobject_create_and_add("uv", firmware_kobj);
+	if (!uv_kobj)
+		return -ENOMEM;
+
+	uv_query_kset = kset_create_and_add("query", NULL, uv_kobj);
+	if (!uv_query_kset)
+		goto out_kobj;
+
+	rc = sysfs_create_group(&uv_query_kset->kobj, &uv_query_attr_group);
+	if (!rc)
+		return 0;
+
+	kset_unregister(uv_query_kset);
+out_kobj:
+	kobject_del(uv_kobj);
+	kobject_put(uv_kobj);
+	return rc;
+}
+device_initcall(uv_info_init);
 #endif
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (37 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 38/42] s390: protvirt: Add sysfs firmware interface for Ultravisor information Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-18 16:25   ` Will Deacon
  2020-02-14 22:26 ` [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication Christian Borntraeger
                   ` (2 subsequent siblings)
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Andrea Arcangeli, linux-mm, Will Deacon,
	Sean Christopherson

From: Claudio Imbrenda <imbrenda@linux.ibm.com>

This is a potential extension to do error handling if we fail to
make the page accessible if we know what others need.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
 mm/gup.c            | 17 ++++++++++++-----
 mm/page-writeback.c |  6 +++++-
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index a1c15d029f7c..354bcfbd844b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -193,6 +193,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 	struct page *page;
 	spinlock_t *ptl;
 	pte_t *ptep, pte;
+	int ret;
 
 	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
 	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
@@ -250,8 +251,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 		if (is_zero_pfn(pte_pfn(pte))) {
 			page = pte_page(pte);
 		} else {
-			int ret;
-
 			ret = follow_pfn_pte(vma, address, ptep, flags);
 			page = ERR_PTR(ret);
 			goto out;
@@ -259,7 +258,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 	}
 
 	if (flags & FOLL_SPLIT && PageTransCompound(page)) {
-		int ret;
 		get_page(page);
 		pte_unmap_unlock(ptep, ptl);
 		lock_page(page);
@@ -276,7 +274,12 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 			page = ERR_PTR(-ENOMEM);
 			goto out;
 		}
-		arch_make_page_accessible(page);
+		ret = arch_make_page_accessible(page);
+		if (ret) {
+			put_page(page);
+			page = ERR_PTR(ret);
+			goto out;
+		}
 	}
 	if (flags & FOLL_TOUCH) {
 		if ((flags & FOLL_WRITE) &&
@@ -1920,7 +1923,11 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 
 		VM_BUG_ON_PAGE(compound_head(page) != head, page);
 
-		arch_make_page_accessible(page);
+		ret = arch_make_page_accessible(page);
+		if (ret) {
+			put_page(head);
+			goto pte_unmap;
+		}
 		SetPageReferenced(page);
 		pages[*nr] = page;
 		(*nr)++;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 4c020e4ae71c..558d7063c117 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2807,7 +2807,11 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 		inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
 	}
 	unlock_page_memcg(page);
-	arch_make_page_accessible(page);
+	/*
+	 * If writeback has been triggered on a page that cannot be made
+	 * accessible, it is too late.
+	 */
+	WARN_ON(arch_make_page_accessible(page));
 	return ret;
 
 }
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (38 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-17 14:15   ` Ulrich Weigand
  2020-02-14 22:26 ` [PATCH v2 41/42] potential fixup for "s390/mm: provide memory management functions for protected KVM guests" Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 42/42] KVM: s390: rstify new ioctls in api.rst Christian Borntraeger
  41 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik, Andrea Arcangeli, linux-mm, Will Deacon,
	Sean Christopherson

From: Claudio Imbrenda <imbrenda@linux.ibm.com>

We might want to do different things depending on where we are coming
from.

Cc: Will Deacon <will@kernel.org>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
[borntraeger@de.ibm.com: patch description]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 include/linux/gfp.h | 8 +++++++-
 mm/gup.c            | 4 ++--
 mm/page-writeback.c | 2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index be2754841369..a15fcb361e7c 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -485,8 +485,14 @@ static inline void arch_free_page(struct page *page, int order) { }
 #ifndef HAVE_ARCH_ALLOC_PAGE
 static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
+enum access_type {
+	MAKE_ACCESSIBLE_GENERIC,
+	MAKE_ACCESSIBLE_GET,
+	MAKE_ACCESSIBLE_GET_FAST,
+	MAKE_ACCESSIBLE_WRITEBACK
+};
 #ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
-static inline int arch_make_page_accessible(struct page *page)
+static inline int arch_make_page_accessible(struct page *page, int where)
 {
 	return 0;
 }
diff --git a/mm/gup.c b/mm/gup.c
index 354bcfbd844b..ce962c155724 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -274,7 +274,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 			page = ERR_PTR(-ENOMEM);
 			goto out;
 		}
-		ret = arch_make_page_accessible(page);
+		ret = arch_make_page_accessible(page, MAKE_ACCESSIBLE_GET);
 		if (ret) {
 			put_page(page);
 			page = ERR_PTR(ret);
@@ -1923,7 +1923,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 
 		VM_BUG_ON_PAGE(compound_head(page) != head, page);
 
-		ret = arch_make_page_accessible(page);
+		ret = arch_make_page_accessible(page, MAKE_ACCESSIBLE_GET_FAST);
 		if (ret) {
 			put_page(head);
 			goto pte_unmap;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 558d7063c117..f85148e59800 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2811,7 +2811,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 	 * If writeback has been triggered on a page that cannot be made
 	 * accessible, it is too late.
 	 */
-	WARN_ON(arch_make_page_accessible(page));
+	WARN_ON(arch_make_page_accessible(page, MAKE_ACCESSIBLE_WRITEBACK));
 	return ret;
 
 }
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 41/42] potential fixup for "s390/mm: provide memory management functions for protected KVM guests"
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (39 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  2020-02-14 22:26 ` [PATCH v2 42/42] KVM: s390: rstify new ioctls in api.rst Christian Borntraeger
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

From: Claudio Imbrenda <imbrenda@linux.ibm.com>

Needed when "example for future extension: mm:gup/writeback: add
callbacks for inaccessible pages: source indication" is applied.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/page.h | 2 +-
 arch/s390/kernel/uv.c        | 2 +-
 arch/s390/mm/fault.c         | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
index 4ebcf891ff3c..a658487fe8e7 100644
--- a/arch/s390/include/asm/page.h
+++ b/arch/s390/include/asm/page.h
@@ -154,7 +154,7 @@ static inline int devmem_is_allowed(unsigned long pfn)
 #define HAVE_ARCH_ALLOC_PAGE
 
 #if IS_ENABLED(CONFIG_PGSTE)
-int arch_make_page_accessible(struct page *page);
+int arch_make_page_accessible(struct page *page, int where);
 #define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
 #endif
 
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index fb606b171f42..5ace77694ed3 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -287,7 +287,7 @@ EXPORT_SYMBOL_GPL(gmap_convert_to_secure);
 /**
  * To be called with the page locked or with an extra reference!
  */
-int arch_make_page_accessible(struct page *page)
+int arch_make_page_accessible(struct page *page, int where)
 {
 	int rc = 0;
 
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 7bd86ebc882f..1f31025bc2cf 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -842,7 +842,7 @@ void do_secure_storage_access(struct pt_regs *regs)
 			up_read(&mm->mmap_sem);
 			break;
 		}
-		if (arch_make_page_accessible(page))
+		if (arch_make_page_accessible(page, MAKE_ACCESSIBLE_GENERIC))
 			send_sig(SIGSEGV, current, 0);
 		put_page(page);
 		up_read(&mm->mmap_sem);
@@ -851,7 +851,7 @@ void do_secure_storage_access(struct pt_regs *regs)
 		page = phys_to_page(addr);
 		if (unlikely(!try_get_page(page)))
 			break;
-		rc = arch_make_page_accessible(page);
+		rc = arch_make_page_accessible(page, MAKE_ACCESSIBLE_GENERIC);
 		put_page(page);
 		if (rc)
 			BUG();
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2 42/42] KVM: s390: rstify new ioctls in api.rst
  2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
                   ` (40 preceding siblings ...)
  2020-02-14 22:26 ` [PATCH v2 41/42] potential fixup for "s390/mm: provide memory management functions for protected KVM guests" Christian Borntraeger
@ 2020-02-14 22:26 ` Christian Borntraeger
  41 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-14 22:26 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, David Hildenbrand, Thomas Huth,
	Ulrich Weigand, Claudio Imbrenda, linux-s390, Michael Mueller,
	Vasily Gorbik

We also need to rstify the new ioctls that we added in parallel to the
rstification of the kvm docs.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/api.rst | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index a82166e5f7d9..89e9c34bc5d3 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4613,35 +4613,38 @@ unpins the VPA pages and releases all the device pages that are used to
 track the secure pages by hypervisor.
 
 4.122 KVM_S390_NORMAL_RESET
+---------------------------
 
-Capability: KVM_CAP_S390_VCPU_RESETS
-Architectures: s390
-Type: vcpu ioctl
-Parameters: none
-Returns: 0
+:Capability: KVM_CAP_S390_VCPU_RESETS
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
 
 This ioctl resets VCPU registers and control structures according to
 the cpu reset definition in the POP (Principles Of Operation).
 
 4.123 KVM_S390_INITIAL_RESET
+----------------------------
 
-Capability: none
-Architectures: s390
-Type: vcpu ioctl
-Parameters: none
-Returns: 0
+:Capability: none
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
 
 This ioctl resets VCPU registers and control structures according to
 the initial cpu reset definition in the POP. However, the cpu is not
 put into ESA mode. This reset is a superset of the normal reset.
 
 4.124 KVM_S390_CLEAR_RESET
+--------------------------
 
-Capability: KVM_CAP_S390_VCPU_RESETS
-Architectures: s390
-Type: vcpu ioctl
-Parameters: none
-Returns: 0
+:Capability: KVM_CAP_S390_VCPU_RESETS
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
 
 This ioctl resets VCPU registers and control structures according to
 the clear cpu reset definition in the POP. However, the cpu is not put
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-14 22:26 ` [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
@ 2020-02-17  9:14   ` David Hildenbrand
  2020-02-17 11:10     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17  9:14 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm, Will Deacon, Sean Christopherson

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
> 
> With the introduction of protected KVM guests on s390 there is now a
> concept of inaccessible pages. These pages need to be made accessible
> before the host can access them.
> 
> While cpu accesses will trigger a fault that can be resolved, I/O
> accesses will just fail.  We need to add a callback into architecture
> code for places that will do I/O, namely when writeback is started or
> when a page reference is taken.
> 
> This is not only to enable paging, file backing etc, it is also
> necessary to protect the host against a malicious user space. For
> example a bad QEMU could simply start direct I/O on such protected
> memory.  We do not want userspace to be able to trigger I/O errors and
> thus we the logic is "whenever somebody accesses that page (gup) or
> doing I/O, make sure that this page can be accessed. When the guest
> tries to access that page we will wait in the page fault handler for
> writeback to have finished and for the page_ref to be the expected
> value.
> 
> If wanted by others, the callbacks can be extended with error handlin
> and a parameter from where this is called.

s/handlin/handling/

One last question from my side:

Why is it OK to ignore errors here. IOW, why not squash "[PATCH v2
39/42] example for future extension: mm:gup/writeback: add callbacks for
inaccessible pages: error cases" into this patch.

I can see in patch "[PATCH v2 05/42] s390/mm: provide memory management
functions for protected KVM guests", that the call can fail for various
reasons. That puzzles me a bit - what would happen if any of that fails?
Or will it actually never fail for s390x (and all that error handling in
arch_make_page_accessible() is essentially dead code in real life?)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages
  2020-02-14 22:26 ` [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages Christian Borntraeger
@ 2020-02-17  9:43   ` David Hildenbrand
  2020-02-20 12:18     ` David Hildenbrand
  2020-02-20 13:31     ` Christian Borntraeger
  0 siblings, 2 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17  9:43 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
> 
> The adapter interrupt page containing the indicator bits is currently
> pinned. That means that a guest with many devices can pin a lot of
> memory pages in the host. This also complicates the reference tracking
> which is needed for memory management handling of protected virtual
> machines. It might also have some strange side effects for madvise
> MADV_DONTNEED and other things.
> 
> We can simply try to get the userspace page set the bits and free the
> page. By storing the userspace address in the irq routing entry instead
> of the guest address we can actually avoid many lookups and list walks
> so that this variant is very likely not slower.
> 
> If userspace messes around with the memory slots the worst thing that
> can happen is that we write to some other memory within that process.
> As we get the the page with FOLL_WRITE this can also not be used to
> write to shared read-only pages.
> 
> Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
> [borntraeger@de.ibm.com: patch simplification]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  Documentation/virt/kvm/devices/s390_flic.rst |  11 +-
>  arch/s390/include/asm/kvm_host.h             |   3 -
>  arch/s390/kvm/interrupt.c                    | 170 ++++++-------------
>  3 files changed, 53 insertions(+), 131 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst
> index 954190da7d04..ea96559ba501 100644
> --- a/Documentation/virt/kvm/devices/s390_flic.rst
> +++ b/Documentation/virt/kvm/devices/s390_flic.rst
> @@ -108,16 +108,9 @@ Groups:
>        mask or unmask the adapter, as specified in mask
>  
>      KVM_S390_IO_ADAPTER_MAP
> -      perform a gmap translation for the guest address provided in addr,
> -      pin a userspace page for the translated address and add it to the
> -      list of mappings
> -
> -      .. note:: A new mapping will be created unconditionally; therefore,
> -	        the calling code should avoid making duplicate mappings.
> -
> +      This is now a no-op. The mapping is purely done by the irq route.
>      KVM_S390_IO_ADAPTER_UNMAP
> -      release a userspace page for the translated address specified in addr
> -      from the list of mappings
> +      This is now a no-op. The mapping is purely done by the irq route.
>  

The interface should have accepted a hva from the very start and not
guest addresses ...

[...]

>  
>  static int modify_io_adapter(struct kvm_device *dev,
> @@ -2456,12 +2378,13 @@ static int modify_io_adapter(struct kvm_device *dev,
>  		if (ret > 0)
>  			ret = 0;
>  		break;
> +	/*
> +	 * We resolve the gpa to hva when setting the IRQ routing. the set_irq
> +	 * code uses get_user_pages_remote to do the actual write.

nit: "get_user_pages_remote()"

> +	 */
>  	case KVM_S390_IO_ADAPTER_MAP:
> -		ret = kvm_s390_adapter_map(dev->kvm, req.id, req.addr);
> -		break;
>  	case KVM_S390_IO_ADAPTER_UNMAP:
> -		ret = kvm_s390_adapter_unmap(dev->kvm, req.id, req.addr);
> -		break;
> +		return 0;
>  	default:
>  		ret = -EINVAL;
>  	}
> @@ -2699,19 +2622,21 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
>  	return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit;
>  }
>  
> -static struct s390_map_info *get_map_info(struct s390_io_adapter *adapter,
> -					  u64 addr)
> +static struct page *get_map_page(struct kvm *kvm,
> +				 struct s390_io_adapter *adapter,
> +				 u64 uaddr)
>  {
> -	struct s390_map_info *map;
> +	struct page *page = NULL;
>  
>  	if (!adapter)
>  		return NULL;

AFAIKs, this check is not necessary.

> -
> -	list_for_each_entry(map, &adapter->maps, list) {
> -		if (map->guest_addr == addr)
> -			return map;
> -	}
> -	return NULL;
> +	if (!uaddr)
> +		return NULL;

I do wonder if that check is necessary. I don't think so but might be
missing something.

> +	down_read(&kvm->mm->mmap_sem);
> +	get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE,
> +			      &page, NULL, NULL);
> +	up_read(&kvm->mm->mmap_sem);
> +	return page;
>  }
>  
>  static int adapter_indicators_set(struct kvm *kvm,
> @@ -2720,30 +2645,35 @@ static int adapter_indicators_set(struct kvm *kvm,
>  {
>  	unsigned long bit;
>  	int summary_set, idx;
> -	struct s390_map_info *info;
> +	struct page *ind_page, *summary_page;
>  	void *map;
>  
> -	info = get_map_info(adapter, adapter_int->ind_addr);
> -	if (!info)
> +	ind_page = get_map_page(kvm, adapter, adapter_int->ind_addr);
> +	if (!ind_page)
>  		return -1;
> -	map = page_address(info->page);
> -	bit = get_ind_bit(info->addr, adapter_int->ind_offset, adapter->swap);
> -	set_bit(bit, map);
> -	idx = srcu_read_lock(&kvm->srcu);
> -	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
> -	set_page_dirty_lock(info->page);
> -	info = get_map_info(adapter, adapter_int->summary_addr);
> -	if (!info) {
> -		srcu_read_unlock(&kvm->srcu, idx);
> +	summary_page = get_map_page(kvm, adapter, adapter_int->summary_addr);
> +	if (!summary_page) {
> +		put_page(ind_page);
>  		return -1;
>  	}
> -	map = page_address(info->page);
> -	bit = get_ind_bit(info->addr, adapter_int->summary_offset,
> -			  adapter->swap);
> +
> +	idx = srcu_read_lock(&kvm->srcu);
> +	map = page_address(ind_page);
> +	bit = get_ind_bit(adapter_int->ind_addr,
> +			  adapter_int->ind_offset, adapter->swap);
> +	set_bit(bit, map);
> +	mark_page_dirty(kvm, adapter_int->ind_addr >> PAGE_SHIFT);
> +	set_page_dirty_lock(ind_page);
> +	map = page_address(summary_page);
> +	bit = get_ind_bit(adapter_int->summary_addr,
> +			  adapter_int->summary_offset, adapter->swap);
>  	summary_set = test_and_set_bit(bit, map);
> -	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
> -	set_page_dirty_lock(info->page);
> +	mark_page_dirty(kvm, adapter_int->summary_addr >> PAGE_SHIFT);
> +	set_page_dirty_lock(summary_page);
>  	srcu_read_unlock(&kvm->srcu, idx);
> +
> +	put_page(ind_page);
> +	put_page(summary_page);
>  	return summary_set ? 0 : 1;
>  }
>  
> @@ -2765,9 +2695,7 @@ static int set_adapter_int(struct kvm_kernel_irq_routing_entry *e,
>  	adapter = get_io_adapter(kvm, e->adapter.adapter_id);
>  	if (!adapter)
>  		return -1;
> -	down_read(&adapter->maps_lock);
>  	ret = adapter_indicators_set(kvm, adapter, &e->adapter);
> -	up_read(&adapter->maps_lock);
>  	if ((ret > 0) && !adapter->masked) {
>  		ret = kvm_s390_inject_airq(kvm, adapter);
>  		if (ret == 0)
> @@ -2818,23 +2746,27 @@ int kvm_set_routing_entry(struct kvm *kvm,
>  			  struct kvm_kernel_irq_routing_entry *e,
>  			  const struct kvm_irq_routing_entry *ue)
>  {
> -	int ret;
> +	u64 uaddr;
>  
>  	switch (ue->type) {
> +	/* we store the userspace addresses instead of the guest addresses */
>  	case KVM_IRQ_ROUTING_S390_ADAPTER:
>  		e->set = set_adapter_int;
> -		e->adapter.summary_addr = ue->u.adapter.summary_addr;
> -		e->adapter.ind_addr = ue->u.adapter.ind_addr;
> +		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);
> +		if (uaddr == -EFAULT)
> +			return -EFAULT;
> +		e->adapter.summary_addr = uaddr;
> +		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr);
> +		if (uaddr == -EFAULT)
> +			return -EFAULT;

AFAIK, leaving e->adapter.summary_addr set is not an issue.

Interesting, in kvm_s390_adapter_map(), we didn't synchronize again slot
updates when doing the gmap_translate(), which looks wrong to me ...

It seems to be the same thing here. I do wonder if it is safe to do a
gmap_translate() here, looks like this can race with
kvm_arch_commit_memory_region().

I would have assumed we need e.g., the slots_lock while doing the
gmap_translate() - or a srcu_read_lock(&vcpu->kvm->srcu) or similar ...


Apart from that, looks good to me.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 03/42] s390/protvirt: introduce host side setup
  2020-02-14 22:26 ` [PATCH v2 03/42] s390/protvirt: introduce host side setup Christian Borntraeger
@ 2020-02-17  9:53   ` David Hildenbrand
  2020-02-17 11:11     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17  9:53 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Vasily Gorbik <gor@linux.ibm.com>
> 
> Add "prot_virt" command line option which controls if the kernel
> protected VMs support is enabled at early boot time. This has to be
> done early, because it needs large amounts of memory and will disable
> some features like STP time sync for the lpar.
> 
> Extend ultravisor info definitions and expose it via uv_info struct
> filled in during startup.
> 
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  .../admin-guide/kernel-parameters.txt         |  5 ++
>  arch/s390/boot/Makefile                       |  2 +-
>  arch/s390/boot/uv.c                           | 21 +++++++-
>  arch/s390/include/asm/uv.h                    | 45 +++++++++++++++-
>  arch/s390/kernel/Makefile                     |  1 +
>  arch/s390/kernel/setup.c                      |  4 --
>  arch/s390/kernel/uv.c                         | 52 +++++++++++++++++++
>  7 files changed, 122 insertions(+), 8 deletions(-)
>  create mode 100644 arch/s390/kernel/uv.c
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index dbc22d684627..b0beae9b9e36 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3795,6 +3795,11 @@
>  			before loading.
>  			See Documentation/admin-guide/blockdev/ramdisk.rst.
>  
> +	prot_virt=	[S390] enable hosting protected virtual machines
> +			isolated from the hypervisor (if hardware supports
> +			that).
> +			Format: <bool>
> +
>  	psi=		[KNL] Enable or disable pressure stall information
>  			tracking.
>  			Format: <bool>
> diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
> index e2c47d3a1c89..30f1811540c5 100644
> --- a/arch/s390/boot/Makefile
> +++ b/arch/s390/boot/Makefile
> @@ -37,7 +37,7 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
>  obj-y	:= head.o als.o startup.o mem_detect.o ipl_parm.o ipl_report.o
>  obj-y	+= string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
>  obj-y	+= version.o pgm_check_info.o ctype.o text_dma.o
> -obj-$(CONFIG_PROTECTED_VIRTUALIZATION_GUEST)	+= uv.o
> +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
>  obj-$(CONFIG_RELOCATABLE)	+= machine_kexec_reloc.o
>  obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
>  targets	:= bzImage startup.a section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y)
> diff --git a/arch/s390/boot/uv.c b/arch/s390/boot/uv.c
> index ed007f4a6444..af9e1cc93c68 100644
> --- a/arch/s390/boot/uv.c
> +++ b/arch/s390/boot/uv.c
> @@ -3,7 +3,13 @@
>  #include <asm/facility.h>
>  #include <asm/sections.h>
>  
> +/* will be used in arch/s390/kernel/uv.c */
> +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>  int __bootdata_preserved(prot_virt_guest);
> +#endif
> +#if IS_ENABLED(CONFIG_KVM)
> +struct uv_info __bootdata_preserved(uv_info);
> +#endif
>  
>  void uv_query_info(void)
>  {
> @@ -18,7 +24,20 @@ void uv_query_info(void)
>  	if (uv_call(0, (uint64_t)&uvcb))
>  		return;
>  
> -	if (test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
> +	if (IS_ENABLED(CONFIG_KVM)) {
> +		memcpy(uv_info.inst_calls_list, uvcb.inst_calls_list, sizeof(uv_info.inst_calls_list));
> +		uv_info.uv_base_stor_len = uvcb.uv_base_stor_len;
> +		uv_info.guest_base_stor_len = uvcb.conf_base_phys_stor_len;
> +		uv_info.guest_virt_base_stor_len = uvcb.conf_base_virt_stor_len;
> +		uv_info.guest_virt_var_stor_len = uvcb.conf_virt_var_stor_len;
> +		uv_info.guest_cpu_stor_len = uvcb.cpu_stor_len;
> +		uv_info.max_sec_stor_addr = ALIGN(uvcb.max_guest_stor_addr, PAGE_SIZE);
> +		uv_info.max_num_sec_conf = uvcb.max_num_sec_conf;
> +		uv_info.max_guest_cpus = uvcb.max_guest_cpus;
> +	}
> +
> +	if (IS_ENABLED(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) &&
> +	    test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
>  	    test_bit_inv(BIT_UVC_CMD_REMOVE_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list))
>  		prot_virt_guest = 1;
>  }
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index 4093a2856929..34b1114dcc38 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -44,7 +44,19 @@ struct uv_cb_qui {
>  	struct uv_cb_header header;
>  	u64 reserved08;
>  	u64 inst_calls_list[4];
> -	u64 reserved30[15];
> +	u64 reserved30[2];
> +	u64 uv_base_stor_len;
> +	u64 reserved48;
> +	u64 conf_base_phys_stor_len;
> +	u64 conf_base_virt_stor_len;
> +	u64 conf_virt_var_stor_len;
> +	u64 cpu_stor_len;
> +	u32 reserved70[3];
> +	u32 max_num_sec_conf;
> +	u64 max_guest_stor_addr;
> +	u8  reserved88[158-136];
> +	u16 max_guest_cpus;
> +	u64 reserveda0;
>  } __packed __aligned(8);
>  
>  struct uv_cb_share {
> @@ -69,6 +81,19 @@ static inline int uv_call(unsigned long r1, unsigned long r2)
>  	return cc;
>  }
>  
> +struct uv_info {
> +	unsigned long inst_calls_list[4];
> +	unsigned long uv_base_stor_len;
> +	unsigned long guest_base_stor_len;
> +	unsigned long guest_virt_base_stor_len;
> +	unsigned long guest_virt_var_stor_len;
> +	unsigned long guest_cpu_stor_len;
> +	unsigned long max_sec_stor_addr;
> +	unsigned int max_num_sec_conf;
> +	unsigned short max_guest_cpus;
> +};
> +extern struct uv_info uv_info;
> +
>  #ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>  extern int prot_virt_guest;
>  
> @@ -121,11 +146,27 @@ static inline int uv_remove_shared(unsigned long addr)
>  	return share(addr, UVC_CMD_REMOVE_SHARED_ACCESS);
>  }
>  
> -void uv_query_info(void);
>  #else
>  #define is_prot_virt_guest() 0
>  static inline int uv_set_shared(unsigned long addr) { return 0; }
>  static inline int uv_remove_shared(unsigned long addr) { return 0; }
> +#endif
> +
> +#if IS_ENABLED(CONFIG_KVM)
> +extern int prot_virt_host;
> +
> +static inline int is_prot_virt_host(void)
> +{
> +	return prot_virt_host;
> +}
> +#else
> +#define is_prot_virt_host() 0
> +#endif
> +
> +#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
> +	IS_ENABLED(CONFIG_KVM)
> +void uv_query_info(void);
> +#else
>  static inline void uv_query_info(void) {}
>  #endif
>  
> diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
> index 2b1203cf7be6..22bfb8d5084e 100644
> --- a/arch/s390/kernel/Makefile
> +++ b/arch/s390/kernel/Makefile
> @@ -78,6 +78,7 @@ obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_events.o perf_regs.o
>  obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_diag.o
>  
>  obj-$(CONFIG_TRACEPOINTS)	+= trace.o
> +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
>  
>  # vdso
>  obj-y				+= vdso64/
> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
> index b2c2f75860e8..a2496382175e 100644
> --- a/arch/s390/kernel/setup.c
> +++ b/arch/s390/kernel/setup.c
> @@ -92,10 +92,6 @@ char elf_platform[ELF_PLATFORM_SIZE];
>  
>  unsigned long int_hwcap = 0;
>  
> -#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
> -int __bootdata_preserved(prot_virt_guest);
> -#endif
> -
>  int __bootdata(noexec_disabled);
>  int __bootdata(memory_end_set);
>  unsigned long __bootdata(memory_end);
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> new file mode 100644
> index 000000000000..b1f936710360
> --- /dev/null
> +++ b/arch/s390/kernel/uv.c
> @@ -0,0 +1,52 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Common Ultravisor functions and initialization
> + *
> + * Copyright IBM Corp. 2019, 2020
> + */
> +#define KMSG_COMPONENT "prot_virt"
> +#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/sizes.h>
> +#include <linux/bitmap.h>
> +#include <linux/memblock.h>
> +#include <asm/facility.h>
> +#include <asm/sections.h>
> +#include <asm/uv.h>
> +
> +/* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */
> +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
> +int __bootdata_preserved(prot_virt_guest);
> +#endif
> +
> +#if IS_ENABLED(CONFIG_KVM)


Why is that glued to CONFIG_KVM? Can't we glue all that stuff to
CONFIG_PROTECTED_VIRTUALIZATION_GUEST and make in kconfig
CONFIG_PROTECTED_VIRTUALIZATION_GUEST depend on CONFIG_KVM?

IOW, I'd prefer to only have CONFIG_PROTECTED_VIRTUALIZATION_GUEST in
!KVM code and enforce it via kconfig instead.


Anyhow, I remember Conny also having a comment about that previously, so

Acked-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 04/42] s390/protvirt: add ultravisor initialization
  2020-02-14 22:26 ` [PATCH v2 04/42] s390/protvirt: add ultravisor initialization Christian Borntraeger
@ 2020-02-17  9:57   ` David Hildenbrand
  2020-02-17 11:13     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17  9:57 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Vasily Gorbik <gor@linux.ibm.com>
> 
> Before being able to host protected virtual machines, donate some of
> the memory to the ultravisor. Besides that the ultravisor might impose
> addressing limitations for memory used to back protected VM storage. Treat
> that limit as protected virtualization host's virtual memory limit.
> 
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/uv.h | 15 +++++++++++
>  arch/s390/kernel/setup.c   |  5 ++++
>  arch/s390/kernel/uv.c      | 51 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+)
> 
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index 34b1114dcc38..f5b55e3972b3 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -23,12 +23,14 @@
>  #define UVC_RC_NO_RESUME	0x0007
>  
>  #define UVC_CMD_QUI			0x0001
> +#define UVC_CMD_INIT_UV			0x000f
>  #define UVC_CMD_SET_SHARED_ACCESS	0x1000
>  #define UVC_CMD_REMOVE_SHARED_ACCESS	0x1001
>  
>  /* Bits in installed uv calls */
>  enum uv_cmds_inst {
>  	BIT_UVC_CMD_QUI = 0,
> +	BIT_UVC_CMD_INIT_UV = 1,
>  	BIT_UVC_CMD_SET_SHARED_ACCESS = 8,
>  	BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9,
>  };
> @@ -59,6 +61,14 @@ struct uv_cb_qui {
>  	u64 reserveda0;
>  } __packed __aligned(8);
>  
> +struct uv_cb_init {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 stor_origin;
> +	u64 stor_len;
> +	u64 reserved28[4];
> +} __packed __aligned(8);
> +
>  struct uv_cb_share {
>  	struct uv_cb_header header;
>  	u64 reserved08[3];
> @@ -159,8 +169,13 @@ static inline int is_prot_virt_host(void)
>  {
>  	return prot_virt_host;
>  }
> +
> +void setup_uv(void);
> +void adjust_to_uv_max(unsigned long *vmax);
>  #else
>  #define is_prot_virt_host() 0
> +static inline void setup_uv(void) {}
> +static inline void adjust_to_uv_max(unsigned long *vmax) {}
>  #endif
>  
>  #if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
> index a2496382175e..e02727812e67 100644
> --- a/arch/s390/kernel/setup.c
> +++ b/arch/s390/kernel/setup.c
> @@ -560,6 +560,9 @@ static void __init setup_memory_end(void)
>  			vmax = _REGION1_SIZE; /* 4-level kernel page table */
>  	}
>  
> +	if (prot_virt_host)
> +		adjust_to_uv_max(&vmax);
> +
>  	/* module area is at the end of the kernel address space. */
>  	MODULES_END = vmax;
>  	MODULES_VADDR = MODULES_END - MODULES_LEN;
> @@ -1134,6 +1137,8 @@ void __init setup_arch(char **cmdline_p)
>  	 */
>  	memblock_trim_memory(1UL << (MAX_ORDER - 1 + PAGE_SHIFT));
>  
> +	if (prot_virt_host)
> +		setup_uv();
>  	setup_memory_end();
>  	setup_memory();
>  	dma_contiguous_reserve(memory_end);
> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
> index b1f936710360..1424994f5489 100644
> --- a/arch/s390/kernel/uv.c
> +++ b/arch/s390/kernel/uv.c
> @@ -49,4 +49,55 @@ static int __init prot_virt_setup(char *val)
>  	return rc;
>  }
>  early_param("prot_virt", prot_virt_setup);
> +
> +static int __init uv_init(unsigned long stor_base, unsigned long stor_len)
> +{
> +	struct uv_cb_init uvcb = {
> +		.header.cmd = UVC_CMD_INIT_UV,
> +		.header.len = sizeof(uvcb),
> +		.stor_origin = stor_base,
> +		.stor_len = stor_len,
> +	};
> +
> +	if (uv_call(0, (uint64_t)&uvcb)) {
> +		pr_err("Ultravisor init failed with rc: 0x%x rrc: 0%x\n",
> +		       uvcb.header.rc, uvcb.header.rrc);
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +void __init setup_uv(void)
> +{
> +	unsigned long uv_stor_base;
> +
> +	if (!prot_virt_host)
> +		return;

That can go.

> +
> +	uv_stor_base = (unsigned long)memblock_alloc_try_nid(
> +		uv_info.uv_base_stor_len, SZ_1M, SZ_2G,
> +		MEMBLOCK_ALLOC_ACCESSIBLE, NUMA_NO_NODE);

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests
  2020-02-14 22:26 ` [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests Christian Borntraeger
@ 2020-02-17 10:21   ` David Hildenbrand
  2020-02-17 11:28     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 10:21 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm

> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
> index 85e944f04c70..4ebcf891ff3c 100644
> --- a/arch/s390/include/asm/page.h
> +++ b/arch/s390/include/asm/page.h
> @@ -153,6 +153,11 @@ static inline int devmem_is_allowed(unsigned long pfn)
>  #define HAVE_ARCH_FREE_PAGE
>  #define HAVE_ARCH_ALLOC_PAGE
>  
> +#if IS_ENABLED(CONFIG_PGSTE)
> +int arch_make_page_accessible(struct page *page);
> +#define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> +#endif
> +

Feels like this should have been one of the (CONFIG_)ARCH_HAVE_XXX
thingies defined via kconfig instead.

E.g., like (CONFIG_)HAVE_ARCH_TRANSPARENT_HUGEPAGE

[...]

> +
> +/*
> + * Requests the Ultravisor to encrypt a guest page and make it
> + * accessible to the host for paging (export).
> + *
> + * @paddr: Absolute host address of page to be exported
> + */
> +int uv_convert_from_secure(unsigned long paddr)
> +{
> +	struct uv_cb_cfs uvcb = {
> +		.header.cmd = UVC_CMD_CONV_FROM_SEC_STOR,
> +		.header.len = sizeof(uvcb),
> +		.paddr = paddr
> +	};
> +
> +	if (uv_call(0, (u64)&uvcb))
> +		return -EINVAL;
> +	return 0;
> +}
> +
> +/*
> + * Calculate the expected ref_count for a page that would otherwise have no
> + * further pins. This was cribbed from similar functions in other places in
> + * the kernel, but with some slight modifications. We know that a secure
> + * page can not be a huge page for example.

s/ca not cannot/

> + */
> +static int expected_page_refs(struct page *page)
> +{
> +	int res;
> +
> +	res = page_mapcount(page);
> +	if (PageSwapCache(page)) {
> +		res++;
> +	} else if (page_mapping(page)) {
> +		res++;
> +		if (page_has_private(page))
> +			res++;
> +	}
> +	return res;
> +}
> +
> +static int make_secure_pte(pte_t *ptep, unsigned long addr,
> +			   struct page *exp_page, struct uv_cb_header *uvcb)
> +{
> +	pte_t entry = READ_ONCE(*ptep);
> +	struct page *page;
> +	int expected, rc = 0;
> +
> +	if (!pte_present(entry))
> +		return -ENXIO;
> +	if (pte_val(entry) & _PAGE_INVALID)
> +		return -ENXIO;
> +
> +	page = pte_page(entry);
> +	if (page != exp_page)
> +		return -ENXIO;
> +	if (PageWriteback(page))
> +		return -EAGAIN;
> +	expected = expected_page_refs(page);
> +	if (!page_ref_freeze(page, expected))
> +		return -EBUSY;
> +	set_bit(PG_arch_1, &page->flags);
> +	rc = uv_call(0, (u64)uvcb);
> +	page_ref_unfreeze(page, expected);
> +	/* Return -ENXIO if the page was not mapped, -EINVAL otherwise */
> +	if (rc)
> +		rc = uvcb->rc == 0x10a ? -ENXIO : -EINVAL;
> +	return rc;
> +}
> +
> +/*
> + * Requests the Ultravisor to make a page accessible to a guest.
> + * If it's brought in the first time, it will be cleared. If
> + * it has been exported before, it will be decrypted and integrity
> + * checked.
> + */
> +int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
> +{
> +	struct vm_area_struct *vma;
> +	unsigned long uaddr;
> +	struct page *page;
> +	int rc, local_drain = 0;

local_drain could have been a bool.

> +	spinlock_t *ptelock;
> +	pte_t *ptep;
> +
> +again:
> +	rc = -EFAULT;
> +	down_read(&gmap->mm->mmap_sem);
> +
> +	uaddr = __gmap_translate(gmap, gaddr);
> +	if (IS_ERR_VALUE(uaddr))
> +		goto out;
> +	vma = find_vma(gmap->mm, uaddr);
> +	if (!vma)
> +		goto out;
> +	/*
> +	 * Secure pages cannot be huge and userspace should not combine both.
> +	 * In case userspace does it anyway this will result in an -EFAULT for
> +	 * the unpack. The guest is thus never reaching secure mode. If
> +	 * userspace is playing dirty tricky with mapping huge pages later
> +	 * on this will result in a segmenation fault.

s/segmenation/segmentation/

> +	 */
> +	if (is_vm_hugetlb_page(vma))
> +		goto out;
> +
> +	rc = -ENXIO;
> +	page = follow_page(vma, uaddr, FOLL_WRITE);
> +	if (IS_ERR_OR_NULL(page))
> +		goto out;
> +
> +	lock_page(page);
> +	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
> +	rc = make_secure_pte(ptep, uaddr, page, uvcb);
> +	pte_unmap_unlock(ptep, ptelock);
> +	unlock_page(page);
> +out:
> +	up_read(&gmap->mm->mmap_sem);
> +
> +	if (rc == -EAGAIN) {
> +		wait_on_page_writeback(page);
> +	} else if (rc == -EBUSY) {
> +		/*
> +		 * If we have tried a local drain and the page refcount
> +		 * still does not match our expected safe value, try with a
> +		 * system wide drain. This is needed if the pagevecs holding
> +		 * the page are on a different CPU.
> +		 */
> +		if (local_drain) {
> +			lru_add_drain_all();

I do wonder if that is valid to be called with all the locks at this point.

> +			/* We give up here, and let the caller try again */
> +			return -EAGAIN;
> +		}
> +		/*
> +		 * We are here if the page refcount does not match the
> +		 * expected safe value. The main culprits are usually
> +		 * pagevecs. With lru_add_drain() we drain the pagevecs
> +		 * on the local CPU so that hopefully the refcount will
> +		 * reach the expected safe value.
> +		 */
> +		lru_add_drain();

dito ...

> +		local_drain = 1;
> +		/* And now we try again immediately after draining */
> +		goto again;
> +	} else if (rc == -ENXIO) {
> +		if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))
> +			return -EFAULT;
> +		return -EAGAIN;
> +	}
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(gmap_make_secure);
> +
> +int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr)
> +{
> +	struct uv_cb_cts uvcb = {
> +		.header.cmd = UVC_CMD_CONV_TO_SEC_STOR,
> +		.header.len = sizeof(uvcb),
> +		.guest_handle = gmap->guest_handle,
> +		.gaddr = gaddr,
> +	};
> +
> +	return gmap_make_secure(gmap, gaddr, &uvcb);
> +}
> +EXPORT_SYMBOL_GPL(gmap_convert_to_secure);
> +
> +/**
> + * To be called with the page locked or with an extra reference!

Can we have races here? (IOW, two callers concurrently for the same page)

> + */
> +int arch_make_page_accessible(struct page *page)
> +{
> +	int rc = 0;
> +
> +	/* Hugepage cannot be protected, so nothing to do */
> +	if (PageHuge(page))
> +		return 0;
> +
> +	/*
> +	 * PG_arch_1 is used in 3 places:
> +	 * 1. for kernel page tables during early boot
> +	 * 2. for storage keys of huge pages and KVM
> +	 * 3. As an indication that this page might be secure. This can
> +	 *    overindicate, e.g. we set the bit before calling
> +	 *    convert_to_secure.
> +	 * As secure pages are never huge, all 3 variants can co-exists.
> +	 */
> +	if (!test_bit(PG_arch_1, &page->flags))
> +		return 0;
> +
> +	rc = uv_pin_shared(page_to_phys(page));
> +	if (!rc) {
> +		clear_bit(PG_arch_1, &page->flags);
> +		return 0;
> +	}

Overall, looks sane to me. (I am mostly concerned about possible races,
e.g., when two gmaps would be created for a single VM and nasty stuff be
done with them). But yeah, I guess you guys thought about this ;)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 07/42] KVM: s390: protvirt: Add UV debug trace
  2020-02-14 22:26 ` [PATCH v2 07/42] KVM: s390: protvirt: Add UV debug trace Christian Borntraeger
@ 2020-02-17 10:41   ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 10:41 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> Let's have some debug traces which stay around for longer than the
> guest.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/kvm-s390.c |  9 ++++++++-
>  arch/s390/kvm/kvm-s390.h | 11 +++++++++++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index d7ff30e45589..cc7793525a69 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -220,6 +220,7 @@ static struct kvm_s390_vm_cpu_subfunc kvm_s390_available_subfunc;
>  static struct gmap_notifier gmap_notifier;
>  static struct gmap_notifier vsie_gmap_notifier;
>  debug_info_t *kvm_s390_dbf;
> +debug_info_t *kvm_s390_dbf_uv;
>  
>  /* Section: not file related */
>  int kvm_arch_hardware_enable(void)
> @@ -460,7 +461,12 @@ int kvm_arch_init(void *opaque)
>  	if (!kvm_s390_dbf)
>  		return -ENOMEM;
>  
> -	if (debug_register_view(kvm_s390_dbf, &debug_sprintf_view))
> +	kvm_s390_dbf_uv = debug_register("kvm-uv", 32, 1, 7 * sizeof(long));
> +	if (!kvm_s390_dbf_uv)
> +		goto out;


Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 08/42] KVM: s390: add new variants of UV CALL
  2020-02-14 22:26 ` [PATCH v2 08/42] KVM: s390: add new variants of UV CALL Christian Borntraeger
@ 2020-02-17 10:42   ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 10:42 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> This adds two new helper functions for doing UV CALLs.
> 
> The first variant handles UV CALLs that might have longer busy
> conditions or just need longer when doing partial completion. We should
> schedule when necessary.
> 
> The second variant handles UV CALLs that only need the handle but have
> no payload (e.g. destroying a VM). We can provide a simple wrapper for
> those.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/uv.h | 65 +++++++++++++++++++++++++++++++++++---
>  1 file changed, 60 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index e45963cc7f40..bc452a15ac3f 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -14,6 +14,7 @@
>  #include <linux/types.h>
>  #include <linux/errno.h>
>  #include <linux/bug.h>
> +#include <linux/sched.h>
>  #include <asm/page.h>
>  #include <asm/gmap.h>
>  
> @@ -91,6 +92,19 @@ struct uv_cb_cfs {
>  	u64 paddr;
>  } __packed __aligned(8);
>  
> +/*
> + * A common UV call struct for calls that take no payload
> + * Examples:
> + * Destroy cpu/config
> + * Verify
> + */
> +struct uv_cb_nodata {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 handle;
> +	u64 reserved20[4];
> +} __packed __aligned(8);
> +
>  struct uv_cb_share {
>  	struct uv_cb_header header;
>  	u64 reserved08[3];
> @@ -98,21 +112,62 @@ struct uv_cb_share {
>  	u64 reserved28;
>  } __packed __aligned(8);
>  
> -static inline int uv_call(unsigned long r1, unsigned long r2)
> +static inline int __uv_call(unsigned long r1, unsigned long r2)
>  {
>  	int cc;
>  
>  	asm volatile(
> -		"0:	.insn rrf,0xB9A40000,%[r1],%[r2],0,0\n"
> -		"		brc	3,0b\n"
> -		"		ipm	%[cc]\n"
> -		"		srl	%[cc],28\n"
> +		"	.insn rrf,0xB9A40000,%[r1],%[r2],0,0\n"
> +		"	ipm	%[cc]\n"
> +		"	srl	%[cc],28\n"
>  		: [cc] "=d" (cc)
>  		: [r1] "a" (r1), [r2] "a" (r2)
>  		: "memory", "cc");
>  	return cc;
>  }
>  
> +static inline int uv_call(unsigned long r1, unsigned long r2)
> +{
> +	int cc;
> +
> +	do {
> +		cc = __uv_call(r1, r2);
> +	} while (cc > 1);
> +	return cc;
> +}
> +
> +/* Low level uv_call that avoids stalls for long running busy conditions  */
> +static inline int uv_call_sched(unsigned long r1, unsigned long r2)
> +{
> +	int cc;
> +
> +	do {
> +		cc = __uv_call(r1, r2);
> +		cond_resched();
> +	} while (cc > 1);
> +	return cc;
> +}
> +
> +/*
> + * special variant of uv_call that only transports the cpu or guest
> + * handle and the command, like destroy or verify.
> + */
> +static inline int uv_cmd_nodata(u64 handle, u16 cmd, u16 *rc, u16 *rrc)
> +{
> +	struct uv_cb_nodata uvcb = {
> +		.header.cmd = cmd,
> +		.header.len = sizeof(uvcb),
> +		.handle = handle,
> +	};
> +	int cc;
> +
> +	WARN(!handle, "No handle provided to Ultravisor call cmd %x\n", cmd);
> +	cc = uv_call_sched(0, (u64)&uvcb);
> +	*rc = uvcb.header.rc;
> +	*rrc = uvcb.header.rrc;
> +	return cc ? -EINVAL : 0;
> +}
> +
>  struct uv_info {
>  	unsigned long inst_calls_list[4];
>  	unsigned long uv_base_stor_len;
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-14 22:26 ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
@ 2020-02-17 10:56   ` David Hildenbrand
  2020-02-17 12:04     ` Christian Borntraeger
  2020-02-18  8:09     ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
  2020-02-18  8:39   ` [PATCH v2.1] " Christian Borntraeger
  1 sibling, 2 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 10:56 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

[...]
>  
> +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
> +{
> +	int r = 0;
> +	void __user *argp = (void __user *)cmd->data;
> +
> +	switch (cmd->cmd) {
> +	case KVM_PV_VM_CREATE: {
> +		r = -EINVAL;
> +		if (kvm_s390_pv_is_protected(kvm))
> +			break;

Isn't this racy? I think there has to be a way to make sure the PV state
can't change. Is there any and I am missing something obvious? (is
suspect we need the kvm->lock)

> +
> +		r = kvm_s390_pv_alloc_vm(kvm);
> +		if (r)
> +			break;
> +
> +		mutex_lock(&kvm->lock);
> +		kvm_s390_vcpu_block_all(kvm);
> +		/* FMT 4 SIE needs esca */
> +		r = sca_switch_to_extended(kvm);
> +		if (r) {
> +			kvm_s390_pv_dealloc_vm(kvm);
> +			kvm_s390_vcpu_unblock_all(kvm);
> +			mutex_unlock(&kvm->lock);
> +			break;
> +		}
> +		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
> +		kvm_s390_vcpu_unblock_all(kvm);
> +		mutex_unlock(&kvm->lock);
> +		break;
> +	}
> +	case KVM_PV_VM_DESTROY: {
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))
> +			break;
> +

dito

> +		/* All VCPUs have to be destroyed before this call. */
> +		mutex_lock(&kvm->lock);
> +		kvm_s390_vcpu_block_all(kvm);
> +		r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
> +		if (!r)
> +			kvm_s390_pv_dealloc_vm(kvm);
> +		kvm_s390_vcpu_unblock_all(kvm);
> +		mutex_unlock(&kvm->lock);
> +		break;
> +	}
> +	case KVM_PV_VM_SET_SEC_PARMS: {

I'd name this "KVM_PV_VM_SET_PARMS" instead.

> +		struct kvm_s390_pv_sec_parm parms = {};
> +		void *hdr;
> +
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))
> +			break;
> +

dito

> +		r = -EFAULT;
> +		if (copy_from_user(&parms, argp, sizeof(parms)))
> +			break;
> +
> +		/* Currently restricted to 8KB */
> +		r = -EINVAL;
> +		if (parms.length > PAGE_SIZE * 2)
> +			break;
> +
> +		r = -ENOMEM;
> +		hdr = vmalloc(parms.length);
> +		if (!hdr)
> +			break;
> +
> +		r = -EFAULT;
> +		if (!copy_from_user(hdr, (void __user *)parms.origin,
> +				    parms.length))
> +			r = kvm_s390_pv_set_sec_parms(kvm, hdr, parms.length,
> +						      &cmd->rc, &cmd->rrc);
> +
> +		vfree(hdr);
> +		break;
> +	}
> +	case KVM_PV_VM_UNPACK: {
> +		struct kvm_s390_pv_unp unp = {};
> +
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))
> +			break;
> +

dito

> +		r = -EFAULT;
> +		if (copy_from_user(&unp, argp, sizeof(unp)))
> +			break;
> +
> +		r = kvm_s390_pv_unpack(kvm, unp.addr, unp.size, unp.tweak,
> +				       &cmd->rc, &cmd->rrc);
> +		break;
> +	}
> +	case KVM_PV_VM_VERIFY: {
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))> +			break;

dito

> +
> +		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
> +				  UVC_CMD_VERIFY_IMG, &cmd->rc, &cmd->rrc);
> +		KVM_UV_EVENT(kvm, 3, "PROTVIRT VERIFY: rc %x rrc %x", cmd->rc,
> +			     cmd->rrc);
> +		break;
> +	}
> +	default:
> +		return -ENOTTY;
> +	}
> +	return r;
> +}
> +
>  long kvm_arch_vm_ioctl(struct file *filp,
>  		       unsigned int ioctl, unsigned long arg)
>  {
> @@ -2262,6 +2376,25 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		mutex_unlock(&kvm->slots_lock);
>  		break;
>  	}
> +	case KVM_S390_PV_COMMAND: {
> +		struct kvm_pv_cmd args;
> +
> +		r = 0;
> +		if (!is_prot_virt_host()) {
> +			r = -EINVAL;
> +			break;
> +		}
> +		if (copy_from_user(&args, argp, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +		r = kvm_s390_handle_pv(kvm, &args);
> +		if (copy_to_user(argp, &args, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +		break;
> +	}
>  	default:
>  		r = -ENOTTY;
>  	}
> @@ -2525,6 +2658,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
> +	u16 rc, rrc;
> +
>  	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
>  	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
>  	kvm_s390_clear_local_irqs(vcpu);
> @@ -2537,6 +2672,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  
>  	if (vcpu->kvm->arch.use_cmma)
>  		kvm_s390_vcpu_unsetup_cmma(vcpu);
> +	if (kvm_s390_pv_handle_cpu(vcpu))
> +		kvm_s390_pv_destroy_cpu(vcpu, &rc, &rrc);
>  	free_page((unsigned long)(vcpu->arch.sie_block));
>  }
>  
> @@ -2558,10 +2695,15 @@ static void kvm_free_vcpus(struct kvm *kvm)
>  
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> +	u16 rc, rrc;
>  	kvm_free_vcpus(kvm);
>  	sca_dispose(kvm);
> -	debug_unregister(kvm->arch.dbf);
>  	kvm_s390_gisa_destroy(kvm);
> +	if (kvm_s390_pv_is_protected(kvm)) {
> +		kvm_s390_pv_destroy_vm(kvm, &rc, &rrc);
> +		kvm_s390_pv_dealloc_vm(kvm);
> +	}
> +	debug_unregister(kvm->arch.dbf);
>  	free_page((unsigned long)kvm->arch.sie_page2);
>  	if (!kvm_is_ucontrol(kvm))
>  		gmap_remove(kvm->arch.gmap);
> @@ -2657,6 +2799,9 @@ static int sca_switch_to_extended(struct kvm *kvm)
>  	unsigned int vcpu_idx;
>  	u32 scaol, scaoh;
>  
> +	if (kvm->arch.use_esca)
> +		return 0;
> +
>  	new_sca = alloc_pages_exact(sizeof(*new_sca), GFP_KERNEL|__GFP_ZERO);
>  	if (!new_sca)
>  		return -ENOMEM;
> @@ -2908,6 +3053,7 @@ static void kvm_s390_vcpu_setup_model(struct kvm_vcpu *vcpu)
>  static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>  {
>  	int rc = 0;
> +	u16 uvrc, uvrrc;
>  
>  	atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH |
>  						    CPUSTAT_SM |
> @@ -2975,6 +3121,9 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>  
>  	kvm_s390_vcpu_crypto_setup(vcpu);
>  
> +	if (kvm_s390_pv_is_protected(vcpu->kvm))
> +		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);

With an explicit KVM_PV_VCPU_CREATE, this does not belong here. When
hotplugging CPUs, user space has to do that manually. But as I said
already, this user space API could be improved. (below)

> +
>  	return rc;
>  }
>  
> @@ -4352,6 +4501,38 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
>  	return -ENOIOCTLCMD;
>  }
>  
> +static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
> +				   struct kvm_pv_cmd *cmd)
> +{
> +	int r = 0;
> +
> +	if (!kvm_s390_pv_is_protected(vcpu->kvm))
> +		return -EINVAL;
> +
> +	if (cmd->flags)
> +		return -EINVAL;
> +
> +	switch (cmd->cmd) {
> +	case KVM_PV_VCPU_CREATE: {
> +		if (kvm_s390_pv_handle_cpu(vcpu))
> +			return -EINVAL;
> +
> +		r = kvm_s390_pv_create_cpu(vcpu, &cmd->rc, &cmd->rrc);
> +		break;
> +	}
> +	case KVM_PV_VCPU_DESTROY: {
> +		if (!kvm_s390_pv_handle_cpu(vcpu))
> +			return -EINVAL;
> +
> +		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
> +		break;
> +	}
> +	default:
> +		r = -ENOTTY;
> +	}
> +	return r;
> +}
> +
>  long kvm_arch_vcpu_ioctl(struct file *filp,
>  			 unsigned int ioctl, unsigned long arg)
>  {
> @@ -4493,6 +4674,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  					   irq_state.len);
>  		break;
>  	}
> +	case KVM_S390_PV_COMMAND_VCPU: {
> +		struct kvm_pv_cmd args;
> +
> +		r = 0;
> +		if (!is_prot_virt_host()) {
> +			r = -EINVAL;
> +			break;
> +		}
> +		if (copy_from_user(&args, argp, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +		r = kvm_s390_handle_pv_vcpu(vcpu, &args);
> +		if (copy_to_user(argp, &args, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +		break;
> +	}
>  	default:
>  		r = -ENOTTY;


Can we please discuss why we can't

- Get rid of KVM_S390_PV_COMMAND_VCPU
- Do the allocation in KVM_PV_VM_CREATE
- Rename KVM_PV_VM_CREATE -> KVM_PV_ENABLE
- Rename KVM_PV_VM_DESTROY -> KVM_PV_DISABLE

This user space API is unnecessary complicated and confusing.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 15/42] KVM: s390: protvirt: Add interruption injection controls
  2020-02-14 22:26 ` [PATCH v2 15/42] KVM: s390: protvirt: Add interruption injection controls Christian Borntraeger
@ 2020-02-17 10:59   ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 10:59 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Michael Mueller <mimu@linux.ibm.com>
> 
> This defines the necessary data structures in the SIE control block to
> inject machine checks,external and I/O interrupts. We first define the
> the interrupt injection control, which defines the next interrupt to
> inject. Then we define the fields that contain the payload for machine
> checks,external and I/O interrupts
> 
> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/kvm_host.h | 56 +++++++++++++++++++++++++-------
>  1 file changed, 44 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index c6694f47b73b..834b3b7a6e23 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -222,7 +222,15 @@ struct kvm_s390_sie_block {
>  	__u8	icptcode;		/* 0x0050 */
>  	__u8	icptstatus;		/* 0x0051 */
>  	__u16	ihcpu;			/* 0x0052 */
> -	__u8	reserved54[2];		/* 0x0054 */
> +	__u8	reserved54;		/* 0x0054 */
> +#define IICTL_CODE_NONE		 0x00
> +#define IICTL_CODE_MCHK		 0x01
> +#define IICTL_CODE_EXT		 0x02
> +#define IICTL_CODE_IO		 0x03
> +#define IICTL_CODE_RESTART	 0x04
> +#define IICTL_CODE_SPECIFICATION 0x10
> +#define IICTL_CODE_OPERAND	 0x11
> +	__u8	iictl;			/* 0x0055 */
>  	__u16	ipa;			/* 0x0056 */
>  	__u32	ipb;			/* 0x0058 */
>  	__u32	scaoh;			/* 0x005c */
> @@ -259,24 +267,48 @@ struct kvm_s390_sie_block {
>  #define HPID_KVM	0x4
>  #define HPID_VSIE	0x5
>  	__u8	hpid;			/* 0x00b8 */
> -	__u8	reservedb9[11];		/* 0x00b9 */
> -	__u16	extcpuaddr;		/* 0x00c4 */
> -	__u16	eic;			/* 0x00c6 */
> +	__u8	reservedb9[7];		/* 0x00b9 */
> +	union {
> +		struct {
> +			__u32	eiparams;	/* 0x00c0 */
> +			__u16	extcpuaddr;	/* 0x00c4 */
> +			__u16	eic;		/* 0x00c6 */
> +		};
> +		__u64	mcic;			/* 0x00c0 */
> +	} __packed;
>  	__u32	reservedc8;		/* 0x00c8 */
> -	__u16	pgmilc;			/* 0x00cc */
> -	__u16	iprcc;			/* 0x00ce */
> -	__u32	dxc;			/* 0x00d0 */
> -	__u16	mcn;			/* 0x00d4 */
> -	__u8	perc;			/* 0x00d6 */
> -	__u8	peratmid;		/* 0x00d7 */
> +	union {
> +		struct {
> +			__u16	pgmilc;		/* 0x00cc */
> +			__u16	iprcc;		/* 0x00ce */
> +		};
> +		__u32	edc;			/* 0x00cc */
> +	} __packed;
> +	union {
> +		struct {
> +			__u32	dxc;		/* 0x00d0 */
> +			__u16	mcn;		/* 0x00d4 */
> +			__u8	perc;		/* 0x00d6 */
> +			__u8	peratmid;	/* 0x00d7 */
> +		};
> +		__u64	faddr;			/* 0x00d0 */
> +	} __packed;
>  	__u64	peraddr;		/* 0x00d8 */
>  	__u8	eai;			/* 0x00e0 */
>  	__u8	peraid;			/* 0x00e1 */
>  	__u8	oai;			/* 0x00e2 */
>  	__u8	armid;			/* 0x00e3 */
>  	__u8	reservede4[4];		/* 0x00e4 */
> -	__u64	tecmc;			/* 0x00e8 */
> -	__u8	reservedf0[12];		/* 0x00f0 */
> +	union {
> +		__u64	tecmc;		/* 0x00e8 */
> +		struct {
> +			__u16	subchannel_id;	/* 0x00e8 */
> +			__u16	subchannel_nr;	/* 0x00ea */
> +			__u32	io_int_parm;	/* 0x00ec */
> +			__u32	io_int_word;	/* 0x00f0 */
> +		};
> +	} __packed;
> +	__u8	reservedf4[8];		/* 0x00f4 */
>  #define CRYCB_FORMAT_MASK 0x00000003
>  #define CRYCB_FORMAT0 0x00000000
>  #define CRYCB_FORMAT1 0x00000001
> 

I'd just squash that into the patch where it is actually used. (IOW, the
next one)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling
  2020-02-14 22:26 ` [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling Christian Borntraeger
@ 2020-02-17 11:01   ` David Hildenbrand
  2020-02-17 11:33     ` Christian Borntraeger
  2020-02-17 14:37     ` Janosch Frank
  0 siblings, 2 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 11:01 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> Guest registers for protected guests are stored at offset 0x380.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/kvm_host.h |  4 +++-
>  arch/s390/kvm/kvm-s390.c         | 11 +++++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index ba3364b37159..4fcbb055a565 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -343,7 +343,9 @@ struct kvm_s390_itdb {
>  struct sie_page {
>  	struct kvm_s390_sie_block sie_block;
>  	struct mcck_volatile_info mcck_info;	/* 0x0200 */
> -	__u8 reserved218[1000];		/* 0x0218 */
> +	__u8 reserved218[360];		/* 0x0218 */
> +	__u64 pv_grregs[16];		/* 0x0380 */
> +	__u8 reserved400[512];		/* 0x0400 */
>  	struct kvm_s390_itdb itdb;	/* 0x0600 */
>  	__u8 reserved700[2304];		/* 0x0700 */
>  };
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index a85e50075d99..6ebb0dae5a2e 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3999,6 +3999,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>  {
>  	int rc, exit_reason;
> +	struct sie_page *sie_page = (struct sie_page *)vcpu->arch.sie_block;
>  
>  	/*
>  	 * We try to hold kvm->srcu during most of vcpu_run (except when run-
> @@ -4020,8 +4021,18 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>  		guest_enter_irqoff();
>  		__disable_cpu_timer_accounting(vcpu);
>  		local_irq_enable();
> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> +			memcpy(sie_page->pv_grregs,
> +			       vcpu->run->s.regs.gprs,
> +			       sizeof(sie_page->pv_grregs));
> +		}
>  		exit_reason = sie64a(vcpu->arch.sie_block,
>  				     vcpu->run->s.regs.gprs);
> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> +			memcpy(vcpu->run->s.regs.gprs,
> +			       sie_page->pv_grregs,
> +			       sizeof(sie_page->pv_grregs));
> +		}
>  		local_irq_disable();
>  		__enable_cpu_timer_accounting(vcpu);
>  		guest_exit_irqoff();
> 

As discussed, I think there is room for improvement in the future (which
we could have documented in the patch description), because this is
obviously sub-optimal.

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer
  2020-02-14 22:26 ` [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer Christian Borntraeger
@ 2020-02-17 11:08   ` David Hildenbrand
  2020-02-17 14:47     ` Janosch Frank
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 11:08 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

> @@ -4460,6 +4489,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>  
>  	switch (mop->op) {
>  	case KVM_S390_MEMOP_LOGICAL_READ:
> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> +			r = -EINVAL;
> +			break;
> +		}

Could we have a possible race with disabling code, especially while
concurrently freeing? (sorry if I ask again, there was just a flood of
emails)

>  		if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
>  			r = check_gva_range(vcpu, mop->gaddr, mop->ar,
>  					    mop->size, GACC_FETCH);
> @@ -4472,6 +4505,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>  		}
>  		break;
>  	case KVM_S390_MEMOP_LOGICAL_WRITE:
> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> +			r = -EINVAL;
> +			break;
> +		}

dito

>  		if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
>  			r = check_gva_range(vcpu, mop->gaddr, mop->ar,
>  					    mop->size, GACC_STORE);
> @@ -4483,6 +4520,11 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>  		}
>  		r = write_guest(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size);
>  		break;
> +	case KVM_S390_MEMOP_SIDA_READ:
> +	case KVM_S390_MEMOP_SIDA_WRITE:
> +		/* we are locked against sida going away by the vcpu->mutex */
> +		r = kvm_s390_guest_sida_op(vcpu, mop);
> +		break;
>  	default:
>  		r = -EINVAL;
>  	}
> diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
> index 09573e36c329..80169a9b43ec 100644
> --- a/arch/s390/kvm/pv.c
> +++ b/arch/s390/kvm/pv.c
> @@ -92,6 +92,7 @@ int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
>  
>  	free_pages(vcpu->arch.pv.stor_base,
>  		   get_order(uv_info.guest_cpu_stor_len));
> +	free_page(sida_origin(vcpu->arch.sie_block));
>  	vcpu->arch.sie_block->pv_handle_cpu = 0;
>  	vcpu->arch.sie_block->pv_handle_config = 0;
>  	memset(&vcpu->arch.pv, 0, sizeof(vcpu->arch.pv));
> @@ -121,6 +122,14 @@ int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
>  	uvcb.state_origin = (u64)vcpu->arch.sie_block;
>  	uvcb.stor_origin = (u64)vcpu->arch.pv.stor_base;
>  
> +	/* Alloc Secure Instruction Data Area Designation */
> +	vcpu->arch.sie_block->sidad = __get_free_page(GFP_KERNEL | __GFP_ZERO);
> +	if (!vcpu->arch.sie_block->sidad) {
> +		free_pages(vcpu->arch.pv.stor_base,
> +			   get_order(uv_info.guest_cpu_stor_len));
> +		return -ENOMEM;
> +	}
> +
>  	cc = uv_call(0, (u64)&uvcb);
>  	*rc = uvcb.header.rc;
>  	*rrc = uvcb.header.rrc;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 207915488502..0fdee1bc3798 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -475,11 +475,15 @@ struct kvm_s390_mem_op {
>  	__u32 op;		/* type of operation */
>  	__u64 buf;		/* buffer in userspace */
>  	__u8 ar;		/* the access register number */
> -	__u8 reserved[31];	/* should be set to 0 */
> +	__u8 reserved21[3];	/* should be set to 0 */
> +	__u32 sida_offset;	/* offset into the sida */
> +	__u8 reserved28[24];	/* should be set to 0 */
>  };

As discussed, I'd prefer an overlaying layout for the sida, as the ar
does not make any sense (correct me if I'm wrong :) )

__u32 op;		/* type of operation */
__u64 buf;		/* buffer in userspace */
uinon {
	__u8 ar;		/* the access register number */
	__u32 sida_offset;	/* offset into the sida */
	__u8 reserved[32];	/* should be set to 0 */
};

With something like that

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-17  9:14   ` David Hildenbrand
@ 2020-02-17 11:10     ` Christian Borntraeger
  2020-02-18  8:27       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 11:10 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm, Will Deacon, Sean Christopherson



On 17.02.20 10:14, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
>>
>> With the introduction of protected KVM guests on s390 there is now a
>> concept of inaccessible pages. These pages need to be made accessible
>> before the host can access them.
>>
>> While cpu accesses will trigger a fault that can be resolved, I/O
>> accesses will just fail.  We need to add a callback into architecture
>> code for places that will do I/O, namely when writeback is started or
>> when a page reference is taken.
>>
>> This is not only to enable paging, file backing etc, it is also
>> necessary to protect the host against a malicious user space. For
>> example a bad QEMU could simply start direct I/O on such protected
>> memory.  We do not want userspace to be able to trigger I/O errors and
>> thus we the logic is "whenever somebody accesses that page (gup) or
>> doing I/O, make sure that this page can be accessed. When the guest
>> tries to access that page we will wait in the page fault handler for
>> writeback to have finished and for the page_ref to be the expected
>> value.
>>
>> If wanted by others, the callbacks can be extended with error handlin
>> and a parameter from where this is called.
> 
> s/handlin/handling/

ack. 
> 
> One last question from my side:
> 
> Why is it OK to ignore errors here. IOW, why not squash "[PATCH v2
> 39/42] example for future extension: mm:gup/writeback: add callbacks for
> inaccessible pages: error cases" into this patch.

I can certainly squash this in. But I have never heard feedback from the 
memory management people if they would prefer the patch as-is (no overhead
at all for !s390) or already with the error handling. 
> 
> I can see in patch "[PATCH v2 05/42] s390/mm: provide memory management
> functions for protected KVM guests", that the call can fail for various
> reasons. That puzzles me a bit - what would happen if any of that fails?

The convert _to_ secure has error handling. uv_convert_from_secure (the
one that is used for arch_make_page_accessible can only fail if we mess up,
e.g. an "export" on memory that we have donated to the ultravisor.


> Or will it actually never fail for s390x (and all that error handling in
> arch_make_page_accessible() is essentially dead code in real life?)

So yes, if everything is setup properly this should not fail in real life
and only we have a kernel (or firmware) bug.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 03/42] s390/protvirt: introduce host side setup
  2020-02-17  9:53   ` David Hildenbrand
@ 2020-02-17 11:11     ` Christian Borntraeger
  2020-02-17 11:13       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 11:11 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik



On 17.02.20 10:53, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Vasily Gorbik <gor@linux.ibm.com>
>>
>> Add "prot_virt" command line option which controls if the kernel
>> protected VMs support is enabled at early boot time. This has to be
>> done early, because it needs large amounts of memory and will disable
>> some features like STP time sync for the lpar.
>>
>> Extend ultravisor info definitions and expose it via uv_info struct
>> filled in during startup.
>>
>> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  .../admin-guide/kernel-parameters.txt         |  5 ++
>>  arch/s390/boot/Makefile                       |  2 +-
>>  arch/s390/boot/uv.c                           | 21 +++++++-
>>  arch/s390/include/asm/uv.h                    | 45 +++++++++++++++-
>>  arch/s390/kernel/Makefile                     |  1 +
>>  arch/s390/kernel/setup.c                      |  4 --
>>  arch/s390/kernel/uv.c                         | 52 +++++++++++++++++++
>>  7 files changed, 122 insertions(+), 8 deletions(-)
>>  create mode 100644 arch/s390/kernel/uv.c
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index dbc22d684627..b0beae9b9e36 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -3795,6 +3795,11 @@
>>  			before loading.
>>  			See Documentation/admin-guide/blockdev/ramdisk.rst.
>>  
>> +	prot_virt=	[S390] enable hosting protected virtual machines
>> +			isolated from the hypervisor (if hardware supports
>> +			that).
>> +			Format: <bool>
>> +
>>  	psi=		[KNL] Enable or disable pressure stall information
>>  			tracking.
>>  			Format: <bool>
>> diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
>> index e2c47d3a1c89..30f1811540c5 100644
>> --- a/arch/s390/boot/Makefile
>> +++ b/arch/s390/boot/Makefile
>> @@ -37,7 +37,7 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
>>  obj-y	:= head.o als.o startup.o mem_detect.o ipl_parm.o ipl_report.o
>>  obj-y	+= string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
>>  obj-y	+= version.o pgm_check_info.o ctype.o text_dma.o
>> -obj-$(CONFIG_PROTECTED_VIRTUALIZATION_GUEST)	+= uv.o
>> +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
>>  obj-$(CONFIG_RELOCATABLE)	+= machine_kexec_reloc.o
>>  obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
>>  targets	:= bzImage startup.a section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y)
>> diff --git a/arch/s390/boot/uv.c b/arch/s390/boot/uv.c
>> index ed007f4a6444..af9e1cc93c68 100644
>> --- a/arch/s390/boot/uv.c
>> +++ b/arch/s390/boot/uv.c
>> @@ -3,7 +3,13 @@
>>  #include <asm/facility.h>
>>  #include <asm/sections.h>
>>  
>> +/* will be used in arch/s390/kernel/uv.c */
>> +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>>  int __bootdata_preserved(prot_virt_guest);
>> +#endif
>> +#if IS_ENABLED(CONFIG_KVM)
>> +struct uv_info __bootdata_preserved(uv_info);
>> +#endif
>>  
>>  void uv_query_info(void)
>>  {
>> @@ -18,7 +24,20 @@ void uv_query_info(void)
>>  	if (uv_call(0, (uint64_t)&uvcb))
>>  		return;
>>  
>> -	if (test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
>> +	if (IS_ENABLED(CONFIG_KVM)) {
>> +		memcpy(uv_info.inst_calls_list, uvcb.inst_calls_list, sizeof(uv_info.inst_calls_list));
>> +		uv_info.uv_base_stor_len = uvcb.uv_base_stor_len;
>> +		uv_info.guest_base_stor_len = uvcb.conf_base_phys_stor_len;
>> +		uv_info.guest_virt_base_stor_len = uvcb.conf_base_virt_stor_len;
>> +		uv_info.guest_virt_var_stor_len = uvcb.conf_virt_var_stor_len;
>> +		uv_info.guest_cpu_stor_len = uvcb.cpu_stor_len;
>> +		uv_info.max_sec_stor_addr = ALIGN(uvcb.max_guest_stor_addr, PAGE_SIZE);
>> +		uv_info.max_num_sec_conf = uvcb.max_num_sec_conf;
>> +		uv_info.max_guest_cpus = uvcb.max_guest_cpus;
>> +	}
>> +
>> +	if (IS_ENABLED(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) &&
>> +	    test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
>>  	    test_bit_inv(BIT_UVC_CMD_REMOVE_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list))
>>  		prot_virt_guest = 1;
>>  }
>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>> index 4093a2856929..34b1114dcc38 100644
>> --- a/arch/s390/include/asm/uv.h
>> +++ b/arch/s390/include/asm/uv.h
>> @@ -44,7 +44,19 @@ struct uv_cb_qui {
>>  	struct uv_cb_header header;
>>  	u64 reserved08;
>>  	u64 inst_calls_list[4];
>> -	u64 reserved30[15];
>> +	u64 reserved30[2];
>> +	u64 uv_base_stor_len;
>> +	u64 reserved48;
>> +	u64 conf_base_phys_stor_len;
>> +	u64 conf_base_virt_stor_len;
>> +	u64 conf_virt_var_stor_len;
>> +	u64 cpu_stor_len;
>> +	u32 reserved70[3];
>> +	u32 max_num_sec_conf;
>> +	u64 max_guest_stor_addr;
>> +	u8  reserved88[158-136];
>> +	u16 max_guest_cpus;
>> +	u64 reserveda0;
>>  } __packed __aligned(8);
>>  
>>  struct uv_cb_share {
>> @@ -69,6 +81,19 @@ static inline int uv_call(unsigned long r1, unsigned long r2)
>>  	return cc;
>>  }
>>  
>> +struct uv_info {
>> +	unsigned long inst_calls_list[4];
>> +	unsigned long uv_base_stor_len;
>> +	unsigned long guest_base_stor_len;
>> +	unsigned long guest_virt_base_stor_len;
>> +	unsigned long guest_virt_var_stor_len;
>> +	unsigned long guest_cpu_stor_len;
>> +	unsigned long max_sec_stor_addr;
>> +	unsigned int max_num_sec_conf;
>> +	unsigned short max_guest_cpus;
>> +};
>> +extern struct uv_info uv_info;
>> +
>>  #ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>>  extern int prot_virt_guest;
>>  
>> @@ -121,11 +146,27 @@ static inline int uv_remove_shared(unsigned long addr)
>>  	return share(addr, UVC_CMD_REMOVE_SHARED_ACCESS);
>>  }
>>  
>> -void uv_query_info(void);
>>  #else
>>  #define is_prot_virt_guest() 0
>>  static inline int uv_set_shared(unsigned long addr) { return 0; }
>>  static inline int uv_remove_shared(unsigned long addr) { return 0; }
>> +#endif
>> +
>> +#if IS_ENABLED(CONFIG_KVM)
>> +extern int prot_virt_host;
>> +
>> +static inline int is_prot_virt_host(void)
>> +{
>> +	return prot_virt_host;
>> +}
>> +#else
>> +#define is_prot_virt_host() 0
>> +#endif
>> +
>> +#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
>> +	IS_ENABLED(CONFIG_KVM)
>> +void uv_query_info(void);
>> +#else
>>  static inline void uv_query_info(void) {}
>>  #endif
>>  
>> diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
>> index 2b1203cf7be6..22bfb8d5084e 100644
>> --- a/arch/s390/kernel/Makefile
>> +++ b/arch/s390/kernel/Makefile
>> @@ -78,6 +78,7 @@ obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_events.o perf_regs.o
>>  obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_diag.o
>>  
>>  obj-$(CONFIG_TRACEPOINTS)	+= trace.o
>> +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
>>  
>>  # vdso
>>  obj-y				+= vdso64/
>> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
>> index b2c2f75860e8..a2496382175e 100644
>> --- a/arch/s390/kernel/setup.c
>> +++ b/arch/s390/kernel/setup.c
>> @@ -92,10 +92,6 @@ char elf_platform[ELF_PLATFORM_SIZE];
>>  
>>  unsigned long int_hwcap = 0;
>>  
>> -#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>> -int __bootdata_preserved(prot_virt_guest);
>> -#endif
>> -
>>  int __bootdata(noexec_disabled);
>>  int __bootdata(memory_end_set);
>>  unsigned long __bootdata(memory_end);
>> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
>> new file mode 100644
>> index 000000000000..b1f936710360
>> --- /dev/null
>> +++ b/arch/s390/kernel/uv.c
>> @@ -0,0 +1,52 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Common Ultravisor functions and initialization
>> + *
>> + * Copyright IBM Corp. 2019, 2020
>> + */
>> +#define KMSG_COMPONENT "prot_virt"
>> +#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/types.h>
>> +#include <linux/sizes.h>
>> +#include <linux/bitmap.h>
>> +#include <linux/memblock.h>
>> +#include <asm/facility.h>
>> +#include <asm/sections.h>
>> +#include <asm/uv.h>
>> +
>> +/* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */
>> +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>> +int __bootdata_preserved(prot_virt_guest);
>> +#endif
>> +
>> +#if IS_ENABLED(CONFIG_KVM)
> 
> 
> Why is that glued to CONFIG_KVM? Can't we glue all that stuff to
> CONFIG_PROTECTED_VIRTUALIZATION_GUEST and make in kconfig
> CONFIG_PROTECTED_VIRTUALIZATION_GUEST depend on CONFIG_KVM?


CONFIG_PROTECTED_VIRTUALIZATION_GUEST is for the GUEST part (the virtio-ccw changes
inside the guest). The other thing is for the host and here we bind it
to CONFIG_KVM (or PGSTE for mm code).

> 
> IOW, I'd prefer to only have CONFIG_PROTECTED_VIRTUALIZATION_GUEST in
> !KVM code and enforce it via kconfig instead.
> 
> 
> Anyhow, I remember Conny also having a comment about that previously, so
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 21/42] KVM: s390: protvirt: handle secure guest prefix pages
  2020-02-14 22:26 ` [PATCH v2 21/42] KVM: s390: protvirt: handle secure guest prefix pages Christian Borntraeger
@ 2020-02-17 11:11   ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 11:11 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> The SPX instruction is handled by the ultravisor. We do get a
> notification intercept, though. Let us update our internal view.
> 
> In addition to that, when the guest prefix page is not secure, an
> intercept 112 (0x70) is indicated. Let us make the prefix pages
> secure again.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/kvm_host.h |  1 +
>  arch/s390/kvm/intercept.c        | 18 ++++++++++++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index aa945b101fff..0ea82152d2f7 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -225,6 +225,7 @@ struct kvm_s390_sie_block {
>  #define ICPT_INT_ENABLE	0x64
>  #define ICPT_PV_INSTR	0x68
>  #define ICPT_PV_NOTIFY	0x6c
> +#define ICPT_PV_PREF	0x70
>  	__u8	icptcode;		/* 0x0050 */
>  	__u8	icptstatus;		/* 0x0051 */
>  	__u16	ihcpu;			/* 0x0052 */
> diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
> index db3dd5ee0b7a..6c9db886381c 100644
> --- a/arch/s390/kvm/intercept.c
> +++ b/arch/s390/kvm/intercept.c
> @@ -451,6 +451,15 @@ static int handle_operexc(struct kvm_vcpu *vcpu)
>  	return kvm_s390_inject_program_int(vcpu, PGM_OPERATION);
>  }
>  
> +static int handle_pv_spx(struct kvm_vcpu *vcpu)
> +{
> +	u32 pref = *(u32 *)vcpu->arch.sie_block->sidad;
> +
> +	kvm_s390_set_prefix(vcpu, pref);
> +	trace_kvm_s390_handle_prefix(vcpu, 1, pref);
> +	return 0;
> +}
> +
>  static int handle_pv_sclp(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
> @@ -477,6 +486,8 @@ static int handle_pv_sclp(struct kvm_vcpu *vcpu)
>  
>  static int handle_pv_notification(struct kvm_vcpu *vcpu)
>  {
> +	if (vcpu->arch.sie_block->ipa == 0xb210)
> +		return handle_pv_spx(vcpu);
>  	if (vcpu->arch.sie_block->ipa == 0xb220)
>  		return handle_pv_sclp(vcpu);
>  
> @@ -534,6 +545,13 @@ int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
>  	case ICPT_PV_NOTIFY:
>  		rc = handle_pv_notification(vcpu);
>  		break;
> +	case ICPT_PV_PREF:
> +		rc = 0;
> +		gmap_convert_to_secure(vcpu->arch.gmap,
> +				       kvm_s390_get_prefix(vcpu));
> +		gmap_convert_to_secure(vcpu->arch.gmap,
> +				       kvm_s390_get_prefix(vcpu) + PAGE_SIZE);

So, no need to go via KVM_REQ_MMU_RELOAD anymore, right? Good.

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 04/42] s390/protvirt: add ultravisor initialization
  2020-02-17  9:57   ` David Hildenbrand
@ 2020-02-17 11:13     ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 11:13 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik



On 17.02.20 10:57, David Hildenbrand wrote:
t setup_uv(void)
>> +{
>> +	unsigned long uv_stor_base;
>> +
>> +	if (!prot_virt_host)
>> +		return;
> 
> That can go.

ack.
> 
>> +
>> +	uv_stor_base = (unsigned long)memblock_alloc_try_nid(
>> +		uv_info.uv_base_stor_len, SZ_1M, SZ_2G,
>> +		MEMBLOCK_ALLOC_ACCESSIBLE, NUMA_NO_NODE);
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 03/42] s390/protvirt: introduce host side setup
  2020-02-17 11:11     ` Christian Borntraeger
@ 2020-02-17 11:13       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 11:13 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 17.02.20 12:11, Christian Borntraeger wrote:
> 
> 
> On 17.02.20 10:53, David Hildenbrand wrote:
>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>> From: Vasily Gorbik <gor@linux.ibm.com>
>>>
>>> Add "prot_virt" command line option which controls if the kernel
>>> protected VMs support is enabled at early boot time. This has to be
>>> done early, because it needs large amounts of memory and will disable
>>> some features like STP time sync for the lpar.
>>>
>>> Extend ultravisor info definitions and expose it via uv_info struct
>>> filled in during startup.
>>>
>>> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
>>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> ---
>>>  .../admin-guide/kernel-parameters.txt         |  5 ++
>>>  arch/s390/boot/Makefile                       |  2 +-
>>>  arch/s390/boot/uv.c                           | 21 +++++++-
>>>  arch/s390/include/asm/uv.h                    | 45 +++++++++++++++-
>>>  arch/s390/kernel/Makefile                     |  1 +
>>>  arch/s390/kernel/setup.c                      |  4 --
>>>  arch/s390/kernel/uv.c                         | 52 +++++++++++++++++++
>>>  7 files changed, 122 insertions(+), 8 deletions(-)
>>>  create mode 100644 arch/s390/kernel/uv.c
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index dbc22d684627..b0beae9b9e36 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -3795,6 +3795,11 @@
>>>  			before loading.
>>>  			See Documentation/admin-guide/blockdev/ramdisk.rst.
>>>  
>>> +	prot_virt=	[S390] enable hosting protected virtual machines
>>> +			isolated from the hypervisor (if hardware supports
>>> +			that).
>>> +			Format: <bool>
>>> +
>>>  	psi=		[KNL] Enable or disable pressure stall information
>>>  			tracking.
>>>  			Format: <bool>
>>> diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
>>> index e2c47d3a1c89..30f1811540c5 100644
>>> --- a/arch/s390/boot/Makefile
>>> +++ b/arch/s390/boot/Makefile
>>> @@ -37,7 +37,7 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
>>>  obj-y	:= head.o als.o startup.o mem_detect.o ipl_parm.o ipl_report.o
>>>  obj-y	+= string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
>>>  obj-y	+= version.o pgm_check_info.o ctype.o text_dma.o
>>> -obj-$(CONFIG_PROTECTED_VIRTUALIZATION_GUEST)	+= uv.o
>>> +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
>>>  obj-$(CONFIG_RELOCATABLE)	+= machine_kexec_reloc.o
>>>  obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
>>>  targets	:= bzImage startup.a section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y)
>>> diff --git a/arch/s390/boot/uv.c b/arch/s390/boot/uv.c
>>> index ed007f4a6444..af9e1cc93c68 100644
>>> --- a/arch/s390/boot/uv.c
>>> +++ b/arch/s390/boot/uv.c
>>> @@ -3,7 +3,13 @@
>>>  #include <asm/facility.h>
>>>  #include <asm/sections.h>
>>>  
>>> +/* will be used in arch/s390/kernel/uv.c */
>>> +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>>>  int __bootdata_preserved(prot_virt_guest);
>>> +#endif
>>> +#if IS_ENABLED(CONFIG_KVM)
>>> +struct uv_info __bootdata_preserved(uv_info);
>>> +#endif
>>>  
>>>  void uv_query_info(void)
>>>  {
>>> @@ -18,7 +24,20 @@ void uv_query_info(void)
>>>  	if (uv_call(0, (uint64_t)&uvcb))
>>>  		return;
>>>  
>>> -	if (test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
>>> +	if (IS_ENABLED(CONFIG_KVM)) {
>>> +		memcpy(uv_info.inst_calls_list, uvcb.inst_calls_list, sizeof(uv_info.inst_calls_list));
>>> +		uv_info.uv_base_stor_len = uvcb.uv_base_stor_len;
>>> +		uv_info.guest_base_stor_len = uvcb.conf_base_phys_stor_len;
>>> +		uv_info.guest_virt_base_stor_len = uvcb.conf_base_virt_stor_len;
>>> +		uv_info.guest_virt_var_stor_len = uvcb.conf_virt_var_stor_len;
>>> +		uv_info.guest_cpu_stor_len = uvcb.cpu_stor_len;
>>> +		uv_info.max_sec_stor_addr = ALIGN(uvcb.max_guest_stor_addr, PAGE_SIZE);
>>> +		uv_info.max_num_sec_conf = uvcb.max_num_sec_conf;
>>> +		uv_info.max_guest_cpus = uvcb.max_guest_cpus;
>>> +	}
>>> +
>>> +	if (IS_ENABLED(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) &&
>>> +	    test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
>>>  	    test_bit_inv(BIT_UVC_CMD_REMOVE_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list))
>>>  		prot_virt_guest = 1;
>>>  }
>>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>>> index 4093a2856929..34b1114dcc38 100644
>>> --- a/arch/s390/include/asm/uv.h
>>> +++ b/arch/s390/include/asm/uv.h
>>> @@ -44,7 +44,19 @@ struct uv_cb_qui {
>>>  	struct uv_cb_header header;
>>>  	u64 reserved08;
>>>  	u64 inst_calls_list[4];
>>> -	u64 reserved30[15];
>>> +	u64 reserved30[2];
>>> +	u64 uv_base_stor_len;
>>> +	u64 reserved48;
>>> +	u64 conf_base_phys_stor_len;
>>> +	u64 conf_base_virt_stor_len;
>>> +	u64 conf_virt_var_stor_len;
>>> +	u64 cpu_stor_len;
>>> +	u32 reserved70[3];
>>> +	u32 max_num_sec_conf;
>>> +	u64 max_guest_stor_addr;
>>> +	u8  reserved88[158-136];
>>> +	u16 max_guest_cpus;
>>> +	u64 reserveda0;
>>>  } __packed __aligned(8);
>>>  
>>>  struct uv_cb_share {
>>> @@ -69,6 +81,19 @@ static inline int uv_call(unsigned long r1, unsigned long r2)
>>>  	return cc;
>>>  }
>>>  
>>> +struct uv_info {
>>> +	unsigned long inst_calls_list[4];
>>> +	unsigned long uv_base_stor_len;
>>> +	unsigned long guest_base_stor_len;
>>> +	unsigned long guest_virt_base_stor_len;
>>> +	unsigned long guest_virt_var_stor_len;
>>> +	unsigned long guest_cpu_stor_len;
>>> +	unsigned long max_sec_stor_addr;
>>> +	unsigned int max_num_sec_conf;
>>> +	unsigned short max_guest_cpus;
>>> +};
>>> +extern struct uv_info uv_info;
>>> +
>>>  #ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>>>  extern int prot_virt_guest;
>>>  
>>> @@ -121,11 +146,27 @@ static inline int uv_remove_shared(unsigned long addr)
>>>  	return share(addr, UVC_CMD_REMOVE_SHARED_ACCESS);
>>>  }
>>>  
>>> -void uv_query_info(void);
>>>  #else
>>>  #define is_prot_virt_guest() 0
>>>  static inline int uv_set_shared(unsigned long addr) { return 0; }
>>>  static inline int uv_remove_shared(unsigned long addr) { return 0; }
>>> +#endif
>>> +
>>> +#if IS_ENABLED(CONFIG_KVM)
>>> +extern int prot_virt_host;
>>> +
>>> +static inline int is_prot_virt_host(void)
>>> +{
>>> +	return prot_virt_host;
>>> +}
>>> +#else
>>> +#define is_prot_virt_host() 0
>>> +#endif
>>> +
>>> +#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) ||                          \
>>> +	IS_ENABLED(CONFIG_KVM)
>>> +void uv_query_info(void);
>>> +#else
>>>  static inline void uv_query_info(void) {}
>>>  #endif
>>>  
>>> diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile
>>> index 2b1203cf7be6..22bfb8d5084e 100644
>>> --- a/arch/s390/kernel/Makefile
>>> +++ b/arch/s390/kernel/Makefile
>>> @@ -78,6 +78,7 @@ obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_events.o perf_regs.o
>>>  obj-$(CONFIG_PERF_EVENTS)	+= perf_cpum_cf_diag.o
>>>  
>>>  obj-$(CONFIG_TRACEPOINTS)	+= trace.o
>>> +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE))	+= uv.o
>>>  
>>>  # vdso
>>>  obj-y				+= vdso64/
>>> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
>>> index b2c2f75860e8..a2496382175e 100644
>>> --- a/arch/s390/kernel/setup.c
>>> +++ b/arch/s390/kernel/setup.c
>>> @@ -92,10 +92,6 @@ char elf_platform[ELF_PLATFORM_SIZE];
>>>  
>>>  unsigned long int_hwcap = 0;
>>>  
>>> -#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>>> -int __bootdata_preserved(prot_virt_guest);
>>> -#endif
>>> -
>>>  int __bootdata(noexec_disabled);
>>>  int __bootdata(memory_end_set);
>>>  unsigned long __bootdata(memory_end);
>>> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
>>> new file mode 100644
>>> index 000000000000..b1f936710360
>>> --- /dev/null
>>> +++ b/arch/s390/kernel/uv.c
>>> @@ -0,0 +1,52 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * Common Ultravisor functions and initialization
>>> + *
>>> + * Copyright IBM Corp. 2019, 2020
>>> + */
>>> +#define KMSG_COMPONENT "prot_virt"
>>> +#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
>>> +
>>> +#include <linux/kernel.h>
>>> +#include <linux/types.h>
>>> +#include <linux/sizes.h>
>>> +#include <linux/bitmap.h>
>>> +#include <linux/memblock.h>
>>> +#include <asm/facility.h>
>>> +#include <asm/sections.h>
>>> +#include <asm/uv.h>
>>> +
>>> +/* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */
>>> +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
>>> +int __bootdata_preserved(prot_virt_guest);
>>> +#endif
>>> +
>>> +#if IS_ENABLED(CONFIG_KVM)
>>
>>
>> Why is that glued to CONFIG_KVM? Can't we glue all that stuff to
>> CONFIG_PROTECTED_VIRTUALIZATION_GUEST and make in kconfig
>> CONFIG_PROTECTED_VIRTUALIZATION_GUEST depend on CONFIG_KVM?
> 
> 
> CONFIG_PROTECTED_VIRTUALIZATION_GUEST is for the GUEST part (the virtio-ccw changes
> inside the guest). The other thing is for the host and here we bind it
> to CONFIG_KVM (or PGSTE for mm code).

Ahh, sorry messed that up. Think I had another (in previous versions?)
config option in mind.

Makes sense

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests
  2020-02-17 10:21   ` David Hildenbrand
@ 2020-02-17 11:28     ` Christian Borntraeger
  2020-02-17 12:07       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 11:28 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm



On 17.02.20 11:21, David Hildenbrand wrote:
>> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
>> index 85e944f04c70..4ebcf891ff3c 100644
>> --- a/arch/s390/include/asm/page.h
>> +++ b/arch/s390/include/asm/page.h
>> @@ -153,6 +153,11 @@ static inline int devmem_is_allowed(unsigned long pfn)
>>  #define HAVE_ARCH_FREE_PAGE
>>  #define HAVE_ARCH_ALLOC_PAGE
>>  
>> +#if IS_ENABLED(CONFIG_PGSTE)
>> +int arch_make_page_accessible(struct page *page);
>> +#define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
>> +#endif
>> +
> 
> Feels like this should have been one of the (CONFIG_)ARCH_HAVE_XXX
> thingies defined via kconfig instead.
> 
> E.g., like (CONFIG_)HAVE_ARCH_TRANSPARENT_HUGEPAGE
> 
> [...]

This looks more or less like HAVE_ARCH_ALLOC_PAGE. You will find both
variants.
I think I will leave it that way for now until we need it to be a config
or the mm maintainers have a preference.


> 
>> +
>> +/*
>> + * Requests the Ultravisor to encrypt a guest page and make it
>> + * accessible to the host for paging (export).
>> + *
>> + * @paddr: Absolute host address of page to be exported
>> + */
>> +int uv_convert_from_secure(unsigned long paddr)
>> +{
>> +	struct uv_cb_cfs uvcb = {
>> +		.header.cmd = UVC_CMD_CONV_FROM_SEC_STOR,
>> +		.header.len = sizeof(uvcb),
>> +		.paddr = paddr
>> +	};
>> +
>> +	if (uv_call(0, (u64)&uvcb))
>> +		return -EINVAL;
>> +	return 0;
>> +}
>> +
>> +/*
>> + * Calculate the expected ref_count for a page that would otherwise have no
>> + * further pins. This was cribbed from similar functions in other places in
>> + * the kernel, but with some slight modifications. We know that a secure
>> + * page can not be a huge page for example.
> 
> s/ca not cannot/

ack.


> 
>> + */
>> +static int expected_page_refs(struct page *page)
>> +{
>> +	int res;
>> +
>> +	res = page_mapcount(page);
>> +	if (PageSwapCache(page)) {
>> +		res++;
>> +	} else if (page_mapping(page)) {
>> +		res++;
>> +		if (page_has_private(page))
>> +			res++;
>> +	}
>> +	return res;
>> +}
>> +
>> +static int make_secure_pte(pte_t *ptep, unsigned long addr,
>> +			   struct page *exp_page, struct uv_cb_header *uvcb)
>> +{
>> +	pte_t entry = READ_ONCE(*ptep);
>> +	struct page *page;
>> +	int expected, rc = 0;
>> +
>> +	if (!pte_present(entry))
>> +		return -ENXIO;
>> +	if (pte_val(entry) & _PAGE_INVALID)
>> +		return -ENXIO;
>> +
>> +	page = pte_page(entry);
>> +	if (page != exp_page)
>> +		return -ENXIO;
>> +	if (PageWriteback(page))
>> +		return -EAGAIN;
>> +	expected = expected_page_refs(page);
>> +	if (!page_ref_freeze(page, expected))
>> +		return -EBUSY;
>> +	set_bit(PG_arch_1, &page->flags);
>> +	rc = uv_call(0, (u64)uvcb);
>> +	page_ref_unfreeze(page, expected);
>> +	/* Return -ENXIO if the page was not mapped, -EINVAL otherwise */
>> +	if (rc)
>> +		rc = uvcb->rc == 0x10a ? -ENXIO : -EINVAL;
>> +	return rc;
>> +}
>> +
>> +/*
>> + * Requests the Ultravisor to make a page accessible to a guest.
>> + * If it's brought in the first time, it will be cleared. If
>> + * it has been exported before, it will be decrypted and integrity
>> + * checked.
>> + */
>> +int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
>> +{
>> +	struct vm_area_struct *vma;
>> +	unsigned long uaddr;
>> +	struct page *page;
>> +	int rc, local_drain = 0;
> 
> local_drain could have been a bool.

ack
> 
>> +	spinlock_t *ptelock;
>> +	pte_t *ptep;
>> +
>> +again:
>> +	rc = -EFAULT;
>> +	down_read(&gmap->mm->mmap_sem);
>> +
>> +	uaddr = __gmap_translate(gmap, gaddr);
>> +	if (IS_ERR_VALUE(uaddr))
>> +		goto out;
>> +	vma = find_vma(gmap->mm, uaddr);
>> +	if (!vma)
>> +		goto out;
>> +	/*
>> +	 * Secure pages cannot be huge and userspace should not combine both.
>> +	 * In case userspace does it anyway this will result in an -EFAULT for
>> +	 * the unpack. The guest is thus never reaching secure mode. If
>> +	 * userspace is playing dirty tricky with mapping huge pages later
>> +	 * on this will result in a segmenation fault.
> 
> s/segmenation/segmentation/

ack.
> 
>> +	 */
>> +	if (is_vm_hugetlb_page(vma))
>> +		goto out;
>> +
>> +	rc = -ENXIO;
>> +	page = follow_page(vma, uaddr, FOLL_WRITE);
>> +	if (IS_ERR_OR_NULL(page))
>> +		goto out;
>> +
>> +	lock_page(page);
>> +	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
>> +	rc = make_secure_pte(ptep, uaddr, page, uvcb);
>> +	pte_unmap_unlock(ptep, ptelock);
>> +	unlock_page(page);
>> +out:
>> +	up_read(&gmap->mm->mmap_sem);
>> +
>> +	if (rc == -EAGAIN) {
>> +		wait_on_page_writeback(page);
>> +	} else if (rc == -EBUSY) {
>> +		/*
>> +		 * If we have tried a local drain and the page refcount
>> +		 * still does not match our expected safe value, try with a
>> +		 * system wide drain. This is needed if the pagevecs holding
>> +		 * the page are on a different CPU.
>> +		 */
>> +		if (local_drain) {
>> +			lru_add_drain_all();
> 
> I do wonder if that is valid to be called with all the locks at this point.

This function uses per cpu workers and needs no other locks. Also verified 
with lockdep. 

> 
>> +			/* We give up here, and let the caller try again */
>> +			return -EAGAIN;
>> +		}
>> +		/*
>> +		 * We are here if the page refcount does not match the
>> +		 * expected safe value. The main culprits are usually
>> +		 * pagevecs. With lru_add_drain() we drain the pagevecs
>> +		 * on the local CPU so that hopefully the refcount will
>> +		 * reach the expected safe value.
>> +		 */
>> +		lru_add_drain();
> 
> dito ...

dito. 

> 
>> +		local_drain = 1;
>> +		/* And now we try again immediately after draining */
>> +		goto again;
>> +	} else if (rc == -ENXIO) {
>> +		if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))
>> +			return -EFAULT;
>> +		return -EAGAIN;
>> +	}
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_GPL(gmap_make_secure);
>> +
>> +int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr)
>> +{
>> +	struct uv_cb_cts uvcb = {
>> +		.header.cmd = UVC_CMD_CONV_TO_SEC_STOR,
>> +		.header.len = sizeof(uvcb),
>> +		.guest_handle = gmap->guest_handle,
>> +		.gaddr = gaddr,
>> +	};
>> +
>> +	return gmap_make_secure(gmap, gaddr, &uvcb);
>> +}
>> +EXPORT_SYMBOL_GPL(gmap_convert_to_secure);
>> +
>> +/**
>> + * To be called with the page locked or with an extra reference!
> 
> Can we have races here? (IOW, two callers concurrently for the same page)

That would be fine and is part of the design. The ultravisor calls will
either make the page accessible or will be a (mostly) no-op.
In fact, we allow for slight over-indication of "needs to be exported"

What about:

/*
 * To be called with the page locked or with an extra reference! This will
 * prevent gmap_make_secure from touching the page concurrently. Having 2
 * parallel make_page_accessible is fine, as the UV calls will become a 
 * no-op if the page is already exported.
 */


> 
>> + */
>> +int arch_make_page_accessible(struct page *page)
>> +{
>> +	int rc = 0;
>> +
>> +	/* Hugepage cannot be protected, so nothing to do */
>> +	if (PageHuge(page))
>> +		return 0;
>> +
>> +	/*
>> +	 * PG_arch_1 is used in 3 places:
>> +	 * 1. for kernel page tables during early boot
>> +	 * 2. for storage keys of huge pages and KVM
>> +	 * 3. As an indication that this page might be secure. This can
>> +	 *    overindicate, e.g. we set the bit before calling
>> +	 *    convert_to_secure.
>> +	 * As secure pages are never huge, all 3 variants can co-exists.
>> +	 */
>> +	if (!test_bit(PG_arch_1, &page->flags))
>> +		return 0;
>> +
>> +	rc = uv_pin_shared(page_to_phys(page));
>> +	if (!rc) {
>> +		clear_bit(PG_arch_1, &page->flags);
>> +		return 0;
>> +	}
> 
> Overall, looks sane to me. (I am mostly concerned about possible races,
> e.g., when two gmaps would be created for a single VM and nasty stuff be
> done with them). But yeah, I guess you guys thought about this ;)



^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling
  2020-02-17 11:01   ` David Hildenbrand
@ 2020-02-17 11:33     ` Christian Borntraeger
  2020-02-17 14:37     ` Janosch Frank
  1 sibling, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 11:33 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 17.02.20 12:01, David Hildenbrand wrote:

> As discussed, I think there is room for improvement in the future (which
> we could have documented in the patch description), because this is
> obviously sub-optimal.
> 

Will use the following as patch description
  
Guest registers for protected guests are stored at offset 0x380.  We
will copy those to the usual places.  Long term we could refactor this
or use register access functions.


> Reviewed-by: David Hildenbrand <david@redhat.com>
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-17 10:56   ` David Hildenbrand
@ 2020-02-17 12:04     ` Christian Borntraeger
  2020-02-17 12:09       ` David Hildenbrand
  2020-02-18  8:09     ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
  1 sibling, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 12:04 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 17.02.20 11:56, David Hildenbrand wrote:
> [...]
>>  
>> +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>> +{
>> +	int r = 0;
>> +	void __user *argp = (void __user *)cmd->data;
>> +
>> +	switch (cmd->cmd) {
>> +	case KVM_PV_VM_CREATE: {
>> +		r = -EINVAL;
>> +		if (kvm_s390_pv_is_protected(kvm))
>> +			break;
> 
> Isn't this racy? I think there has to be a way to make sure the PV state
> can't change. Is there any and I am missing something obvious? (is
> suspect we need the kvm->lock)


Yes, kvm->lock around kvm_s390_handle_pv is safer.

Something like


diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 932f7f32e82f..87dc6caa2181 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2422,7 +2422,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
                        r = -EFAULT;
                        break;
                }
+               mutex_lock(&kvm->lock);
                r = kvm_s390_handle_pv(kvm, &args);
+               mutex_unlock(&kvm->lock);
                if (copy_to_user(argp, &args, sizeof(args))) {
                        r = -EFAULT;
                        break;



[...]


>> +	case KVM_PV_VM_SET_SEC_PARMS: {
> 
> I'd name this "KVM_PV_VM_SET_PARMS" instead.

[...]

>> @@ -2975,6 +3121,9 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>>  
>>  	kvm_s390_vcpu_crypto_setup(vcpu);
>>  
>> +	if (kvm_s390_pv_is_protected(vcpu->kvm))
>> +		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
> 
> With an explicit KVM_PV_VCPU_CREATE, this does not belong here. When
> hotplugging CPUs, user space has to do that manually. But as I said
> already, this user space API could be improved. (below)

With your proposed API this would stay.

[...]

>> @@ -4493,6 +4674,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>  					   irq_state.len);
>>  		break;
>>  	}
>> +	case KVM_S390_PV_COMMAND_VCPU: {
>> +		struct kvm_pv_cmd args;
>> +
>> +		r = 0;
>> +		if (!is_prot_virt_host()) {
>> +			r = -EINVAL;
>> +			break;
>> +		}
>> +		if (copy_from_user(&args, argp, sizeof(args))) {
>> +			r = -EFAULT;
>> +			break;
>> +		}
>> +		r = kvm_s390_handle_pv_vcpu(vcpu, &args);
>> +		if (copy_to_user(argp, &args, sizeof(args))) {
>> +			r = -EFAULT;
>> +			break;
>> +		}
>> +		break;
>> +	}
>>  	default:
>>  		r = -ENOTTY;
> 
> 
> Can we please discuss why we can't
> 
> - Get rid of KVM_S390_PV_COMMAND_VCPU
> - Do the allocation in KVM_PV_VM_CREATE
> - Rename KVM_PV_VM_CREATE -> KVM_PV_ENABLE
> - Rename KVM_PV_VM_DESTROY -> KVM_PV_DISABLE
> 
> This user space API is unnecessary complicated and confusing.

I will have a look if this is feasible.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests
  2020-02-17 11:28     ` Christian Borntraeger
@ 2020-02-17 12:07       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 12:07 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm


>>> +		if (local_drain) {
>>> +			lru_add_drain_all();
>>
>> I do wonder if that is valid to be called with all the locks at this point.
> 
> This function uses per cpu workers and needs no other locks. Also verified 
> with lockdep. 

Okay, perfect.

>>> +/**
>>> + * To be called with the page locked or with an extra reference!
>>
>> Can we have races here? (IOW, two callers concurrently for the same page)
> 
> That would be fine and is part of the design. The ultravisor calls will
> either make the page accessible or will be a (mostly) no-op.
> In fact, we allow for slight over-indication of "needs to be exported"
> 
> What about:
> 
> /*
>  * To be called with the page locked or with an extra reference! This will
>  * prevent gmap_make_secure from touching the page concurrently. Having 2
>  * parallel make_page_accessible is fine, as the UV calls will become a 
>  * no-op if the page is already exported.
>  */

Yes, much clearer, thanks!



-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-17 12:04     ` Christian Borntraeger
@ 2020-02-17 12:09       ` David Hildenbrand
  2020-02-17 14:53         ` [PATCH 0/2] example changes Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 12:09 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 17.02.20 13:04, Christian Borntraeger wrote:
> 
> 
> On 17.02.20 11:56, David Hildenbrand wrote:
>> [...]
>>>  
>>> +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>>> +{
>>> +	int r = 0;
>>> +	void __user *argp = (void __user *)cmd->data;
>>> +
>>> +	switch (cmd->cmd) {
>>> +	case KVM_PV_VM_CREATE: {
>>> +		r = -EINVAL;
>>> +		if (kvm_s390_pv_is_protected(kvm))
>>> +			break;
>>
>> Isn't this racy? I think there has to be a way to make sure the PV state
>> can't change. Is there any and I am missing something obvious? (is
>> suspect we need the kvm->lock)
> 
> 
> Yes, kvm->lock around kvm_s390_handle_pv is safer.
> 
> Something like

Yes. Probably a lockdep_assert_held() inside kvm_s390_pv_is_protected()
would make sense to catch other races.
[...]

>>
>> With an explicit KVM_PV_VCPU_CREATE, this does not belong here. When
>> hotplugging CPUs, user space has to do that manually. But as I said
>> already, this user space API could be improved. (below)
> 
> With your proposed API this would stay.

Yes

[...]
>>
>>
>> Can we please discuss why we can't
>>
>> - Get rid of KVM_S390_PV_COMMAND_VCPU
>> - Do the allocation in KVM_PV_VM_CREATE
>> - Rename KVM_PV_VM_CREATE -> KVM_PV_ENABLE
>> - Rename KVM_PV_VM_DESTROY -> KVM_PV_DISABLE
>>
>> This user space API is unnecessary complicated and confusing.
> 
> I will have a look if this is feasible.
> 

Okay, thanks!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication
  2020-02-14 22:26 ` [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication Christian Borntraeger
@ 2020-02-17 14:15   ` Ulrich Weigand
  2020-02-17 14:38     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: Ulrich Weigand @ 2020-02-17 14:15 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Janosch Frank, Andrew Morton, KVM, Cornelia Huck,
	David Hildenbrand, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm, Will Deacon, Sean Christopherson

On Fri, Feb 14, 2020 at 05:26:56PM -0500, Christian Borntraeger wrote:
> +enum access_type {
> +	MAKE_ACCESSIBLE_GENERIC,
> +	MAKE_ACCESSIBLE_GET,
> +	MAKE_ACCESSIBLE_GET_FAST,
> +	MAKE_ACCESSIBLE_WRITEBACK
> +};
>  #ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> -static inline int arch_make_page_accessible(struct page *page)
> +static inline int arch_make_page_accessible(struct page *page, int where)

If we want to make this distinction, wouldn't it be simpler to just
use different function names, like
  arch_make_page_accessible_for_writeback
  arch_make_page_accessible_for_gup
etc.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 22/42] KVM: s390/mm: handle guest unpin events
  2020-02-14 22:26 ` [PATCH v2 22/42] KVM: s390/mm: handle guest unpin events Christian Borntraeger
@ 2020-02-17 14:23   ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 14:23 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
> 
> The current code tries to first pin shared pages, if that fails (e.g.
> because the page is not shared) it will export them. For shared pages
> this means that we get a new intercept telling us that the guest is
> unsharing that page. We will make the page secure at that point in time
> and revoke the host access. This is synchronized with other host events,
> e.g. the code will wait until host I/O has finished.
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/intercept.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
> index 6c9db886381c..1e231058e4b3 100644
> --- a/arch/s390/kvm/intercept.c
> +++ b/arch/s390/kvm/intercept.c
> @@ -16,6 +16,7 @@
>  #include <asm/asm-offsets.h>
>  #include <asm/irq.h>
>  #include <asm/sysinfo.h>
> +#include <asm/uv.h>
>  
>  #include "kvm-s390.h"
>  #include "gaccess.h"
> @@ -484,12 +485,35 @@ static int handle_pv_sclp(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static int handle_pv_uvc(struct kvm_vcpu *vcpu)
> +{
> +	struct uv_cb_share *guest_uvcb = (void *)vcpu->arch.sie_block->sidad;
> +	struct uv_cb_cts uvcb = {
> +		.header.cmd	= UVC_CMD_UNPIN_PAGE_SHARED,
> +		.header.len	= sizeof(uvcb),
> +		.guest_handle	= kvm_s390_pv_handle(vcpu->kvm),
> +		.gaddr		= guest_uvcb->paddr,
> +	};
> +	int rc;
> +
> +	if (guest_uvcb->header.cmd != UVC_CMD_REMOVE_SHARED_ACCESS) {
> +		WARN_ONCE(1, "Unexpected UVC 0x%x!\n", guest_uvcb->header.cmd);
> +		return 0;
> +	}
> +	rc = gmap_make_secure(vcpu->arch.gmap, uvcb.gaddr, &uvcb);
> +	if (rc == -EINVAL)
> +		return 0;
> +	return rc;
> +}
> +
>  static int handle_pv_notification(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu->arch.sie_block->ipa == 0xb210)
>  		return handle_pv_spx(vcpu);
>  	if (vcpu->arch.sie_block->ipa == 0xb220)
>  		return handle_pv_sclp(vcpu);
> +	if (vcpu->arch.sie_block->ipa == 0xb9a4)
> +		return handle_pv_uvc(vcpu);
>  
>  	return handle_instruction(vcpu);
>  }
> 

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area
  2020-02-14 22:26 ` [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area Christian Borntraeger
@ 2020-02-17 14:24   ` David Hildenbrand
  2020-02-17 18:40     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 14:24 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> STHYI data has to go through the bounce buffer.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/intercept.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
> index 1e231058e4b3..cfabeecbb777 100644
> --- a/arch/s390/kvm/intercept.c
> +++ b/arch/s390/kvm/intercept.c
> @@ -392,7 +392,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
>  		goto out;
>  	}
>  
> -	if (addr & ~PAGE_MASK)
> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (addr & ~PAGE_MASK))
>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>  
>  	sctns = (void *)get_zeroed_page(GFP_KERNEL);
> @@ -403,10 +403,15 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
>  
>  out:
>  	if (!cc) {
> -		r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE);
> -		if (r) {
> -			free_page((unsigned long)sctns);
> -			return kvm_s390_inject_prog_cond(vcpu, r);
> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {

I have the feeling that we might have to think about proper locking for
kvm_s390_pv_is_protected(). We have to make sure there cannot be any
races with user space. Smells like a new r/w lock maybe.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling
  2020-02-17 11:01   ` David Hildenbrand
  2020-02-17 11:33     ` Christian Borntraeger
@ 2020-02-17 14:37     ` Janosch Frank
  1 sibling, 0 replies; 132+ messages in thread
From: Janosch Frank @ 2020-02-17 14:37 UTC (permalink / raw)
  To: David Hildenbrand, Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

[-- Attachment #1.1: Type: text/plain, Size: 2911 bytes --]

On 2/17/20 12:01 PM, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> Guest registers for protected guests are stored at offset 0x380.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/include/asm/kvm_host.h |  4 +++-
>>  arch/s390/kvm/kvm-s390.c         | 11 +++++++++++
>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>> index ba3364b37159..4fcbb055a565 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -343,7 +343,9 @@ struct kvm_s390_itdb {
>>  struct sie_page {
>>  	struct kvm_s390_sie_block sie_block;
>>  	struct mcck_volatile_info mcck_info;	/* 0x0200 */
>> -	__u8 reserved218[1000];		/* 0x0218 */
>> +	__u8 reserved218[360];		/* 0x0218 */
>> +	__u64 pv_grregs[16];		/* 0x0380 */
>> +	__u8 reserved400[512];		/* 0x0400 */
>>  	struct kvm_s390_itdb itdb;	/* 0x0600 */
>>  	__u8 reserved700[2304];		/* 0x0700 */
>>  };
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index a85e50075d99..6ebb0dae5a2e 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -3999,6 +3999,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
>>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  {
>>  	int rc, exit_reason;
>> +	struct sie_page *sie_page = (struct sie_page *)vcpu->arch.sie_block;
>>  
>>  	/*
>>  	 * We try to hold kvm->srcu during most of vcpu_run (except when run-
>> @@ -4020,8 +4021,18 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  		guest_enter_irqoff();
>>  		__disable_cpu_timer_accounting(vcpu);
>>  		local_irq_enable();
>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>> +			memcpy(sie_page->pv_grregs,
>> +			       vcpu->run->s.regs.gprs,
>> +			       sizeof(sie_page->pv_grregs));
>> +		}
>>  		exit_reason = sie64a(vcpu->arch.sie_block,
>>  				     vcpu->run->s.regs.gprs);
>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>> +			memcpy(vcpu->run->s.regs.gprs,
>> +			       sie_page->pv_grregs,
>> +			       sizeof(sie_page->pv_grregs));
>> +		}
>>  		local_irq_disable();
>>  		__enable_cpu_timer_accounting(vcpu);
>>  		guest_exit_irqoff();
>>
> 
> As discussed, I think there is room for improvement in the future (which
> we could have documented in the patch description), because this is
> obviously sub-optimal.

I added it to my KVM TODO list.

> 
> Reviewed-by: David Hildenbrand <david@redhat.com>

Thanks for reviewing this patch and all the others :-)



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication
  2020-02-17 14:15   ` Ulrich Weigand
@ 2020-02-17 14:38     ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 14:38 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Janosch Frank, Andrew Morton, KVM, Cornelia Huck,
	David Hildenbrand, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm, Will Deacon, Sean Christopherson



On 17.02.20 15:15, Ulrich Weigand wrote:
> On Fri, Feb 14, 2020 at 05:26:56PM -0500, Christian Borntraeger wrote:
>> +enum access_type {
>> +	MAKE_ACCESSIBLE_GENERIC,
>> +	MAKE_ACCESSIBLE_GET,
>> +	MAKE_ACCESSIBLE_GET_FAST,
>> +	MAKE_ACCESSIBLE_WRITEBACK
>> +};
>>  #ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
>> -static inline int arch_make_page_accessible(struct page *page)
>> +static inline int arch_make_page_accessible(struct page *page, int where)
> 
> If we want to make this distinction, wouldn't it be simpler to just
> use different function names, like
>   arch_make_page_accessible_for_writeback
>   arch_make_page_accessible_for_gup
> etc.

Agreed.
I would suggest to do these changes when somebody needs them, though.
On the other hand, Patch 39 (the error handling) is something that we
could merge now.



^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer
  2020-02-17 11:08   ` David Hildenbrand
@ 2020-02-17 14:47     ` Janosch Frank
  2020-02-17 15:00       ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: Janosch Frank @ 2020-02-17 14:47 UTC (permalink / raw)
  To: David Hildenbrand, Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

[-- Attachment #1.1: Type: text/plain, Size: 3886 bytes --]

On 2/17/20 12:08 PM, David Hildenbrand wrote:
>> @@ -4460,6 +4489,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>>  
>>  	switch (mop->op) {
>>  	case KVM_S390_MEMOP_LOGICAL_READ:
>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>> +			r = -EINVAL;
>> +			break;
>> +		}
> 
> Could we have a possible race with disabling code, especially while
> concurrently freeing? (sorry if I ask again, there was just a flood of
> emails)
> 
>>  		if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
>>  			r = check_gva_range(vcpu, mop->gaddr, mop->ar,
>>  					    mop->size, GACC_FETCH);
>> @@ -4472,6 +4505,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>>  		}
>>  		break;
>>  	case KVM_S390_MEMOP_LOGICAL_WRITE:
>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>> +			r = -EINVAL;
>> +			break;
>> +		}
> 
> dito
> 
>>  		if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
>>  			r = check_gva_range(vcpu, mop->gaddr, mop->ar,
>>  					    mop->size, GACC_STORE);
>> @@ -4483,6 +4520,11 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>>  		}
>>  		r = write_guest(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size);
>>  		break;
>> +	case KVM_S390_MEMOP_SIDA_READ:
>> +	case KVM_S390_MEMOP_SIDA_WRITE:
>> +		/* we are locked against sida going away by the vcpu->mutex */
>> +		r = kvm_s390_guest_sida_op(vcpu, mop);
>> +		break;
>>  	default:
>>  		r = -EINVAL;
>>  	}
>> diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
>> index 09573e36c329..80169a9b43ec 100644
>> --- a/arch/s390/kvm/pv.c
>> +++ b/arch/s390/kvm/pv.c
>> @@ -92,6 +92,7 @@ int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
>>  
>>  	free_pages(vcpu->arch.pv.stor_base,
>>  		   get_order(uv_info.guest_cpu_stor_len));
>> +	free_page(sida_origin(vcpu->arch.sie_block));
>>  	vcpu->arch.sie_block->pv_handle_cpu = 0;
>>  	vcpu->arch.sie_block->pv_handle_config = 0;
>>  	memset(&vcpu->arch.pv, 0, sizeof(vcpu->arch.pv));
>> @@ -121,6 +122,14 @@ int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
>>  	uvcb.state_origin = (u64)vcpu->arch.sie_block;
>>  	uvcb.stor_origin = (u64)vcpu->arch.pv.stor_base;
>>  
>> +	/* Alloc Secure Instruction Data Area Designation */
>> +	vcpu->arch.sie_block->sidad = __get_free_page(GFP_KERNEL | __GFP_ZERO);
>> +	if (!vcpu->arch.sie_block->sidad) {
>> +		free_pages(vcpu->arch.pv.stor_base,
>> +			   get_order(uv_info.guest_cpu_stor_len));
>> +		return -ENOMEM;
>> +	}
>> +
>>  	cc = uv_call(0, (u64)&uvcb);
>>  	*rc = uvcb.header.rc;
>>  	*rrc = uvcb.header.rrc;
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 207915488502..0fdee1bc3798 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -475,11 +475,15 @@ struct kvm_s390_mem_op {
>>  	__u32 op;		/* type of operation */
>>  	__u64 buf;		/* buffer in userspace */
>>  	__u8 ar;		/* the access register number */
>> -	__u8 reserved[31];	/* should be set to 0 */
>> +	__u8 reserved21[3];	/* should be set to 0 */
>> +	__u32 sida_offset;	/* offset into the sida */
>> +	__u8 reserved28[24];	/* should be set to 0 */
>>  };
> 
> As discussed, I'd prefer an overlaying layout for the sida, as the ar
> does not make any sense (correct me if I'm wrong :) )

That wouldn't work, because we still check mop->ar < 16 in
kvm_s390_guest_mem_op(). Also we currently check mop contents twice
because we overload mem_op() with the SIDA operations.

Using a separate IOCTL is cleaner...

> 
> __u32 op;		/* type of operation */
> __u64 buf;		/* buffer in userspace */
> uinon {
> 	__u8 ar;		/* the access register number */
> 	__u32 sida_offset;	/* offset into the sida */
> 	__u8 reserved[32];	/* should be set to 0 */
> };
> 
> With something like that
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH 0/2] example changes
  2020-02-17 12:09       ` David Hildenbrand
@ 2020-02-17 14:53         ` Christian Borntraeger
  2020-02-17 14:53           ` [PATCH 1/2] lock changes Christian Borntraeger
                             ` (2 more replies)
  0 siblings, 3 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 14:53 UTC (permalink / raw)
  To: david
  Cc: Ulrich.Weigand, borntraeger, cohuck, frankja, frankja, gor,
	imbrenda, kvm, linux-s390, mimu, thuth

I also need to rename the ioctls and do more testing,
but this is probably what you want?

Christian Borntraeger (2):
  lock changes
  merge vm/cpu create

 arch/s390/kvm/intercept.c |  6 +--
 arch/s390/kvm/interrupt.c | 35 ++++++++++-------
 arch/s390/kvm/kvm-s390.c  | 83 +++++++++++++++++++++++++--------------
 arch/s390/kvm/kvm-s390.h  | 18 ++++++---
 arch/s390/kvm/priv.c      |  4 +-
 5 files changed, 92 insertions(+), 54 deletions(-)

-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH 1/2] lock changes
  2020-02-17 14:53         ` [PATCH 0/2] example changes Christian Borntraeger
@ 2020-02-17 14:53           ` Christian Borntraeger
  2020-02-17 14:53           ` [PATCH 2/2] merge vm/cpu create Christian Borntraeger
  2020-02-17 19:18           ` [PATCH 0/2] example changes David Hildenbrand
  2 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 14:53 UTC (permalink / raw)
  To: david
  Cc: Ulrich.Weigand, borntraeger, cohuck, frankja, frankja, gor,
	imbrenda, kvm, linux-s390, mimu, thuth

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/intercept.c |  6 +++---
 arch/s390/kvm/interrupt.c | 35 +++++++++++++++++++++--------------
 arch/s390/kvm/kvm-s390.c  | 28 +++++++++++++---------------
 arch/s390/kvm/kvm-s390.h  | 18 +++++++++++++-----
 arch/s390/kvm/priv.c      |  4 ++--
 5 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index cfabeecbb777..a5ae12c4c139 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -236,7 +236,7 @@ static int handle_prog(struct kvm_vcpu *vcpu)
 	 * Intercept 8 indicates a loop of specification exceptions
 	 * for protected guests.
 	 */
-	if (kvm_s390_pv_is_protected(vcpu->kvm))
+	if (kvm_s390_pv_cpu_is_protected(vcpu))
 		return -EOPNOTSUPP;
 
 	if (guestdbg_enabled(vcpu) && per_event(vcpu)) {
@@ -392,7 +392,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
 		goto out;
 	}
 
-	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (addr & ~PAGE_MASK))
+	if (!kvm_s390_pv_cpu_is_protected(vcpu) && (addr & ~PAGE_MASK))
 		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
 
 	sctns = (void *)get_zeroed_page(GFP_KERNEL);
@@ -403,7 +403,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
 
 out:
 	if (!cc) {
-		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 			memcpy((void *)(sida_origin(vcpu->arch.sie_block)),
 			       sctns, PAGE_SIZE);
 		} else {
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 7a10096fa204..140f04a2b547 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -393,7 +393,7 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
 	if (psw_mchk_disabled(vcpu))
 		active_mask &= ~IRQ_PEND_MCHK_MASK;
 	/* PV guest cpus can have a single interruption injected at a time. */
-	if (kvm_s390_pv_is_protected(vcpu->kvm) &&
+	if (kvm_s390_pv_cpu_is_protected(vcpu) &&
 	    vcpu->arch.sie_block->iictl != IICTL_CODE_NONE)
 		active_mask &= ~(IRQ_PEND_EXT_II_MASK |
 				 IRQ_PEND_IO_MASK |
@@ -495,7 +495,7 @@ static int __must_check __deliver_cpu_timer(struct kvm_vcpu *vcpu)
 	vcpu->stat.deliver_cputm++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER,
 					 0, 0);
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
 		vcpu->arch.sie_block->eic = EXT_IRQ_CPU_TIMER;
 	} else {
@@ -519,7 +519,7 @@ static int __must_check __deliver_ckc(struct kvm_vcpu *vcpu)
 	vcpu->stat.deliver_ckc++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP,
 					 0, 0);
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
 		vcpu->arch.sie_block->eic = EXT_IRQ_CLK_COMP;
 	} else {
@@ -578,7 +578,7 @@ static int __write_machine_check(struct kvm_vcpu *vcpu,
 	 * the hypervisor does not not have the needed information for
 	 * protected guests.
 	 */
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_MCHK;
 		vcpu->arch.sie_block->mcic = mchk->mcic;
 		vcpu->arch.sie_block->faddr = mchk->failing_storage_address;
@@ -735,7 +735,7 @@ static int __must_check __deliver_restart(struct kvm_vcpu *vcpu)
 	vcpu->stat.deliver_restart_signal++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_RESTART, 0, 0);
 
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_RESTART;
 	} else {
 		rc  = write_guest_lc(vcpu,
@@ -785,7 +785,7 @@ static int __must_check __deliver_emergency_signal(struct kvm_vcpu *vcpu)
 	vcpu->stat.deliver_emergency_signal++;
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_EMERGENCY,
 					 cpu_addr, 0);
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
 		vcpu->arch.sie_block->eic = EXT_IRQ_EMERGENCY_SIG;
 		vcpu->arch.sie_block->extcpuaddr = cpu_addr;
@@ -819,7 +819,7 @@ static int __must_check __deliver_external_call(struct kvm_vcpu *vcpu)
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
 					 KVM_S390_INT_EXTERNAL_CALL,
 					 extcall.code, 0);
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_EXT;
 		vcpu->arch.sie_block->eic = EXT_IRQ_EXTERNAL_CALL;
 		vcpu->arch.sie_block->extcpuaddr = extcall.code;
@@ -871,7 +871,7 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
 	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
 					 pgm_info.code, 0);
 
-	if (kvm_s390_pv_is_protected(vcpu->kvm))
+	if (kvm_s390_pv_cpu_is_protected(vcpu))
 		return __deliver_prog_pv(vcpu, pgm_info.code & ~PGM_PER);
 
 	switch (pgm_info.code & ~PGM_PER) {
@@ -1010,7 +1010,7 @@ static int __must_check __deliver_service(struct kvm_vcpu *vcpu)
 	memset(&fi->srv_signal, 0, sizeof(ext));
 	clear_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs);
 	clear_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs);
-	if (kvm_s390_pv_is_protected(vcpu->kvm))
+	if (kvm_s390_pv_cpu_is_protected(vcpu))
 		set_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs);
 	spin_unlock(&fi->lock);
 
@@ -1139,7 +1139,7 @@ static int __do_deliver_io(struct kvm_vcpu *vcpu, struct kvm_s390_io_info *io)
 {
 	int rc;
 
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		vcpu->arch.sie_block->iictl = IICTL_CODE_IO;
 		vcpu->arch.sie_block->subchannel_id = io->subchannel_id;
 		vcpu->arch.sie_block->subchannel_nr = io->subchannel_nr;
@@ -1898,8 +1898,13 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
 	kvm->stat.inject_io++;
 	isc = int_word_to_isc(inti->io.io_int_word);
 
-	/* do not make use of gisa in protected mode */
-	if (!kvm_s390_pv_is_protected(kvm) &&
+	/*
+	 * Do not make use of gisa in protected mode. We do not use the lock
+	 * checking variant as this is just a performance optimization and we
+	 * do not hold the lock here. This is ok as the code will pick
+	 * interrupts from both "lists" for delivery.
+	 */
+	if (!kvm_s390_pv_handle(kvm) &&
 	    gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) {
 		VM_EVENT(kvm, 4, "%s isc %1u", "inject: I/O (AI/gisa)", isc);
 		gisa_set_ipm_gisc(gi->origin, isc);
@@ -2208,10 +2213,12 @@ void kvm_s390_clear_float_irqs(struct kvm *kvm)
 	struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
 	int i;
 
-	spin_lock(&fi->lock);
-	fi->pending_irqs = 0;
+	mutex_lock(&kvm->lock);
 	if (!kvm_s390_pv_is_protected(kvm))
 		fi->masked_irqs = 0;
+	mutex_unlock(&kvm->lock);
+	spin_lock(&fi->lock);
+	fi->pending_irqs = 0;
 	memset(&fi->srv_signal, 0, sizeof(fi->srv_signal));
 	memset(&fi->mchk, 0, sizeof(fi->mchk));
 	for (i = 0; i < FIRQ_LIST_COUNT; i++)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 87dc6caa2181..a095d9695f18 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2194,7 +2194,6 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 			break;
 		}
 
-		mutex_lock(&kvm->lock);
 		kvm_s390_vcpu_block_all(kvm);
 		/* FMT 4 SIE needs esca */
 		r = sca_switch_to_extended(kvm);
@@ -2208,7 +2207,6 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 		kvm_s390_vcpu_unblock_all(kvm);
 		/* we need to block service interrupts from now on */
 		set_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs);
-		mutex_unlock(&kvm->lock);
 		break;
 	}
 	case KVM_PV_VM_DESTROY: {
@@ -2216,8 +2214,6 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 		if (!kvm_s390_pv_is_protected(kvm))
 			break;
 
-		/* All VCPUs have to be destroyed before this call. */
-		mutex_lock(&kvm->lock);
 		kvm_s390_vcpu_block_all(kvm);
 		r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
 		if (!r)
@@ -2225,7 +2221,6 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 		kvm_s390_vcpu_unblock_all(kvm);
 		/* no need to block service interrupts any more */
 		clear_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs);
-		mutex_unlock(&kvm->lock);
 		break;
 	}
 	case KVM_PV_VM_SET_SEC_PARMS: {
@@ -2740,7 +2735,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	kvm_free_vcpus(kvm);
 	sca_dispose(kvm);
 	kvm_s390_gisa_destroy(kvm);
-	if (kvm_s390_pv_is_protected(kvm)) {
+	/* do not use the lock checking variant at tear-down */
+	if (kvm_s390_pv_handle(kvm)) {
 		kvm_s390_pv_destroy_vm(kvm, &rc, &rrc);
 		kvm_s390_pv_dealloc_vm(kvm);
 	}
@@ -3162,8 +3158,10 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 
 	kvm_s390_vcpu_crypto_setup(vcpu);
 
+	mutex_lock(&vcpu->kvm->lock);
 	if (kvm_s390_pv_is_protected(vcpu->kvm))
 		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
+	mutex_unlock(&vcpu->kvm->lock);
 
 	return rc;
 }
@@ -4053,14 +4051,14 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 		guest_enter_irqoff();
 		__disable_cpu_timer_accounting(vcpu);
 		local_irq_enable();
-		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 			memcpy(sie_page->pv_grregs,
 			       vcpu->run->s.regs.gprs,
 			       sizeof(sie_page->pv_grregs));
 		}
 		exit_reason = sie64a(vcpu->arch.sie_block,
 				     vcpu->run->s.regs.gprs);
-		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 			memcpy(vcpu->run->s.regs.gprs,
 			       sie_page->pv_grregs,
 			       sizeof(sie_page->pv_grregs));
@@ -4184,7 +4182,7 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		current->thread.fpu.fpc = 0;
 
 	/* Sync fmt2 only data */
-	if (likely(!kvm_s390_pv_is_protected(vcpu->kvm))) {
+	if (likely(!kvm_s390_pv_cpu_is_protected(vcpu))) {
 		sync_regs_fmt2(vcpu, kvm_run);
 	} else {
 		/*
@@ -4244,7 +4242,7 @@ static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	/* Restore will be done lazily at return */
 	current->thread.fpu.fpc = vcpu->arch.host_fpregs.fpc;
 	current->thread.fpu.regs = vcpu->arch.host_fpregs.regs;
-	if (likely(!kvm_s390_pv_is_protected(vcpu->kvm)))
+	if (likely(!kvm_s390_pv_cpu_is_protected(vcpu)))
 		store_regs_fmt2(vcpu, kvm_run);
 }
 
@@ -4443,7 +4441,7 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 	 * ultravisor. We block all interrupts and let the next sie exit
 	 * refresh our view.
 	 */
-	if (kvm_s390_pv_is_protected(vcpu->kvm))
+	if (kvm_s390_pv_cpu_is_protected(vcpu))
 		vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
 	/*
 	 * Another VCPU might have used IBS while we were offline.
@@ -4573,7 +4571,7 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 
 	switch (mop->op) {
 	case KVM_S390_MEMOP_LOGICAL_READ:
-		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 			r = -EINVAL;
 			break;
 		}
@@ -4589,7 +4587,7 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
 		}
 		break;
 	case KVM_S390_MEMOP_LOGICAL_WRITE:
-		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+		if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 			r = -EINVAL;
 			break;
 		}
@@ -4655,7 +4653,7 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
 {
 	int r = 0;
 
-	if (!kvm_s390_pv_is_protected(vcpu->kvm))
+	if (!kvm_s390_pv_cpu_is_protected(vcpu))
 		return -EINVAL;
 
 	if (cmd->flags)
@@ -4751,7 +4749,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	case KVM_GET_ONE_REG: {
 		struct kvm_one_reg reg;
 		r = -EINVAL;
-		if (kvm_s390_pv_is_protected(vcpu->kvm))
+		if (kvm_s390_pv_cpu_is_protected(vcpu))
 			break;
 		r = -EFAULT;
 		if (copy_from_user(&reg, argp, sizeof(reg)))
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 1af1e30beead..f00a99957f79 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -15,6 +15,7 @@
 #include <linux/hrtimer.h>
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
+#include <linux/lockdep.h>
 #include <asm/facility.h>
 #include <asm/processor.h>
 #include <asm/sclp.h>
@@ -221,11 +222,6 @@ int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
 int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
 			      u16 *rrc);
 
-static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
-{
-	return !!kvm->arch.pv.handle;
-}
-
 static inline u64 kvm_s390_pv_handle(struct kvm *kvm)
 {
 	return kvm->arch.pv.handle;
@@ -236,6 +232,18 @@ static inline u64 kvm_s390_pv_handle_cpu(struct kvm_vcpu *vcpu)
 	return vcpu->arch.pv.handle;
 }
 
+static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
+{
+	lockdep_assert_held(&kvm->lock);
+	return !!kvm_s390_pv_handle(kvm);
+}
+
+static inline bool kvm_s390_pv_cpu_is_protected(struct kvm_vcpu *vcpu)
+{
+	lockdep_assert_held(&vcpu->mutex);
+	return !!kvm_s390_pv_handle_cpu(vcpu);
+}
+
 /* implemented in interrupt.c */
 int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
 void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu);
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index b2de7dc5f58d..394b57090ff6 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -872,7 +872,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 
 	operand2 = kvm_s390_get_base_disp_s(vcpu, &ar);
 
-	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (operand2 & 0xfff))
+	if (!kvm_s390_pv_cpu_is_protected(vcpu) && (operand2 & 0xfff))
 		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
 
 	switch (fc) {
@@ -893,7 +893,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 		handle_stsi_3_2_2(vcpu, (void *) mem);
 		break;
 	}
-	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
+	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
 		       PAGE_SIZE);
 		rc = 0;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH 2/2] merge vm/cpu create
  2020-02-17 14:53         ` [PATCH 0/2] example changes Christian Borntraeger
  2020-02-17 14:53           ` [PATCH 1/2] lock changes Christian Borntraeger
@ 2020-02-17 14:53           ` Christian Borntraeger
  2020-02-17 15:00             ` Janosch Frank
  2020-02-17 19:18           ` [PATCH 0/2] example changes David Hildenbrand
  2 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 14:53 UTC (permalink / raw)
  To: david
  Cc: Ulrich.Weigand, borntraeger, cohuck, frankja, frankja, gor,
	imbrenda, kvm, linux-s390, mimu, thuth

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 55 +++++++++++++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a095d9695f18..10b20e17a7fe 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2171,9 +2171,41 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_s390_switch_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	int i, r = 0;
+
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
+		if (r)
+			break;
+	}
+	return r;
+}
+
+static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	int i, r = 0;
+	u16 dummy;
+
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
+		if (r)
+			break;
+	}
+	if (r)
+		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
+	return r;
+}
+
 static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 {
 	int r = 0;
+	u16 dummy;
 	void __user *argp = (void __user *)cmd->data;
 
 	switch (cmd->cmd) {
@@ -2204,6 +2236,11 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 			break;
 		}
 		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
+		if (!r)
+			r = kvm_s390_switch_to_pv(kvm, &cmd->rc, &cmd->rrc);
+		if (r)
+			kvm_s390_pv_destroy_vm(kvm, &dummy, &dummy);
+
 		kvm_s390_vcpu_unblock_all(kvm);
 		/* we need to block service interrupts from now on */
 		set_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs);
@@ -2215,7 +2252,9 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
 			break;
 
 		kvm_s390_vcpu_block_all(kvm);
-		r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
+		r = kvm_s390_switch_from_pv(kvm, &cmd->rc, &cmd->rrc);
+		if (!r)
+			r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
 		if (!r)
 			kvm_s390_pv_dealloc_vm(kvm);
 		kvm_s390_vcpu_unblock_all(kvm);
@@ -4660,20 +4699,6 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	switch (cmd->cmd) {
-	case KVM_PV_VCPU_CREATE: {
-		if (kvm_s390_pv_handle_cpu(vcpu))
-			return -EINVAL;
-
-		r = kvm_s390_pv_create_cpu(vcpu, &cmd->rc, &cmd->rrc);
-		break;
-	}
-	case KVM_PV_VCPU_DESTROY: {
-		if (!kvm_s390_pv_handle_cpu(vcpu))
-			return -EINVAL;
-
-		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
-		break;
-	}
 	case KVM_PV_VCPU_SET_IPL_PSW: {
 		if (!kvm_s390_pv_handle_cpu(vcpu))
 			return -EINVAL;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH 2/2] merge vm/cpu create
  2020-02-17 14:53           ` [PATCH 2/2] merge vm/cpu create Christian Borntraeger
@ 2020-02-17 15:00             ` Janosch Frank
  2020-02-17 15:02               ` Christian Borntraeger
  2020-02-19 11:02               ` Christian Borntraeger
  0 siblings, 2 replies; 132+ messages in thread
From: Janosch Frank @ 2020-02-17 15:00 UTC (permalink / raw)
  To: Christian Borntraeger, david
  Cc: Ulrich.Weigand, cohuck, frankja, gor, imbrenda, kvm, linux-s390,
	mimu, thuth

[-- Attachment #1.1: Type: text/plain, Size: 3019 bytes --]

On 2/17/20 3:53 PM, Christian Borntraeger wrote:
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/kvm-s390.c | 55 +++++++++++++++++++++++++++++-----------
>  1 file changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index a095d9695f18..10b20e17a7fe 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2171,9 +2171,41 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
>  	return r;
>  }
>  
> +static int kvm_s390_switch_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
> +{
> +	int i, r = 0;
> +
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
> +		if (r)
> +			break;
> +	}
> +	return r;
> +}
> +
> +static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
> +{
> +	int i, r = 0;
> +	u16 dummy;
> +
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
> +		if (r)
> +			break;
> +	}
> +	if (r)
> +		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
> +	return r;
> +}

Why does that only affect the cpus?
If we have a switch function it should do VM and VCPUs, no?

> +
>  static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>  {
>  	int r = 0;
> +	u16 dummy;
>  	void __user *argp = (void __user *)cmd->data;
>  
>  	switch (cmd->cmd) {
> @@ -2204,6 +2236,11 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>  			break;
>  		}
>  		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
> +		if (!r)
> +			r = kvm_s390_switch_to_pv(kvm, &cmd->rc, &cmd->rrc);
> +		if (r)
> +			kvm_s390_pv_destroy_vm(kvm, &dummy, &dummy);
> +
>  		kvm_s390_vcpu_unblock_all(kvm);
>  		/* we need to block service interrupts from now on */
>  		set_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs);
> @@ -2215,7 +2252,9 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>  			break;
>  
>  		kvm_s390_vcpu_block_all(kvm);
> -		r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
> +		r = kvm_s390_switch_from_pv(kvm, &cmd->rc, &cmd->rrc);
> +		if (!r)
> +			r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
>  		if (!r)
>  			kvm_s390_pv_dealloc_vm(kvm);
>  		kvm_s390_vcpu_unblock_all(kvm);
> @@ -4660,20 +4699,6 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
>  		return -EINVAL;
>  
>  	switch (cmd->cmd) {
> -	case KVM_PV_VCPU_CREATE: {
> -		if (kvm_s390_pv_handle_cpu(vcpu))
> -			return -EINVAL;
> -
> -		r = kvm_s390_pv_create_cpu(vcpu, &cmd->rc, &cmd->rrc);
> -		break;
> -	}
> -	case KVM_PV_VCPU_DESTROY: {
> -		if (!kvm_s390_pv_handle_cpu(vcpu))
> -			return -EINVAL;
> -
> -		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
> -		break;
> -	}
>  	case KVM_PV_VCPU_SET_IPL_PSW: {
>  		if (!kvm_s390_pv_handle_cpu(vcpu))
>  			return -EINVAL;
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer
  2020-02-17 14:47     ` Janosch Frank
@ 2020-02-17 15:00       ` Christian Borntraeger
  2020-02-17 15:38         ` Janosch Frank
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 15:00 UTC (permalink / raw)
  To: Janosch Frank, David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik



On 17.02.20 15:47, Janosch Frank wrote:
> On 2/17/20 12:08 PM, David Hildenbrand wrote:
>>> @@ -4460,6 +4489,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>>>  
>>>  	switch (mop->op) {
>>>  	case KVM_S390_MEMOP_LOGICAL_READ:
>>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>>> +			r = -EINVAL;
>>> +			break;
>>> +		}
>>
>> Could we have a possible race with disabling code, especially while
>> concurrently freeing? (sorry if I ask again, there was just a flood of
>> emails)

see my other reply. Hopefully fixed soon.[...]

>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index 207915488502..0fdee1bc3798 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -475,11 +475,15 @@ struct kvm_s390_mem_op {
>>>  	__u32 op;		/* type of operation */
>>>  	__u64 buf;		/* buffer in userspace */
>>>  	__u8 ar;		/* the access register number */
>>> -	__u8 reserved[31];	/* should be set to 0 */
>>> +	__u8 reserved21[3];	/* should be set to 0 */
>>> +	__u32 sida_offset;	/* offset into the sida */
>>> +	__u8 reserved28[24];	/* should be set to 0 */
>>>  };
>>
>> As discussed, I'd prefer an overlaying layout for the sida, as the ar
>> does not make any sense (correct me if I'm wrong :) )
> 
> That wouldn't work, because we still check mop->ar < 16 in
> kvm_s390_guest_mem_op(). Also we currently check mop contents twice
> because we overload mem_op() with the SIDA operations.
> 
> Using a separate IOCTL is cleaner...

I would rather use the current patch instead of adding a new ioctl.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH 2/2] merge vm/cpu create
  2020-02-17 15:00             ` Janosch Frank
@ 2020-02-17 15:02               ` Christian Borntraeger
  2020-02-19 11:02               ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 15:02 UTC (permalink / raw)
  To: Janosch Frank, david
  Cc: Ulrich.Weigand, cohuck, frankja, gor, imbrenda, kvm, linux-s390,
	mimu, thuth



On 17.02.20 16:00, Janosch Frank wrote:
> On 2/17/20 3:53 PM, Christian Borntraeger wrote:
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/kvm/kvm-s390.c | 55 +++++++++++++++++++++++++++++-----------
>>  1 file changed, 40 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index a095d9695f18..10b20e17a7fe 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2171,9 +2171,41 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
>>  	return r;
>>  }
>>  
>> +static int kvm_s390_switch_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
>> +{
>> +	int i, r = 0;
>> +
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
>> +		if (r)
>> +			break;
>> +	}
>> +	return r;
>> +}
>> +
>> +static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
>> +{
>> +	int i, r = 0;
>> +	u16 dummy;
>> +
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
>> +		if (r)
>> +			break;
>> +	}
>> +	if (r)
>> +		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
>> +	return r;
>> +}
> 
> Why does that only affect the cpus?
> If we have a switch function it should do VM and VCPUs, no?

It is a helper function for the function below. 

FWIW, it also needs to take the cpu->mutex for each cpu.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer
  2020-02-17 15:00       ` Christian Borntraeger
@ 2020-02-17 15:38         ` Janosch Frank
  2020-02-17 16:58           ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: Janosch Frank @ 2020-02-17 15:38 UTC (permalink / raw)
  To: Christian Borntraeger, David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

[-- Attachment #1.1: Type: text/plain, Size: 1827 bytes --]

On 2/17/20 4:00 PM, Christian Borntraeger wrote:
> 
> 
> On 17.02.20 15:47, Janosch Frank wrote:
>> On 2/17/20 12:08 PM, David Hildenbrand wrote:
>>>> @@ -4460,6 +4489,10 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
>>>>  
>>>>  	switch (mop->op) {
>>>>  	case KVM_S390_MEMOP_LOGICAL_READ:
>>>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>>>> +			r = -EINVAL;
>>>> +			break;
>>>> +		}
>>>
>>> Could we have a possible race with disabling code, especially while
>>> concurrently freeing? (sorry if I ask again, there was just a flood of
>>> emails)
> 
> see my other reply. Hopefully fixed soon.[...]
> 
>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>> index 207915488502..0fdee1bc3798 100644
>>>> --- a/include/uapi/linux/kvm.h
>>>> +++ b/include/uapi/linux/kvm.h
>>>> @@ -475,11 +475,15 @@ struct kvm_s390_mem_op {
>>>>  	__u32 op;		/* type of operation */
>>>>  	__u64 buf;		/* buffer in userspace */
>>>>  	__u8 ar;		/* the access register number */
>>>> -	__u8 reserved[31];	/* should be set to 0 */
>>>> +	__u8 reserved21[3];	/* should be set to 0 */
>>>> +	__u32 sida_offset;	/* offset into the sida */
>>>> +	__u8 reserved28[24];	/* should be set to 0 */
>>>>  };
>>>
>>> As discussed, I'd prefer an overlaying layout for the sida, as the ar
>>> does not make any sense (correct me if I'm wrong :) )
>>
>> That wouldn't work, because we still check mop->ar < 16 in
>> kvm_s390_guest_mem_op(). Also we currently check mop contents twice
>> because we overload mem_op() with the SIDA operations.
>>
>> Using a separate IOCTL is cleaner...
> 
> I would rather use the current patch instead of adding a new ioctl.
> 

Well, then I'd suggest moving the normal memop ops into an own function
and also move the ar check into there.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer
  2020-02-17 15:38         ` Janosch Frank
@ 2020-02-17 16:58           ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 16:58 UTC (permalink / raw)
  To: Janosch Frank, David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik



On 17.02.20 16:38, Janosch Frank wrote:

>> I would rather use the current patch instead of adding a new ioctl.
>>
> 
> Well, then I'd suggest moving the normal memop ops into an own function
> and also move the ar check into there.

Yes, I will refactor that and change the kvm_s390_mem_op structure.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area
  2020-02-17 14:24   ` David Hildenbrand
@ 2020-02-17 18:40     ` Christian Borntraeger
  2020-02-17 19:16       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-17 18:40 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 17.02.20 15:24, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> STHYI data has to go through the bounce buffer.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/kvm/intercept.c | 15 ++++++++++-----
>>  1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
>> index 1e231058e4b3..cfabeecbb777 100644
>> --- a/arch/s390/kvm/intercept.c
>> +++ b/arch/s390/kvm/intercept.c
>> @@ -392,7 +392,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
>>  		goto out;
>>  	}
>>  
>> -	if (addr & ~PAGE_MASK)
>> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (addr & ~PAGE_MASK))
>>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>  
>>  	sctns = (void *)get_zeroed_page(GFP_KERNEL);
>> @@ -403,10 +403,15 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
>>  
>>  out:
>>  	if (!cc) {
>> -		r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE);
>> -		if (r) {
>> -			free_page((unsigned long)sctns);
>> -			return kvm_s390_inject_prog_cond(vcpu, r);
>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> 
> I have the feeling that we might have to think about proper locking for
> kvm_s390_pv_is_protected(). We have to make sure there cannot be any
> races with user space. Smells like a new r/w lock maybe.

I think we can keep that within kvm->lock (for the global changes) and vcpu->mutex. 
All the intercepts only happen within the vcpu run ioctl and that takes the
vcpu->mutex. See my other patch (with  mutex_lock(&vcpu->mutex) and mutex_unlock(&vcpu->mutex)
around the create/destroy functions). lockdep_assert_help is happy and as long as the
pv_handle reflects the state of the control block of that CPU we are good.

I pushed out the patch/rfcs to pv_worktree on kvms390.git.




^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area
  2020-02-17 18:40     ` Christian Borntraeger
@ 2020-02-17 19:16       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 19:16 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 17.02.20 19:40, Christian Borntraeger wrote:
> 
> 
> On 17.02.20 15:24, David Hildenbrand wrote:
>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>> From: Janosch Frank <frankja@linux.ibm.com>
>>>
>>> STHYI data has to go through the bounce buffer.
>>>
>>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> ---
>>>  arch/s390/kvm/intercept.c | 15 ++++++++++-----
>>>  1 file changed, 10 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
>>> index 1e231058e4b3..cfabeecbb777 100644
>>> --- a/arch/s390/kvm/intercept.c
>>> +++ b/arch/s390/kvm/intercept.c
>>> @@ -392,7 +392,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
>>>  		goto out;
>>>  	}
>>>  
>>> -	if (addr & ~PAGE_MASK)
>>> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (addr & ~PAGE_MASK))
>>>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>>  
>>>  	sctns = (void *)get_zeroed_page(GFP_KERNEL);
>>> @@ -403,10 +403,15 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
>>>  
>>>  out:
>>>  	if (!cc) {
>>> -		r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE);
>>> -		if (r) {
>>> -			free_page((unsigned long)sctns);
>>> -			return kvm_s390_inject_prog_cond(vcpu, r);
>>> +		if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>>
>> I have the feeling that we might have to think about proper locking for
>> kvm_s390_pv_is_protected(). We have to make sure there cannot be any
>> races with user space. Smells like a new r/w lock maybe.
> 
> I think we can keep that within kvm->lock (for the global changes) and vcpu->mutex. 
> All the intercepts only happen within the vcpu run ioctl and that takes the
> vcpu->mutex. See my other patch (with  mutex_lock(&vcpu->mutex) and mutex_unlock(&vcpu->mutex)
> around the create/destroy functions). lockdep_assert_help is happy and as long as the
> pv_handle reflects the state of the control block of that CPU we are good.
> 
> I pushed out the patch/rfcs to pv_worktree on kvms390.git.

Fine with me!


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH 0/2] example changes
  2020-02-17 14:53         ` [PATCH 0/2] example changes Christian Borntraeger
  2020-02-17 14:53           ` [PATCH 1/2] lock changes Christian Borntraeger
  2020-02-17 14:53           ` [PATCH 2/2] merge vm/cpu create Christian Borntraeger
@ 2020-02-17 19:18           ` David Hildenbrand
  2 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-17 19:18 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth

On 17.02.20 15:53, Christian Borntraeger wrote:
> I also need to rename the ioctls and do more testing,
> but this is probably what you want?

Yes, looks like that should be it! Will have a look at the remaining
patches tomorrow.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-17 10:56   ` David Hildenbrand
  2020-02-17 12:04     ` Christian Borntraeger
@ 2020-02-18  8:09     ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  8:09 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 17.02.20 11:56, David Hildenbrand wrote:
>> +	}
>> +	case KVM_PV_VM_SET_SEC_PARMS: {
> 
> I'd name this "KVM_PV_VM_SET_PARMS" instead.

I would stick with the SEC name as this is more similar to the UV call name.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-17 11:10     ` Christian Borntraeger
@ 2020-02-18  8:27       ` David Hildenbrand
  2020-02-18 15:46         ` Sean Christopherson
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  8:27 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, Andrew Morton
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm, Will Deacon, Sean Christopherson

On 17.02.20 12:10, Christian Borntraeger wrote:
> 
> 
> On 17.02.20 10:14, David Hildenbrand wrote:
>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
>>>
>>> With the introduction of protected KVM guests on s390 there is now a
>>> concept of inaccessible pages. These pages need to be made accessible
>>> before the host can access them.
>>>
>>> While cpu accesses will trigger a fault that can be resolved, I/O
>>> accesses will just fail.  We need to add a callback into architecture
>>> code for places that will do I/O, namely when writeback is started or
>>> when a page reference is taken.
>>>
>>> This is not only to enable paging, file backing etc, it is also
>>> necessary to protect the host against a malicious user space. For
>>> example a bad QEMU could simply start direct I/O on such protected
>>> memory.  We do not want userspace to be able to trigger I/O errors and
>>> thus we the logic is "whenever somebody accesses that page (gup) or
>>> doing I/O, make sure that this page can be accessed. When the guest
>>> tries to access that page we will wait in the page fault handler for
>>> writeback to have finished and for the page_ref to be the expected
>>> value.
>>>
>>> If wanted by others, the callbacks can be extended with error handlin
>>> and a parameter from where this is called.
>>
>> s/handlin/handling/
> 
> ack. 
>>
>> One last question from my side:
>>
>> Why is it OK to ignore errors here. IOW, why not squash "[PATCH v2
>> 39/42] example for future extension: mm:gup/writeback: add callbacks for
>> inaccessible pages: error cases" into this patch.
> 
> I can certainly squash this in. But I have never heard feedback from the 
> memory management people if they would prefer the patch as-is (no overhead
> at all for !s390) or already with the error handling. 
>>
>> I can see in patch "[PATCH v2 05/42] s390/mm: provide memory management
>> functions for protected KVM guests", that the call can fail for various
>> reasons. That puzzles me a bit - what would happen if any of that fails?
> 
> The convert _to_ secure has error handling. uv_convert_from_secure (the
> one that is used for arch_make_page_accessible can only fail if we mess up,
> e.g. an "export" on memory that we have donated to the ultravisor.

Okay, so more like a BUG_ON() in case it really fails.

> 
> 
>> Or will it actually never fail for s390x (and all that error handling in
>> arch_make_page_accessible() is essentially dead code in real life?)
> 
> So yes, if everything is setup properly this should not fail in real life
> and only we have a kernel (or firmware) bug.
> 

Then, without feedback from other possible users, this should be a void
function. So either introduce error handling or convert it to a void for
now (and add e.g., BUG_ON and a comment inside the s390x implementation).

As discussed in the other patch, I'd prefer a
CONFIG_HAVE_ARCH_MAKE_PAGE_ACCESSIBLE and handle it via kconfig, but not
sure what other people here prefer.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 24/42] KVM: s390: protvirt: STSI handling
  2020-02-14 22:26 ` [PATCH v2 24/42] KVM: s390: protvirt: STSI handling Christian Borntraeger
@ 2020-02-18  8:35   ` David Hildenbrand
  2020-02-18  8:44     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  8:35 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> Save response to sidad and disable address checking for protected
> guests.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/priv.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index ed52ffa8d5d4..b2de7dc5f58d 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -872,7 +872,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>  
>  	operand2 = kvm_s390_get_base_disp_s(vcpu, &ar);
>  
> -	if (operand2 & 0xfff)
> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (operand2 & 0xfff))
>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);

Why is that needed? I'd assume the hardware handles this for us and this
case can never happen for PV? (IOW, change is not necessary)

>  
>  	switch (fc) {
> @@ -893,8 +893,13 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>  		handle_stsi_3_2_2(vcpu, (void *) mem);
>  		break;
>  	}
> -
> -	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
> +	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> +		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
> +		       PAGE_SIZE);
> +		rc = 0;
> +	} else {
> +		rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
> +	}
>  	if (rc) {
>  		rc = kvm_s390_inject_prog_cond(vcpu, rc);
>  		goto out;
> 

I'd pull the interrupt injection into the else case, makes things clearer.

What about user_stsi? Will that still work? (I assume user space will
write to the sida and it should work just fine)

(I assume races regarding kvm_s390_pv_is_protected() will be dealt with
in your next series)

In general, looks good to me.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-14 22:26 ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
  2020-02-17 10:56   ` David Hildenbrand
@ 2020-02-18  8:39   ` " Christian Borntraeger
  2020-02-18  9:12     ` David Hildenbrand
  2020-02-18  9:56     ` David Hildenbrand
  1 sibling, 2 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  8:39 UTC (permalink / raw)
  To: borntraeger
  Cc: Ulrich.Weigand, cohuck, david, frankja, frankja, gor, imbrenda,
	kvm, linux-s390, mimu, thuth

From: Janosch Frank <frankja@linux.ibm.com>

This contains 3 main changes:
1. changes in SIE control block handling for secure guests
2. helper functions for create/destroy/unpack secure guests
3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
machines

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
2->2.1  - combine CREATE/DESTROY CPU/VM into ENABLE DISABLE
	- rework locking and check locks with lockdep
	- I still have the PV_COMMAND_CPU in here for later use in
	  the SET_IPL_PSW ioctl. If wanted I can move
	- change CAP number

 arch/s390/include/asm/kvm_host.h |  24 ++-
 arch/s390/include/asm/uv.h       |  69 ++++++++
 arch/s390/kvm/Makefile           |   2 +-
 arch/s390/kvm/kvm-s390.c         | 231 ++++++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.h         |  35 +++++
 arch/s390/kvm/pv.c               | 262 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h         |  35 +++++
 7 files changed, 654 insertions(+), 4 deletions(-)
 create mode 100644 arch/s390/kvm/pv.c

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index d058289385a5..1aa2382fe363 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -160,7 +160,13 @@ struct kvm_s390_sie_block {
 	__u8	reserved08[4];		/* 0x0008 */
 #define PROG_IN_SIE (1<<0)
 	__u32	prog0c;			/* 0x000c */
-	__u8	reserved10[16];		/* 0x0010 */
+	union {
+		__u8	reserved10[16];		/* 0x0010 */
+		struct {
+			__u64	pv_handle_cpu;
+			__u64	pv_handle_config;
+		};
+	};
 #define PROG_BLOCK_SIE	(1<<0)
 #define PROG_REQUEST	(1<<1)
 	atomic_t prog20;		/* 0x0020 */
@@ -233,7 +239,7 @@ struct kvm_s390_sie_block {
 #define ECB3_RI  0x01
 	__u8    ecb3;			/* 0x0063 */
 	__u32	scaol;			/* 0x0064 */
-	__u8	reserved68;		/* 0x0068 */
+	__u8	sdf;			/* 0x0068 */
 	__u8    epdx;			/* 0x0069 */
 	__u8    reserved6a[2];		/* 0x006a */
 	__u32	todpr;			/* 0x006c */
@@ -645,6 +651,11 @@ struct kvm_guestdbg_info_arch {
 	unsigned long last_bp;
 };
 
+struct kvm_s390_pv_vcpu {
+	u64 handle;
+	unsigned long stor_base;
+};
+
 struct kvm_vcpu_arch {
 	struct kvm_s390_sie_block *sie_block;
 	/* if vsie is active, currently executed shadow sie control block */
@@ -673,6 +684,7 @@ struct kvm_vcpu_arch {
 	__u64 cputm_start;
 	bool gs_enabled;
 	bool skey_enabled;
+	struct kvm_s390_pv_vcpu pv;
 };
 
 struct kvm_vm_stat {
@@ -843,6 +855,13 @@ struct kvm_s390_gisa_interrupt {
 	DECLARE_BITMAP(kicked_mask, KVM_MAX_VCPUS);
 };
 
+struct kvm_s390_pv {
+	u64 handle;
+	u64 guest_len;
+	unsigned long stor_base;
+	void *stor_var;
+};
+
 struct kvm_arch{
 	void *sca;
 	int use_esca;
@@ -878,6 +897,7 @@ struct kvm_arch{
 	DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
 	DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS);
 	struct kvm_s390_gisa_interrupt gisa_int;
+	struct kvm_s390_pv pv;
 };
 
 #define KVM_HVA_ERR_BAD		(-1UL)
diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
index bc452a15ac3f..839cb3a89986 100644
--- a/arch/s390/include/asm/uv.h
+++ b/arch/s390/include/asm/uv.h
@@ -23,11 +23,19 @@
 #define UVC_RC_INV_STATE	0x0003
 #define UVC_RC_INV_LEN		0x0005
 #define UVC_RC_NO_RESUME	0x0007
+#define UVC_RC_NEED_DESTROY	0x8000
 
 #define UVC_CMD_QUI			0x0001
 #define UVC_CMD_INIT_UV			0x000f
+#define UVC_CMD_CREATE_SEC_CONF		0x0100
+#define UVC_CMD_DESTROY_SEC_CONF	0x0101
+#define UVC_CMD_CREATE_SEC_CPU		0x0120
+#define UVC_CMD_DESTROY_SEC_CPU		0x0121
 #define UVC_CMD_CONV_TO_SEC_STOR	0x0200
 #define UVC_CMD_CONV_FROM_SEC_STOR	0x0201
+#define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
+#define UVC_CMD_UNPACK_IMG		0x0301
+#define UVC_CMD_VERIFY_IMG		0x0302
 #define UVC_CMD_PIN_PAGE_SHARED		0x0341
 #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
 #define UVC_CMD_SET_SHARED_ACCESS	0x1000
@@ -37,10 +45,17 @@
 enum uv_cmds_inst {
 	BIT_UVC_CMD_QUI = 0,
 	BIT_UVC_CMD_INIT_UV = 1,
+	BIT_UVC_CMD_CREATE_SEC_CONF = 2,
+	BIT_UVC_CMD_DESTROY_SEC_CONF = 3,
+	BIT_UVC_CMD_CREATE_SEC_CPU = 4,
+	BIT_UVC_CMD_DESTROY_SEC_CPU = 5,
 	BIT_UVC_CMD_CONV_TO_SEC_STOR = 6,
 	BIT_UVC_CMD_CONV_FROM_SEC_STOR = 7,
 	BIT_UVC_CMD_SET_SHARED_ACCESS = 8,
 	BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9,
+	BIT_UVC_CMD_SET_SEC_PARMS = 11,
+	BIT_UVC_CMD_UNPACK_IMG = 13,
+	BIT_UVC_CMD_VERIFY_IMG = 14,
 	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
 	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
 };
@@ -52,6 +67,7 @@ struct uv_cb_header {
 	u16 rrc;	/* Return Reason Code */
 } __packed __aligned(8);
 
+/* Query Ultravisor Information */
 struct uv_cb_qui {
 	struct uv_cb_header header;
 	u64 reserved08;
@@ -71,6 +87,7 @@ struct uv_cb_qui {
 	u64 reserveda0;
 } __packed __aligned(8);
 
+/* Initialize Ultravisor */
 struct uv_cb_init {
 	struct uv_cb_header header;
 	u64 reserved08[2];
@@ -79,6 +96,35 @@ struct uv_cb_init {
 	u64 reserved28[4];
 } __packed __aligned(8);
 
+/* Create Guest Configuration */
+struct uv_cb_cgc {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 conf_base_stor_origin;
+	u64 conf_virt_stor_origin;
+	u64 reserved30;
+	u64 guest_stor_origin;
+	u64 guest_stor_len;
+	u64 guest_sca;
+	u64 guest_asce;
+	u64 reserved58[5];
+} __packed __aligned(8);
+
+/* Create Secure CPU */
+struct uv_cb_csc {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 cpu_handle;
+	u64 guest_handle;
+	u64 stor_origin;
+	u8  reserved30[6];
+	u16 num;
+	u64 state_origin;
+	u64 reserved40[4];
+} __packed __aligned(8);
+
+/* Convert to Secure */
 struct uv_cb_cts {
 	struct uv_cb_header header;
 	u64 reserved08[2];
@@ -86,12 +132,34 @@ struct uv_cb_cts {
 	u64 gaddr;
 } __packed __aligned(8);
 
+/* Convert from Secure / Pin Page Shared */
 struct uv_cb_cfs {
 	struct uv_cb_header header;
 	u64 reserved08[2];
 	u64 paddr;
 } __packed __aligned(8);
 
+/* Set Secure Config Parameter */
+struct uv_cb_ssc {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 sec_header_origin;
+	u32 sec_header_len;
+	u32 reserved2c;
+	u64 reserved30[4];
+} __packed __aligned(8);
+
+/* Unpack */
+struct uv_cb_unp {
+	struct uv_cb_header header;
+	u64 reserved08[2];
+	u64 guest_handle;
+	u64 gaddr;
+	u64 tweak[2];
+	u64 reserved38[3];
+} __packed __aligned(8);
+
 /*
  * A common UV call struct for calls that take no payload
  * Examples:
@@ -105,6 +173,7 @@ struct uv_cb_nodata {
 	u64 reserved20[4];
 } __packed __aligned(8);
 
+/* Set Shared Access */
 struct uv_cb_share {
 	struct uv_cb_header header;
 	u64 reserved08[3];
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 05ee90a5ea08..12decca22e7c 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o  $(KVM)/async_pf.o $(KVM)/irqch
 ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
 
 kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
-kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
+kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
 
 obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index cc7793525a69..1a7bb08f5c26 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -44,6 +44,7 @@
 #include <asm/cpacf.h>
 #include <asm/timex.h>
 #include <asm/ap.h>
+#include <asm/uv.h>
 #include "kvm-s390.h"
 #include "gaccess.h"
 
@@ -234,8 +235,10 @@ int kvm_arch_check_processor_compat(void)
 	return 0;
 }
 
+/* forward declarations */
 static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
 			      unsigned long end);
+static int sca_switch_to_extended(struct kvm *kvm);
 
 static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
 {
@@ -571,6 +574,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_BPB:
 		r = test_facility(82);
 		break;
+	case KVM_CAP_S390_PROTECTED:
+		r = is_prot_virt_host();
+		break;
 	default:
 		r = 0;
 	}
@@ -2165,6 +2171,152 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_s390_switch_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	int i, r = 0;
+
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		mutex_lock(&vcpu->mutex);
+		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
+		mutex_unlock(&vcpu->mutex);
+		if (r)
+			break;
+	}
+	return r;
+}
+
+static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	int i, r = 0;
+	u16 dummy;
+
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		mutex_lock(&vcpu->mutex);
+		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
+		mutex_unlock(&vcpu->mutex);
+		if (r)
+			break;
+	}
+	if (r)
+		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
+	return r;
+}
+
+static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
+{
+	int r = 0;
+	u16 dummy;
+	void __user *argp = (void __user *)cmd->data;
+
+	switch (cmd->cmd) {
+	case KVM_PV_ENABLE: {
+		r = -EINVAL;
+		if (kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = kvm_s390_pv_alloc_vm(kvm);
+		if (r)
+			break;
+
+		kvm_s390_vcpu_block_all(kvm);
+		/* FMT 4 SIE needs esca */
+		r = sca_switch_to_extended(kvm);
+		if (r) {
+			kvm_s390_pv_dealloc_vm(kvm);
+			kvm_s390_vcpu_unblock_all(kvm);
+			mutex_unlock(&kvm->lock);
+			break;
+		}
+		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
+		if (!r)
+			r = kvm_s390_switch_to_pv(kvm, &cmd->rc, &cmd->rrc);
+		if (r)
+			kvm_s390_pv_destroy_vm(kvm, &dummy, &dummy);
+
+		kvm_s390_vcpu_unblock_all(kvm);
+		break;
+	}
+	case KVM_PV_DISABLE: {
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		kvm_s390_vcpu_block_all(kvm);
+		r = kvm_s390_switch_from_pv(kvm, &cmd->rc, &cmd->rrc);
+		if (!r)
+			r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
+		if (!r)
+			kvm_s390_pv_dealloc_vm(kvm);
+		kvm_s390_vcpu_unblock_all(kvm);
+		break;
+	}
+	case KVM_PV_VM_SET_SEC_PARMS: {
+		struct kvm_s390_pv_sec_parm parms = {};
+		void *hdr;
+
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = -EFAULT;
+		if (copy_from_user(&parms, argp, sizeof(parms)))
+			break;
+
+		/* Currently restricted to 8KB */
+		r = -EINVAL;
+		if (parms.length > PAGE_SIZE * 2)
+			break;
+
+		r = -ENOMEM;
+		hdr = vmalloc(parms.length);
+		if (!hdr)
+			break;
+
+		r = -EFAULT;
+		if (!copy_from_user(hdr, (void __user *)parms.origin,
+				    parms.length))
+			r = kvm_s390_pv_set_sec_parms(kvm, hdr, parms.length,
+						      &cmd->rc, &cmd->rrc);
+
+		vfree(hdr);
+		break;
+	}
+	case KVM_PV_VM_UNPACK: {
+		struct kvm_s390_pv_unp unp = {};
+
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = -EFAULT;
+		if (copy_from_user(&unp, argp, sizeof(unp)))
+			break;
+
+		r = kvm_s390_pv_unpack(kvm, unp.addr, unp.size, unp.tweak,
+				       &cmd->rc, &cmd->rrc);
+		break;
+	}
+	case KVM_PV_VM_VERIFY: {
+		r = -EINVAL;
+		if (!kvm_s390_pv_is_protected(kvm))
+			break;
+
+		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
+				  UVC_CMD_VERIFY_IMG, &cmd->rc, &cmd->rrc);
+		KVM_UV_EVENT(kvm, 3, "PROTVIRT VERIFY: rc %x rrc %x", cmd->rc,
+			     cmd->rrc);
+		break;
+	}
+	default:
+		return -ENOTTY;
+	}
+	return r;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -2262,6 +2414,27 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		mutex_unlock(&kvm->slots_lock);
 		break;
 	}
+	case KVM_S390_PV_COMMAND: {
+		struct kvm_pv_cmd args;
+
+		r = 0;
+		if (!is_prot_virt_host()) {
+			r = -EINVAL;
+			break;
+		}
+		if (copy_from_user(&args, argp, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		mutex_lock(&kvm->lock);
+		r = kvm_s390_handle_pv(kvm, &args);
+		mutex_unlock(&kvm->lock);
+		if (copy_to_user(argp, &args, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
@@ -2525,6 +2698,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
+	u16 rc, rrc;
+
 	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
 	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
 	kvm_s390_clear_local_irqs(vcpu);
@@ -2537,6 +2712,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 	if (vcpu->kvm->arch.use_cmma)
 		kvm_s390_vcpu_unsetup_cmma(vcpu);
+	if (kvm_s390_pv_handle_cpu(vcpu))
+		kvm_s390_pv_destroy_cpu(vcpu, &rc, &rrc);
 	free_page((unsigned long)(vcpu->arch.sie_block));
 }
 
@@ -2558,10 +2735,16 @@ static void kvm_free_vcpus(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	u16 rc, rrc;
 	kvm_free_vcpus(kvm);
 	sca_dispose(kvm);
-	debug_unregister(kvm->arch.dbf);
 	kvm_s390_gisa_destroy(kvm);
+	/* do not use the lock checking variant at tear-down */
+	if (kvm_s390_pv_handle(kvm)) {
+		kvm_s390_pv_destroy_vm(kvm, &rc, &rrc);
+		kvm_s390_pv_dealloc_vm(kvm);
+	}
+	debug_unregister(kvm->arch.dbf);
 	free_page((unsigned long)kvm->arch.sie_page2);
 	if (!kvm_is_ucontrol(kvm))
 		gmap_remove(kvm->arch.gmap);
@@ -2657,6 +2840,9 @@ static int sca_switch_to_extended(struct kvm *kvm)
 	unsigned int vcpu_idx;
 	u32 scaol, scaoh;
 
+	if (kvm->arch.use_esca)
+		return 0;
+
 	new_sca = alloc_pages_exact(sizeof(*new_sca), GFP_KERNEL|__GFP_ZERO);
 	if (!new_sca)
 		return -ENOMEM;
@@ -2908,6 +3094,7 @@ static void kvm_s390_vcpu_setup_model(struct kvm_vcpu *vcpu)
 static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 {
 	int rc = 0;
+	u16 uvrc, uvrrc;
 
 	atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH |
 						    CPUSTAT_SM |
@@ -2975,6 +3162,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 
 	kvm_s390_vcpu_crypto_setup(vcpu);
 
+	mutex_lock(&vcpu->kvm->lock);
+	if (kvm_s390_pv_is_protected(vcpu->kvm))
+		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
+	mutex_unlock(&vcpu->kvm->lock);
+
 	return rc;
 }
 
@@ -4352,6 +4544,24 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
 	return -ENOIOCTLCMD;
 }
 
+static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
+				   struct kvm_pv_cmd *cmd)
+{
+	int r = 0;
+
+	if (!kvm_s390_pv_cpu_is_protected(vcpu))
+		return -EINVAL;
+
+	if (cmd->flags)
+		return -EINVAL;
+
+	switch (cmd->cmd) {
+	default:
+		r = -ENOTTY;
+	}
+	return r;
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 			 unsigned int ioctl, unsigned long arg)
 {
@@ -4493,6 +4703,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 					   irq_state.len);
 		break;
 	}
+	case KVM_S390_PV_COMMAND_VCPU: {
+		struct kvm_pv_cmd args;
+
+		r = 0;
+		if (!is_prot_virt_host()) {
+			r = -EINVAL;
+			break;
+		}
+		if (copy_from_user(&args, argp, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		r = kvm_s390_handle_pv_vcpu(vcpu, &args);
+		if (copy_to_user(argp, &args, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 83dabb18e4d9..b3cc0b49cb42 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -15,6 +15,7 @@
 #include <linux/hrtimer.h>
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
+#include <linux/lockdep.h>
 #include <asm/facility.h>
 #include <asm/processor.h>
 #include <asm/sclp.h>
@@ -207,6 +208,40 @@ static inline int kvm_s390_user_cpu_state_ctrl(struct kvm *kvm)
 	return kvm->arch.user_cpu_state_ctrl != 0;
 }
 
+/* implemented in pv.c */
+void kvm_s390_pv_dealloc_vm(struct kvm *kvm);
+int kvm_s390_pv_alloc_vm(struct kvm *kvm);
+int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc);
+int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
+int kvm_s390_pv_destroy_vm(struct kvm *kvm, u16 *rc, u16 *rrc);
+int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
+int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
+			      u16 *rrc);
+int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
+		       unsigned long tweak, u16 *rc, u16 *rrc);
+
+static inline u64 kvm_s390_pv_handle(struct kvm *kvm)
+{
+	return kvm->arch.pv.handle;
+}
+
+static inline u64 kvm_s390_pv_handle_cpu(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.pv.handle;
+}
+
+static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
+{
+	lockdep_assert_held(&kvm->lock);
+	return !!kvm_s390_pv_handle(kvm);
+}
+
+static inline bool kvm_s390_pv_cpu_is_protected(struct kvm_vcpu *vcpu)
+{
+	lockdep_assert_held(&vcpu->mutex);
+	return !!kvm_s390_pv_handle_cpu(vcpu);
+}
+
 /* implemented in interrupt.c */
 int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
 void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu);
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
new file mode 100644
index 000000000000..bf00cde1ead8
--- /dev/null
+++ b/arch/s390/kvm/pv.c
@@ -0,0 +1,262 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hosting Secure Execution virtual machines
+ *
+ * Copyright IBM Corp. 2019
+ *    Author(s): Janosch Frank <frankja@linux.ibm.com>
+ */
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/pagemap.h>
+#include <linux/sched/signal.h>
+#include <asm/pgalloc.h>
+#include <asm/gmap.h>
+#include <asm/uv.h>
+#include <asm/gmap.h>
+#include <asm/mman.h>
+#include "kvm-s390.h"
+
+void kvm_s390_pv_dealloc_vm(struct kvm *kvm)
+{
+	vfree(kvm->arch.pv.stor_var);
+	free_pages(kvm->arch.pv.stor_base,
+		   get_order(uv_info.guest_base_stor_len));
+	memset(&kvm->arch.pv, 0, sizeof(kvm->arch.pv));
+}
+
+int kvm_s390_pv_alloc_vm(struct kvm *kvm)
+{
+	unsigned long base = uv_info.guest_base_stor_len;
+	unsigned long virt = uv_info.guest_virt_var_stor_len;
+	unsigned long npages = 0, vlen = 0;
+	struct kvm_memory_slot *memslot;
+
+	kvm->arch.pv.stor_var = NULL;
+	kvm->arch.pv.stor_base = __get_free_pages(GFP_KERNEL, get_order(base));
+	if (!kvm->arch.pv.stor_base)
+		return -ENOMEM;
+
+	/*
+	 * Calculate current guest storage for allocation of the
+	 * variable storage, which is based on the length in MB.
+	 *
+	 * Slots are sorted by GFN
+	 */
+	mutex_lock(&kvm->slots_lock);
+	memslot = kvm_memslots(kvm)->memslots;
+	npages = memslot->base_gfn + memslot->npages;
+	mutex_unlock(&kvm->slots_lock);
+
+	kvm->arch.pv.guest_len = npages * PAGE_SIZE;
+
+	/* Allocate variable storage */
+	vlen = ALIGN(virt * ((npages * PAGE_SIZE) / HPAGE_SIZE), PAGE_SIZE);
+	vlen += uv_info.guest_virt_base_stor_len;
+	kvm->arch.pv.stor_var = vzalloc(vlen);
+	if (!kvm->arch.pv.stor_var)
+		goto out_err;
+	return 0;
+
+out_err:
+	kvm_s390_pv_dealloc_vm(kvm);
+	return -ENOMEM;
+}
+
+int kvm_s390_pv_destroy_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	int cc;
+
+	cc = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
+			   UVC_CMD_DESTROY_SEC_CONF, rc, rrc);
+	WRITE_ONCE(kvm->arch.gmap->guest_handle, 0);
+	atomic_set(&kvm->mm->context.is_protected, 0);
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc);
+	return cc;
+}
+
+int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
+{
+	int cc = 0;
+
+	if (kvm_s390_pv_handle_cpu(vcpu)) {
+		cc = uv_cmd_nodata(kvm_s390_pv_handle_cpu(vcpu),
+				   UVC_CMD_DESTROY_SEC_CPU, rc, rrc);
+
+		VCPU_EVENT(vcpu, 3, "PROTVIRT DESTROY VCPU: rc %x rrc %x",
+			   *rc, *rrc);
+		KVM_UV_EVENT(vcpu->kvm, 3, "PROTVIRT DESTROY VCPU: rc %x rrc %x",
+			     *rc, *rrc);
+	}
+
+	free_pages(vcpu->arch.pv.stor_base,
+		   get_order(uv_info.guest_cpu_stor_len));
+	vcpu->arch.sie_block->pv_handle_cpu = 0;
+	vcpu->arch.sie_block->pv_handle_config = 0;
+	memset(&vcpu->arch.pv, 0, sizeof(vcpu->arch.pv));
+	vcpu->arch.sie_block->sdf = 0;
+	return cc;
+}
+
+int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc)
+{
+	struct uv_cb_csc uvcb = {
+		.header.cmd = UVC_CMD_CREATE_SEC_CPU,
+		.header.len = sizeof(uvcb),
+	};
+	int cc;
+
+	if (kvm_s390_pv_handle_cpu(vcpu))
+		return -EINVAL;
+
+	vcpu->arch.pv.stor_base = __get_free_pages(GFP_KERNEL,
+						   get_order(uv_info.guest_cpu_stor_len));
+	if (!vcpu->arch.pv.stor_base)
+		return -ENOMEM;
+
+	/* Input */
+	uvcb.guest_handle = kvm_s390_pv_handle(vcpu->kvm);
+	uvcb.num = vcpu->arch.sie_block->icpua;
+	uvcb.state_origin = (u64)vcpu->arch.sie_block;
+	uvcb.stor_origin = (u64)vcpu->arch.pv.stor_base;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	VCPU_EVENT(vcpu, 3, "PROTVIRT CREATE VCPU: handle %llx rc %x rrc %x",
+		   uvcb.cpu_handle, uvcb.header.rc, uvcb.header.rrc);
+	KVM_UV_EVENT(vcpu->kvm, 3,
+		     "PROTVIRT CREATE VCPU: cpu %d handle %llx rc %x rrc %x",
+		     vcpu->vcpu_id, uvcb.cpu_handle, uvcb.header.rc,
+		     uvcb.header.rrc);
+
+	if (cc) {
+		u16 dummy;
+
+		kvm_s390_pv_destroy_cpu(vcpu, &dummy, &dummy);
+		return -EINVAL;
+	}
+
+	/* Output */
+	vcpu->arch.pv.handle = uvcb.cpu_handle;
+	vcpu->arch.sie_block->pv_handle_cpu = uvcb.cpu_handle;
+	vcpu->arch.sie_block->pv_handle_config = kvm_s390_pv_handle(vcpu->kvm);
+	vcpu->arch.sie_block->sdf = 2;
+	return 0;
+}
+
+int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
+{
+	u16 drc, drrc;
+	int cc;
+
+	struct uv_cb_cgc uvcb = {
+		.header.cmd = UVC_CMD_CREATE_SEC_CONF,
+		.header.len = sizeof(uvcb)
+	};
+
+	if (kvm_s390_pv_handle(kvm))
+		return -EINVAL;
+
+	/* Inputs */
+	uvcb.guest_stor_origin = 0; /* MSO is 0 for KVM */
+	uvcb.guest_stor_len = kvm->arch.pv.guest_len;
+	uvcb.guest_asce = kvm->arch.gmap->asce;
+	uvcb.guest_sca = (unsigned long)kvm->arch.sca;
+	uvcb.conf_base_stor_origin = (u64)kvm->arch.pv.stor_base;
+	uvcb.conf_virt_stor_origin = (u64)kvm->arch.pv.stor_var;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT CREATE VM: handle %llx len %llx rc %x rrc %x",
+		     uvcb.guest_handle, uvcb.guest_stor_len, *rc, *rrc);
+
+	/* Outputs */
+	kvm->arch.pv.handle = uvcb.guest_handle;
+
+	if (cc && (uvcb.header.rc & UVC_RC_NEED_DESTROY)) {
+		kvm_s390_pv_destroy_vm(kvm, &drc, &drrc);
+		return -EINVAL;
+	}
+	kvm->arch.gmap->guest_handle = uvcb.guest_handle;
+	atomic_set(&kvm->mm->context.is_protected, 1);
+	return cc;
+}
+
+int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
+			      u16 *rrc)
+{
+	struct uv_cb_ssc uvcb = {
+		.header.cmd = UVC_CMD_SET_SEC_CONF_PARAMS,
+		.header.len = sizeof(uvcb),
+		.sec_header_origin = (u64)hdr,
+		.sec_header_len = length,
+		.guest_handle = kvm_s390_pv_handle(kvm),
+	};
+	int cc;
+
+	if (!kvm_s390_pv_handle(kvm))
+		return -EINVAL;
+
+	cc = uv_call(0, (u64)&uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT VM SET PARMS: rc %x rrc %x",
+		     uvcb.header.rc, uvcb.header.rrc);
+	if (cc)
+		return -EINVAL;
+	return 0;
+}
+
+static int unpack_one(struct kvm *kvm, unsigned long addr, u64 tweak[2],
+		      u16 *rc, u16 *rrc)
+{
+	struct uv_cb_unp uvcb = {
+		.header.cmd = UVC_CMD_UNPACK_IMG,
+		.header.len = sizeof(uvcb),
+		.guest_handle = kvm_s390_pv_handle(kvm),
+		.gaddr = addr,
+		.tweak[0] = tweak[0],
+		.tweak[1] = tweak[1],
+	};
+	int ret;
+
+	ret = gmap_make_secure(kvm->arch.gmap, addr, &uvcb);
+	*rc = uvcb.header.rc;
+	*rrc = uvcb.header.rrc;
+
+	if (ret && ret != -EAGAIN)
+		KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: failed addr %llx with rc %x rrc %x",
+			     uvcb.gaddr, *rc, *rrc);
+	return ret;
+}
+
+int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
+		       unsigned long tweak, u16 *rc, u16 *rrc)
+{
+	u64 tw[2] = {tweak, 0};
+	int ret = 0;
+
+	if (addr & ~PAGE_MASK || !size || size & ~PAGE_MASK)
+		return -EINVAL;
+
+	KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: start addr %lx size %lx",
+		     addr, size);
+
+	while (tw[1] < size) {
+		ret = unpack_one(kvm, addr, tw, rc, rrc);
+		if (ret == -EAGAIN) {
+			cond_resched();
+			if (fatal_signal_pending(current))
+				break;
+			continue;
+		}
+		if (ret)
+			break;
+		addr += PAGE_SIZE;
+		tw[1] += PAGE_SIZE;
+	}
+	if (!ret)
+		KVM_UV_EVENT(kvm, 3, "%s", "PROTVIRT VM UNPACK: successful");
+	return ret;
+}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4b95f9a31a2f..50d393a618a4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1010,6 +1010,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_NISV_TO_USER 177
 #define KVM_CAP_ARM_INJECT_EXT_DABT 178
 #define KVM_CAP_S390_VCPU_RESETS 179
+#define KVM_CAP_S390_PROTECTED 180
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1478,6 +1479,40 @@ struct kvm_enc_region {
 #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
 #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
 
+struct kvm_s390_pv_sec_parm {
+	__u64	origin;
+	__u64	length;
+};
+
+struct kvm_s390_pv_unp {
+	__u64 addr;
+	__u64 size;
+	__u64 tweak;
+};
+
+enum pv_cmd_id {
+	KVM_PV_ENABLE,
+	KVM_PV_DISABLE,
+	KVM_PV_VM_SET_SEC_PARMS,
+	KVM_PV_VM_UNPACK,
+	KVM_PV_VM_VERIFY,
+	KVM_PV_VCPU_CREATE,
+	KVM_PV_VCPU_DESTROY,
+};
+
+struct kvm_pv_cmd {
+	__u32 cmd;	/* Command to be executed */
+	__u16 rc;	/* Ultravisor return code */
+	__u16 rrc;	/* Ultravisor return reason code */
+	__u64 data;	/* Data or address */
+	__u32 flags;    /* flags for future extensions. Must be 0 for now */
+	__u32 reserved[3];
+};
+
+/* Available with KVM_CAP_S390_PROTECTED */
+#define KVM_S390_PV_COMMAND		_IOWR(KVMIO, 0xc5, struct kvm_pv_cmd)
+#define KVM_S390_PV_COMMAND_VCPU	_IOWR(KVMIO, 0xc6, struct kvm_pv_cmd)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
2.25.0


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg
  2020-02-14 22:26 ` [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg Christian Borntraeger
@ 2020-02-18  8:40   ` David Hildenbrand
  2020-02-18  8:57     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  8:40 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>

"KVM: s390: protvirt: disallow KVM_GET_ONE_REG/KVM_SET_ONE_REG"

> 
> A lot of the registers are controlled by the Ultravisor and never
> visible to KVM. Some fields in the sie control block are overlayed, like
> gbea. As no known userspace uses the ONE_REG interface on s390 if sync
> regs are available, no functionality is lost if it is disabled for
> protected guests.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  Documentation/virt/kvm/api.rst | 6 ++++--
>  arch/s390/kvm/kvm-s390.c       | 3 +++
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index cb58714fe60d..a82166e5f7d9 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -2117,7 +2117,8 @@ Errors:
>  
>    ======   ============================================================
>    ENOENT   no such register
> -  EINVAL   invalid register ID, or no such register
> +  EINVAL   invalid register ID, or no such register, ONE_REG forbidden
> +           for protected guests (s390)

"invalid register ID, no such register, or used with VMs in protected
virtualization mode on s390" ?

>    EPERM    (arm64) register access not allowed before vcpu finalization
>    ======   ============================================================
>  
> @@ -2552,7 +2553,8 @@ Errors include:
>  
>    ======== ============================================================
>    ENOENT   no such register
> -  EINVAL   invalid register ID, or no such register
> +  EINVAL   invalid register ID, or no such register, ONE_REG forbidden
> +           for protected guests (s390)

dito

>    EPERM    (arm64) register access not allowed before vcpu finalization
>    ======== ============================================================
>  
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 8db82aaf1275..d20a7fa9d480 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4638,6 +4638,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	case KVM_SET_ONE_REG:
>  	case KVM_GET_ONE_REG: {
>  		struct kvm_one_reg reg;
> +		r = -EINVAL;
> +		if (kvm_s390_pv_is_protected(vcpu->kvm))
> +			break;

I assume races will be dealt with in your next series.

>  		r = -EFAULT;
>  		if (copy_from_user(&reg, argp, sizeof(reg)))
>  			break;
> 

With the two nits fixed

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible
  2020-02-14 22:26 ` [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible Christian Borntraeger
@ 2020-02-18  8:42   ` David Hildenbrand
  2020-02-18  9:20     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  8:42 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> For protected VMs the hypervisor can not access guest breaking event
> address, program parameter, bpbc and todpr. Do not reset those fields
> as the control block does not provide access to these fields.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/kvm-s390.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index d20a7fa9d480..5b551cc73540 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3442,14 +3442,16 @@ static void kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
>  	kvm_s390_set_prefix(vcpu, 0);
>  	kvm_s390_set_cpu_timer(vcpu, 0);
>  	vcpu->arch.sie_block->ckc = 0;
> -	vcpu->arch.sie_block->todpr = 0;
>  	memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr));
>  	vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK;
>  	vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK;
>  	vcpu->run->s.regs.fpc = 0;
> -	vcpu->arch.sie_block->gbea = 1;
> -	vcpu->arch.sie_block->pp = 0;
> -	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
> +	if (!kvm_s390_pv_handle_cpu(vcpu)) {

Shouldn't we instead check if the VM is in PV mode? (with changed
lifecycle handling). Easier to understand.

(side not: the name kvm_s390_pv_handle_cpu() is very confusing. I'd
suggest kvm_s390_pv_cpu_get_handle(). Will reply to the other patch)

> +		vcpu->arch.sie_block->gbea = 1;
> +		vcpu->arch.sie_block->pp = 0;
> +		vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
> +		vcpu->arch.sie_block->todpr = 0;
> +	}
>  }
>  
>  static void kvm_arch_vcpu_ioctl_clear_reset(struct kvm_vcpu *vcpu)
> 


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 24/42] KVM: s390: protvirt: STSI handling
  2020-02-18  8:35   ` David Hildenbrand
@ 2020-02-18  8:44     ` Christian Borntraeger
  2020-02-18  9:08       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  8:44 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 09:35, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> Save response to sidad and disable address checking for protected
>> guests.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/kvm/priv.c | 11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index ed52ffa8d5d4..b2de7dc5f58d 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -872,7 +872,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>  
>>  	operand2 = kvm_s390_get_base_disp_s(vcpu, &ar);
>>  
>> -	if (operand2 & 0xfff)
>> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (operand2 & 0xfff))
>>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> 
> Why is that needed? I'd assume the hardware handles this for us and this
> case can never happen for PV? (IOW, change is not necessary)

Hardware is handling this for us AND we are not allowed to inject a specification
exception. The ultravisor guards the program checks that we are allowed to inject.

> 
>>  
>>  	switch (fc) {
>> @@ -893,8 +893,13 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>  		handle_stsi_3_2_2(vcpu, (void *) mem);
>>  		break;
>>  	}
>> -
>> -	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>> +	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>> +		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>> +		       PAGE_SIZE);
>> +		rc = 0;
>> +	} else {
>> +		rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>> +	}
>>  	if (rc) {
>>  		rc = kvm_s390_inject_prog_cond(vcpu, rc);
>>  		goto out;
>>
> 
> I'd pull the interrupt injection into the else case, makes things clearer.

Well, no. Thhe else case could set rc to 0.

> 
> What about user_stsi? Will that still work? (I assume user space will
> write to the sida and it should work just fine)

will still work. 
> 
> (I assume races regarding kvm_s390_pv_is_protected() will be dealt with
> in your next series)
> 
> In general, looks good to me.
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg
  2020-02-18  8:40   ` David Hildenbrand
@ 2020-02-18  8:57     ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  8:57 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 09:40, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
> 
> "KVM: s390: protvirt: disallow KVM_GET_ONE_REG/KVM_SET_ONE_REG"
> 
>>
>> A lot of the registers are controlled by the Ultravisor and never
>> visible to KVM. Some fields in the sie control block are overlayed, like
>> gbea. As no known userspace uses the ONE_REG interface on s390 if sync
>> regs are available, no functionality is lost if it is disabled for
>> protected guests.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  Documentation/virt/kvm/api.rst | 6 ++++--
>>  arch/s390/kvm/kvm-s390.c       | 3 +++
>>  2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index cb58714fe60d..a82166e5f7d9 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -2117,7 +2117,8 @@ Errors:
>>  
>>    ======   ============================================================
>>    ENOENT   no such register
>> -  EINVAL   invalid register ID, or no such register
>> +  EINVAL   invalid register ID, or no such register, ONE_REG forbidden
>> +           for protected guests (s390)
> 
> "invalid register ID, no such register, or used with VMs in protected
> virtualization mode on s390" ?

ack.
> 
>>    EPERM    (arm64) register access not allowed before vcpu finalization
>>    ======   ============================================================
>>  
>> @@ -2552,7 +2553,8 @@ Errors include:
>>  
>>    ======== ============================================================
>>    ENOENT   no such register
>> -  EINVAL   invalid register ID, or no such register
>> +  EINVAL   invalid register ID, or no such register, ONE_REG forbidden
>> +           for protected guests (s390)
> 
> dito

ack
> 
>>    EPERM    (arm64) register access not allowed before vcpu finalization
>>    ======== ============================================================
>>  
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 8db82aaf1275..d20a7fa9d480 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4638,6 +4638,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>  	case KVM_SET_ONE_REG:
>>  	case KVM_GET_ONE_REG: {
>>  		struct kvm_one_reg reg;
>> +		r = -EINVAL;
>> +		if (kvm_s390_pv_is_protected(vcpu->kvm))
>> +			break;
> 
> I assume races will be dealt with in your next series.

yes. This is running  under vcpu_mutex and we will hold that lock when doing 
the gear shift.

> 
>>  		r = -EFAULT;
>>  		if (copy_from_user(&reg, argp, sizeof(reg)))
>>  			break;
>>
> 
> With the two nits fixed
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 24/42] KVM: s390: protvirt: STSI handling
  2020-02-18  8:44     ` Christian Borntraeger
@ 2020-02-18  9:08       ` David Hildenbrand
  2020-02-18  9:11         ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:08 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 18.02.20 09:44, Christian Borntraeger wrote:
> 
> 
> On 18.02.20 09:35, David Hildenbrand wrote:
>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>> From: Janosch Frank <frankja@linux.ibm.com>
>>>
>>> Save response to sidad and disable address checking for protected
>>> guests.
>>>
>>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> ---
>>>  arch/s390/kvm/priv.c | 11 ++++++++---
>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>> index ed52ffa8d5d4..b2de7dc5f58d 100644
>>> --- a/arch/s390/kvm/priv.c
>>> +++ b/arch/s390/kvm/priv.c
>>> @@ -872,7 +872,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>  
>>>  	operand2 = kvm_s390_get_base_disp_s(vcpu, &ar);
>>>  
>>> -	if (operand2 & 0xfff)
>>> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (operand2 & 0xfff))
>>>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>
>> Why is that needed? I'd assume the hardware handles this for us and this
>> case can never happen for PV? (IOW, change is not necessary)
> 
> Hardware is handling this for us AND we are not allowed to inject a specification
> exception. The ultravisor guards the program checks that we are allowed to inject.
> 

Yeah, but can this ever trigger without the check? AFAIKs, no. So why
add it?

(rather add a BUG_ON in kvm_s390_inject_program_int() in case we are in
PV mode)

>>
>>>  
>>>  	switch (fc) {
>>> @@ -893,8 +893,13 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>  		handle_stsi_3_2_2(vcpu, (void *) mem);
>>>  		break;
>>>  	}
>>> -
>>> -	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>>> +	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>>> +		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>> +		       PAGE_SIZE);
>>> +		rc = 0;
>>> +	} else {
>>> +		rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>>> +	}
>>>  	if (rc) {
>>>  		rc = kvm_s390_inject_prog_cond(vcpu, rc);
>>>  		goto out;
>>>
>>
>> I'd pull the interrupt injection into the else case, makes things clearer.
> 
> Well, no. Thhe else case could set rc to 0.

Huh?!

if (kvm_s390_pv_is_protected(vcpu->kvm)) {
	memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
	rc = 0;
} else {
	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
	if (rc) {
		rc = kvm_s390_inject_prog_cond(vcpu, rc);
		goto out;
	}
}

What am I missing?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 24/42] KVM: s390: protvirt: STSI handling
  2020-02-18  9:08       ` David Hildenbrand
@ 2020-02-18  9:11         ` Christian Borntraeger
  2020-02-18  9:13           ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  9:11 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 10:08, David Hildenbrand wrote:
> On 18.02.20 09:44, Christian Borntraeger wrote:
>>
>>
>> On 18.02.20 09:35, David Hildenbrand wrote:
>>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>>> From: Janosch Frank <frankja@linux.ibm.com>
>>>>
>>>> Save response to sidad and disable address checking for protected
>>>> guests.
>>>>
>>>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>>>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>>> ---
>>>>  arch/s390/kvm/priv.c | 11 ++++++++---
>>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>>> index ed52ffa8d5d4..b2de7dc5f58d 100644
>>>> --- a/arch/s390/kvm/priv.c
>>>> +++ b/arch/s390/kvm/priv.c
>>>> @@ -872,7 +872,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>>  
>>>>  	operand2 = kvm_s390_get_base_disp_s(vcpu, &ar);
>>>>  
>>>> -	if (operand2 & 0xfff)
>>>> +	if (!kvm_s390_pv_is_protected(vcpu->kvm) && (operand2 & 0xfff))
>>>>  		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>>
>>> Why is that needed? I'd assume the hardware handles this for us and this
>>> case can never happen for PV? (IOW, change is not necessary)
>>
>> Hardware is handling this for us AND we are not allowed to inject a specification
>> exception. The ultravisor guards the program checks that we are allowed to inject.
>>
> 
> Yeah, but can this ever trigger without the check? AFAIKs, no. So why
> add it?

It can. the GPRS can contain stale data and so can operand2.

> 
> (rather add a BUG_ON in kvm_s390_inject_program_int() in case we are in
> PV mode)
> 
>>>
>>>>  
>>>>  	switch (fc) {
>>>> @@ -893,8 +893,13 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>>  		handle_stsi_3_2_2(vcpu, (void *) mem);
>>>>  		break;
>>>>  	}
>>>> -
>>>> -	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>>>> +	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>>>> +		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>>> +		       PAGE_SIZE);
>>>> +		rc = 0;
>>>> +	} else {
>>>> +		rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>>>> +	}
>>>>  	if (rc) {
>>>>  		rc = kvm_s390_inject_prog_cond(vcpu, rc);
>>>>  		goto out;
>>>>
>>>
>>> I'd pull the interrupt injection into the else case, makes things clearer.
>>
>> Well, no. Thhe else case could set rc to 0.
> 
> Huh?!
> 
> if (kvm_s390_pv_is_protected(vcpu->kvm)) {
> 	memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
> 	rc = 0;
> } else {
> 	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
> 	if (rc) {
> 		rc = kvm_s390_inject_prog_cond(vcpu, rc);
> 		goto out;
> 	}
> }
> 

Hmm, I find that one harder to read.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-18  8:39   ` [PATCH v2.1] " Christian Borntraeger
@ 2020-02-18  9:12     ` David Hildenbrand
  2020-02-18 21:18       ` Christian Borntraeger
  2020-02-19 11:01       ` Christian Borntraeger
  2020-02-18  9:56     ` David Hildenbrand
  1 sibling, 2 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:12 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth

On 18.02.20 09:39, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> This contains 3 main changes:
> 1. changes in SIE control block handling for secure guests
> 2. helper functions for create/destroy/unpack secure guests
> 3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
> machines
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> 2->2.1  - combine CREATE/DESTROY CPU/VM into ENABLE DISABLE
> 	- rework locking and check locks with lockdep
> 	- I still have the PV_COMMAND_CPU in here for later use in
> 	  the SET_IPL_PSW ioctl. If wanted I can move

I'd prefer to move, and eventually just turn this into a clean, separate
ioctl without subcommands (e.g., if we'll only need a single subcommand
in the near future). And it makes this patch a alittle easier to review
... :)

[...]

>  obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index cc7793525a69..1a7bb08f5c26 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -44,6 +44,7 @@
>  #include <asm/cpacf.h>
>  #include <asm/timex.h>
>  #include <asm/ap.h>
> +#include <asm/uv.h>
>  #include "kvm-s390.h"
>  #include "gaccess.h"
>  
> @@ -234,8 +235,10 @@ int kvm_arch_check_processor_compat(void)
>  	return 0;
>  }
>  
> +/* forward declarations */
>  static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
>  			      unsigned long end);
> +static int sca_switch_to_extended(struct kvm *kvm);
>  
>  static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
>  {
> @@ -571,6 +574,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_S390_BPB:
>  		r = test_facility(82);
>  		break;
> +	case KVM_CAP_S390_PROTECTED:
> +		r = is_prot_virt_host();
> +		break;
>  	default:
>  		r = 0;
>  	}
> @@ -2165,6 +2171,152 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
>  	return r;
>  }
>  
> +static int kvm_s390_switch_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
> +{
> +	int i, r = 0;
> +
> +	struct kvm_vcpu *vcpu;
> +

Once we lock the VCPU, it cannot be running, right?

> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		mutex_lock(&vcpu->mutex);
> +		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
> +		mutex_unlock(&vcpu->mutex);
> +		if (r)
> +			break;
> +	}

Can this actually ever fail? If so, you would leave half-initialized
state around. Warn and continue?

Especially, kvm_arch_vcpu_destroy() ignores any error from
kvm_s390_pv_destroy_cpu() as well ...

IMHO, we should make kvm_s390_switch_from_pv() and
kvm_s390_pv_destroy_cpu() never fail.

> +	return r;
> +}
> +
> +static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
> +{
> +	int i, r = 0;
> +	u16 dummy;
> +
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		mutex_lock(&vcpu->mutex);
> +		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
> +		mutex_unlock(&vcpu->mutex);
> +		if (r)
> +			break;
> +	}
> +	if (r)
> +		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
> +	return r;
> +}
> +
> +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
> +{
> +	int r = 0;
> +	u16 dummy;
> +	void __user *argp = (void __user *)cmd->data;
> +
> +	switch (cmd->cmd) {
> +	case KVM_PV_ENABLE: {
> +		r = -EINVAL;
> +		if (kvm_s390_pv_is_protected(kvm))
> +			break;

Why not factor out this check, it's common for all sucommands.

> +
> +		r = kvm_s390_pv_alloc_vm(kvm);
> +		if (r)
> +			break;
> +
> +		kvm_s390_vcpu_block_all(kvm);

As kvm_s390_vcpu_block_all() does not support nesting, this will not
work as expected - sca_switch_to_extended() already blocks. Are the
vcpu->locks not enough?

> +		/* FMT 4 SIE needs esca */
> +		r = sca_switch_to_extended(kvm);
> +		if (r) {
> +			kvm_s390_pv_dealloc_vm(kvm);
> +			kvm_s390_vcpu_unblock_all(kvm);
> +			mutex_unlock(&kvm->lock);
> +			break;
> +		}
> +		r = kvm_s390_pv_create_vm(kvm, &cmd->rc, &cmd->rrc);
> +		if (!r)
> +			r = kvm_s390_switch_to_pv(kvm, &cmd->rc, &cmd->rrc);
> +		if (r)
> +			kvm_s390_pv_destroy_vm(kvm, &dummy, &dummy);
> +
> +		kvm_s390_vcpu_unblock_all(kvm);
> +		break;
> +	}
> +	case KVM_PV_DISABLE: {
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))
> +			break;
> +
> +		kvm_s390_vcpu_block_all(kvm);

Won't taking the vcpu lock achieve a similar goal (VCPU can't be running).

> +		r = kvm_s390_switch_from_pv(kvm, &cmd->rc, &cmd->rrc);
> +		if (!r)
> +			r = kvm_s390_pv_destroy_vm(kvm, &cmd->rc, &cmd->rrc);
> +		if (!r)
> +			kvm_s390_pv_dealloc_vm(kvm);
> +		kvm_s390_vcpu_unblock_all(kvm);
> +		break;
> +	}

[...]

> @@ -2558,10 +2735,16 @@ static void kvm_free_vcpus(struct kvm *kvm)
>  
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> +	u16 rc, rrc;
>  	kvm_free_vcpus(kvm);
>  	sca_dispose(kvm);
> -	debug_unregister(kvm->arch.dbf);
>  	kvm_s390_gisa_destroy(kvm);
> +	/* do not use the lock checking variant at tear-down */
> +	if (kvm_s390_pv_handle(kvm)) {

kvm_s390_pv_is_protected ? I dislike using kvm_s390_pv_handle() when
we're not interested in the handle.

> +		kvm_s390_pv_destroy_vm(kvm, &rc, &rrc);
> +		kvm_s390_pv_dealloc_vm(kvm);
> +	}
> +	debug_unregister(kvm->arch.dbf);
>  	free_page((unsigned long)kvm->arch.sie_page2);
>  	if (!kvm_is_ucontrol(kvm))
>  		gmap_remove(kvm->arch.gmap);

[...]

> +/* implemented in pv.c */
> +void kvm_s390_pv_dealloc_vm(struct kvm *kvm);
> +int kvm_s390_pv_alloc_vm(struct kvm *kvm);
> +int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc);
> +int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
> +int kvm_s390_pv_destroy_vm(struct kvm *kvm, u16 *rc, u16 *rrc);
> +int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc);
> +int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
> +			      u16 *rrc);
> +int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
> +		       unsigned long tweak, u16 *rc, u16 *rrc);
> +
> +static inline u64 kvm_s390_pv_handle(struct kvm *kvm)
> +{
> +	return kvm->arch.pv.handle;
> +}

Can we rename this to

kvm_s390_pv_get_handle()

> +
> +static inline u64 kvm_s390_pv_handle_cpu(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.pv.handle;
> +}

Can we rename this to kvm_s390_pv_cpu_get_handle() ? (so it doesn't look
like the function will handle something)

> +
> +static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
> +{
> +	lockdep_assert_held(&kvm->lock);
> +	return !!kvm_s390_pv_handle(kvm);
> +}
> +
> +static inline bool kvm_s390_pv_cpu_is_protected(struct kvm_vcpu *vcpu)
> +{
> +	lockdep_assert_held(&vcpu->mutex);
> +	return !!kvm_s390_pv_handle_cpu(vcpu);
> +}
> +
>  /* implemented in interrupt.c */
>  int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
>  void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu);
> diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
> new file mode 100644
> index 000000000000..bf00cde1ead8
> --- /dev/null
> +++ b/arch/s390/kvm/pv.c
> @@ -0,0 +1,262 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hosting Secure Execution virtual machines
> + *
> + * Copyright IBM Corp. 2019
> + *    Author(s): Janosch Frank <frankja@linux.ibm.com>
> + */
> +#include <linux/kvm.h>
> +#include <linux/kvm_host.h>
> +#include <linux/pagemap.h>
> +#include <linux/sched/signal.h>
> +#include <asm/pgalloc.h>
> +#include <asm/gmap.h>
> +#include <asm/uv.h>
> +#include <asm/gmap.h>
> +#include <asm/mman.h>
> +#include "kvm-s390.h"
> +
> +void kvm_s390_pv_dealloc_vm(struct kvm *kvm)
> +{
> +	vfree(kvm->arch.pv.stor_var);
> +	free_pages(kvm->arch.pv.stor_base,
> +		   get_order(uv_info.guest_base_stor_len));
> +	memset(&kvm->arch.pv, 0, sizeof(kvm->arch.pv));
> +}
> +
> +int kvm_s390_pv_alloc_vm(struct kvm *kvm)
> +{
> +	unsigned long base = uv_info.guest_base_stor_len;
> +	unsigned long virt = uv_info.guest_virt_var_stor_len;
> +	unsigned long npages = 0, vlen = 0;
> +	struct kvm_memory_slot *memslot;
> +
> +	kvm->arch.pv.stor_var = NULL;
> +	kvm->arch.pv.stor_base = __get_free_pages(GFP_KERNEL, get_order(base));
> +	if (!kvm->arch.pv.stor_base)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Calculate current guest storage for allocation of the
> +	 * variable storage, which is based on the length in MB.
> +	 *
> +	 * Slots are sorted by GFN
> +	 */
> +	mutex_lock(&kvm->slots_lock);
> +	memslot = kvm_memslots(kvm)->memslots;
> +	npages = memslot->base_gfn + memslot->npages;
> +	mutex_unlock(&kvm->slots_lock);

Are you blocking the addition of new memslots somehow?

> +int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
> +{
> +	u16 drc, drrc;
> +	int cc;
> +
> +	struct uv_cb_cgc uvcb = {
> +		.header.cmd = UVC_CMD_CREATE_SEC_CONF,
> +		.header.len = sizeof(uvcb)
> +	};
> +
> +	if (kvm_s390_pv_handle(kvm))

Why is that necessary? We should only be called in PV mode.

> +		return -EINVAL;
> +
> +	/* Inputs */
> +	uvcb.guest_stor_origin = 0; /* MSO is 0 for KVM */
> +	uvcb.guest_stor_len = kvm->arch.pv.guest_len;
> +	uvcb.guest_asce = kvm->arch.gmap->asce;
> +	uvcb.guest_sca = (unsigned long)kvm->arch.sca;
> +	uvcb.conf_base_stor_origin = (u64)kvm->arch.pv.stor_base;
> +	uvcb.conf_virt_stor_origin = (u64)kvm->arch.pv.stor_var;
> +
> +	cc = uv_call(0, (u64)&uvcb);
> +	*rc = uvcb.header.rc;
> +	*rrc = uvcb.header.rrc;
> +	KVM_UV_EVENT(kvm, 3, "PROTVIRT CREATE VM: handle %llx len %llx rc %x rrc %x",
> +		     uvcb.guest_handle, uvcb.guest_stor_len, *rc, *rrc);
> +
> +	/* Outputs */
> +	kvm->arch.pv.handle = uvcb.guest_handle;
> +
> +	if (cc && (uvcb.header.rc & UVC_RC_NEED_DESTROY)) {
> +		kvm_s390_pv_destroy_vm(kvm, &drc, &drrc);
> +		return -EINVAL;
> +	}
> +	kvm->arch.gmap->guest_handle = uvcb.guest_handle;
> +	atomic_set(&kvm->mm->context.is_protected, 1);
> +	return cc;
> +}
> +
> +int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
> +			      u16 *rrc)
> +{
> +	struct uv_cb_ssc uvcb = {
> +		.header.cmd = UVC_CMD_SET_SEC_CONF_PARAMS,
> +		.header.len = sizeof(uvcb),
> +		.sec_header_origin = (u64)hdr,
> +		.sec_header_len = length,
> +		.guest_handle = kvm_s390_pv_handle(kvm),
> +	};
> +	int cc;
> +
> +	if (!kvm_s390_pv_handle(kvm))

Why is that necessary? We should only be called in PV mode.

> +		return -EINVAL;
> +
> +	cc = uv_call(0, (u64)&uvcb);
> +	*rc = uvcb.header.rc;
> +	*rrc = uvcb.header.rrc;
> +	KVM_UV_EVENT(kvm, 3, "PROTVIRT VM SET PARMS: rc %x rrc %x",
> +		     uvcb.header.rc, uvcb.header.rrc);
> +	if (cc)
> +		return -EINVAL;
> +	return 0;
> +}

[...]

> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 4b95f9a31a2f..50d393a618a4 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1010,6 +1010,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_ARM_NISV_TO_USER 177
>  #define KVM_CAP_ARM_INJECT_EXT_DABT 178
>  #define KVM_CAP_S390_VCPU_RESETS 179
> +#define KVM_CAP_S390_PROTECTED 180
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> @@ -1478,6 +1479,40 @@ struct kvm_enc_region {
>  #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
>  #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
>  
> +struct kvm_s390_pv_sec_parm {
> +	__u64	origin;
> +	__u64	length;

tabs vs. spaces. (I'd use a single space like in kvm_s390_pv_unp below)

> +};
> +
> +struct kvm_s390_pv_unp {
> +	__u64 addr;
> +	__u64 size;
> +	__u64 tweak;
> +};
> +
> +enum pv_cmd_id {
> +	KVM_PV_ENABLE,
> +	KVM_PV_DISABLE,
> +	KVM_PV_VM_SET_SEC_PARMS,
> +	KVM_PV_VM_UNPACK,
> +	KVM_PV_VM_VERIFY,
> +	KVM_PV_VCPU_CREATE,
> +	KVM_PV_VCPU_DESTROY,
> +};
> +
> +struct kvm_pv_cmd {
> +	__u32 cmd;	/* Command to be executed */
> +	__u16 rc;	/* Ultravisor return code */
> +	__u16 rrc;	/* Ultravisor return reason code */
> +	__u64 data;	/* Data or address */
> +	__u32 flags;    /* flags for future extensions. Must be 0 for now */
> +	__u32 reserved[3];
> +};
> +
> +/* Available with KVM_CAP_S390_PROTECTED */
> +#define KVM_S390_PV_COMMAND		_IOWR(KVMIO, 0xc5, struct kvm_pv_cmd)
> +#define KVM_S390_PV_COMMAND_VCPU	_IOWR(KVMIO, 0xc6, struct kvm_pv_cmd)
> +
>  /* Secure Encrypted Virtualization command */
>  enum sev_cmd_id {
>  	/* Guest initialization commands */
> 


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 24/42] KVM: s390: protvirt: STSI handling
  2020-02-18  9:11         ` Christian Borntraeger
@ 2020-02-18  9:13           ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:13 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

>> Yeah, but can this ever trigger without the check? AFAIKs, no. So why
>> add it?
> 
> It can. the GPRS can contain stale data and so can operand2.
> 

Oh, ok. Makes sense.

>>
>> (rather add a BUG_ON in kvm_s390_inject_program_int() in case we are in
>> PV mode)
>>
>>>>
>>>>>  
>>>>>  	switch (fc) {
>>>>> @@ -893,8 +893,13 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>>>  		handle_stsi_3_2_2(vcpu, (void *) mem);
>>>>>  		break;
>>>>>  	}
>>>>> -
>>>>> -	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>>>>> +	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>>>>> +		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>>>> +		       PAGE_SIZE);
>>>>> +		rc = 0;
>>>>> +	} else {
>>>>> +		rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>>>>> +	}
>>>>>  	if (rc) {
>>>>>  		rc = kvm_s390_inject_prog_cond(vcpu, rc);
>>>>>  		goto out;
>>>>>
>>>>
>>>> I'd pull the interrupt injection into the else case, makes things clearer.
>>>
>>> Well, no. Thhe else case could set rc to 0.
>>
>> Huh?!
>>
>> if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>> 	memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>> 	rc = 0;
>> } else {
>> 	rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>> 	if (rc) {
>> 		rc = kvm_s390_inject_prog_cond(vcpu, rc);
>> 		goto out;
>> 	}
>> }
>>
> 
> Hmm, I find that one harder to read.

It's clearer that you only inject a program check when not in pv mode
... ;) No strong feelings.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible
  2020-02-18  8:42   ` David Hildenbrand
@ 2020-02-18  9:20     ` Christian Borntraeger
  2020-02-18  9:28       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  9:20 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 09:42, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> For protected VMs the hypervisor can not access guest breaking event
>> address, program parameter, bpbc and todpr. Do not reset those fields
>> as the control block does not provide access to these fields.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/kvm/kvm-s390.c | 10 ++++++----
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index d20a7fa9d480..5b551cc73540 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -3442,14 +3442,16 @@ static void kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
>>  	kvm_s390_set_prefix(vcpu, 0);
>>  	kvm_s390_set_cpu_timer(vcpu, 0);
>>  	vcpu->arch.sie_block->ckc = 0;
>> -	vcpu->arch.sie_block->todpr = 0;
>>  	memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr));
>>  	vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK;
>>  	vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK;
>>  	vcpu->run->s.regs.fpc = 0;
>> -	vcpu->arch.sie_block->gbea = 1;
>> -	vcpu->arch.sie_block->pp = 0;
>> -	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
>> +	if (!kvm_s390_pv_handle_cpu(vcpu)) {
> 
> Shouldn't we instead check if the VM is in PV mode? (with changed
> lifecycle handling). Easier to understand.

No. these ioctls are under the vcpu->mutex, so I am going to compare
against the per cpu variant.

I will use kvm_s390_pv_cpu_is_protected instead to have the lockdep assertion.


> 
> (side not: the name kvm_s390_pv_handle_cpu() is very confusing. I'd
> suggest kvm_s390_pv_cpu_get_handle(). Will reply to the other patch)
> 
>> +		vcpu->arch.sie_block->gbea = 1;
>> +		vcpu->arch.sie_block->pp = 0;
>> +		vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
>> +		vcpu->arch.sie_block->todpr = 0;
>> +	}
>>  }
>>  
>>  static void kvm_arch_vcpu_ioctl_clear_reset(struct kvm_vcpu *vcpu)
>>
> 
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible
  2020-02-18  9:20     ` Christian Borntraeger
@ 2020-02-18  9:28       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:28 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 18.02.20 10:20, Christian Borntraeger wrote:
> 
> 
> On 18.02.20 09:42, David Hildenbrand wrote:
>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>> From: Janosch Frank <frankja@linux.ibm.com>
>>>
>>> For protected VMs the hypervisor can not access guest breaking event
>>> address, program parameter, bpbc and todpr. Do not reset those fields
>>> as the control block does not provide access to these fields.
>>>
>>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> ---
>>>  arch/s390/kvm/kvm-s390.c | 10 ++++++----
>>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index d20a7fa9d480..5b551cc73540 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -3442,14 +3442,16 @@ static void kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
>>>  	kvm_s390_set_prefix(vcpu, 0);
>>>  	kvm_s390_set_cpu_timer(vcpu, 0);
>>>  	vcpu->arch.sie_block->ckc = 0;
>>> -	vcpu->arch.sie_block->todpr = 0;
>>>  	memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr));
>>>  	vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK;
>>>  	vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK;
>>>  	vcpu->run->s.regs.fpc = 0;
>>> -	vcpu->arch.sie_block->gbea = 1;
>>> -	vcpu->arch.sie_block->pp = 0;
>>> -	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
>>> +	if (!kvm_s390_pv_handle_cpu(vcpu)) {
>>
>> Shouldn't we instead check if the VM is in PV mode? (with changed
>> lifecycle handling). Easier to understand.
> 
> No. these ioctls are under the vcpu->mutex, so I am going to compare
> against the per cpu variant.
> 
> I will use kvm_s390_pv_cpu_is_protected instead to have the lockdep assertion.

That's what I meant!


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection
  2020-02-14 22:26 ` [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection Christian Borntraeger
@ 2020-02-18  9:33   ` David Hildenbrand
  2020-02-18  9:37     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:33 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> Only two program exceptions can be injected for a protected guest:
> specification and operand.
> 
> For both, a code needs to be specified in the interrupt injection
> control of the state description, as the guest prefix page is not
> accessible to KVM for such guests.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/interrupt.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 3e160d9a214f..7a10096fa204 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -836,6 +836,21 @@ static int __must_check __deliver_external_call(struct kvm_vcpu *vcpu)
>  	return rc ? -EFAULT : 0;
>  }
>  
> +static int __deliver_prog_pv(struct kvm_vcpu *vcpu, u16 code)
> +{
> +	switch (code) {
> +	case PGM_SPECIFICATION:
> +		vcpu->arch.sie_block->iictl = IICTL_CODE_SPECIFICATION;
> +		break;
> +	case PGM_OPERAND:
> +		vcpu->arch.sie_block->iictl = IICTL_CODE_OPERAND;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
>  static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
> @@ -856,6 +871,9 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
>  	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
>  					 pgm_info.code, 0);
>  
> +	if (kvm_s390_pv_is_protected(vcpu->kvm))

Can we actually ever have PER set, and what would happen if so?
Shouldn't we also return -EINVAL?

> +		return __deliver_prog_pv(vcpu, pgm_info.code & ~PGM_PER);
> +
>  	switch (pgm_info.code & ~PGM_PER) {
>  	case PGM_AFX_TRANSLATION:
>  	case PGM_ASX_TRANSLATION:
> 


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection
  2020-02-18  9:33   ` David Hildenbrand
@ 2020-02-18  9:37     ` Christian Borntraeger
  2020-02-18  9:39       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18  9:37 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 10:33, David Hildenbrand wrote:
eliver_prog(struct kvm_vcpu *vcpu)
>>  {
>>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
>> @@ -856,6 +871,9 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
>>  	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
>>  					 pgm_info.code, 0);
>>  
>> +	if (kvm_s390_pv_is_protected(vcpu->kvm))
> 
> Can we actually ever have PER set, and what would happen if so?
> Shouldn't we also return -EINVAL?

The ultravisor would add a concurrent PER event if appropriate.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling
  2020-02-14 22:26 ` [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling Christian Borntraeger
@ 2020-02-18  9:38   ` David Hildenbrand
  2020-02-19 12:45     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:38 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> If the host initialized the Ultravisor, we can set stfle bit 161
> (protected virtual IPL enhancements facility), which indicates that
> the IPL subcodes 8, 9, and 10 are valid. These subcodes are used by a
> normal guest to set/retrieve an IPL information block of type 5 (for
> protected virtual machines) and transition into protected mode.
> 
> Once in protected mode, the Ultravisor will conceal the facility bit.
> Therefore each boot into protected mode has to go through
> non-protected mode. There is no secure re-ipl with subcode 10 without
> a previous subcode 3.
> 
> In protected mode, there is no subcode 4 available, as the VM has no
> more access to its memory from non-protected mode. I.e., only a IPL
> clear is possible.
> 
> The error cases will all be handled in userspace.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/kvm-s390.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 4a97d3b7840e..f96c1f530cc2 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2621,6 +2621,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (css_general_characteristics.aiv && test_facility(65))
>  		set_kvm_facility(kvm->arch.model.fac_mask, 65);
>  
> +	if (is_prot_virt_host()) {
> +		set_kvm_facility(kvm->arch.model.fac_mask, 161);
> +		set_kvm_facility(kvm->arch.model.fac_list, 161);
> +	}
> +

Aren't these IPL subcodes completely emulated in QEMU? If so, rather
QEMU with support should enable them when the kernel capability for PV
(=== is_prot_virt_host()) is in place.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection
  2020-02-18  9:37     ` Christian Borntraeger
@ 2020-02-18  9:39       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:39 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 18.02.20 10:37, Christian Borntraeger wrote:
> 
> 
> On 18.02.20 10:33, David Hildenbrand wrote:
> eliver_prog(struct kvm_vcpu *vcpu)
>>>  {
>>>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
>>> @@ -856,6 +871,9 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
>>>  	trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
>>>  					 pgm_info.code, 0);
>>>  
>>> +	if (kvm_s390_pv_is_protected(vcpu->kvm))
>>
>> Can we actually ever have PER set, and what would happen if so?
>> Shouldn't we also return -EINVAL?
> 
> The ultravisor would add a concurrent PER event if appropriate.
> 

Please add that to the patch description.

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1
  2020-02-14 22:26 ` [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1 Christian Borntraeger
@ 2020-02-18  9:44   ` David Hildenbrand
  2020-02-19 11:53     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:44 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> diag 308 subcode 0 and 1 require several KVM and Ultravisor interactions.
> Specific to these "soft" reboots are
> 
> * The "unshare all" UVC
> * The "prepare for reset" UVC
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/uv.h |  4 ++++
>  arch/s390/kvm/kvm-s390.c   | 22 ++++++++++++++++++++++
>  include/uapi/linux/kvm.h   |  2 ++
>  3 files changed, 28 insertions(+)
> 
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index 839cb3a89986..254d5769d136 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -36,6 +36,8 @@
>  #define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
>  #define UVC_CMD_UNPACK_IMG		0x0301
>  #define UVC_CMD_VERIFY_IMG		0x0302
> +#define UVC_CMD_PREPARE_RESET		0x0320
> +#define UVC_CMD_SET_UNSHARE_ALL		0x0340
>  #define UVC_CMD_PIN_PAGE_SHARED		0x0341
>  #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
>  #define UVC_CMD_SET_SHARED_ACCESS	0x1000
> @@ -56,6 +58,8 @@ enum uv_cmds_inst {
>  	BIT_UVC_CMD_SET_SEC_PARMS = 11,
>  	BIT_UVC_CMD_UNPACK_IMG = 13,
>  	BIT_UVC_CMD_VERIFY_IMG = 14,
> +	BIT_UVC_CMD_PREPARE_RESET = 18,
> +	BIT_UVC_CMD_UNSHARE_ALL = 20,
>  	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
>  	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
>  };
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index f96c1f530cc2..ad84c1144908 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2285,6 +2285,28 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>  			     cmd->rrc);
>  		break;
>  	}
> +	case KVM_PV_VM_PREP_RESET: {
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))
> +			break;
> +
> +		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
> +				  UVC_CMD_PREPARE_RESET, &cmd->rc, &cmd->rrc);
> +		KVM_UV_EVENT(kvm, 3, "PROTVIRT PREP RESET: rc %x rrc %x",
> +			     cmd->rc, cmd->rrc);
> +		break;
> +	}
> +	case KVM_PV_VM_UNSHARE_ALL: {
> +		r = -EINVAL;
> +		if (!kvm_s390_pv_is_protected(kvm))
> +			break;
> +
> +		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
> +				  UVC_CMD_SET_UNSHARE_ALL, &cmd->rc, &cmd->rrc);
> +		KVM_UV_EVENT(kvm, 3, "PROTVIRT UNSHARE: rc %x rrc %x",
> +			     cmd->rc, cmd->rrc);

I do wonder if that has any possible races with CPUs currently running,
which try to mark pages shared. That no other VCPUs are running is not
enforced here AFAIKs.

Apart from that, looks good to me.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor
  2020-02-14 22:26 ` [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor Christian Borntraeger
@ 2020-02-18  9:48   ` David Hildenbrand
  2020-02-19 19:36     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:48 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> VCPU states have to be reported to the ultravisor for SIGP
> interpretation, kdump, kexec and reboot.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/uv.h | 15 +++++++++++++++
>  arch/s390/kvm/kvm-s390.c   |  7 ++++++-
>  arch/s390/kvm/kvm-s390.h   |  2 ++
>  arch/s390/kvm/pv.c         | 22 ++++++++++++++++++++++
>  4 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index 254d5769d136..7b82881ec3b4 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -37,6 +37,7 @@
>  #define UVC_CMD_UNPACK_IMG		0x0301
>  #define UVC_CMD_VERIFY_IMG		0x0302
>  #define UVC_CMD_PREPARE_RESET		0x0320
> +#define UVC_CMD_CPU_SET_STATE		0x0330
>  #define UVC_CMD_SET_UNSHARE_ALL		0x0340
>  #define UVC_CMD_PIN_PAGE_SHARED		0x0341
>  #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
> @@ -58,6 +59,7 @@ enum uv_cmds_inst {
>  	BIT_UVC_CMD_SET_SEC_PARMS = 11,
>  	BIT_UVC_CMD_UNPACK_IMG = 13,
>  	BIT_UVC_CMD_VERIFY_IMG = 14,
> +	BIT_UVC_CMD_CPU_SET_STATE = 17,
>  	BIT_UVC_CMD_PREPARE_RESET = 18,
>  	BIT_UVC_CMD_UNSHARE_ALL = 20,
>  	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
> @@ -164,6 +166,19 @@ struct uv_cb_unp {
>  	u64 reserved38[3];
>  } __packed __aligned(8);
>  
> +#define PV_CPU_STATE_OPR	1
> +#define PV_CPU_STATE_STP	2
> +#define PV_CPU_STATE_CHKSTP	3
> +
> +struct uv_cb_cpu_set_state {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 cpu_handle;
> +	u8  reserved20[7];
> +	u8  state;
> +	u64 reserved28[5];
> +};
> +
>  /*
>   * A common UV call struct for calls that take no payload
>   * Examples:
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index ad84c1144908..5426b01e3da1 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4396,6 +4396,7 @@ static void __enable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
>  void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>  {
>  	int i, online_vcpus, started_vcpus = 0;
> +	u16 rc, rrc;
>  
>  	if (!is_vcpu_stopped(vcpu))
>  		return;
> @@ -4421,7 +4422,8 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>  		 */
>  		__disable_ibs_on_all_vcpus(vcpu->kvm);
>  	}
> -
> +	/* Let's tell the UV that we want to start again */
> +	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR, &rc, &rrc);
>  	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
>  	/*
>  	 * Another VCPU might have used IBS while we were offline.
> @@ -4436,6 +4438,7 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
>  {
>  	int i, online_vcpus, started_vcpus = 0;
>  	struct kvm_vcpu *started_vcpu = NULL;
> +	u16 rc, rrc;
>  
>  	if (is_vcpu_stopped(vcpu))
>  		return;
> @@ -4449,6 +4452,8 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
>  	kvm_s390_clear_stop_irq(vcpu);
>  
>  	kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
> +	/* Let's tell the UV that we successfully stopped the vcpu */
> +	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_STP, &rc, &rrc);
>  	__disable_ibs_on_vcpu(vcpu);
>  
>  	for (i = 0; i < online_vcpus; i++) {
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index d5503dd0d1e4..1af1e30beead 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -218,6 +218,8 @@ int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
>  			      u16 *rrc);
>  int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
>  		       unsigned long tweak, u16 *rc, u16 *rrc);
> +int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
> +			      u16 *rrc);
>  
>  static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
>  {
> diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
> index 80169a9b43ec..b4bf6b6eb708 100644
> --- a/arch/s390/kvm/pv.c
> +++ b/arch/s390/kvm/pv.c
> @@ -271,3 +271,25 @@ int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
>  		KVM_UV_EVENT(kvm, 3, "%s", "PROTVIRT VM UNPACK: successful");
>  	return ret;
>  }
> +
> +int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
> +			      u16 *rrc)
> +{
> +	struct uv_cb_cpu_set_state uvcb = {
> +		.header.cmd	= UVC_CMD_CPU_SET_STATE,
> +		.header.len	= sizeof(uvcb),
> +		.cpu_handle	= kvm_s390_pv_handle_cpu(vcpu),
> +		.state		= state,
> +	};
> +	int cc;
> +
> +	if (!kvm_s390_pv_handle_cpu(vcpu))

I'd actually prefer to move this to the caller. (and sue the _protected
variant)

> +		return -EINVAL;
> +
> +	cc = uv_call(0, (u64)&uvcb);
> +	*rc = uvcb.header.rc;
> +	*rrc = uvcb.header.rrc;
> +	if (cc)
> +		return -EINVAL;

All return values are ignored. warn instead and make this a void function?

> +	return 0;
> +}
> 



-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state
  2020-02-14 22:26 ` [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state Christian Borntraeger
@ 2020-02-18  9:50   ` David Hildenbrand
  2020-02-19 11:06     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:50 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> Code 5 for the set cpu state UV call tells the UV to load a PSW from
> the SE header (first IPL) or from guest location 0x0 (diag 308 subcode
> 0/1). Also it sets the cpu into operating state afterwards, so we can
> start it.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/uv.h | 1 +
>  arch/s390/kvm/kvm-s390.c   | 8 ++++++++
>  include/uapi/linux/kvm.h   | 1 +
>  3 files changed, 10 insertions(+)
> 
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index 7b82881ec3b4..d59825d95b9d 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -169,6 +169,7 @@ struct uv_cb_unp {
>  #define PV_CPU_STATE_OPR	1
>  #define PV_CPU_STATE_STP	2
>  #define PV_CPU_STATE_CHKSTP	3
> +#define PV_CPU_STATE_OPR_LOAD	5
>  
>  struct uv_cb_cpu_set_state {
>  	struct uv_cb_header header;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 5426b01e3da1..b6113285f47f 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4656,6 +4656,14 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
>  		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
>  		break;
>  	}
> +	case KVM_PV_VCPU_SET_IPL_PSW: {
> +		if (!kvm_s390_pv_handle_cpu(vcpu))
> +			return -EINVAL;
> +
> +		r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD,
> +					      &cmd->rc, &cmd->rrc);

Can we squeeze that into kvm_arch_vcpu_ioctl_set_mpstate() instead? The
interface seems to do exactly what you want it to do.

KVM_MP_STATE_OPERATING_LOAD

Allow it only when in PV.



-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112
  2020-02-14 22:26 ` [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112 Christian Borntraeger
@ 2020-02-18  9:53   ` David Hildenbrand
  2020-02-18 10:02     ` David Hildenbrand
  2020-02-18 10:05     ` Christian Borntraeger
  0 siblings, 2 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:53 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> We're not allowed to inject interrupts on intercepts that leave the
> guest state in an "in-between" state where the next SIE entry will do a
> continuation, namely secure instruction interception (104) and secure
> prefix interception (112).
> As our PSW is just a copy of the real one that will be replaced on the
> next exit, we can mask out the interrupt bits in the PSW to make sure
> that we do not inject anything.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/kvm/kvm-s390.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index b6113285f47f..1b6963bbc96f 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4025,6 +4025,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
>  	return vcpu_post_run_fault_in_sie(vcpu);
>  }
>  
> +#define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK)
>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>  {
>  	int rc, exit_reason;
> @@ -4061,6 +4062,16 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>  			memcpy(vcpu->run->s.regs.gprs,
>  			       sie_page->pv_grregs,
>  			       sizeof(sie_page->pv_grregs));
> +			/*
> +			 * We're not allowed to inject interrupts on intercepts
> +			 * that leave the guest state in an "in-between" state
> +			 * where the next SIE entry will do a continuation.
> +			 * Fence interrupts in our "internal" PSW.
> +			 */
> +			if (vcpu->arch.sie_block->icptcode == ICPT_PV_INSTR ||
> +			    vcpu->arch.sie_block->icptcode == ICPT_PV_PREF) {
> +				vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
> +			}

Is this actually the right approach? I mean, you lose PSW masks, but you
only want to teach the interrupt delivery code to skip delivering
interrupts until the intercept has been handled. Does not look clean to
me TBH.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start
  2020-02-14 22:26 ` [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start Christian Borntraeger
@ 2020-02-18  9:53   ` David Hildenbrand
  2020-02-18 10:02     ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:53 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 14.02.20 23:26, Christian Borntraeger wrote:
> As PSW restart is handled by the ultravisor (and we only get a start
> notification) we must re-check the PSW after a start before injecting
> interrupts.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> ---
>  arch/s390/kvm/kvm-s390.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 1b6963bbc96f..16af4d1a2c29 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4436,6 +4436,13 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>  	/* Let's tell the UV that we want to start again */
>  	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR, &rc, &rrc);
>  	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
> +	/*
> +	 * The real PSW might have changed due to a RESTART interpreted by the
> +	 * ultravisor. We block all interrupts and let the next sie exit
> +	 * refresh our view.
> +	 */
> +	if (kvm_s390_pv_is_protected(vcpu->kvm))
> +		vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;

Same comment as to the previous patch.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 35/42] KVM: s390: protvirt: Add UV cpu reset calls
  2020-02-14 22:26 ` [PATCH v2 35/42] KVM: s390: protvirt: Add UV cpu reset calls Christian Borntraeger
@ 2020-02-18  9:54   ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:54 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 14.02.20 23:26, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> For protected VMs, the VCPU resets are done by the Ultravisor, as KVM
> has no access to the VCPU registers.
> 
> Note that the ultravisor will only accept a call for the exact reset
> that has been requested.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/include/asm/uv.h |  6 ++++++
>  arch/s390/kvm/kvm-s390.c   | 20 ++++++++++++++++++++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index d59825d95b9d..d4fb54231932 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -36,7 +36,10 @@
>  #define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
>  #define UVC_CMD_UNPACK_IMG		0x0301
>  #define UVC_CMD_VERIFY_IMG		0x0302
> +#define UVC_CMD_CPU_RESET		0x0310
> +#define UVC_CMD_CPU_RESET_INITIAL	0x0311
>  #define UVC_CMD_PREPARE_RESET		0x0320
> +#define UVC_CMD_CPU_RESET_CLEAR		0x0321
>  #define UVC_CMD_CPU_SET_STATE		0x0330
>  #define UVC_CMD_SET_UNSHARE_ALL		0x0340
>  #define UVC_CMD_PIN_PAGE_SHARED		0x0341
> @@ -59,8 +62,11 @@ enum uv_cmds_inst {
>  	BIT_UVC_CMD_SET_SEC_PARMS = 11,
>  	BIT_UVC_CMD_UNPACK_IMG = 13,
>  	BIT_UVC_CMD_VERIFY_IMG = 14,
> +	BIT_UVC_CMD_CPU_RESET = 15,
> +	BIT_UVC_CMD_CPU_RESET_INITIAL = 16,
>  	BIT_UVC_CMD_CPU_SET_STATE = 17,
>  	BIT_UVC_CMD_PREPARE_RESET = 18,
> +	BIT_UVC_CMD_CPU_PERFORM_CLEAR_RESET = 19,
>  	BIT_UVC_CMD_UNSHARE_ALL = 20,
>  	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
>  	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 16af4d1a2c29..932f7f32e82f 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4695,6 +4695,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	void __user *argp = (void __user *)arg;
>  	int idx;
>  	long r;
> +	u16 rc, rrc;
>  
>  	vcpu_load(vcpu);
>  
> @@ -4716,14 +4717,33 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	case KVM_S390_CLEAR_RESET:
>  		r = 0;
>  		kvm_arch_vcpu_ioctl_clear_reset(vcpu);
> +		if (kvm_s390_pv_handle_cpu(vcpu)) {

_protected checks please. (if not already converted in your tree :) )


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-18  8:39   ` [PATCH v2.1] " Christian Borntraeger
  2020-02-18  9:12     ` David Hildenbrand
@ 2020-02-18  9:56     ` David Hildenbrand
  2020-02-18 20:26       ` Christian Borntraeger
  1 sibling, 1 reply; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18  9:56 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth

On 18.02.20 09:39, Christian Borntraeger wrote:
> From: Janosch Frank <frankja@linux.ibm.com>
> 
> This contains 3 main changes:
> 1. changes in SIE control block handling for secure guests
> 2. helper functions for create/destroy/unpack secure guests
> 3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
> machines
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> 2->2.1  - combine CREATE/DESTROY CPU/VM into ENABLE DISABLE
> 	- rework locking and check locks with lockdep
> 	- I still have the PV_COMMAND_CPU in here for later use in
> 	  the SET_IPL_PSW ioctl. If wanted I can move
> 	- change CAP number
> 
>  arch/s390/include/asm/kvm_host.h |  24 ++-
>  arch/s390/include/asm/uv.h       |  69 ++++++++
>  arch/s390/kvm/Makefile           |   2 +-
>  arch/s390/kvm/kvm-s390.c         | 231 ++++++++++++++++++++++++++-
>  arch/s390/kvm/kvm-s390.h         |  35 +++++
>  arch/s390/kvm/pv.c               | 262 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm.h         |  35 +++++
>  7 files changed, 654 insertions(+), 4 deletions(-)
>  create mode 100644 arch/s390/kvm/pv.c
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index d058289385a5..1aa2382fe363 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -160,7 +160,13 @@ struct kvm_s390_sie_block {
>  	__u8	reserved08[4];		/* 0x0008 */
>  #define PROG_IN_SIE (1<<0)
>  	__u32	prog0c;			/* 0x000c */
> -	__u8	reserved10[16];		/* 0x0010 */
> +	union {
> +		__u8	reserved10[16];		/* 0x0010 */
> +		struct {
> +			__u64	pv_handle_cpu;
> +			__u64	pv_handle_config;
> +		};
> +	};
>  #define PROG_BLOCK_SIE	(1<<0)
>  #define PROG_REQUEST	(1<<1)
>  	atomic_t prog20;		/* 0x0020 */
> @@ -233,7 +239,7 @@ struct kvm_s390_sie_block {
>  #define ECB3_RI  0x01
>  	__u8    ecb3;			/* 0x0063 */
>  	__u32	scaol;			/* 0x0064 */
> -	__u8	reserved68;		/* 0x0068 */
> +	__u8	sdf;			/* 0x0068 */
>  	__u8    epdx;			/* 0x0069 */
>  	__u8    reserved6a[2];		/* 0x006a */
>  	__u32	todpr;			/* 0x006c */
> @@ -645,6 +651,11 @@ struct kvm_guestdbg_info_arch {
>  	unsigned long last_bp;
>  };
>  
> +struct kvm_s390_pv_vcpu {
> +	u64 handle;
> +	unsigned long stor_base;
> +};
> +
>  struct kvm_vcpu_arch {
>  	struct kvm_s390_sie_block *sie_block;
>  	/* if vsie is active, currently executed shadow sie control block */
> @@ -673,6 +684,7 @@ struct kvm_vcpu_arch {
>  	__u64 cputm_start;
>  	bool gs_enabled;
>  	bool skey_enabled;
> +	struct kvm_s390_pv_vcpu pv;
>  };
>  
>  struct kvm_vm_stat {
> @@ -843,6 +855,13 @@ struct kvm_s390_gisa_interrupt {
>  	DECLARE_BITMAP(kicked_mask, KVM_MAX_VCPUS);
>  };
>  
> +struct kvm_s390_pv {
> +	u64 handle;
> +	u64 guest_len;
> +	unsigned long stor_base;
> +	void *stor_var;
> +};
> +
>  struct kvm_arch{
>  	void *sca;
>  	int use_esca;
> @@ -878,6 +897,7 @@ struct kvm_arch{
>  	DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
>  	DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS);
>  	struct kvm_s390_gisa_interrupt gisa_int;
> +	struct kvm_s390_pv pv;
>  };
>  
>  #define KVM_HVA_ERR_BAD		(-1UL)
> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
> index bc452a15ac3f..839cb3a89986 100644
> --- a/arch/s390/include/asm/uv.h
> +++ b/arch/s390/include/asm/uv.h
> @@ -23,11 +23,19 @@
>  #define UVC_RC_INV_STATE	0x0003
>  #define UVC_RC_INV_LEN		0x0005
>  #define UVC_RC_NO_RESUME	0x0007
> +#define UVC_RC_NEED_DESTROY	0x8000
>  
>  #define UVC_CMD_QUI			0x0001
>  #define UVC_CMD_INIT_UV			0x000f
> +#define UVC_CMD_CREATE_SEC_CONF		0x0100
> +#define UVC_CMD_DESTROY_SEC_CONF	0x0101
> +#define UVC_CMD_CREATE_SEC_CPU		0x0120
> +#define UVC_CMD_DESTROY_SEC_CPU		0x0121
>  #define UVC_CMD_CONV_TO_SEC_STOR	0x0200
>  #define UVC_CMD_CONV_FROM_SEC_STOR	0x0201
> +#define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
> +#define UVC_CMD_UNPACK_IMG		0x0301
> +#define UVC_CMD_VERIFY_IMG		0x0302
>  #define UVC_CMD_PIN_PAGE_SHARED		0x0341
>  #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
>  #define UVC_CMD_SET_SHARED_ACCESS	0x1000
> @@ -37,10 +45,17 @@
>  enum uv_cmds_inst {
>  	BIT_UVC_CMD_QUI = 0,
>  	BIT_UVC_CMD_INIT_UV = 1,
> +	BIT_UVC_CMD_CREATE_SEC_CONF = 2,
> +	BIT_UVC_CMD_DESTROY_SEC_CONF = 3,
> +	BIT_UVC_CMD_CREATE_SEC_CPU = 4,
> +	BIT_UVC_CMD_DESTROY_SEC_CPU = 5,
>  	BIT_UVC_CMD_CONV_TO_SEC_STOR = 6,
>  	BIT_UVC_CMD_CONV_FROM_SEC_STOR = 7,
>  	BIT_UVC_CMD_SET_SHARED_ACCESS = 8,
>  	BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9,
> +	BIT_UVC_CMD_SET_SEC_PARMS = 11,
> +	BIT_UVC_CMD_UNPACK_IMG = 13,
> +	BIT_UVC_CMD_VERIFY_IMG = 14,
>  	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
>  	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
>  };
> @@ -52,6 +67,7 @@ struct uv_cb_header {
>  	u16 rrc;	/* Return Reason Code */
>  } __packed __aligned(8);
>  
> +/* Query Ultravisor Information */
>  struct uv_cb_qui {
>  	struct uv_cb_header header;
>  	u64 reserved08;
> @@ -71,6 +87,7 @@ struct uv_cb_qui {
>  	u64 reserveda0;
>  } __packed __aligned(8);
>  
> +/* Initialize Ultravisor */
>  struct uv_cb_init {
>  	struct uv_cb_header header;
>  	u64 reserved08[2];
> @@ -79,6 +96,35 @@ struct uv_cb_init {
>  	u64 reserved28[4];
>  } __packed __aligned(8);
>  
> +/* Create Guest Configuration */
> +struct uv_cb_cgc {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 guest_handle;
> +	u64 conf_base_stor_origin;
> +	u64 conf_virt_stor_origin;
> +	u64 reserved30;
> +	u64 guest_stor_origin;
> +	u64 guest_stor_len;
> +	u64 guest_sca;
> +	u64 guest_asce;
> +	u64 reserved58[5];
> +} __packed __aligned(8);
> +
> +/* Create Secure CPU */
> +struct uv_cb_csc {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 cpu_handle;
> +	u64 guest_handle;
> +	u64 stor_origin;
> +	u8  reserved30[6];
> +	u16 num;
> +	u64 state_origin;
> +	u64 reserved40[4];
> +} __packed __aligned(8);
> +
> +/* Convert to Secure */
>  struct uv_cb_cts {
>  	struct uv_cb_header header;
>  	u64 reserved08[2];
> @@ -86,12 +132,34 @@ struct uv_cb_cts {
>  	u64 gaddr;
>  } __packed __aligned(8);
>  
> +/* Convert from Secure / Pin Page Shared */
>  struct uv_cb_cfs {
>  	struct uv_cb_header header;
>  	u64 reserved08[2];
>  	u64 paddr;
>  } __packed __aligned(8);
>  
> +/* Set Secure Config Parameter */
> +struct uv_cb_ssc {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 guest_handle;
> +	u64 sec_header_origin;
> +	u32 sec_header_len;
> +	u32 reserved2c;
> +	u64 reserved30[4];
> +} __packed __aligned(8);
> +
> +/* Unpack */
> +struct uv_cb_unp {
> +	struct uv_cb_header header;
> +	u64 reserved08[2];
> +	u64 guest_handle;
> +	u64 gaddr;
> +	u64 tweak[2];
> +	u64 reserved38[3];
> +} __packed __aligned(8);
> +
>  /*
>   * A common UV call struct for calls that take no payload
>   * Examples:
> @@ -105,6 +173,7 @@ struct uv_cb_nodata {
>  	u64 reserved20[4];
>  } __packed __aligned(8);
>  
> +/* Set Shared Access */
>  struct uv_cb_share {
>  	struct uv_cb_header header;
>  	u64 reserved08[3];
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index 05ee90a5ea08..12decca22e7c 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -9,6 +9,6 @@ common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o  $(KVM)/async_pf.o $(KVM)/irqch
>  ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>  
>  kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
> -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o
> +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
>  
>  obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index cc7793525a69..1a7bb08f5c26 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -44,6 +44,7 @@
>  #include <asm/cpacf.h>
>  #include <asm/timex.h>
>  #include <asm/ap.h>
> +#include <asm/uv.h>
>  #include "kvm-s390.h"
>  #include "gaccess.h"
>  
> @@ -234,8 +235,10 @@ int kvm_arch_check_processor_compat(void)
>  	return 0;
>  }
>  
> +/* forward declarations */
>  static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
>  			      unsigned long end);
> +static int sca_switch_to_extended(struct kvm *kvm);
>  
>  static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
>  {
> @@ -571,6 +574,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_S390_BPB:
>  		r = test_facility(82);
>  		break;
> +	case KVM_CAP_S390_PROTECTED:
> +		r = is_prot_virt_host();
> +		break;

FWIW, the clean thing to do is to enable the capability only after all
features have been implemented, so as the very last patch.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112
  2020-02-18  9:53   ` David Hildenbrand
@ 2020-02-18 10:02     ` David Hildenbrand
  2020-02-18 10:05     ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18 10:02 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 18.02.20 10:53, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> We're not allowed to inject interrupts on intercepts that leave the
>> guest state in an "in-between" state where the next SIE entry will do a
>> continuation, namely secure instruction interception (104) and secure
>> prefix interception (112).
>> As our PSW is just a copy of the real one that will be replaced on the
>> next exit, we can mask out the interrupt bits in the PSW to make sure
>> that we do not inject anything.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/kvm/kvm-s390.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b6113285f47f..1b6963bbc96f 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4025,6 +4025,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
>>  	return vcpu_post_run_fault_in_sie(vcpu);
>>  }
>>  
>> +#define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK)
>>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  {
>>  	int rc, exit_reason;
>> @@ -4061,6 +4062,16 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  			memcpy(vcpu->run->s.regs.gprs,
>>  			       sie_page->pv_grregs,
>>  			       sizeof(sie_page->pv_grregs));
>> +			/*
>> +			 * We're not allowed to inject interrupts on intercepts
>> +			 * that leave the guest state in an "in-between" state
>> +			 * where the next SIE entry will do a continuation.
>> +			 * Fence interrupts in our "internal" PSW.
>> +			 */
>> +			if (vcpu->arch.sie_block->icptcode == ICPT_PV_INSTR ||
>> +			    vcpu->arch.sie_block->icptcode == ICPT_PV_PREF) {
>> +				vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
>> +			}
> 
> Is this actually the right approach? I mean, you lose PSW masks, but you
> only want to teach the interrupt delivery code to skip delivering
> interrupts until the intercept has been handled. Does not look clean to
> me TBH.

After reading the patch description again :)

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start
  2020-02-18  9:53   ` David Hildenbrand
@ 2020-02-18 10:02     ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-18 10:02 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 18.02.20 10:53, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> As PSW restart is handled by the ultravisor (and we only get a start
>> notification) we must re-check the PSW after a start before injecting
>> interrupts.
>>
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  arch/s390/kvm/kvm-s390.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 1b6963bbc96f..16af4d1a2c29 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4436,6 +4436,13 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>>  	/* Let's tell the UV that we want to start again */
>>  	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR, &rc, &rrc);
>>  	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
>> +	/*
>> +	 * The real PSW might have changed due to a RESTART interpreted by the
>> +	 * ultravisor. We block all interrupts and let the next sie exit
>> +	 * refresh our view.
>> +	 */
>> +	if (kvm_s390_pv_is_protected(vcpu->kvm))
>> +		vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
> 
> Same comment as to the previous patch.

As I understood how it works

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112
  2020-02-18  9:53   ` David Hildenbrand
  2020-02-18 10:02     ` David Hildenbrand
@ 2020-02-18 10:05     ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18 10:05 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank


On 18.02.20 10:53, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> We're not allowed to inject interrupts on intercepts that leave the
>> guest state in an "in-between" state where the next SIE entry will do a
>> continuation, namely secure instruction interception (104) and secure
>> prefix interception (112).
>> As our PSW is just a copy of the real one that will be replaced on the
>> next exit, we can mask out the interrupt bits in the PSW to make sure
>> that we do not inject anything.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/kvm/kvm-s390.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b6113285f47f..1b6963bbc96f 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4025,6 +4025,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
>>  	return vcpu_post_run_fault_in_sie(vcpu);
>>  }
>>  
>> +#define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK)
>>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  {
>>  	int rc, exit_reason;
>> @@ -4061,6 +4062,16 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  			memcpy(vcpu->run->s.regs.gprs,
>>  			       sie_page->pv_grregs,
>>  			       sizeof(sie_page->pv_grregs));
>> +			/*
>> +			 * We're not allowed to inject interrupts on intercepts
>> +			 * that leave the guest state in an "in-between" state
>> +			 * where the next SIE entry will do a continuation.
>> +			 * Fence interrupts in our "internal" PSW.
>> +			 */
>> +			if (vcpu->arch.sie_block->icptcode == ICPT_PV_INSTR ||
>> +			    vcpu->arch.sie_block->icptcode == ICPT_PV_PREF) {
>> +				vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
>> +			}
> 
> Is this actually the right approach? I mean, you lose PSW masks, but you
> only want to teach the interrupt delivery code to skip delivering
> interrupts until the intercept has been handled. Does not look clean to
> me TBH.

The only purpose of the PSW mask in PV is to tell the hypervisor what interrupts
are deliverable or if we are in wait state.  In fact those bits are the only
ones that we see. So I think its fair to use those.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-18  8:27       ` David Hildenbrand
@ 2020-02-18 15:46         ` Sean Christopherson
  2020-02-18 16:02           ` Will Deacon
  0 siblings, 1 reply; 132+ messages in thread
From: Sean Christopherson @ 2020-02-18 15:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Christian Borntraeger, Janosch Frank, Andrew Morton, KVM,
	Cornelia Huck, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm, Will Deacon

On Tue, Feb 18, 2020 at 09:27:20AM +0100, David Hildenbrand wrote:
> On 17.02.20 12:10, Christian Borntraeger wrote:
> > So yes, if everything is setup properly this should not fail in real life
> > and only we have a kernel (or firmware) bug.
> > 
> 
> Then, without feedback from other possible users, this should be a void
> function. So either introduce error handling or convert it to a void for
> now (and add e.g., BUG_ON and a comment inside the s390x implementation).

My preference would also be for a void function (versus ignoring an int
return).

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-18 15:46         ` Sean Christopherson
@ 2020-02-18 16:02           ` Will Deacon
  2020-02-18 16:15             ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: Will Deacon @ 2020-02-18 16:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: David Hildenbrand, Christian Borntraeger, Janosch Frank,
	Andrew Morton, KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm

On Tue, Feb 18, 2020 at 07:46:10AM -0800, Sean Christopherson wrote:
> On Tue, Feb 18, 2020 at 09:27:20AM +0100, David Hildenbrand wrote:
> > On 17.02.20 12:10, Christian Borntraeger wrote:
> > > So yes, if everything is setup properly this should not fail in real life
> > > and only we have a kernel (or firmware) bug.
> > > 
> > 
> > Then, without feedback from other possible users, this should be a void
> > function. So either introduce error handling or convert it to a void for
> > now (and add e.g., BUG_ON and a comment inside the s390x implementation).
> 
> My preference would also be for a void function (versus ignoring an int
> return).

The gup code could certainly handle the error value, although the writeback
is a lot less clear (so a BUG_ON() would seem to be sufficient for now).

Will

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-18 16:02           ` Will Deacon
@ 2020-02-18 16:15             ` Christian Borntraeger
  2020-02-18 21:35               ` Sean Christopherson
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18 16:15 UTC (permalink / raw)
  To: Will Deacon, Sean Christopherson
  Cc: David Hildenbrand, Janosch Frank, Andrew Morton, KVM,
	Cornelia Huck, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm



On 18.02.20 17:02, Will Deacon wrote:
> On Tue, Feb 18, 2020 at 07:46:10AM -0800, Sean Christopherson wrote:
>> On Tue, Feb 18, 2020 at 09:27:20AM +0100, David Hildenbrand wrote:
>>> On 17.02.20 12:10, Christian Borntraeger wrote:
>>>> So yes, if everything is setup properly this should not fail in real life
>>>> and only we have a kernel (or firmware) bug.
>>>>
>>>
>>> Then, without feedback from other possible users, this should be a void
>>> function. So either introduce error handling or convert it to a void for
>>> now (and add e.g., BUG_ON and a comment inside the s390x implementation).
>>
>> My preference would also be for a void function (versus ignoring an int
>> return).
> 
> The gup code could certainly handle the error value, although the writeback
> is a lot less clear (so a BUG_ON() would seem to be sufficient for now).

Sean, David. Can we agree on merging patch 39?


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases
  2020-02-14 22:26 ` [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases Christian Borntraeger
@ 2020-02-18 16:25   ` Will Deacon
  2020-02-18 16:30     ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: Will Deacon @ 2020-02-18 16:25 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Janosch Frank, Andrew Morton, KVM, Cornelia Huck,
	David Hildenbrand, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm, Sean Christopherson

On Fri, Feb 14, 2020 at 05:26:55PM -0500, Christian Borntraeger wrote:
> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
> 
> This is a potential extension to do error handling if we fail to
> make the page accessible if we know what others need.
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> ---
>  mm/gup.c            | 17 ++++++++++++-----
>  mm/page-writeback.c |  6 +++++-
>  2 files changed, 17 insertions(+), 6 deletions(-)

Sorry, I missed this when replying elsewhere in the thread!
Anyway, looks good to me:

Acked-by: Will Deacon <will@kernel.org>

Thanks,

Will

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases
  2020-02-18 16:25   ` Will Deacon
@ 2020-02-18 16:30     ` Christian Borntraeger
  2020-02-18 16:33       ` Will Deacon
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18 16:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: Janosch Frank, Andrew Morton, KVM, Cornelia Huck,
	David Hildenbrand, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm, Sean Christopherson



On 18.02.20 17:25, Will Deacon wrote:
> On Fri, Feb 14, 2020 at 05:26:55PM -0500, Christian Borntraeger wrote:
>> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
>>
>> This is a potential extension to do error handling if we fail to
>> make the page accessible if we know what others need.
>>
>> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
>> ---
>>  mm/gup.c            | 17 ++++++++++++-----
>>  mm/page-writeback.c |  6 +++++-
>>  2 files changed, 17 insertions(+), 6 deletions(-)
> 
> Sorry, I missed this when replying elsewhere in the thread!
> Anyway, looks good to me:
> 
> Acked-by: Will Deacon <will@kernel.org>

I can use that for a combined patch (this one merged into the first) ? Correct?


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases
  2020-02-18 16:30     ` Christian Borntraeger
@ 2020-02-18 16:33       ` Will Deacon
  0 siblings, 0 replies; 132+ messages in thread
From: Will Deacon @ 2020-02-18 16:33 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Janosch Frank, Andrew Morton, KVM, Cornelia Huck,
	David Hildenbrand, Thomas Huth, Ulrich Weigand, Claudio Imbrenda,
	linux-s390, Michael Mueller, Vasily Gorbik, Andrea Arcangeli,
	linux-mm, Sean Christopherson

On Tue, Feb 18, 2020 at 05:30:40PM +0100, Christian Borntraeger wrote:
> On 18.02.20 17:25, Will Deacon wrote:
> > On Fri, Feb 14, 2020 at 05:26:55PM -0500, Christian Borntraeger wrote:
> >> From: Claudio Imbrenda <imbrenda@linux.ibm.com>
> >>
> >> This is a potential extension to do error handling if we fail to
> >> make the page accessible if we know what others need.
> >>
> >> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> >> ---
> >>  mm/gup.c            | 17 ++++++++++++-----
> >>  mm/page-writeback.c |  6 +++++-
> >>  2 files changed, 17 insertions(+), 6 deletions(-)
> > 
> > Sorry, I missed this when replying elsewhere in the thread!
> > Anyway, looks good to me:
> > 
> > Acked-by: Will Deacon <will@kernel.org>
> 
> I can use that for a combined patch (this one merged into the first) ? Correct?

Fine by me!

Will

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-18  9:56     ` David Hildenbrand
@ 2020-02-18 20:26       ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18 20:26 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth



On 18.02.20 10:56, David Hildenbrand wrote:
> On 18.02.20 09:39, Christian Borntraeger wrote:

>> @@ -571,6 +574,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_S390_BPB:
>>  		r = test_facility(82);
>>  		break;
>> +	case KVM_CAP_S390_PROTECTED:
>> +		r = is_prot_virt_host();
>> +		break;
> 
> FWIW, the clean thing to do is to enable the capability only after all
> features have been implemented, so as the very last patch.
> 

Ack.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-18  9:12     ` David Hildenbrand
@ 2020-02-18 21:18       ` Christian Borntraeger
  2020-02-19  8:32         ` David Hildenbrand
  2020-02-19 11:01       ` Christian Borntraeger
  1 sibling, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-18 21:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth



On 18.02.20 10:12, David Hildenbrand wrote:
> On 18.02.20 09:39, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> This contains 3 main changes:
>> 1. changes in SIE control block handling for secure guests
>> 2. helper functions for create/destroy/unpack secure guests
>> 3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
>> machines
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>> 2->2.1  - combine CREATE/DESTROY CPU/VM into ENABLE DISABLE
>> 	- rework locking and check locks with lockdep
>> 	- I still have the PV_COMMAND_CPU in here for later use in
>> 	  the SET_IPL_PSW ioctl. If wanted I can move
> 
> I'd prefer to move, and eventually just turn this into a clean, separate
> ioctl without subcommands (e.g., if we'll only need a single subcommand
> in the near future). And it makes this patch a alittle easier to review
> ... :)
> 

Maybe the MP_STATE solution will work out. Then we can get rid of this.
will look into that when dealing with the other patch.

> [...]

> 
> Once we lock the VCPU, it cannot be running, right?

Right, while holding the vcpu->mutex, KVM_RUN is blocked. 

> 
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		mutex_lock(&vcpu->mutex);
>> +		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
>> +		mutex_unlock(&vcpu->mutex);
>> +		if (r)
>> +			break;
>> +	}
> 
> Can this actually ever fail? If so, you would leave half-initialized
> state around. Warn and continue?
> 
> Especially, kvm_arch_vcpu_destroy() ignores any error from
> kvm_s390_pv_destroy_cpu() as well ...
> 
> IMHO, we should make kvm_s390_switch_from_pv() and
> kvm_s390_pv_destroy_cpu() never fail.

will answer this part tomorrow.

> 
>> +	return r;
>> +}
>> +
>> +static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
>> +{
>> +	int i, r = 0;
>> +	u16 dummy;
>> +
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		mutex_lock(&vcpu->mutex);
>> +		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
>> +		mutex_unlock(&vcpu->mutex);
>> +		if (r)
>> +			break;
>> +	}
>> +	if (r)
>> +		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
>> +	return r;
>> +}
>> +
>> +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>> +{
>> +	int r = 0;
>> +	u16 dummy;
>> +	void __user *argp = (void __user *)cmd->data;
>> +
>> +	switch (cmd->cmd) {
>> +	case KVM_PV_ENABLE: {
>> +		r = -EINVAL;
>> +		if (kvm_s390_pv_is_protected(kvm))
>> +			break;
> 
> Why not factor out this check, it's common for all sucommands.

Unfortunately it is not common. Sometimes it has an "!" sometimes not.

> 
>> +
>> +		r = kvm_s390_pv_alloc_vm(kvm);
>> +		if (r)
>> +			break;
>> +
>> +		kvm_s390_vcpu_block_all(kvm);
> 
> As kvm_s390_vcpu_block_all() does not support nesting, this will not
> work as expected - sca_switch_to_extended() already blocks. Are the
> vcpu->locks not enough?

Right, we would need to move the sca_switch out of this lock. On the other
hand the vcpu->locks should be sufficent for the host integrity (and also
for the secure guest integrity))
So I will just remove the block_all here and below.
> 
> [...]
> 
>> @@ -2558,10 +2735,16 @@ static void kvm_free_vcpus(struct kvm *kvm)
>>  
>>  void kvm_arch_destroy_vm(struct kvm *kvm)
>>  {
>> +	u16 rc, rrc;
>>  	kvm_free_vcpus(kvm);
>>  	sca_dispose(kvm);
>> -	debug_unregister(kvm->arch.dbf);
>>  	kvm_s390_gisa_destroy(kvm);
>> +	/* do not use the lock checking variant at tear-down */
>> +	if (kvm_s390_pv_handle(kvm)) {
> 
> kvm_s390_pv_is_protected ? I dislike using kvm_s390_pv_handle() when
> we're not interested in the handle.

Then I would need to take the kvm->mutex here. This is why I added
the comment. We are already at the last steps of killing the kvm
and the lock is not held. So the lockdep check would complain.
On the other hand grabbing kvm->lock is pointless as the kvm file
descriptor is already gone so the ioctls that we would protect against
cannot happen.


What about Improving the comment:

        /*
         * We are already at the end of life and kvm->lock is not taken.
         * This is ok as the file descriptor is closed by now and nobody
         * can mess with the pv state. To avoid lockdep_assert_held from
         * complaining we do not use kvm_s390_pv_is_protected.
         */

On the other hand This looks a bit too verbose.




[...]

>> +static inline u64 kvm_s390_pv_handle(struct kvm *kvm)
>> +{
>> +	return kvm->arch.pv.handle;
>> +}
> 
> Can we rename this to
> 
> kvm_s390_pv_get_handle()

ack.
> 
>> +
>> +static inline u64 kvm_s390_pv_handle_cpu(struct kvm_vcpu *vcpu)
>> +{
>> +	return vcpu->arch.pv.handle;
>> +}
> 
> Can we rename this to kvm_s390_pv_cpu_get_handle() ? (so it doesn't look
> like the function will handle something)

ack. 

[...]
>> +int kvm_s390_pv_alloc_vm(struct kvm *kvm)
>> +{
>> +	unsigned long base = uv_info.guest_base_stor_len;
>> +	unsigned long virt = uv_info.guest_virt_var_stor_len;
>> +	unsigned long npages = 0, vlen = 0;
>> +	struct kvm_memory_slot *memslot;
>> +
>> +	kvm->arch.pv.stor_var = NULL;
>> +	kvm->arch.pv.stor_base = __get_free_pages(GFP_KERNEL, get_order(base));
>> +	if (!kvm->arch.pv.stor_base)
>> +		return -ENOMEM;
>> +
>> +	/*
>> +	 * Calculate current guest storage for allocation of the
>> +	 * variable storage, which is based on the length in MB.
>> +	 *
>> +	 * Slots are sorted by GFN
>> +	 */
>> +	mutex_lock(&kvm->slots_lock);
>> +	memslot = kvm_memslots(kvm)->memslots;
>> +	npages = memslot->base_gfn + memslot->npages;
>> +	mutex_unlock(&kvm->slots_lock);
> 
> Are you blocking the addition of new memslots somehow?
> 

>> +int kvm_s390_pv_create_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
>> +{
>> +	u16 drc, drrc;
>> +	int cc;
>> +
>> +	struct uv_cb_cgc uvcb = {
>> +		.header.cmd = UVC_CMD_CREATE_SEC_CONF,
>> +		.header.len = sizeof(uvcb)
>> +	};
>> +
>> +	if (kvm_s390_pv_handle(kvm))
> 
> Why is that necessary? We should only be called in PV mode.

It is not because the caller already checks. It was necessary in a previous version I guess.

[...]
>> +int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
>> +			      u16 *rrc)
>> +{
>> +	struct uv_cb_ssc uvcb = {
>> +		.header.cmd = UVC_CMD_SET_SEC_CONF_PARAMS,
>> +		.header.len = sizeof(uvcb),
>> +		.sec_header_origin = (u64)hdr,
>> +		.sec_header_len = length,
>> +		.guest_handle = kvm_s390_pv_handle(kvm),
>> +	};
>> +	int cc;
>> +
>> +	if (!kvm_s390_pv_handle(kvm))
> 
> Why is that necessary? We should only be called in PV mode.

ack


[...]

>> +struct kvm_s390_pv_sec_parm {
>> +	__u64	origin;
>> +	__u64	length;
> 
> tabs vs. spaces. (I'd use a single space like in kvm_s390_pv_unp below)

ack.
[...]


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-18 16:15             ` Christian Borntraeger
@ 2020-02-18 21:35               ` Sean Christopherson
  2020-02-19  8:31                 ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Sean Christopherson @ 2020-02-18 21:35 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Will Deacon, David Hildenbrand, Janosch Frank, Andrew Morton,
	KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Andrea Arcangeli, linux-mm

On Tue, Feb 18, 2020 at 05:15:57PM +0100, Christian Borntraeger wrote:
> 
> 
> On 18.02.20 17:02, Will Deacon wrote:
> > On Tue, Feb 18, 2020 at 07:46:10AM -0800, Sean Christopherson wrote:
> >> On Tue, Feb 18, 2020 at 09:27:20AM +0100, David Hildenbrand wrote:
> >>> On 17.02.20 12:10, Christian Borntraeger wrote:
> >>>> So yes, if everything is setup properly this should not fail in real life
> >>>> and only we have a kernel (or firmware) bug.
> >>>>
> >>>
> >>> Then, without feedback from other possible users, this should be a void
> >>> function. So either introduce error handling or convert it to a void for
> >>> now (and add e.g., BUG_ON and a comment inside the s390x implementation).
> >>
> >> My preference would also be for a void function (versus ignoring an int
> >> return).
> > 
> > The gup code could certainly handle the error value, although the writeback
> > is a lot less clear (so a BUG_ON() would seem to be sufficient for now).
> 
> Sean, David. Can we agree on merging patch 39?

I'm a-ok with adding error checking, ignoring the return value is the only
option I don't like :-)

^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages
  2020-02-18 21:35               ` Sean Christopherson
@ 2020-02-19  8:31                 ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-19  8:31 UTC (permalink / raw)
  To: Sean Christopherson, Christian Borntraeger
  Cc: Will Deacon, Janosch Frank, Andrew Morton, KVM, Cornelia Huck,
	Thomas Huth, Ulrich Weigand, Claudio Imbrenda, linux-s390,
	Michael Mueller, Vasily Gorbik, Andrea Arcangeli, linux-mm

On 18.02.20 22:35, Sean Christopherson wrote:
> On Tue, Feb 18, 2020 at 05:15:57PM +0100, Christian Borntraeger wrote:
>>
>>
>> On 18.02.20 17:02, Will Deacon wrote:
>>> On Tue, Feb 18, 2020 at 07:46:10AM -0800, Sean Christopherson wrote:
>>>> On Tue, Feb 18, 2020 at 09:27:20AM +0100, David Hildenbrand wrote:
>>>>> On 17.02.20 12:10, Christian Borntraeger wrote:
>>>>>> So yes, if everything is setup properly this should not fail in real life
>>>>>> and only we have a kernel (or firmware) bug.
>>>>>>
>>>>>
>>>>> Then, without feedback from other possible users, this should be a void
>>>>> function. So either introduce error handling or convert it to a void for
>>>>> now (and add e.g., BUG_ON and a comment inside the s390x implementation).
>>>>
>>>> My preference would also be for a void function (versus ignoring an int
>>>> return).
>>>
>>> The gup code could certainly handle the error value, although the writeback
>>> is a lot less clear (so a BUG_ON() would seem to be sufficient for now).
>>
>> Sean, David. Can we agree on merging patch 39?
> 
> I'm a-ok with adding error checking, ignoring the return value is the only
> option I don't like :-)

Same over here :)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-18 21:18       ` Christian Borntraeger
@ 2020-02-19  8:32         ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-19  8:32 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth

>>
>>> +	return r;
>>> +}
>>> +
>>> +static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
>>> +{
>>> +	int i, r = 0;
>>> +	u16 dummy;
>>> +
>>> +	struct kvm_vcpu *vcpu;
>>> +
>>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>>> +		mutex_lock(&vcpu->mutex);
>>> +		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
>>> +		mutex_unlock(&vcpu->mutex);
>>> +		if (r)
>>> +			break;
>>> +	}
>>> +	if (r)
>>> +		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
>>> +	return r;
>>> +}
>>> +
>>> +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>>> +{
>>> +	int r = 0;
>>> +	u16 dummy;
>>> +	void __user *argp = (void __user *)cmd->data;
>>> +
>>> +	switch (cmd->cmd) {
>>> +	case KVM_PV_ENABLE: {
>>> +		r = -EINVAL;
>>> +		if (kvm_s390_pv_is_protected(kvm))
>>> +			break;
>>
>> Why not factor out this check, it's common for all sucommands.
> 
> Unfortunately it is not common. Sometimes it has an "!" sometimes not.

Right, makes sense.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2.1] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
  2020-02-18  9:12     ` David Hildenbrand
  2020-02-18 21:18       ` Christian Borntraeger
@ 2020-02-19 11:01       ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 11:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Ulrich.Weigand, cohuck, frankja, frankja, gor, imbrenda, kvm,
	linux-s390, mimu, thuth



On 18.02.20 10:12, David Hildenbrand wrote:
ock the VCPU, it cannot be running, right?
> 
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		mutex_lock(&vcpu->mutex);
>> +		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
>> +		mutex_unlock(&vcpu->mutex);
>> +		if (r)
>> +			break;
>> +	}
> 
> Can this actually ever fail? 

Every reason for failure seems to be wrong usage by KVM. So it should not fail and if if does
we should probably warn. 

If so, you would leave half-initialized
> state around. Warn and continue?

So yes, maybe warn and try to continue to destroy the remaining CPUs.

> 
> Especially, kvm_arch_vcpu_destroy() ignores any error from
> kvm_s390_pv_destroy_cpu() as well ...
> 
> IMHO, we should make kvm_s390_switch_from_pv() and
> kvm_s390_pv_destroy_cpu() never fail.

ack.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH 2/2] merge vm/cpu create
  2020-02-17 15:00             ` Janosch Frank
  2020-02-17 15:02               ` Christian Borntraeger
@ 2020-02-19 11:02               ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 11:02 UTC (permalink / raw)
  To: Janosch Frank, david
  Cc: Ulrich.Weigand, cohuck, frankja, gor, imbrenda, kvm, linux-s390,
	mimu, thuth



On 17.02.20 16:00, Janosch Frank wrote:

>> +static int kvm_s390_switch_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
>> +{
>> +	int i, r = 0;
>> +
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		r = kvm_s390_pv_destroy_cpu(vcpu, rc, rrc);
>> +		if (r)
>> +			break;
>> +	}
>> +	return r;
>> +}
>> +
>> +static int kvm_s390_switch_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc)
>> +{
>> +	int i, r = 0;
>> +	u16 dummy;
>> +
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		r = kvm_s390_pv_create_cpu(vcpu, rc, rrc);
>> +		if (r)
>> +			break;
>> +	}
>> +	if (r)
>> +		kvm_s390_switch_from_pv(kvm,&dummy, &dummy);
>> +	return r;
>> +}
> 
> Why does that only affect the cpus?
> If we have a switch function it should do VM and VCPUs, no?

I will rename that to kvm_s390_switch_cpus_to/from.


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state
  2020-02-18  9:50   ` David Hildenbrand
@ 2020-02-19 11:06     ` Christian Borntraeger
  2020-02-19 11:08       ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 11:06 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 10:50, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> Code 5 for the set cpu state UV call tells the UV to load a PSW from
>> the SE header (first IPL) or from guest location 0x0 (diag 308 subcode
>> 0/1). Also it sets the cpu into operating state afterwards, so we can
>> start it.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/include/asm/uv.h | 1 +
>>  arch/s390/kvm/kvm-s390.c   | 8 ++++++++
>>  include/uapi/linux/kvm.h   | 1 +
>>  3 files changed, 10 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>> index 7b82881ec3b4..d59825d95b9d 100644
>> --- a/arch/s390/include/asm/uv.h
>> +++ b/arch/s390/include/asm/uv.h
>> @@ -169,6 +169,7 @@ struct uv_cb_unp {
>>  #define PV_CPU_STATE_OPR	1
>>  #define PV_CPU_STATE_STP	2
>>  #define PV_CPU_STATE_CHKSTP	3
>> +#define PV_CPU_STATE_OPR_LOAD	5
>>  
>>  struct uv_cb_cpu_set_state {
>>  	struct uv_cb_header header;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 5426b01e3da1..b6113285f47f 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4656,6 +4656,14 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
>>  		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
>>  		break;
>>  	}
>> +	case KVM_PV_VCPU_SET_IPL_PSW: {
>> +		if (!kvm_s390_pv_handle_cpu(vcpu))
>> +			return -EINVAL;
>> +
>> +		r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD,
>> +					      &cmd->rc, &cmd->rrc);
> 
> Can we squeeze that into kvm_arch_vcpu_ioctl_set_mpstate() instead? The
> interface seems to do exactly what you want it to do.
> 
> KVM_MP_STATE_OPERATING_LOAD
> 
> Allow it only when in PV.

Ack, something like this seems to work.


diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a95c82089386..9496616cccc7 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3710,6 +3710,12 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
                kvm_s390_vcpu_start(vcpu);
                break;
        case KVM_MP_STATE_LOAD:
+               if (!kvm_s390_pv_cpu_is_protected(vcpu)) {
+                       rc = -ENXIO;
+                       break;
+               }
+               kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD);
+               break;
        case KVM_MP_STATE_CHECK_STOP:
                /* fall through - CHECK_STOP and LOAD are not supported yet */
        default:


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state
  2020-02-19 11:06     ` Christian Borntraeger
@ 2020-02-19 11:08       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-19 11:08 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 19.02.20 12:06, Christian Borntraeger wrote:
> 
> 
> On 18.02.20 10:50, David Hildenbrand wrote:
>> On 14.02.20 23:26, Christian Borntraeger wrote:
>>> From: Janosch Frank <frankja@linux.ibm.com>
>>>
>>> Code 5 for the set cpu state UV call tells the UV to load a PSW from
>>> the SE header (first IPL) or from guest location 0x0 (diag 308 subcode
>>> 0/1). Also it sets the cpu into operating state afterwards, so we can
>>> start it.
>>>
>>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> ---
>>>  arch/s390/include/asm/uv.h | 1 +
>>>  arch/s390/kvm/kvm-s390.c   | 8 ++++++++
>>>  include/uapi/linux/kvm.h   | 1 +
>>>  3 files changed, 10 insertions(+)
>>>
>>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>>> index 7b82881ec3b4..d59825d95b9d 100644
>>> --- a/arch/s390/include/asm/uv.h
>>> +++ b/arch/s390/include/asm/uv.h
>>> @@ -169,6 +169,7 @@ struct uv_cb_unp {
>>>  #define PV_CPU_STATE_OPR	1
>>>  #define PV_CPU_STATE_STP	2
>>>  #define PV_CPU_STATE_CHKSTP	3
>>> +#define PV_CPU_STATE_OPR_LOAD	5
>>>  
>>>  struct uv_cb_cpu_set_state {
>>>  	struct uv_cb_header header;
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 5426b01e3da1..b6113285f47f 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -4656,6 +4656,14 @@ static int kvm_s390_handle_pv_vcpu(struct kvm_vcpu *vcpu,
>>>  		r = kvm_s390_pv_destroy_cpu(vcpu, &cmd->rc, &cmd->rrc);
>>>  		break;
>>>  	}
>>> +	case KVM_PV_VCPU_SET_IPL_PSW: {
>>> +		if (!kvm_s390_pv_handle_cpu(vcpu))
>>> +			return -EINVAL;
>>> +
>>> +		r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD,
>>> +					      &cmd->rc, &cmd->rrc);
>>
>> Can we squeeze that into kvm_arch_vcpu_ioctl_set_mpstate() instead? The
>> interface seems to do exactly what you want it to do.
>>
>> KVM_MP_STATE_OPERATING_LOAD
>>
>> Allow it only when in PV.
> 
> Ack, something like this seems to work.
> 
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index a95c82089386..9496616cccc7 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3710,6 +3710,12 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>                 kvm_s390_vcpu_start(vcpu);
>                 break;
>         case KVM_MP_STATE_LOAD:
> +               if (!kvm_s390_pv_cpu_is_protected(vcpu)) {
> +                       rc = -ENXIO;
> +                       break;
> +               }
> +               kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD);
> +               break;
>         case KVM_MP_STATE_CHECK_STOP:
>                 /* fall through - CHECK_STOP and LOAD are not supported yet */
>         default:
> 

We might want to allow KVM_MP_STATE_LOAD always. Without PV, it's
currently just a NOP. Whatever you think is best.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1
  2020-02-18  9:44   ` David Hildenbrand
@ 2020-02-19 11:53     ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 11:53 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 10:44, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> diag 308 subcode 0 and 1 require several KVM and Ultravisor interactions.
>> Specific to these "soft" reboots are
>>
>> * The "unshare all" UVC
>> * The "prepare for reset" UVC
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/include/asm/uv.h |  4 ++++
>>  arch/s390/kvm/kvm-s390.c   | 22 ++++++++++++++++++++++
>>  include/uapi/linux/kvm.h   |  2 ++
>>  3 files changed, 28 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>> index 839cb3a89986..254d5769d136 100644
>> --- a/arch/s390/include/asm/uv.h
>> +++ b/arch/s390/include/asm/uv.h
>> @@ -36,6 +36,8 @@
>>  #define UVC_CMD_SET_SEC_CONF_PARAMS	0x0300
>>  #define UVC_CMD_UNPACK_IMG		0x0301
>>  #define UVC_CMD_VERIFY_IMG		0x0302
>> +#define UVC_CMD_PREPARE_RESET		0x0320
>> +#define UVC_CMD_SET_UNSHARE_ALL		0x0340
>>  #define UVC_CMD_PIN_PAGE_SHARED		0x0341
>>  #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
>>  #define UVC_CMD_SET_SHARED_ACCESS	0x1000
>> @@ -56,6 +58,8 @@ enum uv_cmds_inst {
>>  	BIT_UVC_CMD_SET_SEC_PARMS = 11,
>>  	BIT_UVC_CMD_UNPACK_IMG = 13,
>>  	BIT_UVC_CMD_VERIFY_IMG = 14,
>> +	BIT_UVC_CMD_PREPARE_RESET = 18,
>> +	BIT_UVC_CMD_UNSHARE_ALL = 20,
>>  	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
>>  	BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22,
>>  };
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index f96c1f530cc2..ad84c1144908 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2285,6 +2285,28 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
>>  			     cmd->rrc);
>>  		break;
>>  	}
>> +	case KVM_PV_VM_PREP_RESET: {
>> +		r = -EINVAL;
>> +		if (!kvm_s390_pv_is_protected(kvm))
>> +			break;
>> +
>> +		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
>> +				  UVC_CMD_PREPARE_RESET, &cmd->rc, &cmd->rrc);
>> +		KVM_UV_EVENT(kvm, 3, "PROTVIRT PREP RESET: rc %x rrc %x",
>> +			     cmd->rc, cmd->rrc);
>> +		break;
>> +	}
>> +	case KVM_PV_VM_UNSHARE_ALL: {
>> +		r = -EINVAL;
>> +		if (!kvm_s390_pv_is_protected(kvm))
>> +			break;
>> +
>> +		r = uv_cmd_nodata(kvm_s390_pv_handle(kvm),
>> +				  UVC_CMD_SET_UNSHARE_ALL, &cmd->rc, &cmd->rrc);
>> +		KVM_UV_EVENT(kvm, 3, "PROTVIRT UNSHARE: rc %x rrc %x",
>> +			     cmd->rc, cmd->rrc);
> 
> I do wonder if that has any possible races with CPUs currently running,
> which try to mark pages shared. That no other VCPUs are running is not
> enforced here AFAIKs.

The only concern would be that the newly bootet guest (e.g. via kexec/kdump) will
have pages shared that it dont want to share. As it is the responsiblity of the
ultravisor to prevent data leakage, it is enforced that we do an UNSHARE all on 
diag 308 subcode 0 and 1. At the same time the ultravisor will enforce that no
CPU will run before we have finished  the full dance of reset activity.  (The 
ultravisor must guarantee guest secrecy and integrity no matter what KVM does). 

> 
> Apart from that, looks good to me.
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling
  2020-02-18  9:38   ` David Hildenbrand
@ 2020-02-19 12:45     ` Christian Borntraeger
  0 siblings, 0 replies; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 12:45 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 10:38, David Hildenbrand wrote:
[...]

set_kvm_facility(kvm->arch.model.fac_mask, 65);
>>  
>> +	if (is_prot_virt_host()) {
>> +		set_kvm_facility(kvm->arch.model.fac_mask, 161);
>> +		set_kvm_facility(kvm->arch.model.fac_list, 161);
>> +	}
>> +
> 
> Aren't these IPL subcodes completely emulated in QEMU? If so, rather
> QEMU with support should enable them when the kernel capability for PV
> (=== is_prot_virt_host()) is in place.

ack. will drop this patch.



^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor
  2020-02-18  9:48   ` David Hildenbrand
@ 2020-02-19 19:36     ` Christian Borntraeger
  2020-02-19 19:46       ` Christian Borntraeger
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 19:36 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank



On 18.02.20 10:48, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Janosch Frank <frankja@linux.ibm.com>
>>
>> VCPU states have to be reported to the ultravisor for SIGP
>> interpretation, kdump, kexec and reboot.
>>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  arch/s390/include/asm/uv.h | 15 +++++++++++++++
>>  arch/s390/kvm/kvm-s390.c   |  7 ++++++-
>>  arch/s390/kvm/kvm-s390.h   |  2 ++
>>  arch/s390/kvm/pv.c         | 22 ++++++++++++++++++++++
>>  4 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>> index 254d5769d136..7b82881ec3b4 100644
>> --- a/arch/s390/include/asm/uv.h
>> +++ b/arch/s390/include/asm/uv.h
>> @@ -37,6 +37,7 @@
>>  #define UVC_CMD_UNPACK_IMG		0x0301
>>  #define UVC_CMD_VERIFY_IMG		0x0302
>>  #define UVC_CMD_PREPARE_RESET		0x0320
>> +#define UVC_CMD_CPU_SET_STATE		0x0330
>>  #define UVC_CMD_SET_UNSHARE_ALL		0x0340
>>  #define UVC_CMD_PIN_PAGE_SHARED		0x0341
>>  #define UVC_CMD_UNPIN_PAGE_SHARED	0x0342
>> @@ -58,6 +59,7 @@ enum uv_cmds_inst {
>>  	BIT_UVC_CMD_SET_SEC_PARMS = 11,
>>  	BIT_UVC_CMD_UNPACK_IMG = 13,
>>  	BIT_UVC_CMD_VERIFY_IMG = 14,
>> +	BIT_UVC_CMD_CPU_SET_STATE = 17,
>>  	BIT_UVC_CMD_PREPARE_RESET = 18,
>>  	BIT_UVC_CMD_UNSHARE_ALL = 20,
>>  	BIT_UVC_CMD_PIN_PAGE_SHARED = 21,
>> @@ -164,6 +166,19 @@ struct uv_cb_unp {
>>  	u64 reserved38[3];
>>  } __packed __aligned(8);
>>  
>> +#define PV_CPU_STATE_OPR	1
>> +#define PV_CPU_STATE_STP	2
>> +#define PV_CPU_STATE_CHKSTP	3
>> +
>> +struct uv_cb_cpu_set_state {
>> +	struct uv_cb_header header;
>> +	u64 reserved08[2];
>> +	u64 cpu_handle;
>> +	u8  reserved20[7];
>> +	u8  state;
>> +	u64 reserved28[5];
>> +};
>> +
>>  /*
>>   * A common UV call struct for calls that take no payload
>>   * Examples:
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index ad84c1144908..5426b01e3da1 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4396,6 +4396,7 @@ static void __enable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
>>  void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>>  {
>>  	int i, online_vcpus, started_vcpus = 0;
>> +	u16 rc, rrc;
>>  
>>  	if (!is_vcpu_stopped(vcpu))
>>  		return;
>> @@ -4421,7 +4422,8 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>>  		 */
>>  		__disable_ibs_on_all_vcpus(vcpu->kvm);
>>  	}
>> -
>> +	/* Let's tell the UV that we want to start again */
>> +	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR, &rc, &rrc);
>>  	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
>>  	/*
>>  	 * Another VCPU might have used IBS while we were offline.
>> @@ -4436,6 +4438,7 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
>>  {
>>  	int i, online_vcpus, started_vcpus = 0;
>>  	struct kvm_vcpu *started_vcpu = NULL;
>> +	u16 rc, rrc;
>>  
>>  	if (is_vcpu_stopped(vcpu))
>>  		return;
>> @@ -4449,6 +4452,8 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
>>  	kvm_s390_clear_stop_irq(vcpu);
>>  
>>  	kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
>> +	/* Let's tell the UV that we successfully stopped the vcpu */
>> +	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_STP, &rc, &rrc);
>>  	__disable_ibs_on_vcpu(vcpu);
>>  
>>  	for (i = 0; i < online_vcpus; i++) {
>> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
>> index d5503dd0d1e4..1af1e30beead 100644
>> --- a/arch/s390/kvm/kvm-s390.h
>> +++ b/arch/s390/kvm/kvm-s390.h
>> @@ -218,6 +218,8 @@ int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc,
>>  			      u16 *rrc);
>>  int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
>>  		       unsigned long tweak, u16 *rc, u16 *rrc);
>> +int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
>> +			      u16 *rrc);
>>  
>>  static inline bool kvm_s390_pv_is_protected(struct kvm *kvm)
>>  {
>> diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
>> index 80169a9b43ec..b4bf6b6eb708 100644
>> --- a/arch/s390/kvm/pv.c
>> +++ b/arch/s390/kvm/pv.c
>> @@ -271,3 +271,25 @@ int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size,
>>  		KVM_UV_EVENT(kvm, 3, "%s", "PROTVIRT VM UNPACK: successful");
>>  	return ret;
>>  }
>> +
>> +int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state, u16 *rc,
>> +			      u16 *rrc)
>> +{
>> +	struct uv_cb_cpu_set_state uvcb = {
>> +		.header.cmd	= UVC_CMD_CPU_SET_STATE,
>> +		.header.len	= sizeof(uvcb),
>> +		.cpu_handle	= kvm_s390_pv_handle_cpu(vcpu),
>> +		.state		= state,
>> +	};
>> +	int cc;
>> +
>> +	if (!kvm_s390_pv_handle_cpu(vcpu))
> 
> I'd actually prefer to move this to the caller. (and sue the _protected
> variant)

ack this makes sense.
> 
>> +		return -EINVAL;
>> +
>> +	cc = uv_call(0, (u64)&uvcb);
>> +	*rc = uvcb.header.rc;
>> +	*rrc = uvcb.header.rrc;
>> +	if (cc)
>> +		return -EINVAL;
> 
> All return values are ignored. warn instead and make this a void function?

Hmm, I think userspace can trigger errors (e.g. invalid state or invalid point in time)
So Warning is not good.  We should actually return an error to userspace I guess.

by requiring user sigp for protvirt we can minimize the changes.
Something like this on top of the current tree


diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7bc63e0ed740..5c99a0441f70 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2441,6 +2441,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	case KVM_S390_PV_COMMAND: {
 		struct kvm_pv_cmd args;
 
+		/* protvirt means user sigp */
+		kvm->arch.user_cpu_state_ctrl = 1;
 		r = 0;
 		if (!is_prot_virt_host()) {
 			r = -EINVAL;
@@ -3702,17 +3704,17 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 
 	switch (mp_state->mp_state) {
 	case KVM_MP_STATE_STOPPED:
-		kvm_s390_vcpu_stop(vcpu);
+		rc = kvm_s390_vcpu_stop(vcpu);
 		break;
 	case KVM_MP_STATE_OPERATING:
-		kvm_s390_vcpu_start(vcpu);
+		rc = kvm_s390_vcpu_start(vcpu);
 		break;
 	case KVM_MP_STATE_LOAD:
 		if (!kvm_s390_pv_cpu_is_protected(vcpu)) {
 			rc = -ENXIO;
 			break;
 		}
-		kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD);
+		rc = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD);
 		break;
 	case KVM_MP_STATE_CHECK_STOP:
 		/* fall through - CHECK_STOP and LOAD are not supported yet */
@@ -4444,12 +4446,12 @@ static void __enable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
 	kvm_s390_sync_request(KVM_REQ_ENABLE_IBS, vcpu);
 }
 
-void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
+int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 {
-	int i, online_vcpus, started_vcpus = 0;
+	int i, online_vcpus, r= 0, started_vcpus = 0;
 
 	if (!is_vcpu_stopped(vcpu))
-		return;
+		return 0;
 
 	trace_kvm_s390_vcpu_start_stop(vcpu->vcpu_id, 1);
 	/* Only one cpu at a time may enter/leave the STOPPED state. */
@@ -4473,7 +4475,8 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 		__disable_ibs_on_all_vcpus(vcpu->kvm);
 	}
 	/* Let's tell the UV that we want to start again */
-	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR);
+	if (kvm_s390_pv_cpu_is_protected(vcpu))
+		r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR);
 	kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
 	/*
 	 * The real PSW might have changed due to a RESTART interpreted by the
@@ -4488,16 +4491,16 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 	 */
 	kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
 	spin_unlock(&vcpu->kvm->arch.start_stop_lock);
-	return;
+	return r;
 }
 
-void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
+int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
 {
-	int i, online_vcpus, started_vcpus = 0;
+	int i, online_vcpus, r = 0, started_vcpus = 0;
 	struct kvm_vcpu *started_vcpu = NULL;
 
 	if (is_vcpu_stopped(vcpu))
-		return;
+		return 0;
 
 	trace_kvm_s390_vcpu_start_stop(vcpu->vcpu_id, 0);
 	/* Only one cpu at a time may enter/leave the STOPPED state. */
@@ -4509,7 +4512,8 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
 
 	kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
 	/* Let's tell the UV that we successfully stopped the vcpu */
-	kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_STP);
+	if (kvm_s390_pv_cpu_is_protected(vcpu))
+		r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_STP);
 	__disable_ibs_on_vcpu(vcpu);
 
 	for (i = 0; i < online_vcpus; i++) {
@@ -4528,7 +4532,7 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
 	}
 
 	spin_unlock(&vcpu->kvm->arch.start_stop_lock);
-	return;
+	return r;
 }
 
 static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index cd05c1bbda0e..e9e1996d643b 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -333,8 +333,8 @@ void kvm_s390_set_tod_clock(struct kvm *kvm,
 long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
 int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
 int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
-void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu);
-void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu);
+int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu);
+int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu);
 void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu);
 void kvm_s390_vcpu_unblock(struct kvm_vcpu *vcpu);
 bool kvm_s390_vcpu_sie_inhibited(struct kvm_vcpu *vcpu);


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor
  2020-02-19 19:36     ` Christian Borntraeger
@ 2020-02-19 19:46       ` Christian Borntraeger
  2020-02-20 10:52         ` David Hildenbrand
  0 siblings, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-19 19:46 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

I could add a comment to all other users of kvm_s390_vcpu_start/stop like


        /*
         * no need to check the return value of vcpu_stop as it can only have
         * an error for protvirt, but protvirt means user cpu state
         */
        if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
                kvm_s390_vcpu_stop(vcpu);


to make clear why we do not check the return everywhere


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor
  2020-02-19 19:46       ` Christian Borntraeger
@ 2020-02-20 10:52         ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-20 10:52 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik,
	Janosch Frank

On 19.02.20 20:46, Christian Borntraeger wrote:
> I could add a comment to all other users of kvm_s390_vcpu_start/stop like
> 
> 
>         /*
>          * no need to check the return value of vcpu_stop as it can only have
>          * an error for protvirt, but protvirt means user cpu state
>          */
>         if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
>                 kvm_s390_vcpu_stop(vcpu);
> 
> 
> to make clear why we do not check the return everywhere
> 

Makes sense!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages
  2020-02-17  9:43   ` David Hildenbrand
@ 2020-02-20 12:18     ` David Hildenbrand
  2020-02-20 13:31     ` Christian Borntraeger
  1 sibling, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-20 12:18 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

On 17.02.20 10:43, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
>>
>> The adapter interrupt page containing the indicator bits is currently
>> pinned. That means that a guest with many devices can pin a lot of
>> memory pages in the host. This also complicates the reference tracking
>> which is needed for memory management handling of protected virtual
>> machines. It might also have some strange side effects for madvise
>> MADV_DONTNEED and other things.
>>
>> We can simply try to get the userspace page set the bits and free the
>> page. By storing the userspace address in the irq routing entry instead
>> of the guest address we can actually avoid many lookups and list walks
>> so that this variant is very likely not slower.
>>
>> If userspace messes around with the memory slots the worst thing that
>> can happen is that we write to some other memory within that process.
>> As we get the the page with FOLL_WRITE this can also not be used to
>> write to shared read-only pages.
>>
>> Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
>> [borntraeger@de.ibm.com: patch simplification]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  Documentation/virt/kvm/devices/s390_flic.rst |  11 +-
>>  arch/s390/include/asm/kvm_host.h             |   3 -
>>  arch/s390/kvm/interrupt.c                    | 170 ++++++-------------
>>  3 files changed, 53 insertions(+), 131 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst
>> index 954190da7d04..ea96559ba501 100644
>> --- a/Documentation/virt/kvm/devices/s390_flic.rst
>> +++ b/Documentation/virt/kvm/devices/s390_flic.rst
>> @@ -108,16 +108,9 @@ Groups:
>>        mask or unmask the adapter, as specified in mask
>>  
>>      KVM_S390_IO_ADAPTER_MAP
>> -      perform a gmap translation for the guest address provided in addr,
>> -      pin a userspace page for the translated address and add it to the
>> -      list of mappings
>> -
>> -      .. note:: A new mapping will be created unconditionally; therefore,
>> -	        the calling code should avoid making duplicate mappings.
>> -
>> +      This is now a no-op. The mapping is purely done by the irq route.
>>      KVM_S390_IO_ADAPTER_UNMAP
>> -      release a userspace page for the translated address specified in addr
>> -      from the list of mappings
>> +      This is now a no-op. The mapping is purely done by the irq route.
>>  
> 
> The interface should have accepted a hva from the very start and not
> guest addresses ...
> 
> [...]
> 
>>  
>>  static int modify_io_adapter(struct kvm_device *dev,
>> @@ -2456,12 +2378,13 @@ static int modify_io_adapter(struct kvm_device *dev,
>>  		if (ret > 0)
>>  			ret = 0;
>>  		break;
>> +	/*
>> +	 * We resolve the gpa to hva when setting the IRQ routing. the set_irq
>> +	 * code uses get_user_pages_remote to do the actual write.
> 
> nit: "get_user_pages_remote()"
> 
>> +	 */
>>  	case KVM_S390_IO_ADAPTER_MAP:
>> -		ret = kvm_s390_adapter_map(dev->kvm, req.id, req.addr);
>> -		break;
>>  	case KVM_S390_IO_ADAPTER_UNMAP:
>> -		ret = kvm_s390_adapter_unmap(dev->kvm, req.id, req.addr);
>> -		break;
>> +		return 0;
>>  	default:
>>  		ret = -EINVAL;
>>  	}
>> @@ -2699,19 +2622,21 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
>>  	return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit;
>>  }
>>  
>> -static struct s390_map_info *get_map_info(struct s390_io_adapter *adapter,
>> -					  u64 addr)
>> +static struct page *get_map_page(struct kvm *kvm,
>> +				 struct s390_io_adapter *adapter,
>> +				 u64 uaddr)
>>  {
>> -	struct s390_map_info *map;
>> +	struct page *page = NULL;
>>  
>>  	if (!adapter)
>>  		return NULL;
> 
> AFAIKs, this check is not necessary.
> 
>> -
>> -	list_for_each_entry(map, &adapter->maps, list) {
>> -		if (map->guest_addr == addr)
>> -			return map;
>> -	}
>> -	return NULL;
>> +	if (!uaddr)
>> +		return NULL;
> 
> I do wonder if that check is necessary. I don't think so but might be
> missing something.
> 
>> +	down_read(&kvm->mm->mmap_sem);
>> +	get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE,
>> +			      &page, NULL, NULL);
>> +	up_read(&kvm->mm->mmap_sem);
>> +	return page;
>>  }
>>  
>>  static int adapter_indicators_set(struct kvm *kvm,
>> @@ -2720,30 +2645,35 @@ static int adapter_indicators_set(struct kvm *kvm,
>>  {
>>  	unsigned long bit;
>>  	int summary_set, idx;
>> -	struct s390_map_info *info;
>> +	struct page *ind_page, *summary_page;
>>  	void *map;
>>  
>> -	info = get_map_info(adapter, adapter_int->ind_addr);
>> -	if (!info)
>> +	ind_page = get_map_page(kvm, adapter, adapter_int->ind_addr);
>> +	if (!ind_page)
>>  		return -1;
>> -	map = page_address(info->page);
>> -	bit = get_ind_bit(info->addr, adapter_int->ind_offset, adapter->swap);
>> -	set_bit(bit, map);
>> -	idx = srcu_read_lock(&kvm->srcu);
>> -	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
>> -	set_page_dirty_lock(info->page);
>> -	info = get_map_info(adapter, adapter_int->summary_addr);
>> -	if (!info) {
>> -		srcu_read_unlock(&kvm->srcu, idx);
>> +	summary_page = get_map_page(kvm, adapter, adapter_int->summary_addr);
>> +	if (!summary_page) {
>> +		put_page(ind_page);
>>  		return -1;
>>  	}
>> -	map = page_address(info->page);
>> -	bit = get_ind_bit(info->addr, adapter_int->summary_offset,
>> -			  adapter->swap);
>> +
>> +	idx = srcu_read_lock(&kvm->srcu);
>> +	map = page_address(ind_page);
>> +	bit = get_ind_bit(adapter_int->ind_addr,
>> +			  adapter_int->ind_offset, adapter->swap);
>> +	set_bit(bit, map);
>> +	mark_page_dirty(kvm, adapter_int->ind_addr >> PAGE_SHIFT);
>> +	set_page_dirty_lock(ind_page);
>> +	map = page_address(summary_page);
>> +	bit = get_ind_bit(adapter_int->summary_addr,
>> +			  adapter_int->summary_offset, adapter->swap);
>>  	summary_set = test_and_set_bit(bit, map);
>> -	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
>> -	set_page_dirty_lock(info->page);
>> +	mark_page_dirty(kvm, adapter_int->summary_addr >> PAGE_SHIFT);
>> +	set_page_dirty_lock(summary_page);
>>  	srcu_read_unlock(&kvm->srcu, idx);
>> +
>> +	put_page(ind_page);
>> +	put_page(summary_page);
>>  	return summary_set ? 0 : 1;
>>  }
>>  
>> @@ -2765,9 +2695,7 @@ static int set_adapter_int(struct kvm_kernel_irq_routing_entry *e,
>>  	adapter = get_io_adapter(kvm, e->adapter.adapter_id);
>>  	if (!adapter)
>>  		return -1;
>> -	down_read(&adapter->maps_lock);
>>  	ret = adapter_indicators_set(kvm, adapter, &e->adapter);
>> -	up_read(&adapter->maps_lock);
>>  	if ((ret > 0) && !adapter->masked) {
>>  		ret = kvm_s390_inject_airq(kvm, adapter);
>>  		if (ret == 0)
>> @@ -2818,23 +2746,27 @@ int kvm_set_routing_entry(struct kvm *kvm,
>>  			  struct kvm_kernel_irq_routing_entry *e,
>>  			  const struct kvm_irq_routing_entry *ue)
>>  {
>> -	int ret;
>> +	u64 uaddr;
>>  
>>  	switch (ue->type) {
>> +	/* we store the userspace addresses instead of the guest addresses */
>>  	case KVM_IRQ_ROUTING_S390_ADAPTER:
>>  		e->set = set_adapter_int;
>> -		e->adapter.summary_addr = ue->u.adapter.summary_addr;
>> -		e->adapter.ind_addr = ue->u.adapter.ind_addr;
>> +		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);
>> +		if (uaddr == -EFAULT)
>> +			return -EFAULT;
>> +		e->adapter.summary_addr = uaddr;
>> +		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr);
>> +		if (uaddr == -EFAULT)
>> +			return -EFAULT;
> 
> AFAIK, leaving e->adapter.summary_addr set is not an issue.
> 
> Interesting, in kvm_s390_adapter_map(), we didn't synchronize again slot
> updates when doing the gmap_translate(), which looks wrong to me ...
> 
> It seems to be the same thing here. I do wonder if it is safe to do a
> gmap_translate() here, looks like this can race with
> kvm_arch_commit_memory_region().
> 
> I would have assumed we need e.g., the slots_lock while doing the
> gmap_translate() - or a srcu_read_lock(&vcpu->kvm->srcu) or similar ...
> 
> 
> Apart from that, looks good to me.
> 

I think you missed this mail.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages
  2020-02-17  9:43   ` David Hildenbrand
  2020-02-20 12:18     ` David Hildenbrand
@ 2020-02-20 13:31     ` Christian Borntraeger
  2020-02-20 13:34       ` David Hildenbrand
  1 sibling, 1 reply; 132+ messages in thread
From: Christian Borntraeger @ 2020-02-20 13:31 UTC (permalink / raw)
  To: David Hildenbrand, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik



On 17.02.20 10:43, David Hildenbrand wrote:
> On 14.02.20 23:26, Christian Borntraeger wrote:
>> From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
>>
>> The adapter interrupt page containing the indicator bits is currently
>> pinned. That means that a guest with many devices can pin a lot of
>> memory pages in the host. This also complicates the reference tracking
>> which is needed for memory management handling of protected virtual
>> machines. It might also have some strange side effects for madvise
>> MADV_DONTNEED and other things.
>>
>> We can simply try to get the userspace page set the bits and free the
>> page. By storing the userspace address in the irq routing entry instead
>> of the guest address we can actually avoid many lookups and list walks
>> so that this variant is very likely not slower.
>>
>> If userspace messes around with the memory slots the worst thing that
>> can happen is that we write to some other memory within that process.
>> As we get the the page with FOLL_WRITE this can also not be used to
>> write to shared read-only pages.
>>
>> Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
>> [borntraeger@de.ibm.com: patch simplification]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  Documentation/virt/kvm/devices/s390_flic.rst |  11 +-
>>  arch/s390/include/asm/kvm_host.h             |   3 -
>>  arch/s390/kvm/interrupt.c                    | 170 ++++++-------------
>>  3 files changed, 53 insertions(+), 131 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst
>> index 954190da7d04..ea96559ba501 100644
>> --- a/Documentation/virt/kvm/devices/s390_flic.rst
>> +++ b/Documentation/virt/kvm/devices/s390_flic.rst
>> @@ -108,16 +108,9 @@ Groups:
>>        mask or unmask the adapter, as specified in mask
>>  
>>      KVM_S390_IO_ADAPTER_MAP
>> -      perform a gmap translation for the guest address provided in addr,
>> -      pin a userspace page for the translated address and add it to the
>> -      list of mappings
>> -
>> -      .. note:: A new mapping will be created unconditionally; therefore,
>> -	        the calling code should avoid making duplicate mappings.
>> -
>> +      This is now a no-op. The mapping is purely done by the irq route.
>>      KVM_S390_IO_ADAPTER_UNMAP
>> -      release a userspace page for the translated address specified in addr
>> -      from the list of mappings
>> +      This is now a no-op. The mapping is purely done by the irq route.
>>  
> 
> The interface should have accepted a hva from the very start and not
> guest addresses ...

right, but that is history. 

> 
> [...]

>> +	/*
>> +	 * We resolve the gpa to hva when setting the IRQ routing. the set_irq
>> +	 * code uses get_user_pages_remote to do the actual write.
> 
> nit: "get_user_pages_remote()"

ack.

> 
>> +	 */
>>  	case KVM_S390_IO_ADAPTER_MAP:
>> -		ret = kvm_s390_adapter_map(dev->kvm, req.id, req.addr);
>> -		break;
>>  	case KVM_S390_IO_ADAPTER_UNMAP:
>> -		ret = kvm_s390_adapter_unmap(dev->kvm, req.id, req.addr);
>> -		break;
>> +		return 0;
>>  	default:
>>  		ret = -EINVAL;
>>  	}
>> @@ -2699,19 +2622,21 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
>>  	return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit;
>>  }
>>  
>> -static struct s390_map_info *get_map_info(struct s390_io_adapter *adapter,
>> -					  u64 addr)
>> +static struct page *get_map_page(struct kvm *kvm,
>> +				 struct s390_io_adapter *adapter,
>> +				 u64 uaddr)
>>  {
>> -	struct s390_map_info *map;
>> +	struct page *page = NULL;
>>  
>>  	if (!adapter)
>>  		return NULL;
> 
> AFAIKs, this check is not necessary.

Right otherwise we would crash earlier.


> 
>> -
>> -	list_for_each_entry(map, &adapter->maps, list) {
>> -		if (map->guest_addr == addr)
>> -			return map;
>> -	}
>> -	return NULL;
>> +	if (!uaddr)
>> +		return NULL;
> 
> I do wonder if that check is necessary. I don't think so but might be
> missing something.

Nothing should break when we remove this check. get_user_pages_remote will
also return NULL (as newer kernels usually forbid mapping things at 0).

Will remove. 

[...]

>> @@ -2818,23 +2746,27 @@ int kvm_set_routing_entry(struct kvm *kvm,
>>  			  struct kvm_kernel_irq_routing_entry *e,
>>  			  const struct kvm_irq_routing_entry *ue)
>>  {
>> -	int ret;
>> +	u64 uaddr;
>>  
>>  	switch (ue->type) {
>> +	/* we store the userspace addresses instead of the guest addresses */
>>  	case KVM_IRQ_ROUTING_S390_ADAPTER:
>>  		e->set = set_adapter_int;
>> -		e->adapter.summary_addr = ue->u.adapter.summary_addr;
>> -		e->adapter.ind_addr = ue->u.adapter.ind_addr;
>> +		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);
>> +		if (uaddr == -EFAULT)
>> +			return -EFAULT;
>> +		e->adapter.summary_addr = uaddr;
>> +		uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr);
>> +		if (uaddr == -EFAULT)
>> +			return -EFAULT;
> 
> AFAIK, leaving e->adapter.summary_addr set is not an issue.
> 
> Interesting, in kvm_s390_adapter_map(), we didn't synchronize again slot
> updates when doing the gmap_translate(), which looks wrong to me ...
> 
> It seems to be the same thing here. I do wonder if it is safe to do a
> gmap_translate() here, looks like this can race with
> kvm_arch_commit_memory_region().
> 
> I would have assumed we need e.g., the slots_lock while doing the
> gmap_translate() - or a srcu_read_lock(&vcpu->kvm->srcu) or similar ...

gmap_translate does this via the gmap and it holds the mm sem. gmap_unmap_segment
takes the same lock. So I think we are ok here.
 
> 
> Apart from that, looks good to me.
> 


^ permalink raw reply	[flat|nested] 132+ messages in thread

* Re: [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages
  2020-02-20 13:31     ` Christian Borntraeger
@ 2020-02-20 13:34       ` David Hildenbrand
  0 siblings, 0 replies; 132+ messages in thread
From: David Hildenbrand @ 2020-02-20 13:34 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank
  Cc: KVM, Cornelia Huck, Thomas Huth, Ulrich Weigand,
	Claudio Imbrenda, linux-s390, Michael Mueller, Vasily Gorbik

>> AFAIK, leaving e->adapter.summary_addr set is not an issue.
>>
>> Interesting, in kvm_s390_adapter_map(), we didn't synchronize again slot
>> updates when doing the gmap_translate(), which looks wrong to me ...
>>
>> It seems to be the same thing here. I do wonder if it is safe to do a
>> gmap_translate() here, looks like this can race with
>> kvm_arch_commit_memory_region().
>>
>> I would have assumed we need e.g., the slots_lock while doing the
>> gmap_translate() - or a srcu_read_lock(&vcpu->kvm->srcu) or similar ...
> 
> gmap_translate does this via the gmap and it holds the mm sem. gmap_unmap_segment
> takes the same lock. So I think we are ok here.

Ahh, I was looking at __gmap_translate(). Makes sense.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 132+ messages in thread

end of thread, back to index

Thread overview: 132+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-14 22:26 [PATCH v2 00/42] KVM: s390: Add support for protected VMs Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 01/42] mm:gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
2020-02-17  9:14   ` David Hildenbrand
2020-02-17 11:10     ` Christian Borntraeger
2020-02-18  8:27       ` David Hildenbrand
2020-02-18 15:46         ` Sean Christopherson
2020-02-18 16:02           ` Will Deacon
2020-02-18 16:15             ` Christian Borntraeger
2020-02-18 21:35               ` Sean Christopherson
2020-02-19  8:31                 ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 02/42] KVM: s390/interrupt: do not pin adapter interrupt pages Christian Borntraeger
2020-02-17  9:43   ` David Hildenbrand
2020-02-20 12:18     ` David Hildenbrand
2020-02-20 13:31     ` Christian Borntraeger
2020-02-20 13:34       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 03/42] s390/protvirt: introduce host side setup Christian Borntraeger
2020-02-17  9:53   ` David Hildenbrand
2020-02-17 11:11     ` Christian Borntraeger
2020-02-17 11:13       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 04/42] s390/protvirt: add ultravisor initialization Christian Borntraeger
2020-02-17  9:57   ` David Hildenbrand
2020-02-17 11:13     ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 05/42] s390/mm: provide memory management functions for protected KVM guests Christian Borntraeger
2020-02-17 10:21   ` David Hildenbrand
2020-02-17 11:28     ` Christian Borntraeger
2020-02-17 12:07       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 06/42] s390/mm: add (non)secure page access exceptions handlers Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 07/42] KVM: s390: protvirt: Add UV debug trace Christian Borntraeger
2020-02-17 10:41   ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 08/42] KVM: s390: add new variants of UV CALL Christian Borntraeger
2020-02-17 10:42   ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
2020-02-17 10:56   ` David Hildenbrand
2020-02-17 12:04     ` Christian Borntraeger
2020-02-17 12:09       ` David Hildenbrand
2020-02-17 14:53         ` [PATCH 0/2] example changes Christian Borntraeger
2020-02-17 14:53           ` [PATCH 1/2] lock changes Christian Borntraeger
2020-02-17 14:53           ` [PATCH 2/2] merge vm/cpu create Christian Borntraeger
2020-02-17 15:00             ` Janosch Frank
2020-02-17 15:02               ` Christian Borntraeger
2020-02-19 11:02               ` Christian Borntraeger
2020-02-17 19:18           ` [PATCH 0/2] example changes David Hildenbrand
2020-02-18  8:09     ` [PATCH v2 09/42] KVM: s390: protvirt: Add initial vm and cpu lifecycle handling Christian Borntraeger
2020-02-18  8:39   ` [PATCH v2.1] " Christian Borntraeger
2020-02-18  9:12     ` David Hildenbrand
2020-02-18 21:18       ` Christian Borntraeger
2020-02-19  8:32         ` David Hildenbrand
2020-02-19 11:01       ` Christian Borntraeger
2020-02-18  9:56     ` David Hildenbrand
2020-02-18 20:26       ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 10/42] KVM: s390: protvirt: Add KVM api documentation Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 11/42] KVM: s390: protvirt: Secure memory is not mergeable Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 12/42] KVM: s390/mm: Make pages accessible before destroying the guest Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 13/42] KVM: s390: protvirt: Handle SE notification interceptions Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 14/42] KVM: s390: protvirt: Instruction emulation Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 15/42] KVM: s390: protvirt: Add interruption injection controls Christian Borntraeger
2020-02-17 10:59   ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 16/42] KVM: s390: protvirt: Implement interruption injection Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 17/42] KVM: s390: protvirt: Add SCLP interrupt handling Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 18/42] KVM: s390: protvirt: Handle spec exception loops Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 19/42] KVM: s390: protvirt: Add new gprs location handling Christian Borntraeger
2020-02-17 11:01   ` David Hildenbrand
2020-02-17 11:33     ` Christian Borntraeger
2020-02-17 14:37     ` Janosch Frank
2020-02-14 22:26 ` [PATCH v2 20/42] KVM: S390: protvirt: Introduce instruction data area bounce buffer Christian Borntraeger
2020-02-17 11:08   ` David Hildenbrand
2020-02-17 14:47     ` Janosch Frank
2020-02-17 15:00       ` Christian Borntraeger
2020-02-17 15:38         ` Janosch Frank
2020-02-17 16:58           ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 21/42] KVM: s390: protvirt: handle secure guest prefix pages Christian Borntraeger
2020-02-17 11:11   ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 22/42] KVM: s390/mm: handle guest unpin events Christian Borntraeger
2020-02-17 14:23   ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 23/42] KVM: s390: protvirt: Write sthyi data to instruction data area Christian Borntraeger
2020-02-17 14:24   ` David Hildenbrand
2020-02-17 18:40     ` Christian Borntraeger
2020-02-17 19:16       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 24/42] KVM: s390: protvirt: STSI handling Christian Borntraeger
2020-02-18  8:35   ` David Hildenbrand
2020-02-18  8:44     ` Christian Borntraeger
2020-02-18  9:08       ` David Hildenbrand
2020-02-18  9:11         ` Christian Borntraeger
2020-02-18  9:13           ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 25/42] KVM: s390: protvirt: disallow one_reg Christian Borntraeger
2020-02-18  8:40   ` David Hildenbrand
2020-02-18  8:57     ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 26/42] KVM: s390: protvirt: Do only reset registers that are accessible Christian Borntraeger
2020-02-18  8:42   ` David Hildenbrand
2020-02-18  9:20     ` Christian Borntraeger
2020-02-18  9:28       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 27/42] KVM: s390: protvirt: Only sync fmt4 registers Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 28/42] KVM: s390: protvirt: Add program exception injection Christian Borntraeger
2020-02-18  9:33   ` David Hildenbrand
2020-02-18  9:37     ` Christian Borntraeger
2020-02-18  9:39       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 29/42] KVM: s390: protvirt: Add diag 308 subcode 8 - 10 handling Christian Borntraeger
2020-02-18  9:38   ` David Hildenbrand
2020-02-19 12:45     ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 30/42] KVM: s390: protvirt: UV calls in support of diag308 0, 1 Christian Borntraeger
2020-02-18  9:44   ` David Hildenbrand
2020-02-19 11:53     ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 31/42] KVM: s390: protvirt: Report CPU state to Ultravisor Christian Borntraeger
2020-02-18  9:48   ` David Hildenbrand
2020-02-19 19:36     ` Christian Borntraeger
2020-02-19 19:46       ` Christian Borntraeger
2020-02-20 10:52         ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 32/42] KVM: s390: protvirt: Support cmd 5 operation state Christian Borntraeger
2020-02-18  9:50   ` David Hildenbrand
2020-02-19 11:06     ` Christian Borntraeger
2020-02-19 11:08       ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 33/42] KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112 Christian Borntraeger
2020-02-18  9:53   ` David Hildenbrand
2020-02-18 10:02     ` David Hildenbrand
2020-02-18 10:05     ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 34/42] KVM: s390: protvirt: do not inject interrupts after start Christian Borntraeger
2020-02-18  9:53   ` David Hildenbrand
2020-02-18 10:02     ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 35/42] KVM: s390: protvirt: Add UV cpu reset calls Christian Borntraeger
2020-02-18  9:54   ` David Hildenbrand
2020-02-14 22:26 ` [PATCH v2 36/42] DOCUMENTATION: Protected virtual machine introduction and IPL Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 37/42] s390/uv: Fix handling of length extensions (already in s390 tree) Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 38/42] s390: protvirt: Add sysfs firmware interface for Ultravisor information Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 39/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: error cases Christian Borntraeger
2020-02-18 16:25   ` Will Deacon
2020-02-18 16:30     ` Christian Borntraeger
2020-02-18 16:33       ` Will Deacon
2020-02-14 22:26 ` [PATCH v2 40/42] example for future extension: mm:gup/writeback: add callbacks for inaccessible pages: source indication Christian Borntraeger
2020-02-17 14:15   ` Ulrich Weigand
2020-02-17 14:38     ` Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 41/42] potential fixup for "s390/mm: provide memory management functions for protected KVM guests" Christian Borntraeger
2020-02-14 22:26 ` [PATCH v2 42/42] KVM: s390: rstify new ioctls in api.rst Christian Borntraeger

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git