All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10 v4] KVM: introduce readonly memslot
@ 2012-07-17 14:39 Xiao Guangrong
  2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:39 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault-pfn
and async is not allowed, then the vm will crash.

As Avi's suggestion, We introduce readonly memory region to map ROM/ROMD
to the guest, read access is happy for readonly memslot, write access on
readonly memslot will cause KVM_EXIT_MMIO exit.

The test case attached is used to test this feather.

Changlog:
- fix memslot flag check
- fix caching the mmio info into spte which is caused by write on readonly
  memslot
- extend mmio-exit info to indicate whether the mmio-exit is caused by
  readonly fault


[-- Attachment #2: migrate-perf.tar.bz2 --]
[-- Type: application/x-bzip, Size: 305078 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/10] KVM: fix missing check for memslot flags
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
@ 2012-07-17 14:40 ` Xiao Guangrong
  2012-07-17 14:41 ` [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:40 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Check flags when memslot is registered from userspace as Avi's suggestion

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 78cf42f..7cb29bb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -689,6 +689,14 @@ void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)
 	slots->generation++;
 }

+static int check_memory_region_flags(struct kvm_userspace_memory_region *mem)
+{
+	if (mem->flags & ~KVM_MEM_LOG_DIRTY_PAGES)
+		return -EINVAL;
+
+	return 0;
+}
+
 /*
  * Allocate some memory and give it an address in the guest physical address
  * space.
@@ -709,6 +717,10 @@ int __kvm_set_memory_region(struct kvm *kvm,
 	struct kvm_memory_slot old, new;
 	struct kvm_memslots *slots, *old_memslots;

+	r = check_memory_region_flags(mem);
+	if (r)
+		goto out;
+
 	r = -EINVAL;
 	/* General sanity checks */
 	if (mem->memory_size & (PAGE_SIZE - 1))
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
  2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
@ 2012-07-17 14:41 ` Xiao Guangrong
  2012-07-17 14:41 ` [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:41 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Quote Avi's comment:
| KVM_MEMSLOT_INVALID is actually an internal symbol, not used by
| userspace.  Please move it to kvm_host.h.

Also, move KVM_MEMSLOT_INVALID to the highest bit

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 include/linux/kvm.h      |    1 -
 include/linux/kvm_host.h |    2 ++
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2ce09aa..dc3aa2a 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -103,7 +103,6 @@ struct kvm_userspace_memory_region {

 /* for kvm_memory_region::flags */
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
-#define KVM_MEMSLOT_INVALID      (1UL << 1)

 /* for KVM_IRQ_LINE */
 struct kvm_irq_level {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 14bfde4..8e6bc51 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -34,6 +34,8 @@
 #define KVM_MMIO_SIZE 8
 #endif

+#define KVM_MEMSLOT_INVALID	(1UL << 31)
+
 /*
  * If we support unaligned MMIO, at most one fragment will be split into two:
  */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
  2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
  2012-07-17 14:41 ` [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
@ 2012-07-17 14:41 ` Xiao Guangrong
  2012-07-17 14:42 ` [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:41 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

It can instead of hva_to_pfn_atomic

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c       |    5 +----
 include/linux/kvm_host.h |    3 ++-
 virt/kvm/kvm_main.c      |   14 ++++++++------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4ed543a..13d3c69 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2480,15 +2480,12 @@ static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
 				     bool no_dirty_log)
 {
 	struct kvm_memory_slot *slot;
-	unsigned long hva;

 	slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn, no_dirty_log);
 	if (!slot)
 		return get_fault_pfn();

-	hva = gfn_to_hva_memslot(slot, gfn);
-
-	return hva_to_pfn_atomic(hva);
+	return gfn_to_pfn_memslot_atomic(slot, gfn);
 }

 static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8e6bc51..e4815e9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -420,7 +420,6 @@ void kvm_release_page_dirty(struct page *page);
 void kvm_set_page_dirty(struct page *page);
 void kvm_set_page_accessed(struct page *page);

-pfn_t hva_to_pfn_atomic(unsigned long addr);
 pfn_t gfn_to_pfn_atomic(struct kvm *kvm, gfn_t gfn);
 pfn_t gfn_to_pfn_async(struct kvm *kvm, gfn_t gfn, bool *async,
 		       bool write_fault, bool *writable);
@@ -428,6 +427,8 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn);
 pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault,
 		      bool *writable);
 pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
+pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn);
+
 void kvm_release_pfn_dirty(pfn_t);
 void kvm_release_pfn_clean(pfn_t pfn);
 void kvm_set_pfn_dirty(pfn_t pfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7cb29bb..6c1e746 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1156,12 +1156,6 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 	return pfn;
 }

-pfn_t hva_to_pfn_atomic(unsigned long addr)
-{
-	return hva_to_pfn(addr, true, NULL, true, NULL);
-}
-EXPORT_SYMBOL_GPL(hva_to_pfn_atomic);
-
 static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, bool atomic, bool *async,
 			  bool write_fault, bool *writable)
 {
@@ -1211,6 +1205,14 @@ pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn)
 	return hva_to_pfn(addr, false, NULL, true, NULL);
 }

+pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+	unsigned long addr = gfn_to_hva_memslot(slot, gfn);
+
+	return hva_to_pfn(addr, true, NULL, true, NULL);
+}
+EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic);
+
 int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, struct page **pages,
 								  int nr_pages)
 {
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (2 preceding siblings ...)
  2012-07-17 14:41 ` [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
@ 2012-07-17 14:42 ` Xiao Guangrong
  2012-07-17 14:43 ` [PATCH 05/10] KVM: reorganize hva_to_pfn Xiao Guangrong
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:42 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

This set of functions is only used to read data from host space, in the
later patch, we will only get a readonly hva in gfn_to_hva_read, and
the function name is a good hint to let gfn_to_hva_read to pair with
kvm_read_hva()/kvm_read_hva_atomic()

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   29 +++++++++++++++++++++++++----
 1 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6c1e746..a89c7b8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1048,6 +1048,27 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(gfn_to_hva);

+/*
+ * The hva returned by this function is only allowed to be read.
+ * It should pair with kvm_read_hva() or kvm_read_hva_atomic().
+ */
+static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
+{
+	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
+}
+
+static int kvm_read_hva(void *data, void __user *hva, int len)
+{
+	return __copy_from_user(data, hva, len);
+
+}
+
+static int kvm_read_hva_atomic(void *data, void __user *hva, int len)
+{
+	return __copy_from_user_inatomic(data, hva, len);
+
+}
+
 pfn_t get_fault_pfn(void)
 {
 	get_page(fault_page);
@@ -1316,10 +1337,10 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	int r;
 	unsigned long addr;

-	addr = gfn_to_hva(kvm, gfn);
+	addr = gfn_to_hva_read(kvm, gfn);
 	if (kvm_is_error_hva(addr))
 		return -EFAULT;
-	r = __copy_from_user(data, (void __user *)addr + offset, len);
+	r = kvm_read_hva(data, (void __user *)addr + offset, len);
 	if (r)
 		return -EFAULT;
 	return 0;
@@ -1354,11 +1375,11 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data,
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	int offset = offset_in_page(gpa);

-	addr = gfn_to_hva(kvm, gfn);
+	addr = gfn_to_hva_read(kvm, gfn);
 	if (kvm_is_error_hva(addr))
 		return -EFAULT;
 	pagefault_disable();
-	r = __copy_from_user_inatomic(data, (void __user *)addr + offset, len);
+	r = kvm_read_hva_atomic(data, (void __user *)addr + offset, len);
 	pagefault_enable();
 	if (r)
 		return -EFAULT;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/10] KVM: reorganize hva_to_pfn
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (3 preceding siblings ...)
  2012-07-17 14:42 ` [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
@ 2012-07-17 14:43 ` Xiao Guangrong
  2012-07-17 14:43 ` [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:43 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

We do too many things in hva_to_pfn, this patch reorganize the code,
let it be better readable

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |  161 +++++++++++++++++++++++++++++++--------------------
 1 files changed, 99 insertions(+), 62 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a89c7b8..54fb47b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1096,84 +1096,121 @@ static inline int check_user_page_hwpoison(unsigned long addr)
 	return rc == -EHWPOISON;
 }

-static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
-			bool write_fault, bool *writable)
+/*
+ * The atomic path to get the writable pfn which will be stored in @pfn,
+ * true indicates success, otherwise false is returned.
+ */
+static bool hva_to_pfn_fast(unsigned long addr, bool atomic, bool *async,
+			    bool write_fault, bool *writable, pfn_t *pfn)
 {
 	struct page *page[1];
-	int npages = 0;
-	pfn_t pfn;
+	int npages;

-	/* we can do it either atomically or asynchronously, not both */
-	BUG_ON(atomic && async);
+	if (!(async || atomic))
+		return false;

-	BUG_ON(!write_fault && !writable);
+	npages = __get_user_pages_fast(addr, 1, 1, page);
+	if (npages == 1) {
+		*pfn = page_to_pfn(page[0]);
+
+		if (writable)
+			*writable = true;
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * The slow path to get the pfn of the specified host virtual address,
+ * 1 indicates success, -errno is returned if error is detected.
+ */
+static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
+			   bool *writable, pfn_t *pfn)
+{
+	struct page *page[1];
+	int npages = 0;
+
+	might_sleep();

 	if (writable)
-		*writable = true;
+		*writable = write_fault;

-	if (atomic || async)
-		npages = __get_user_pages_fast(addr, 1, 1, page);
+	if (async) {
+		down_read(&current->mm->mmap_sem);
+		npages = get_user_page_nowait(current, current->mm,
+					      addr, write_fault, page);
+		up_read(&current->mm->mmap_sem);
+	} else
+		npages = get_user_pages_fast(addr, 1, write_fault,
+					     page);

-	if (unlikely(npages != 1) && !atomic) {
-		might_sleep();
+	if (npages != 1)
+		return npages;

-		if (writable)
-			*writable = write_fault;
-
-		if (async) {
-			down_read(&current->mm->mmap_sem);
-			npages = get_user_page_nowait(current, current->mm,
-						     addr, write_fault, page);
-			up_read(&current->mm->mmap_sem);
-		} else
-			npages = get_user_pages_fast(addr, 1, write_fault,
-						     page);
-
-		/* map read fault as writable if possible */
-		if (unlikely(!write_fault) && npages == 1) {
-			struct page *wpage[1];
-
-			npages = __get_user_pages_fast(addr, 1, 1, wpage);
-			if (npages == 1) {
-				*writable = true;
-				put_page(page[0]);
-				page[0] = wpage[0];
-			}
-			npages = 1;
+	/* map read fault as writable if possible */
+	if (unlikely(!write_fault)) {
+		struct page *wpage[1];
+
+		npages = __get_user_pages_fast(addr, 1, 1, wpage);
+		if (npages == 1) {
+			*writable = true;
+			put_page(page[0]);
+			page[0] = wpage[0];
 		}
+
+		npages = 1;
 	}
+	*pfn = page_to_pfn(page[0]);
+	return npages;
+}

-	if (unlikely(npages != 1)) {
-		struct vm_area_struct *vma;
+static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
+			bool write_fault, bool *writable)
+{
+	struct vm_area_struct *vma;
+	pfn_t pfn = 0;
+	int npages;

-		if (atomic)
-			return get_fault_pfn();
+	/* we can do it either atomically or asynchronously, not both */
+	BUG_ON(atomic && async);

-		down_read(&current->mm->mmap_sem);
-		if (npages == -EHWPOISON ||
-			(!async && check_user_page_hwpoison(addr))) {
-			up_read(&current->mm->mmap_sem);
-			get_page(hwpoison_page);
-			return page_to_pfn(hwpoison_page);
-		}
+	BUG_ON(!write_fault && !writable);

-		vma = find_vma_intersection(current->mm, addr, addr+1);
-
-		if (vma == NULL)
-			pfn = get_fault_pfn();
-		else if ((vma->vm_flags & VM_PFNMAP)) {
-			pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) +
-				vma->vm_pgoff;
-			BUG_ON(!kvm_is_mmio_pfn(pfn));
-		} else {
-			if (async && (vma->vm_flags & VM_WRITE))
-				*async = true;
-			pfn = get_fault_pfn();
-		}
-		up_read(&current->mm->mmap_sem);
-	} else
-		pfn = page_to_pfn(page[0]);
+	if (hva_to_pfn_fast(addr, atomic, async, write_fault, writable, &pfn))
+		return pfn;

+	if (atomic)
+		return get_fault_pfn();
+
+	npages = hva_to_pfn_slow(addr, async, write_fault, writable, &pfn);
+	if (npages == 1)
+		return pfn;
+
+	down_read(&current->mm->mmap_sem);
+	if (npages == -EHWPOISON ||
+	      (!async && check_user_page_hwpoison(addr))) {
+		get_page(hwpoison_page);
+		pfn = page_to_pfn(hwpoison_page);
+		goto exit;
+	}
+
+	vma = find_vma_intersection(current->mm, addr, addr + 1);
+
+	if (vma == NULL)
+		pfn = get_fault_pfn();
+	else if ((vma->vm_flags & VM_PFNMAP)) {
+		pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) +
+			vma->vm_pgoff;
+		BUG_ON(!kvm_is_mmio_pfn(pfn));
+	} else {
+		if (async && (vma->vm_flags & VM_WRITE))
+			*async = true;
+		pfn = get_fault_pfn();
+	}
+
+exit:
+	up_read(&current->mm->mmap_sem);
 	return pfn;
 }

-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (4 preceding siblings ...)
  2012-07-17 14:43 ` [PATCH 05/10] KVM: reorganize hva_to_pfn Xiao Guangrong
@ 2012-07-17 14:43 ` Xiao Guangrong
  2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:43 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

In current code, we always map writable pfn for the read-fault, in order
to support readonly memslot, we map writable pfn only if 'writable'
is not NULL

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   24 +++++++++++++++++++++++-
 1 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 54fb47b..e9eab07 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1109,6 +1109,14 @@ static bool hva_to_pfn_fast(unsigned long addr, bool atomic, bool *async,
 	if (!(async || atomic))
 		return false;

+	/*
+	 * Fast pin a writable pfn only if it is a write fault request
+	 * or the caller allows to map a writable pfn for a read fault
+	 * request.
+	 */
+	if (!(write_fault || writable))
+		return false;
+
 	npages = __get_user_pages_fast(addr, 1, 1, page);
 	if (npages == 1) {
 		*pfn = page_to_pfn(page[0]);
@@ -1149,7 +1157,7 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 		return npages;

 	/* map read fault as writable if possible */
-	if (unlikely(!write_fault)) {
+	if (unlikely(!write_fault) && writable) {
 		struct page *wpage[1];

 		npages = __get_user_pages_fast(addr, 1, 1, wpage);
@@ -1165,6 +1173,20 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 	return npages;
 }

+/*
+ * Pin guest page in memory and return its pfn.
+ * @addr: host virtual address which maps memory to the guest
+ * @atomic: whether this function can sleep
+ * @async: whether this function need to wait IO complete if the
+ *         host page is not in the memory
+ * @write_fault: whether we should get a writable host page
+ * @writable: whether it allows to map a writable host page for !@write_fault
+ *
+ * The function will map a writable host page for these two cases:
+ * 1): @write_fault = true
+ * 2): @write_fault = false && @writable, @writable will tell the caller
+ *     whether the mapping is writable.
+ */
 static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 			bool write_fault, bool *writable)
 {
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/10] KVM: introduce readonly_fault_pfn
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (5 preceding siblings ...)
  2012-07-17 14:43 ` [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
@ 2012-07-17 14:44 ` Xiao Guangrong
  2012-07-19 10:15   ` Avi Kivity
  2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:44 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Introduce readonly_fault_pfn, in the later patch, it indicates failure
when we try to get a writable pfn from the readonly memslot

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 include/linux/kvm_host.h |    1 +
 virt/kvm/kvm_main.c      |   92 +++++++++++++++++++++++++++------------------
 2 files changed, 56 insertions(+), 37 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4815e9..a2302e7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -385,6 +385,7 @@ extern struct page *bad_page;
 int is_error_page(struct page *page);
 int is_error_pfn(pfn_t pfn);
 int is_hwpoison_pfn(pfn_t pfn);
+int is_readonly_fault_pfn(pfn_t pfn);
 int is_noslot_pfn(pfn_t pfn);
 int is_invalid_pfn(pfn_t pfn);
 int kvm_is_error_hva(unsigned long addr);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e9eab07..b70f1a4 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -109,6 +109,9 @@ static pfn_t hwpoison_pfn;
 static struct page *fault_page;
 static pfn_t fault_pfn;

+static struct page *readonly_fault_page;
+static pfn_t readonly_fault_pfn;
+
 inline int kvm_is_mmio_pfn(pfn_t pfn)
 {
 	if (pfn_valid(pfn)) {
@@ -949,13 +952,15 @@ EXPORT_SYMBOL_GPL(kvm_disable_largepages);

 int is_error_page(struct page *page)
 {
-	return page == bad_page || page == hwpoison_page || page == fault_page;
+	return page == bad_page || page == hwpoison_page || page == fault_page
+		|| page == readonly_fault_page;
 }
 EXPORT_SYMBOL_GPL(is_error_page);

 int is_error_pfn(pfn_t pfn)
 {
-	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn;
+	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn
+		|| pfn == readonly_fault_pfn;
 }
 EXPORT_SYMBOL_GPL(is_error_pfn);

@@ -965,6 +970,12 @@ int is_hwpoison_pfn(pfn_t pfn)
 }
 EXPORT_SYMBOL_GPL(is_hwpoison_pfn);

+int is_readonly_fault_pfn(pfn_t pfn)
+{
+	return pfn == readonly_fault_pfn;
+}
+EXPORT_SYMBOL_GPL(is_readonly_fault_pfn);
+
 int is_noslot_pfn(pfn_t pfn)
 {
 	return pfn == bad_pfn;
@@ -973,7 +984,8 @@ EXPORT_SYMBOL_GPL(is_noslot_pfn);

 int is_invalid_pfn(pfn_t pfn)
 {
-	return pfn == hwpoison_pfn || pfn == fault_pfn;
+	return pfn == hwpoison_pfn || pfn == fault_pfn ||
+			pfn == readonly_fault_pfn;
 }
 EXPORT_SYMBOL_GPL(is_invalid_pfn);

@@ -1076,6 +1088,12 @@ pfn_t get_fault_pfn(void)
 }
 EXPORT_SYMBOL_GPL(get_fault_pfn);

+static pfn_t get_readonly_fault_pfn(void)
+{
+	get_page(readonly_fault_page);
+	return readonly_fault_pfn;
+}
+
 int get_user_page_nowait(struct task_struct *tsk, struct mm_struct *mm,
 	unsigned long start, int write, struct page **page)
 {
@@ -2809,42 +2827,49 @@ static void kvm_sched_out(struct preempt_notifier *pn,
 	kvm_arch_vcpu_put(vcpu);
 }

-int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
-		  struct module *module)
+static void kvm_uninit_dummy_pages(void)
 {
-	int r;
-	int cpu;
-
-	r = kvm_arch_init(opaque);
-	if (r)
-		goto out_fail;
+	if (fault_page)
+		__free_page(fault_page);
+	if (readonly_fault_page)
+		__free_page(readonly_fault_page);
+	if (hwpoison_page)
+		__free_page(hwpoison_page);
+	if (bad_page)
+		__free_page(bad_page);
+}

+static int kvm_init_dummy_pages(void)
+{
 	bad_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	hwpoison_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	fault_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	readonly_fault_page = alloc_page(GFP_KERNEL | __GFP_ZERO);

-	if (bad_page == NULL) {
-		r = -ENOMEM;
-		goto out;
-	}
+	if (!bad_page || !hwpoison_page || !fault_page || !readonly_fault_page)
+		return -ENOMEM;

 	bad_pfn = page_to_pfn(bad_page);
+	hwpoison_pfn = page_to_pfn(hwpoison_page);
+	fault_pfn = page_to_pfn(fault_page);
+	readonly_fault_pfn = page_to_pfn(readonly_fault_page);

-	hwpoison_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
-
-	if (hwpoison_page == NULL) {
-		r = -ENOMEM;
-		goto out_free_0;
-	}
+	return 0;
+}

-	hwpoison_pfn = page_to_pfn(hwpoison_page);
+int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
+		  struct module *module)
+{
+	int r;
+	int cpu;

-	fault_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	r = kvm_arch_init(opaque);
+	if (r)
+		goto out_fail;

-	if (fault_page == NULL) {
-		r = -ENOMEM;
+	r = kvm_init_dummy_pages();
+	if (r)
 		goto out_free_0;
-	}
-
-	fault_pfn = page_to_pfn(fault_page);

 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
 		r = -ENOMEM;
@@ -2920,12 +2945,7 @@ out_free_1:
 out_free_0a:
 	free_cpumask_var(cpus_hardware_enabled);
 out_free_0:
-	if (fault_page)
-		__free_page(fault_page);
-	if (hwpoison_page)
-		__free_page(hwpoison_page);
-	__free_page(bad_page);
-out:
+	kvm_uninit_dummy_pages();
 	kvm_arch_exit();
 out_fail:
 	return r;
@@ -2945,8 +2965,6 @@ void kvm_exit(void)
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
 	free_cpumask_var(cpus_hardware_enabled);
-	__free_page(fault_page);
-	__free_page(hwpoison_page);
-	__free_page(bad_page);
+	kvm_uninit_dummy_pages();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/10] KVM: introduce readonly_bad_hva
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (6 preceding siblings ...)
  2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
@ 2012-07-17 14:45 ` Xiao Guangrong
  2012-07-19 10:16   ` Avi Kivity
  2012-07-17 14:45 ` [PATCH 09/10] KVM: introduce readonly memslot Xiao Guangrong
  2012-07-17 14:46 ` [PATCH 10/10] KVM: indicate readonly access fault Xiao Guangrong
  9 siblings, 1 reply; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:45 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

In the later patch, it indicates failure when we try to get a writable
hva from the readonly slot

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b70f1a4..c056736 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -994,9 +994,19 @@ static inline unsigned long bad_hva(void)
 	return PAGE_OFFSET;
 }

+static inline unsigned long readonly_bad_hva(void)
+{
+	return PAGE_OFFSET + PAGE_SIZE;
+}
+
+static int kvm_is_readonly_bad_hva(unsigned long addr)
+{
+	return addr == readonly_bad_hva();
+}
+
 int kvm_is_error_hva(unsigned long addr)
 {
-	return addr == bad_hva();
+	return addr == bad_hva() || kvm_is_readonly_bad_hva(addr);
 }
 EXPORT_SYMBOL_GPL(kvm_is_error_hva);

-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/10] KVM: introduce readonly memslot
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (7 preceding siblings ...)
  2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
@ 2012-07-17 14:45 ` Xiao Guangrong
  2012-07-17 14:46 ` [PATCH 10/10] KVM: indicate readonly access fault Xiao Guangrong
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:45 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault-pfn
and async is not allowed, then the vm will crash

We introduce readonly memory region to map ROM/ROMD to the guest, read access
is happy for readonly memslot, write access on readonly memslot will cause
KVM_EXIT_MMIO exit

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 Documentation/virtual/kvm/api.txt |   10 +++-
 arch/x86/include/asm/kvm.h        |    1 +
 arch/x86/kvm/mmu.c                |   10 ++++
 arch/x86/kvm/x86.c                |    1 +
 include/linux/kvm.h               |    6 ++-
 virt/kvm/kvm_main.c               |   84 ++++++++++++++++++++++++++++--------
 6 files changed, 89 insertions(+), 23 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 310fe50..4b3d3f1 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -857,7 +857,8 @@ struct kvm_userspace_memory_region {
 };

 /* for kvm_memory_region::flags */
-#define KVM_MEM_LOG_DIRTY_PAGES  1UL
+#define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
+#define KVM_MEM_READONLY	(1UL << 1)

 This ioctl allows the user to create or modify a guest physical memory
 slot.  When changing an existing slot, it may be moved in the guest
@@ -873,9 +874,12 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
 be identical.  This allows large pages in the guest to be backed by large
 pages in the host.

-The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
+The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which
 instructs kvm to keep track of writes to memory within the slot.  See
-the KVM_GET_DIRTY_LOG ioctl.
+the KVM_GET_DIRTY_LOG ioctl. Another flag is KVM_MEM_READONLY when the
+KVM_CAP_READONLY_MEM capability, it indicates the guest memory is read-only,
+that means, guest is only allowed to read it. Writes will be posted to
+userspace as KVM_EXIT_MMIO exits.

 When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
 region are automatically reflected into the guest.  For example, an mmap()
diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
index 246617e..521bf25 100644
--- a/arch/x86/include/asm/kvm.h
+++ b/arch/x86/include/asm/kvm.h
@@ -25,6 +25,7 @@
 #define __KVM_HAVE_DEBUGREGS
 #define __KVM_HAVE_XSAVE
 #define __KVM_HAVE_XCRS
+#define __KVM_HAVE_READONLY_MEM

 /* Architectural interrupt line count. */
 #define KVM_NR_INTERRUPTS 256
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 13d3c69..d4eee8e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2618,6 +2618,16 @@ static void kvm_send_hwpoison_signal(unsigned long address, struct task_struct *
 static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn)
 {
 	kvm_release_pfn_clean(pfn);
+
+	/*
+	 * Do not cache the mmio info caused by writing the readonly gfn
+	 * into the spte otherwise read access on readonly gfn also can
+	 * caused mmio page fault and treat it as mmio access.
+	 * Return 1 to tell kvm to emulate it.
+	 */
+	if (is_readonly_fault_pfn(pfn))
+		return 1;
+
 	if (is_hwpoison_pfn(pfn)) {
 		kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current);
 		return 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8171836..46e13a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2153,6 +2153,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_GET_TSC_KHZ:
 	case KVM_CAP_PCI_2_3:
 	case KVM_CAP_KVMCLOCK_CTRL:
+	case KVM_CAP_READONLY_MEM:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index dc3aa2a..94867d0 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -102,7 +102,8 @@ struct kvm_userspace_memory_region {
 };

 /* for kvm_memory_region::flags */
-#define KVM_MEM_LOG_DIRTY_PAGES  1UL
+#define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
+#define KVM_MEM_READONLY	(1UL << 1)

 /* for KVM_IRQ_LINE */
 struct kvm_irq_level {
@@ -617,6 +618,9 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
 #define KVM_CAP_PPC_ALLOC_HTAB 80
+#ifdef __KVM_HAVE_READONLY_MEM
+#define KVM_CAP_READONLY_MEM 81
+#endif

 #ifdef KVM_CAP_IRQ_ROUTING

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c056736..50e18c0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -694,7 +694,13 @@ void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)

 static int check_memory_region_flags(struct kvm_userspace_memory_region *mem)
 {
-	if (mem->flags & ~KVM_MEM_LOG_DIRTY_PAGES)
+	u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES;
+
+#ifdef KVM_CAP_READONLY_MEM
+	valid_flags |= KVM_MEM_READONLY;
+#endif
+
+	if (mem->flags & ~valid_flags)
 		return -EINVAL;

 	return 0;
@@ -1052,18 +1058,32 @@ out:
 	return size;
 }

-static unsigned long gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
-				     gfn_t *nr_pages)
+static bool memslot_is_readonly(struct kvm_memory_slot *slot)
+{
+	return slot->flags & KVM_MEM_READONLY;
+}
+
+static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
+				     gfn_t *nr_pages, bool write)
 {
 	if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
 		return bad_hva();

+	if (memslot_is_readonly(slot) && write)
+		return readonly_bad_hva();
+
 	if (nr_pages)
 		*nr_pages = slot->npages - (gfn - slot->base_gfn);

 	return gfn_to_hva_memslot(slot, gfn);
 }

+static unsigned long gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
+				     gfn_t *nr_pages)
+{
+	return __gfn_to_hva_many(slot, gfn, nr_pages, true);
+}
+
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 {
 	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
@@ -1076,7 +1096,7 @@ EXPORT_SYMBOL_GPL(gfn_to_hva);
  */
 static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
 {
-	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
+	return __gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL, false);
 }

 static int kvm_read_hva(void *data, void __user *hva, int len)
@@ -1201,6 +1221,17 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 	return npages;
 }

+static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
+{
+	if (unlikely(!(vma->vm_flags & VM_READ)))
+		return false;
+
+	if (write_fault && (unlikely(!(vma->vm_flags & VM_WRITE))))
+		return false;
+
+	return true;
+}
+
 /*
  * Pin guest page in memory and return its pfn.
  * @addr: host virtual address which maps memory to the guest
@@ -1225,8 +1256,6 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 	/* we can do it either atomically or asynchronously, not both */
 	BUG_ON(atomic && async);

-	BUG_ON(!write_fault && !writable);
-
 	if (hva_to_pfn_fast(addr, atomic, async, write_fault, writable, &pfn))
 		return pfn;

@@ -1254,7 +1283,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 			vma->vm_pgoff;
 		BUG_ON(!kvm_is_mmio_pfn(pfn));
 	} else {
-		if (async && (vma->vm_flags & VM_WRITE))
+		if (async && vma_is_valid(vma, write_fault))
 			*async = true;
 		pfn = get_fault_pfn();
 	}
@@ -1264,21 +1293,41 @@ exit:
 	return pfn;
 }

-static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, bool atomic, bool *async,
-			  bool write_fault, bool *writable)
+static pfn_t
+__gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn, bool atomic,
+		     bool *async, bool write_fault, bool *writable)
 {
-	unsigned long addr;
+	unsigned long addr = __gfn_to_hva_many(slot, gfn, NULL, write_fault);

-	if (async)
-		*async = false;
+	if (kvm_is_readonly_bad_hva(addr))
+		return get_readonly_fault_pfn();

-	addr = gfn_to_hva(kvm, gfn);
 	if (kvm_is_error_hva(addr)) {
 		get_page(bad_page);
 		return page_to_pfn(bad_page);
 	}

-	return hva_to_pfn(addr, atomic, async, write_fault, writable);
+	/* Do not map writable pfn in the readonly memslot. */
+	if (writable && memslot_is_readonly(slot))
+		writable = NULL;
+
+	return hva_to_pfn(addr, atomic, async, write_fault,
+			  writable);
+}
+
+
+static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, bool atomic, bool *async,
+			  bool write_fault, bool *writable)
+{
+	struct kvm_memory_slot *slot;
+
+	if (async)
+		*async = false;
+
+	slot = gfn_to_memslot(kvm, gfn);
+
+	return __gfn_to_pfn_memslot(slot, gfn, atomic, async, write_fault,
+				    writable);
 }

 pfn_t gfn_to_pfn_atomic(struct kvm *kvm, gfn_t gfn)
@@ -1309,15 +1358,12 @@ EXPORT_SYMBOL_GPL(gfn_to_pfn_prot);

 pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-	unsigned long addr = gfn_to_hva_memslot(slot, gfn);
-	return hva_to_pfn(addr, false, NULL, true, NULL);
+	return __gfn_to_pfn_memslot(slot, gfn, false, NULL, true, NULL);
 }

 pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-	unsigned long addr = gfn_to_hva_memslot(slot, gfn);
-
-	return hva_to_pfn(addr, true, NULL, true, NULL);
+	return __gfn_to_pfn_memslot(slot, gfn, true, NULL, true, NULL);
 }
 EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic);

-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/10] KVM: indicate readonly access fault
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (8 preceding siblings ...)
  2012-07-17 14:45 ` [PATCH 09/10] KVM: introduce readonly memslot Xiao Guangrong
@ 2012-07-17 14:46 ` Xiao Guangrong
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:46 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Introduce write_readonly_mem in mmio-exit-info to indicate this exit is
caused by write access on readonly memslot

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c       |   12 ++++++++----
 include/linux/kvm.h      |    3 +++
 include/linux/kvm_host.h |    1 +
 virt/kvm/kvm_main.c      |    3 +++
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 46e13a1..cc5f8f0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3703,9 +3703,10 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,

 	ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes);
 	if (ret < 0)
-		return 0;
+		return ret;
+
 	kvm_mmu_pte_write(vcpu, gpa, val, bytes);
-	return 1;
+	return 0;
 }

 struct read_write_emulator_ops {
@@ -3735,7 +3736,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes)
 static int read_emulate(struct kvm_vcpu *vcpu, gpa_t gpa,
 			void *val, int bytes)
 {
-	return !kvm_read_guest(vcpu->kvm, gpa, val, bytes);
+	return kvm_read_guest(vcpu->kvm, gpa, val, bytes);
 }

 static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -3797,7 +3798,8 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	if (ret)
 		goto mmio;

-	if (ops->read_write_emulate(vcpu, gpa, val, bytes))
+	ret = ops->read_write_emulate(vcpu, gpa, val, bytes);
+	if (!ret)
 		return X86EMUL_CONTINUE;

 mmio:
@@ -3819,6 +3821,7 @@ mmio:
 		frag->gpa = gpa;
 		frag->data = val;
 		frag->len = now;
+		frag->write_readonly_mem = (ret == -EPERM);

 		gpa += now;
 		val += now;
@@ -3836,6 +3839,7 @@ static void set_mmio_exit_info(struct kvm_vcpu *vcpu,
 	run->mmio.phys_addr = frag->gpa;
 	run->mmio.len = frag->len;
 	run->mmio.is_write = vcpu->mmio_is_write = write;
+	run->mmio.write_readonly_mem = frag->write_readonly_mem;

 	if (write)
 		memcpy(run->mmio.data, frag->data, frag->len);
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 94867d0..9261541 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -222,6 +222,9 @@ struct kvm_run {
 			__u8  data[8];
 			__u32 len;
 			__u8  is_write;
+#ifdef __KVM_HAVE_READONLY_MEM
+			__u8 write_readonly_mem;
+#endif
 		} mmio;
 		/* KVM_EXIT_HYPERCALL */
 		struct {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a2302e7..7e45014 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -142,6 +142,7 @@ struct kvm_mmio_fragment {
 	gpa_t gpa;
 	void *data;
 	unsigned len;
+	bool write_readonly_mem;
 };

 struct kvm_vcpu {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 50e18c0..8d4bc55 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1527,6 +1527,9 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	unsigned long addr;

 	addr = gfn_to_hva(kvm, gfn);
+	if (kvm_is_readonly_bad_hva(addr))
+		return -EPERM;
+
 	if (kvm_is_error_hva(addr))
 		return -EFAULT;
 	r = __copy_to_user((void __user *)addr + offset, data, len);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 07/10] KVM: introduce readonly_fault_pfn
  2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
@ 2012-07-19 10:15   ` Avi Kivity
  2012-07-20  2:56     ` Xiao Guangrong
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2012-07-19 10:15 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, LKML, KVM

On 07/17/2012 05:44 PM, Xiao Guangrong wrote:
> Introduce readonly_fault_pfn, in the later patch, it indicates failure
> when we try to get a writable pfn from the readonly memslot
> 
> +
>  inline int kvm_is_mmio_pfn(pfn_t pfn)
>  {
>  	if (pfn_valid(pfn)) {
> @@ -949,13 +952,15 @@ EXPORT_SYMBOL_GPL(kvm_disable_largepages);
> 
>  int is_error_page(struct page *page)
>  {
> -	return page == bad_page || page == hwpoison_page || page == fault_page;
> +	return page == bad_page || page == hwpoison_page || page == fault_page
> +		|| page == readonly_fault_page;

All those checks are slow, and get_page(fault_page) etc. isn't very
scalable.

We should move to ERR_PTR() etc.  We could use ENOENT, EHWPOISON,
EFAULT, and EROFS for the above, or maybe there are better matches.

>  }
>  EXPORT_SYMBOL_GPL(is_error_page);
> 
>  int is_error_pfn(pfn_t pfn)
>  {
> -	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn;
> +	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn
> +		|| pfn == readonly_fault_pfn;
>  }

And a similar change here.
-- 
error compiling committee.c: too many arguments to function



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 08/10] KVM: introduce readonly_bad_hva
  2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
@ 2012-07-19 10:16   ` Avi Kivity
  2012-07-20  3:01     ` Xiao Guangrong
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2012-07-19 10:16 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, LKML, KVM

On 07/17/2012 05:45 PM, Xiao Guangrong wrote:
> In the later patch, it indicates failure when we try to get a writable
> hva from the readonly slot
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  virt/kvm/kvm_main.c |   12 +++++++++++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b70f1a4..c056736 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -994,9 +994,19 @@ static inline unsigned long bad_hva(void)
>  	return PAGE_OFFSET;
>  }
> 
> +static inline unsigned long readonly_bad_hva(void)
> +{
> +	return PAGE_OFFSET + PAGE_SIZE;
> +}
> +
> +static int kvm_is_readonly_bad_hva(unsigned long addr)
> +{
> +	return addr == readonly_bad_hva();
> +}
> +
>  int kvm_is_error_hva(unsigned long addr)
>  {
> -	return addr == bad_hva();
> +	return addr == bad_hva() || kvm_is_readonly_bad_hva(addr);
>  }

addr >= PAGE_OFFSET.  Or change it to use -E*.



-- 
error compiling committee.c: too many arguments to function



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 07/10] KVM: introduce readonly_fault_pfn
  2012-07-19 10:15   ` Avi Kivity
@ 2012-07-20  2:56     ` Xiao Guangrong
  0 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-20  2:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

On 07/19/2012 06:15 PM, Avi Kivity wrote:
> On 07/17/2012 05:44 PM, Xiao Guangrong wrote:
>> Introduce readonly_fault_pfn, in the later patch, it indicates failure
>> when we try to get a writable pfn from the readonly memslot
>>
>> +
>>  inline int kvm_is_mmio_pfn(pfn_t pfn)
>>  {
>>  	if (pfn_valid(pfn)) {
>> @@ -949,13 +952,15 @@ EXPORT_SYMBOL_GPL(kvm_disable_largepages);
>>
>>  int is_error_page(struct page *page)
>>  {
>> -	return page == bad_page || page == hwpoison_page || page == fault_page;
>> +	return page == bad_page || page == hwpoison_page || page == fault_page
>> +		|| page == readonly_fault_page;
> 
> All those checks are slow, and get_page(fault_page) etc. isn't very
> scalable.
> 
> We should move to ERR_PTR() etc.  We could use ENOENT, EHWPOISON,
> EFAULT, and EROFS for the above, or maybe there are better matches.
> 

Good point. I will do it in the next version, thank you, Avi!


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 08/10] KVM: introduce readonly_bad_hva
  2012-07-19 10:16   ` Avi Kivity
@ 2012-07-20  3:01     ` Xiao Guangrong
  0 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-20  3:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

On 07/19/2012 06:16 PM, Avi Kivity wrote:
> On 07/17/2012 05:45 PM, Xiao Guangrong wrote:
>> In the later patch, it indicates failure when we try to get a writable
>> hva from the readonly slot
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
>> ---
>>  virt/kvm/kvm_main.c |   12 +++++++++++-
>>  1 files changed, 11 insertions(+), 1 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index b70f1a4..c056736 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -994,9 +994,19 @@ static inline unsigned long bad_hva(void)
>>  	return PAGE_OFFSET;
>>  }
>>
>> +static inline unsigned long readonly_bad_hva(void)
>> +{
>> +	return PAGE_OFFSET + PAGE_SIZE;
>> +}
>> +
>> +static int kvm_is_readonly_bad_hva(unsigned long addr)
>> +{
>> +	return addr == readonly_bad_hva();
>> +}
>> +
>>  int kvm_is_error_hva(unsigned long addr)
>>  {
>> -	return addr == bad_hva();
>> +	return addr == bad_hva() || kvm_is_readonly_bad_hva(addr);
>>  }
> 
> addr >= PAGE_OFFSET.  Or change it to use -E*.

I prefer to the first one, addr >= PAGE_OFFSET, all virtual addresses
between 0 and (~0ULL) are valid, Using PAGE_OFFSET is more readable.

[ is_error_pfn is suitable to use -err because the the range of physical
  address is always limited, for example,  0 ~ 64G on x86.]


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-07-20  3:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
2012-07-17 14:41 ` [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
2012-07-17 14:41 ` [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
2012-07-17 14:42 ` [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
2012-07-17 14:43 ` [PATCH 05/10] KVM: reorganize hva_to_pfn Xiao Guangrong
2012-07-17 14:43 ` [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
2012-07-19 10:15   ` Avi Kivity
2012-07-20  2:56     ` Xiao Guangrong
2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
2012-07-19 10:16   ` Avi Kivity
2012-07-20  3:01     ` Xiao Guangrong
2012-07-17 14:45 ` [PATCH 09/10] KVM: introduce readonly memslot Xiao Guangrong
2012-07-17 14:46 ` [PATCH 10/10] KVM: indicate readonly access fault Xiao Guangrong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.