KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: James Hogan <jhogan@kernel.org>,
	Paul Mackerras <paulus@ozlabs.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Janosch Frank <frankja@linux.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Marc Zyngier <maz@kernel.org>
Cc: "David Hildenbrand" <david@redhat.com>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Sean Christopherson" <sean.j.christopherson@intel.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Jim Mattson" <jmattson@google.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"James Morse" <james.morse@arm.com>,
	"Julien Thierry" <julien.thierry.kdev@gmail.com>,
	"Suzuki K Poulose" <suzuki.poulose@arm.com>,
	linux-mips@vger.kernel.org, kvm-ppc@vger.kernel.org,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
	"Christoffer Dall" <christoffer.dall@arm.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>
Subject: [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot
Date: Tue, 17 Dec 2019 12:40:23 -0800
Message-ID: <20191217204041.10815-2-sean.j.christopherson@intel.com> (raw)
In-Reply-To: <20191217204041.10815-1-sean.j.christopherson@intel.com>

Reallocate a rmap array and recalcuate large page compatibility when
moving an existing memslot to correctly handle the alignment properties
of the new memslot.  The number of rmap entries required at each level
is dependent on the alignment of the memslot's base gfn with respect to
that level, e.g. moving a large-page aligned memslot so that it becomes
unaligned will increase the number of rmap entries needed at the now
unaligned level.

Not updating the rmap array is the most obvious bug, as KVM accesses
garbage data beyond the end of the rmap.  KVM interprets the bad data as
pointers, leading to non-canonical #GPs, unexpected #PFs, etc...

  general protection fault: 0000 [#1] SMP
  CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
  Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
  RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
  RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
  RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
  RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
  R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
  R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
  FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
  Call Trace:
   kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
   __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
   kvm_set_memory_region+0x45/0x60 [kvm]
   kvm_vm_ioctl+0x30f/0x920 [kvm]
   do_vfs_ioctl+0xa1/0x620
   ksys_ioctl+0x66/0x70
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x4c/0x170
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f7ed9911f47
  Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
  RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
  RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
  R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
  R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
  Modules linked in: kvm_intel kvm irqbypass
  ---[ end trace 0c5f570b3358ca89 ]---

The disallow_lpage tracking is more subtle.  Failure to update results
in KVM creating large pages when it shouldn't, either due to stale data
or again due to indexing beyond the end of the metadata arrays, which
can lead to memory corruption and/or leaking data to guest/userspace.

Note, the arrays for the old memslot are freed by the unconditional call
to kvm_free_memslot() in __kvm_set_memory_region().

Fixes: 05da45583de9b ("KVM: MMU: large page support")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/x86.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8bb2fb1705ff..04d1bf89da0e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9703,6 +9703,13 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
 {
 	int i;
 
+	/*
+	 * Clear out the previous array pointers for the KVM_MR_MOVE case.  The
+	 * old arrays will be freed by __kvm_set_memory_region() if installing
+	 * the new memslot is successful.
+	 */
+	memset(&slot->arch, 0, sizeof(slot->arch));
+
 	for (i = 0; i < KVM_NR_PAGE_SIZES; ++i) {
 		struct kvm_lpage_info *linfo;
 		unsigned long ugfn;
@@ -9777,6 +9784,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				const struct kvm_userspace_memory_region *mem,
 				enum kvm_mr_change change)
 {
+	if (change == KVM_MR_MOVE)
+		return kvm_arch_create_memslot(kvm, memslot,
+					       mem->memory_size >> PAGE_SHIFT);
+
 	return 0;
 }
 
-- 
2.24.1


  reply index

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 20:40 [PATCH v4 00/19] KVM: Dynamically size memslot arrays Sean Christopherson
2019-12-17 20:40 ` Sean Christopherson [this message]
2019-12-17 20:48   ` [PATCH v4 01/19] KVM: x86: Allocate new rmap and large page tracking when moving memslot Sean Christopherson
2019-12-17 20:50     ` Sean Christopherson
2019-12-17 21:56   ` Peter Xu
2019-12-17 22:20     ` Sean Christopherson
2019-12-17 22:37       ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 02/19] KVM: Reinstall old memslots if arch preparation fails Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 03/19] KVM: Don't free new memslot if allocation of said memslot fails Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 04/19] KVM: PPC: Move memslot memory allocation into prepare_memory_region() Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 05/19] KVM: x86: Allocate memslot resources during prepare_memory_region() Sean Christopherson
2019-12-17 22:07   ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 06/19] KVM: Drop kvm_arch_create_memslot() Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 07/19] KVM: Explicitly free allocated-but-unused dirty bitmap Sean Christopherson
2019-12-17 22:24   ` Peter Xu
2019-12-17 22:51     ` Sean Christopherson
2019-12-18 16:17       ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 08/19] KVM: Refactor error handling for setting memory region Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 09/19] KVM: Move setting of memslot into helper routine Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 10/19] KVM: Drop "const" attribute from old memslot in commit_memory_region() Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 11/19] KVM: x86: Free arrays for old memslot when moving memslot's base gfn Sean Christopherson
2019-12-17 22:48   ` Peter Xu
2019-12-17 20:40 ` [PATCH v4 12/19] KVM: Move memslot deletion to helper function Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 13/19] KVM: Simplify kvm_free_memslot() and all its descendents Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 14/19] KVM: Clean up local variable usage in __kvm_set_memory_region() Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 15/19] KVM: Provide common implementation for generic dirty log functions Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 16/19] KVM: Ensure validity of memslot with respect to kvm_get_dirty_log() Sean Christopherson
2019-12-24 18:19   ` Peter Xu
2020-01-14 18:25     ` Sean Christopherson
2020-02-06 22:03       ` Peter Xu
2020-02-07 18:52         ` Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 17/19] KVM: Terminate memslot walks via used_slots Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 18/19] KVM: Dynamically size memslot array based on number of used slots Sean Christopherson
2019-12-17 20:40 ` [PATCH v4 19/19] KVM: selftests: Add test for KVM_SET_USER_MEMORY_REGION Sean Christopherson
2019-12-18 11:29   ` Christian Borntraeger
2019-12-18 11:39   ` Christian Borntraeger
2019-12-18 16:39     ` Sean Christopherson
2019-12-18 11:40 ` [PATCH v4 00/19] KVM: Dynamically size memslot arrays Christian Borntraeger
2019-12-18 18:10 ` Marc Zyngier

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191217204041.10815-2-sean.j.christopherson@intel.com \
    --to=sean.j.christopherson@intel.com \
    --cc=borntraeger@de.ibm.com \
    --cc=christoffer.dall@arm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=f4bug@amsat.org \
    --cc=frankja@linux.ibm.com \
    --cc=james.morse@arm.com \
    --cc=jhogan@kernel.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=paulus@ozlabs.org \
    --cc=pbonzini@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git