* [PATCH 1/3] KVM: x86: Avoid overflowing nr_mmu_pages in kvm_arch_commit_memory_region()
2021-11-26 0:31 [PATCH 0/3] KVM: Scalable memslots implementation additional patches Maciej S. Szmigiero
@ 2021-11-26 0:31 ` Maciej S. Szmigiero
2021-11-29 21:03 ` Sean Christopherson
2021-11-26 0:31 ` [PATCH 2/3] KVM: Use atomic_long_cmpxchg() instead of an open-coded variant Maciej S. Szmigiero
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Maciej S. Szmigiero @ 2021-11-26 0:31 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, Igor Mammedov, kvm, linux-kernel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
With kvm->nr_memslot_pages capped at ULONG_MAX we can't safely multiply it
by KVM_PERMILLE_MMU_PAGES (20) since this operation can possibly overflow
an unsigned long variable.
Rewrite this "* 20 / 1000" operation as "/ 50" instead to avoid such
overflow.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/x86.c | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1fcb345bc107..8cd1d254c948 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -135,7 +135,7 @@
#define KVM_HPAGE_MASK(x) (~(KVM_HPAGE_SIZE(x) - 1))
#define KVM_PAGES_PER_HPAGE(x) (KVM_HPAGE_SIZE(x) / PAGE_SIZE)
-#define KVM_PERMILLE_MMU_PAGES 20
+#define KVM_MEMSLOT_PAGES_TO_MMU_PAGES_RATIO 50
#define KVM_MIN_ALLOC_MMU_PAGES 64UL
#define KVM_MMU_HASH_SHIFT 12
#define KVM_NUM_MMU_PAGES (1 << KVM_MMU_HASH_SHIFT)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 04e8dabc187d..69330b395f12 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11753,8 +11753,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
(change == KVM_MR_CREATE || change == KVM_MR_DELETE)) {
unsigned long nr_mmu_pages;
- nr_mmu_pages = kvm->nr_memslot_pages * KVM_PERMILLE_MMU_PAGES;
- nr_mmu_pages /= 1000;
+ nr_mmu_pages = kvm->nr_memslot_pages / KVM_MEMSLOT_PAGES_TO_MMU_PAGES_RATIO;
nr_mmu_pages = max(nr_mmu_pages, KVM_MIN_ALLOC_MMU_PAGES);
kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] KVM: Use atomic_long_cmpxchg() instead of an open-coded variant
2021-11-26 0:31 [PATCH 0/3] KVM: Scalable memslots implementation additional patches Maciej S. Szmigiero
2021-11-26 0:31 ` [PATCH 1/3] KVM: x86: Avoid overflowing nr_mmu_pages in kvm_arch_commit_memory_region() Maciej S. Szmigiero
@ 2021-11-26 0:31 ` Maciej S. Szmigiero
2021-11-26 10:34 ` Paolo Bonzini
2021-11-26 0:31 ` [PATCH 3/3] KVM: Make kvm_for_each_memslot_in_gfn_range() strict and use iterators Maciej S. Szmigiero
2021-11-26 10:35 ` [PATCH 0/3] KVM: Scalable memslots implementation additional patches Paolo Bonzini
3 siblings, 1 reply; 9+ messages in thread
From: Maciej S. Szmigiero @ 2021-11-26 0:31 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, Igor Mammedov, kvm, linux-kernel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
Open-coding a cmpxchg()-like operation is significantly less readable than
a direct call.
Also, the open-coded version compiles to multiple instructions with
a branch on x86, instead of just a single instruction.
Since technically the open-coded variant didn't guarantee actual atomicity
add a comment there, too, that this property isn't strictly required in
this case.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
virt/kvm/kvm_main.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d4399db06d49..367c1cba26d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1378,8 +1378,12 @@ static void kvm_replace_memslot(struct kvm *kvm,
hash_del(&old->id_node[idx]);
interval_tree_remove(&old->hva_node[idx], &slots->hva_tree);
- if ((long)old == atomic_long_read(&slots->last_used_slot))
- atomic_long_set(&slots->last_used_slot, (long)new);
+ /*
+ * The atomicity isn't strictly required here since we are
+ * operating on an inactive memslots set anyway.
+ */
+ atomic_long_cmpxchg(&slots->last_used_slot,
+ (unsigned long)old, (unsigned long)new);
if (!new) {
kvm_erase_gfn_node(slots, old);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] KVM: Make kvm_for_each_memslot_in_gfn_range() strict and use iterators
2021-11-26 0:31 [PATCH 0/3] KVM: Scalable memslots implementation additional patches Maciej S. Szmigiero
2021-11-26 0:31 ` [PATCH 1/3] KVM: x86: Avoid overflowing nr_mmu_pages in kvm_arch_commit_memory_region() Maciej S. Szmigiero
2021-11-26 0:31 ` [PATCH 2/3] KVM: Use atomic_long_cmpxchg() instead of an open-coded variant Maciej S. Szmigiero
@ 2021-11-26 0:31 ` Maciej S. Szmigiero
2021-11-26 10:35 ` [PATCH 0/3] KVM: Scalable memslots implementation additional patches Paolo Bonzini
3 siblings, 0 replies; 9+ messages in thread
From: Maciej S. Szmigiero @ 2021-11-26 0:31 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, Igor Mammedov, kvm, linux-kernel
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
Make kvm_for_each_memslot_in_gfn_range() return only memslots at least
partially intersecting the requested range.
At the same time modify its implementation to use iterators.
This simplifies kvm_check_memslot_overlap() a bit.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
arch/x86/kvm/mmu/mmu.c | 11 ++---
include/linux/kvm_host.h | 104 +++++++++++++++++++++++++--------------
virt/kvm/kvm_main.c | 17 ++-----
3 files changed, 75 insertions(+), 57 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8f0035517450..009eb27d34d3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5715,23 +5715,22 @@ static bool __kvm_zap_rmaps(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
{
const struct kvm_memory_slot *memslot;
struct kvm_memslots *slots;
- struct rb_node *node;
+ struct kvm_memslot_iter iter;
bool flush = false;
gfn_t start, end;
- int i, idx;
+ int i;
if (!kvm_memslots_have_rmaps(kvm))
return flush;
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
slots = __kvm_memslots(kvm, i);
- idx = slots->node_idx;
- kvm_for_each_memslot_in_gfn_range(node, slots, gfn_start, gfn_end) {
- memslot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
+ kvm_for_each_memslot_in_gfn_range(&iter, slots, gfn_start, gfn_end) {
+ memslot = kvm_memslot_iter_slot(&iter);
start = max(gfn_start, memslot->base_gfn);
end = min(gfn_end, memslot->base_gfn + memslot->npages);
- if (start >= end)
+ if (WARN_ON_ONCE(start >= end))
continue;
flush = slot_handle_level_range(kvm, memslot, kvm_zap_rmapp,
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3932bb091099..580d9abd97c1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -841,74 +841,104 @@ struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
return NULL;
}
-static inline
-struct rb_node *kvm_memslots_gfn_upper_bound(struct kvm_memslots *slots, gfn_t gfn)
+/* Iterator used for walking memslots that overlap a gfn range. */
+struct kvm_memslot_iter {
+ struct kvm_memslots *slots;
+ gfn_t end;
+ struct rb_node *node;
+};
+
+static inline void kvm_memslot_iter_start(struct kvm_memslot_iter *iter,
+ struct kvm_memslots *slots,
+ gfn_t start, gfn_t end)
{
int idx = slots->node_idx;
- struct rb_node *node, *result = NULL;
+ struct rb_node *tmp;
+ struct kvm_memory_slot *slot;
- for (node = slots->gfn_tree.rb_node; node; ) {
- struct kvm_memory_slot *slot;
+ iter->slots = slots;
+ iter->end = end;
- slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
- if (gfn < slot->base_gfn) {
- result = node;
- node = node->rb_left;
- } else
- node = node->rb_right;
+ /*
+ * Find the so called "upper bound" of a key - the first node that has
+ * its key strictly greater than the searched one (the start gfn in our case).
+ */
+ iter->node = NULL;
+ for (tmp = slots->gfn_tree.rb_node; tmp; ) {
+ slot = container_of(tmp, struct kvm_memory_slot, gfn_node[idx]);
+ if (start < slot->base_gfn) {
+ iter->node = tmp;
+ tmp = tmp->rb_left;
+ } else {
+ tmp = tmp->rb_right;
+ }
}
- return result;
-}
-
-static inline
-struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t start)
-{
- struct rb_node *node;
-
/*
* Find the slot with the lowest gfn that can possibly intersect with
* the range, so we'll ideally have slot start <= range start
*/
- node = kvm_memslots_gfn_upper_bound(slots, start);
- if (node) {
- struct rb_node *pnode;
-
+ if (iter->node) {
/*
* A NULL previous node means that the very first slot
* already has a higher start gfn.
* In this case slot start > range start.
*/
- pnode = rb_prev(node);
- if (pnode)
- node = pnode;
+ tmp = rb_prev(iter->node);
+ if (tmp)
+ iter->node = tmp;
} else {
/* a NULL node below means no slots */
- node = rb_last(&slots->gfn_tree);
+ iter->node = rb_last(&slots->gfn_tree);
}
- return node;
+ if (iter->node) {
+ /*
+ * It is possible in the slot start < range start case that the
+ * found slot ends before or at range start (slot end <= range start)
+ * and so it does not overlap the requested range.
+ *
+ * In such non-overlapping case the next slot (if it exists) will
+ * already have slot start > range start, otherwise the logic above
+ * would have found it instead of the current slot.
+ */
+ slot = container_of(iter->node, struct kvm_memory_slot, gfn_node[idx]);
+ if (slot->base_gfn + slot->npages <= start)
+ iter->node = rb_next(iter->node);
+ }
}
-static inline
-bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
+static inline struct kvm_memory_slot *kvm_memslot_iter_slot(struct kvm_memslot_iter *iter)
+{
+ return container_of(iter->node, struct kvm_memory_slot, gfn_node[iter->slots->node_idx]);
+}
+
+static inline bool kvm_memslot_iter_is_valid(struct kvm_memslot_iter *iter)
{
struct kvm_memory_slot *memslot;
- memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
+ if (!iter->node)
+ return false;
+
+ memslot = kvm_memslot_iter_slot(iter);
/*
* If this slot starts beyond or at the end of the range so does
* every next one
*/
- return memslot->base_gfn >= end;
+ return memslot->base_gfn < iter->end;
+}
+
+static inline void kvm_memslot_iter_next(struct kvm_memslot_iter *iter)
+{
+ iter->node = rb_next(iter->node);
}
-/* Iterate over each memslot *possibly* intersecting [start, end) range */
-#define kvm_for_each_memslot_in_gfn_range(node, slots, start, end) \
- for (node = kvm_for_each_in_gfn_first(slots, start); \
- node && !kvm_for_each_in_gfn_no_more(slots, node, end); \
- node = rb_next(node)) \
+/* Iterate over each memslot at least partially intersecting [start, end) range */
+#define kvm_for_each_memslot_in_gfn_range(iter, slots, start, end) \
+ for (kvm_memslot_iter_start(iter, slots, start, end); \
+ kvm_memslot_iter_is_valid(iter); \
+ kvm_memslot_iter_next(iter))
/*
* KVM_SET_USER_MEMORY_REGION ioctl allows the following operations:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 367c1cba26d2..d6503b92b3cf 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1797,22 +1797,11 @@ static int kvm_set_memslot(struct kvm *kvm,
static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, int id,
gfn_t start, gfn_t end)
{
- int idx = slots->node_idx;
- struct rb_node *node;
-
- kvm_for_each_memslot_in_gfn_range(node, slots, start, end) {
- struct kvm_memory_slot *cslot;
- gfn_t cend;
+ struct kvm_memslot_iter iter;
- cslot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
- cend = cslot->base_gfn + cslot->npages;
- if (cslot->id == id)
- continue;
-
- /* kvm_for_each_in_gfn_no_more() guarantees that cslot->base_gfn < nend */
- if (cend > start)
+ kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end)
+ if (kvm_memslot_iter_slot(&iter)->id != id)
return true;
- }
return false;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread