* [PATCH] kvm: Nested KVM MMUs need PAE root too
@ 2019-06-22 17:42 Jiří Paleček
2019-08-26 12:16 ` Vitaly Kuznetsov
2019-09-11 16:04 ` Paolo Bonzini
0 siblings, 2 replies; 8+ messages in thread
From: Jiří Paleček @ 2019-06-22 17:42 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On AMD processors, in PAE 32bit mode, nested KVM instances don't
work. The L0 host get a kernel OOPS, which is related to
arch.mmu->pae_root being NULL.
The reason for this is that when setting up nested KVM instance,
arch.mmu is set to &arch.guest_mmu (while normally, it would be
&arch.root_mmu). However, the initialization and allocation of
pae_root only creates it in root_mmu. KVM code (ie. in
mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the
unallocated arch.guest_mmu->pae_root.
This fix just allocates (and frees) pae_root in both guest_mmu and
root_mmu (and also lm_root if it was allocated). The allocation is
subject to previous restrictions ie. it won't allocate anything on
64-bit and AFAIK not on Intel.
See bug 203923 for details.
Signed-off-by: Jiri Palecek <jpalecek@web.de>
Tested-by: Jiri Palecek <jpalecek@web.de>
---
arch/x86/kvm/mmu.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 24843cf49579..efa8285bb56d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5592,13 +5592,13 @@ slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
PT_PAGE_TABLE_LEVEL, lock_flush_tlb);
}
-static void free_mmu_pages(struct kvm_vcpu *vcpu)
+static void free_mmu_pages(struct kvm_mmu *mmu)
{
- free_page((unsigned long)vcpu->arch.mmu->pae_root);
- free_page((unsigned long)vcpu->arch.mmu->lm_root);
+ free_page((unsigned long)mmu->pae_root);
+ free_page((unsigned long)mmu->lm_root);
}
-static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
+static int alloc_mmu_pages(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
{
struct page *page;
int i;
@@ -5619,9 +5619,9 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
if (!page)
return -ENOMEM;
- vcpu->arch.mmu->pae_root = page_address(page);
+ mmu->pae_root = page_address(page);
for (i = 0; i < 4; ++i)
- vcpu->arch.mmu->pae_root[i] = INVALID_PAGE;
+ mmu->pae_root[i] = INVALID_PAGE;
return 0;
}
@@ -5629,6 +5629,7 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
int kvm_mmu_create(struct kvm_vcpu *vcpu)
{
uint i;
+ int ret;
vcpu->arch.mmu = &vcpu->arch.root_mmu;
vcpu->arch.walk_mmu = &vcpu->arch.root_mmu;
@@ -5646,7 +5647,19 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
- return alloc_mmu_pages(vcpu);
+
+ ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
+ if (ret)
+ return ret;
+
+ ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
+ if (ret)
+ goto fail_allocate_root;
+
+ return ret;
+ fail_allocate_root:
+ free_mmu_pages(&vcpu->arch.guest_mmu);
+ return ret;
}
static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
@@ -6102,7 +6115,8 @@ unsigned long kvm_mmu_calculate_default_mmu_pages(struct kvm *kvm)
void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
{
kvm_mmu_unload(vcpu);
- free_mmu_pages(vcpu);
+ free_mmu_pages(&vcpu->arch.root_mmu);
+ free_mmu_pages(&vcpu->arch.guest_mmu);
mmu_free_memory_caches(vcpu);
}
--
2.23.0.rc1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm: Nested KVM MMUs need PAE root too
2019-06-22 17:42 [PATCH] kvm: Nested KVM MMUs need PAE root too Jiří Paleček
@ 2019-08-26 12:16 ` Vitaly Kuznetsov
2019-08-26 16:11 ` Sean Christopherson
2019-08-31 12:45 ` Jiri Palecek
2019-09-11 16:04 ` Paolo Bonzini
1 sibling, 2 replies; 8+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-26 12:16 UTC (permalink / raw)
To: Jiří Paleček; +Cc: kvm, Paolo Bonzini
Jiří Paleček <jpalecek@web.de> writes:
> On AMD processors, in PAE 32bit mode, nested KVM instances don't
> work. The L0 host get a kernel OOPS, which is related to
> arch.mmu->pae_root being NULL.
The date on the message is "Date: Sat, 22 Jun 2019 19:42:04 +0200" and I
got a bit confused at first.
>
> The reason for this is that when setting up nested KVM instance,
> arch.mmu is set to &arch.guest_mmu (while normally, it would be
> &arch.root_mmu).
MMU switch to guest_mmu happens when we're about to start L2 guest - and
we switch back to root_mmu when we want to execute L1 again (e.g. after
vmexit) so this is not a one-time thing ('when setting up nested KVM
instance' makes me think so).
> However, the initialization and allocation of
> pae_root only creates it in root_mmu. KVM code (ie. in
> mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the
> unallocated arch.guest_mmu->pae_root.
Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
>
> This fix just allocates (and frees) pae_root in both guest_mmu and
> root_mmu (and also lm_root if it was allocated). The allocation is
> subject to previous restrictions ie. it won't allocate anything on
> 64-bit and AFAIK not on Intel.
Right, it is only NPT 32 bit which uses that (and it's definitely
undertested).
>
> See bug 203923 for details.
Personally, I'd prefer this to be full link
https://bugzilla.kernel.org/show_bug.cgi?id=203923
as there are multiple bugzillas out threre.
>
> Signed-off-by: Jiri Palecek <jpalecek@web.de>
> Tested-by: Jiri Palecek <jpalecek@web.de>
This is weird. I always thought "Signed-off-by" implies some form of
testing (unless stated otherwise) :-)
>
> ---
> arch/x86/kvm/mmu.c | 30 ++++++++++++++++++++++--------
> 1 file changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 24843cf49579..efa8285bb56d 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5592,13 +5592,13 @@ slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
> PT_PAGE_TABLE_LEVEL, lock_flush_tlb);
> }
>
> -static void free_mmu_pages(struct kvm_vcpu *vcpu)
> +static void free_mmu_pages(struct kvm_mmu *mmu)
> {
> - free_page((unsigned long)vcpu->arch.mmu->pae_root);
> - free_page((unsigned long)vcpu->arch.mmu->lm_root);
> + free_page((unsigned long)mmu->pae_root);
> + free_page((unsigned long)mmu->lm_root);
> }
>
> -static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> +static int alloc_mmu_pages(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
> {
> struct page *page;
> int i;
> @@ -5619,9 +5619,9 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> if (!page)
> return -ENOMEM;
>
> - vcpu->arch.mmu->pae_root = page_address(page);
> + mmu->pae_root = page_address(page);
> for (i = 0; i < 4; ++i)
> - vcpu->arch.mmu->pae_root[i] = INVALID_PAGE;
> + mmu->pae_root[i] = INVALID_PAGE;
>
> return 0;
> }
> @@ -5629,6 +5629,7 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> uint i;
> + int ret;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> vcpu->arch.walk_mmu = &vcpu->arch.root_mmu;
> @@ -5646,7 +5647,19 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
>
> vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
> - return alloc_mmu_pages(vcpu);
> +
> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
> + if (ret)
> + return ret;
> +
> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> + if (ret)
> + goto fail_allocate_root;
(personal preference) here you're just jumping over 'return' so I'd
prefer this to be written as:
ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
if (!ret)
return 0;
free_mmu_pages(&vcpu->arch.guest_mmu);
return ret;
and no label/goto required.
> +
> + return ret;
> + fail_allocate_root:
> + free_mmu_pages(&vcpu->arch.guest_mmu);
> + return ret;
> }
>
> static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> @@ -6102,7 +6115,8 @@ unsigned long kvm_mmu_calculate_default_mmu_pages(struct kvm *kvm)
> void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
> {
> kvm_mmu_unload(vcpu);
> - free_mmu_pages(vcpu);
> + free_mmu_pages(&vcpu->arch.root_mmu);
> + free_mmu_pages(&vcpu->arch.guest_mmu);
> mmu_free_memory_caches(vcpu);
> }
>
> --
> 2.23.0.rc1
>
--
Vitaly
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm: Nested KVM MMUs need PAE root too
2019-08-26 12:16 ` Vitaly Kuznetsov
@ 2019-08-26 16:11 ` Sean Christopherson
2019-08-31 12:45 ` Jiri Palecek
1 sibling, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2019-08-26 16:11 UTC (permalink / raw)
To: Vitaly Kuznetsov; +Cc: Jiří Paleček, kvm, Paolo Bonzini
On Mon, Aug 26, 2019 at 02:16:14PM +0200, Vitaly Kuznetsov wrote:
> Jiří Paleček <jpalecek@web.de> writes:
> > @@ -5646,7 +5647,19 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> > vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
> >
> > vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
> > - return alloc_mmu_pages(vcpu);
> > +
> > + ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
> > + if (ret)
> > + return ret;
> > +
> > + ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> > + if (ret)
> > + goto fail_allocate_root;
>
> (personal preference) here you're just jumping over 'return' so I'd
> prefer this to be written as:
>
> ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> if (!ret)
> return 0;
>
> free_mmu_pages(&vcpu->arch.guest_mmu);
> return ret;
>
> and no label/goto required.
Or alternatively:
ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
if (ret)
free_mmu_pages(&vcpu->arch.guest_mmu);
return ret;
since error handling is usually *not* the fall through path.
> > +
> > + return ret;
> > + fail_allocate_root:
> > + free_mmu_pages(&vcpu->arch.guest_mmu);
> > + return ret;
> > }
> >
> > static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> > @@ -6102,7 +6115,8 @@ unsigned long kvm_mmu_calculate_default_mmu_pages(struct kvm *kvm)
> > void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
> > {
> > kvm_mmu_unload(vcpu);
> > - free_mmu_pages(vcpu);
> > + free_mmu_pages(&vcpu->arch.root_mmu);
> > + free_mmu_pages(&vcpu->arch.guest_mmu);
> > mmu_free_memory_caches(vcpu);
> > }
> >
> > --
> > 2.23.0.rc1
> >
>
> --
> Vitaly
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm: Nested KVM MMUs need PAE root too
2019-08-26 12:16 ` Vitaly Kuznetsov
2019-08-26 16:11 ` Sean Christopherson
@ 2019-08-31 12:45 ` Jiri Palecek
2019-09-03 20:33 ` Sean Christopherson
1 sibling, 1 reply; 8+ messages in thread
From: Jiri Palecek @ 2019-08-31 12:45 UTC (permalink / raw)
To: Vitaly Kuznetsov; +Cc: kvm, Paolo Bonzini, Sean Christopherson
Hello
On 26. 08. 19 14:16, Vitaly Kuznetsov wrote:
> Jiří Paleček <jpalecek@web.de> writes:
>
>
>> The reason for this is that when setting up nested KVM instance,
>> arch.mmu is set to &arch.guest_mmu (while normally, it would be
>> &arch.root_mmu).
> MMU switch to guest_mmu happens when we're about to start L2 guest - and
> we switch back to root_mmu when we want to execute L1 again (e.g. after
> vmexit) so this is not a one-time thing ('when setting up nested KVM
> instance' makes me think so).
OK I'll rephrase it.
>
>> However, the initialization and allocation of
>> pae_root only creates it in root_mmu. KVM code (ie. in
>> mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the
>> unallocated arch.guest_mmu->pae_root.
> Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
Thanks
>> This fix just allocates (and frees) pae_root in both guest_mmu and
>> root_mmu (and also lm_root if it was allocated). The allocation is
>> subject to previous restrictions ie. it won't allocate anything on
>> 64-bit and AFAIK not on Intel.
> Right, it is only NPT 32 bit which uses that (and it's definitely
> undertested).
>
>> See bug 203923 for details.
> Personally, I'd prefer this to be full link
> https://bugzilla.kernel.org/show_bug.cgi?id=203923
> as there are multiple bugzillas out threre.
Will do
>> Signed-off-by: Jiri Palecek <jpalecek@web.de>
>> Tested-by: Jiri Palecek <jpalecek@web.de>
> This is weird. I always thought "Signed-off-by" implies some form of
> testing (unless stated otherwise) :-)
Well, I thought it was quite common that someone authors a patch but
doesn't have means to test it. Anyway, after reading Kernel Newbies, I
added that to indicate that I did test it and if there's need to test
anything reasonably sized on this particualr configuration, I'm open for it.
>
>
>> ---
>> arch/x86/kvm/mmu.c | 30 ++++++++++++++++++++++--------
>> 1 file changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 24843cf49579..efa8285bb56d 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -5646,7 +5647,19 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
>> vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
>>
>> vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
>> - return alloc_mmu_pages(vcpu);
>> +
>> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
>> + if (ret)
>> + return ret;
>> +
>> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
>> + if (ret)
>> + goto fail_allocate_root;
> (personal preference) here you're just jumping over 'return' so I'd
> prefer this to be written as:
>
> ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> if (!ret)
> return 0;
>
> free_mmu_pages(&vcpu->arch.guest_mmu);
> return ret;
>
> and no label/goto required.
OK I'll change it. However, I like the solution by Sean Christopherson more.
Regards
Jiri Palecek
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm: Nested KVM MMUs need PAE root too
2019-08-31 12:45 ` Jiri Palecek
@ 2019-09-03 20:33 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2019-09-03 20:33 UTC (permalink / raw)
To: Jiri Palecek; +Cc: Vitaly Kuznetsov, kvm, Paolo Bonzini
On Sat, Aug 31, 2019 at 02:45:43PM +0200, Jiri Palecek wrote:
> >>Signed-off-by: Jiri Palecek <jpalecek@web.de>
> >>Tested-by: Jiri Palecek <jpalecek@web.de>
> >This is weird. I always thought "Signed-off-by" implies some form of
> >testing (unless stated otherwise) :-)
> Well, I thought it was quite common that someone authors a patch but
> doesn't have means to test it. Anyway, after reading Kernel Newbies, I
> added that to indicate that I did test it and if there's need to test
> anything reasonably sized on this particualr configuration, I'm open for it.
Not being able to test a patch isn't uncommon in the absolute sense, but
it's certainly uncommon when viewed as a percentage of the total number of
patches sent to LKML. A SoB is generally taken to imply basic functional
testing unless otherwise stated, e.g. most folks add a note in the cover
letter or patch comment when something has only been compile tested or not
tested at all.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm: Nested KVM MMUs need PAE root too
2019-06-22 17:42 [PATCH] kvm: Nested KVM MMUs need PAE root too Jiří Paleček
2019-08-26 12:16 ` Vitaly Kuznetsov
@ 2019-09-11 16:04 ` Paolo Bonzini
1 sibling, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2019-09-11 16:04 UTC (permalink / raw)
To: Jiří Paleček; +Cc: kvm
On 22/06/19 19:42, Jiří Paleček wrote:
> On AMD processors, in PAE 32bit mode, nested KVM instances don't
> work. The L0 host get a kernel OOPS, which is related to
> arch.mmu->pae_root being NULL.
>
> The reason for this is that when setting up nested KVM instance,
> arch.mmu is set to &arch.guest_mmu (while normally, it would be
> &arch.root_mmu). However, the initialization and allocation of
> pae_root only creates it in root_mmu. KVM code (ie. in
> mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the
> unallocated arch.guest_mmu->pae_root.
>
> This fix just allocates (and frees) pae_root in both guest_mmu and
> root_mmu (and also lm_root if it was allocated). The allocation is
> subject to previous restrictions ie. it won't allocate anything on
> 64-bit and AFAIK not on Intel.
>
> See bug 203923 for details.
>
> Signed-off-by: Jiri Palecek <jpalecek@web.de>
> Tested-by: Jiri Palecek <jpalecek@web.de>
>
> ---
> arch/x86/kvm/mmu.c | 30 ++++++++++++++++++++++--------
> 1 file changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 24843cf49579..efa8285bb56d 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5592,13 +5592,13 @@ slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
> PT_PAGE_TABLE_LEVEL, lock_flush_tlb);
> }
>
> -static void free_mmu_pages(struct kvm_vcpu *vcpu)
> +static void free_mmu_pages(struct kvm_mmu *mmu)
> {
> - free_page((unsigned long)vcpu->arch.mmu->pae_root);
> - free_page((unsigned long)vcpu->arch.mmu->lm_root);
> + free_page((unsigned long)mmu->pae_root);
> + free_page((unsigned long)mmu->lm_root);
> }
>
> -static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> +static int alloc_mmu_pages(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
> {
> struct page *page;
> int i;
> @@ -5619,9 +5619,9 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> if (!page)
> return -ENOMEM;
>
> - vcpu->arch.mmu->pae_root = page_address(page);
> + mmu->pae_root = page_address(page);
> for (i = 0; i < 4; ++i)
> - vcpu->arch.mmu->pae_root[i] = INVALID_PAGE;
> + mmu->pae_root[i] = INVALID_PAGE;
>
> return 0;
> }
> @@ -5629,6 +5629,7 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> uint i;
> + int ret;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> vcpu->arch.walk_mmu = &vcpu->arch.root_mmu;
> @@ -5646,7 +5647,19 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
>
> vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
> - return alloc_mmu_pages(vcpu);
> +
> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
> + if (ret)
> + return ret;
> +
> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> + if (ret)
> + goto fail_allocate_root;
> +
> + return ret;
> + fail_allocate_root:
> + free_mmu_pages(&vcpu->arch.guest_mmu);
> + return ret;
> }
>
> static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> @@ -6102,7 +6115,8 @@ unsigned long kvm_mmu_calculate_default_mmu_pages(struct kvm *kvm)
> void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
> {
> kvm_mmu_unload(vcpu);
> - free_mmu_pages(vcpu);
> + free_mmu_pages(&vcpu->arch.root_mmu);
> + free_mmu_pages(&vcpu->arch.guest_mmu);
> mmu_free_memory_caches(vcpu);
> }
>
> --
> 2.23.0.rc1
>
Queued, thanks. The goto usage is somewhat borderline, but not unheard of.
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <87y2z7rmgw.fsf@debian>]
* Re: [PATCH] kvm: Nested KVM MMUs need PAE root too
[not found] <87y2z7rmgw.fsf@debian>
@ 2019-09-04 9:53 ` Vitaly Kuznetsov
2019-09-04 11:03 ` Jiri Palecek
0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Kuznetsov @ 2019-09-04 9:53 UTC (permalink / raw)
To: Jiří Paleček; +Cc: kvm, Paolo Bonzini, Sean Christopherson
Jiří Paleček <jpalecek@web.de> writes:
>
Your Subject line needs to be 'PATCH v2' as you're sending an updated
version.
> On AMD processors, in PAE 32bit mode, nested KVM instances don't
> work. The L0 host get a kernel OOPS, which is related to
> arch.mmu->pae_root being NULL.
>
> The reason for this is that when running nested KVM instance, arch.mmu
> is set to &arch.guest_mmu (while normally, it would be
> &arch.root_mmu). However, the initialization and allocation of
> pae_root only creates it in root_mmu. KVM code (ie. in
> mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the
> unallocated arch.guest_mmu->pae_root.
>
> This fix just allocates (and frees) pae_root in both guest_mmu and
> root_mmu (and also lm_root if it was allocated). The allocation is
> subject to previous restrictions ie. it won't allocate anything on
> 64-bit and AFAIK not on Intel.
>
> See https://bugzilla.kernel.org/show_bug.cgi?id=203923 for details.
>
> Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
> Signed-off-by: Jiri Palecek <jpalecek@web.de>
> Tested-by: Jiri Palecek <jpalecek@web.de>
Well, this was commented on in v1 :-)
> ---
> arch/x86/kvm/mmu.c | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 24843cf49579..6882963374e7 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5592,13 +5592,13 @@ slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
> PT_PAGE_TABLE_LEVEL, lock_flush_tlb);
> }
>
> -static void free_mmu_pages(struct kvm_vcpu *vcpu)
> +static void free_mmu_pages(struct kvm_mmu *mmu)
> {
> - free_page((unsigned long)vcpu->arch.mmu->pae_root);
> - free_page((unsigned long)vcpu->arch.mmu->lm_root);
> + free_page((unsigned long)mmu->pae_root);
> + free_page((unsigned long)mmu->lm_root);
> }
>
> -static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> +static int alloc_mmu_pages(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
> {
> struct page *page;
> int i;
> @@ -5619,9 +5619,9 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> if (!page)
> return -ENOMEM;
>
> - vcpu->arch.mmu->pae_root = page_address(page);
> + mmu->pae_root = page_address(page);
> for (i = 0; i < 4; ++i)
> - vcpu->arch.mmu->pae_root[i] = INVALID_PAGE;
> + mmu->pae_root[i] = INVALID_PAGE;
>
> return 0;
> }
> @@ -5629,6 +5629,7 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> uint i;
> + int ret;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> vcpu->arch.walk_mmu = &vcpu->arch.root_mmu;
> @@ -5646,7 +5647,16 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
>
> vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
> - return alloc_mmu_pages(vcpu);
> +
> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
> + if (ret)
> + return ret;
> +
> + ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> + if (ret)
> + free_mmu_pages(&vcpu->arch.guest_mmu);
> +
> + return ret;
> }
>
> static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> @@ -6102,7 +6112,8 @@ unsigned long kvm_mmu_calculate_default_mmu_pages(struct kvm *kvm)
> void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
> {
> kvm_mmu_unload(vcpu);
> - free_mmu_pages(vcpu);
> + free_mmu_pages(&vcpu->arch.root_mmu);
> + free_mmu_pages(&vcpu->arch.guest_mmu);
> mmu_free_memory_caches(vcpu);
> }
>
> --
> 2.23.0.rc1
>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
--
Vitaly
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-09-11 16:04 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-22 17:42 [PATCH] kvm: Nested KVM MMUs need PAE root too Jiří Paleček
2019-08-26 12:16 ` Vitaly Kuznetsov
2019-08-26 16:11 ` Sean Christopherson
2019-08-31 12:45 ` Jiri Palecek
2019-09-03 20:33 ` Sean Christopherson
2019-09-11 16:04 ` Paolo Bonzini
[not found] <87y2z7rmgw.fsf@debian>
2019-09-04 9:53 ` Vitaly Kuznetsov
2019-09-04 11:03 ` Jiri Palecek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).