On 2014-06-29 16:32, Jan Kiszka wrote:
> On 2014-06-29 16:27, Gleb Natapov wrote:
>> On Sun, Jun 29, 2014 at 04:01:04PM +0200, Borislav Petkov wrote:
>>> On Sun, Jun 29, 2014 at 04:42:47PM +0300, Gleb Natapov wrote:
>>>> Please do so and let us know.
>>>
>>> Yep, just did. Reverting ae9fedc793 fixes the issue.
>>>
>>>> reinj:1 means that previous injection failed due to another #PF that
>>>> happened during the event injection itself This may happen if GDT or fist
>>>> instruction of a fault handler is not mapped by shadow pages, but here
>>>> it says that the new page fault is at the same address as the previous
>>>> one as if GDT is or #PF handler is mapped there. Strange. Especially
>>>> since #DF is injected successfully, so GDT should be fine. May be wrong
>>>> cpl makes svm crazy?
>>>
>>> Well, I'm not going to even pretend to know kvm to know *when* we're
>>> saving VMCB state but if we're saving the wrong CPL and then doing the
>>> pagetable walk, I can very well imagine if the walker gets confused. One
>>> possible issue could be U/S bit (bit 2) in the PTE bits which allows
>>> access to supervisor pages only when CPL < 3. I.e., CPL has effect on
>>> pagetable walk and a wrong CPL level could break it.
>>>
>>> All a conjecture though...
>>>
>> Looks plausible, still strange that second #PF is at the same address as the first one though.
>> Anyway, not we have the commit to blame.
> 
> I suspect there is a gap between cause and effect. I'm tracing CPL
> changes currently, and my first impression is that QEMU triggers an
> unwanted switch from CPL 3 to 0 on vmport access:
> 
>  qemu-system-x86-11883 [001]  7493.378630: kvm_entry:            vcpu 0
>  qemu-system-x86-11883 [001]  7493.378631: bprint:               svm_vcpu_run: entry cpl 0
>  qemu-system-x86-11883 [001]  7493.378636: bprint:               svm_vcpu_run: exit cpl 3
>  qemu-system-x86-11883 [001]  7493.378637: kvm_exit:             reason io rip 0x400854 info 56580241 400855
>  qemu-system-x86-11883 [001]  7493.378640: kvm_emulate_insn:     0:400854:ed (prot64)
>  qemu-system-x86-11883 [001]  7493.378642: kvm_userspace_exit:   reason KVM_EXIT_IO (2)
>  qemu-system-x86-11883 [001]  7493.378655: bprint:               kvm_arch_vcpu_ioctl_get_sregs: ss.dpl 0
>  qemu-system-x86-11883 [001]  7493.378684: bprint:               kvm_arch_vcpu_ioctl_set_sregs: ss.dpl 0
>  qemu-system-x86-11883 [001]  7493.378685: bprint:               svm_set_segment: cpl = 0
>  qemu-system-x86-11883 [001]  7493.378711: kvm_pio:              pio_read at 0x5658 size 4 count 1 val 0x3442554a 
> 
> Yeah... do we have to manually sync save.cpl into ss.dpl on get_sregs
> on AMD?
> 

Applying this logic:

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ec8366c..b5e994a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1462,6 +1462,7 @@ static void svm_get_segment(struct kvm_vcpu *vcpu,
 		 */
 		if (var->unusable)
 			var->db = 0;
+		var->dpl = to_svm(vcpu)->vmcb->save.cpl;
 		break;
 	}
 }

...and my VM runs smoothly so far. Does it make sense in all scenarios?

Jan