From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Nadav Har'El" Subject: Re: [PATCH 13/24] Implement VMREAD and VMWRITE Date: Wed, 4 Aug 2010 19:09:07 +0300 Message-ID: <20100804160907.GC15156@fermat.math.technion.ac.il> References: <1276431753-nyh@il.ibm.com> <201006131229.o5DCTDhk013030@rice.haifa.ibm.com> <4C15F802.7060106@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mailgw13.technion.ac.il ([132.68.225.13]:57692 "EHLO mailgw13.technion.ac.il" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751440Ab0HDQJM (ORCPT ); Wed, 4 Aug 2010 12:09:12 -0400 Content-Disposition: inline In-Reply-To: <4C15F802.7060106@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Jun 14, 2010, Avi Kivity wrote about "Re: [PATCH 13/24] Implement VMREAD and VMWRITE": > >+#ifdef CONFIG_X86_64 > >+ switch (vmcs_field_type(field)) { > >+ case VMCS_FIELD_TYPE_U64: case VMCS_FIELD_TYPE_ULONG: > >+ if (!is_long_mode(vcpu)) { > >+ kvm_register_write(vcpu, reg+1, field_value>> 32); > > > > What's this reg+1 thing? I thought vmread simply ignores the upper half. Thanks. Now that I look at this, I really can't figure out what it was supposed to be doing. Maybe it was supposed to attempt to support running 64-bit guests on a 32-bit host, or something, I don't know. Anyway, I removed it now. > >+ kvm_write_guest_virt_system(gva,&field_value, > >+ vmcs_field_size(vmcs_field_type(field), vcpu), > >+ vcpu, NULL); > > > > vmread doesn't support 64-bit writes to memory outside long mode, so > you'll have to truncate the write. > > I think you'll be better off returning a 32-bit size in > vmcs_field_size() in these cases. I think Gleb's correction (see my separate reply to him) was more accurate, that the length of the write actually has nothing to do with the field size - in 32-bit mode we always write 4 bytes, and in 64-bit mode, we always write 8 bytes - even if the given field to VMREAD was shorter or longer than those sizes. So now I have the code like this: kvm_write_guest_virt_system(gva, &field_value, (is_long_mode(vcpu) ? 8 : 4), vcpu, NULL); > >+ if (!nested_map_current(vcpu)) { > >+ printk(KERN_INFO "%s invalid shadow vmcs\n", __func__); > >+ set_rflags_to_vmx_fail_invalid(vcpu); > >+ return 1; > >+ } > > > > Can do the read_any() here. Right, and indeed it will make the code look better. Thanks, done. > >+ if (read_succeed) { > >+ clear_rflags_cf_zf(vcpu); > >+ skip_emulated_instruction(vcpu); > >+ } else { > >+ set_rflags_to_vmx_fail_valid(vcpu); > >+ vmcs_write32(VM_INSTRUCTION_ERROR, 12); > > > > s_e_i() in any case but an exception. Yes, I missed a bunch of those and will be a lot more careful now. Of course the vmcs_write32 above was also completely broken and I fixed it to write to vmcs12 (I discussed this issue in a reply to a different patch). > >+ kvm_read_guest_virt(gva,&field_value, > >+ vmcs_field_size(field_type, vcpu), vcpu, NULL); > > > > Check for exception. I am not sure what I should really do here... In emulating VMWRITE, we try to read from memory the pointer of where to store the result, and kvm_read_guest_virt fails (return !=0). What shall I do in such a case, queue a PF_VECTOR? Or did you have something else in mind? Thanks, and I'm attaching below a newer version of this patch with most of your comments fixed (except the one I asked about in the last paragraph). Nadav. ---- Subject: [PATCH 14/26] nVMX: Implement VMREAD and VMWRITE Implement the VMREAD and VMWRITE instructions. With these instructions, L1 can read and write to the VMCS it is holding. The values are read or written to the fields of the vmcs_fields structure introduced in the previous patch. Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 182 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 180 insertions(+), 2 deletions(-) --- .before/arch/x86/kvm/vmx.c 2010-08-04 19:07:21.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2010-08-04 19:07:21.000000000 +0300 @@ -4180,6 +4180,184 @@ static int handle_vmclear(struct kvm_vcp return 1; } +enum vmcs_field_type { + VMCS_FIELD_TYPE_U16 = 0, + VMCS_FIELD_TYPE_U64 = 1, + VMCS_FIELD_TYPE_U32 = 2, + VMCS_FIELD_TYPE_ULONG = 3 +}; + +static inline int vmcs_field_type(unsigned long field) +{ + if (0x1 & field) /* one of the *_HIGH fields, all are 32 bit */ + return VMCS_FIELD_TYPE_U32; + return (field >> 13) & 0x3 ; +} + +static inline int vmcs_field_size(int field_type, struct kvm_vcpu *vcpu) +{ + switch (field_type) { + case VMCS_FIELD_TYPE_U16: + return 2; + case VMCS_FIELD_TYPE_U32: + return 4; + case VMCS_FIELD_TYPE_U64: + return 8; + case VMCS_FIELD_TYPE_ULONG: + return is_long_mode(vcpu) ? 8 : 4; + } + BUG(); /* can never happen */ +} + +static inline int vmcs_field_readonly(unsigned long field) +{ + return (((field >> 10) & 0x3) == 1); +} + +static inline bool vmcs12_read_any(struct kvm_vcpu *vcpu, + unsigned long field, u64 *ret) +{ + short offset = vmcs_field_to_offset(field); + char *p; + + if (offset < 0) + return 0; + + p = ((char *)(get_vmcs12_fields(vcpu))) + offset; + + switch (vmcs_field_type(field)) { + case VMCS_FIELD_TYPE_ULONG: + *ret = *((unsigned long *)p); + return 1; + case VMCS_FIELD_TYPE_U16: + *ret = (u16) *((unsigned long *)p); + return 1; + case VMCS_FIELD_TYPE_U32: + *ret = (u32) *((unsigned long *)p); + return 1; + case VMCS_FIELD_TYPE_U64: + *ret = *((u64 *)p); + return 1; + default: + return 0; /* can never happen. */ + } +} + +static int handle_vmread(struct kvm_vcpu *vcpu) +{ + unsigned long field; + u64 field_value; + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); + gva_t gva = 0; + + if (!nested_vmx_check_permission(vcpu)) + return 1; + + /* decode instruction info and find the field to read */ + field = kvm_register_read(vcpu, (((vmx_instruction_info) >> 28) & 0xf)); + if(!vmcs12_read_any(vcpu, field, &field_value)){ + nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); + skip_emulated_instruction(vcpu); + return 1; + } + + /* + * and now check if reuqest to put the value in register or memory. + * Note that the number of bits actually written is 32 or 64 depending + * in the mode, not on the given field's length. + */ + if (vmx_instruction_info & (1u << 10)) { + kvm_register_write(vcpu, (((vmx_instruction_info) >> 3) & 0xf), + field_value); + } else { + if (get_vmx_mem_address(vcpu, exit_qualification, + vmx_instruction_info, &gva)) + return 1; + /* ok to use *_system, because handle_vmread verified cpl=0 */ + kvm_write_guest_virt_system(gva, &field_value, + (is_long_mode(vcpu) ? 8 : 4), vcpu, NULL); + } + + nested_vmx_succeed(vcpu); + skip_emulated_instruction(vcpu); + return 1; +} + + +static int handle_vmwrite(struct kvm_vcpu *vcpu) +{ + unsigned long field; + u64 field_value = 0; + gva_t gva; + int field_type; + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); + char *p; + short offset; + + if (!nested_vmx_check_permission(vcpu)) + return 1; + + field = kvm_register_read(vcpu, (((vmx_instruction_info) >> 28) & 0xf)); + + if (vmcs_field_readonly(field)) { + nested_vmx_failValid(vcpu, + VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT); + skip_emulated_instruction(vcpu); + return 1; + } + + field_type = vmcs_field_type(field); + + offset = vmcs_field_to_offset(field); + if (offset < 0) { + nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); + skip_emulated_instruction(vcpu); + return 1; + } + p = ((char *) get_vmcs12_fields(vcpu)) + offset; + + if (vmx_instruction_info & (1u << 10)) + field_value = kvm_register_read(vcpu, + (((vmx_instruction_info) >> 3) & 0xf)); + else { + if (get_vmx_mem_address(vcpu, exit_qualification, + vmx_instruction_info, &gva)) + return 1; + kvm_read_guest_virt(gva, &field_value, + vmcs_field_size(field_type, vcpu), vcpu, NULL); + } + + switch (field_type) { + case VMCS_FIELD_TYPE_U16: + *(u16 *)p = field_value; + break; + case VMCS_FIELD_TYPE_U32: + *(u32 *)p = field_value; + break; + case VMCS_FIELD_TYPE_U64: +#ifdef CONFIG_X86_64 + *(unsigned long *)p = field_value; +#else + *(unsigned long *)p = field_value; + *(((unsigned long *)p)+1) = field_value >> 32; +#endif + break; + case VMCS_FIELD_TYPE_ULONG: + *(unsigned long *)p = field_value; + break; + default: + nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); + skip_emulated_instruction(vcpu); + return 1; + } + + nested_vmx_succeed(vcpu); + skip_emulated_instruction(vcpu); + return 1; +} + static bool verify_vmcs12_revision(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { if (vmcs12->revision_id == VMCS12_REVISION) @@ -4546,9 +4724,9 @@ static int (*kvm_vmx_exit_handlers[])(st [EXIT_REASON_VMLAUNCH] = handle_vmx_insn, [EXIT_REASON_VMPTRLD] = handle_vmptrld, [EXIT_REASON_VMPTRST] = handle_vmptrst, - [EXIT_REASON_VMREAD] = handle_vmx_insn, + [EXIT_REASON_VMREAD] = handle_vmread, [EXIT_REASON_VMRESUME] = handle_vmx_insn, - [EXIT_REASON_VMWRITE] = handle_vmx_insn, + [EXIT_REASON_VMWRITE] = handle_vmwrite, [EXIT_REASON_VMOFF] = handle_vmoff, [EXIT_REASON_VMON] = handle_vmon, [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold, -- Nadav Har'El | Wednesday, Aug 4 2010, 25 Av 5770 nyh@math.technion.ac.il |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |I am the boss of the house, and I have my http://nadav.harel.org.il |wife's permission to say so!