From mboxrd@z Thu Jan 1 00:00:00 1970 From: Razvan Cojocaru Subject: Re: [PATCH V2 3/3] xen/vm_event: Deny register writes if refused by vm_event reply Date: Fri, 26 Jun 2015 12:17:32 +0300 Message-ID: <558D18AC.70203@bitdefender.com> References: <1434359007-9302-1-git-send-email-rcojocaru@bitdefender.com> <1434359007-9302-4-git-send-email-rcojocaru@bitdefender.com> <558D2954020000780008A029@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <558D2954020000780008A029@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: tim@xen.org, kevin.tian@intel.com, wei.liu2@citrix.com, ian.campbell@citrix.com, stefano.stabellini@eu.citrix.com, jun.nakajima@intel.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, eddie.dong@intel.com, Aravind.Gopalakrishnan@amd.com, suravee.suthikulpanit@amd.com, keir@xen.org, boris.ostrovsky@oracle.com List-Id: xen-devel@lists.xenproject.org On 06/26/2015 11:28 AM, Jan Beulich wrote: >>>> On 15.06.15 at 11:03, wrote: >> Deny register writes if a vm_client subscribed to mov_to_msr events >> forbids them. Currently supported for MSR, CR0, CR3 and CR4 events. > > Is the first sentence referring the mov_to_msr stale? Yes, the patch now applies to MSR, CR0, CR3 and CR4. I'll update the patch description. >> --- a/xen/arch/x86/hvm/event.c >> +++ b/xen/arch/x86/hvm/event.c >> @@ -90,7 +90,7 @@ static int hvm_event_traps(uint8_t sync, vm_event_request_t *req) >> return 1; >> } >> >> -void hvm_event_cr(unsigned int index, unsigned long value, unsigned long old) >> +bool_t hvm_event_cr(unsigned int index, unsigned long value, unsigned long old) >> { >> struct arch_domain *currad = ¤t->domain->arch; >> unsigned int ctrlreg_bitmask = monitor_ctrlreg_bitmask(index); >> @@ -109,7 +109,10 @@ void hvm_event_cr(unsigned int index, unsigned long value, unsigned long old) >> >> hvm_event_traps(currad->monitor.write_ctrlreg_sync & ctrlreg_bitmask, >> &req); >> + return 1; >> } >> + >> + return 0; >> } >> >> void hvm_event_msr(unsigned int msr, uint64_t value) > > Why is knowing whether an event was sent relevant for > hvm_event_cr(), but not for hvm_event_msr()? MSR events don't honor onchangeonly, so they're always being sent. A CR event won't be sent if onchangeonly is true and the register is being set to the same value again. Since the change prompted the question, I should perhaps add a comment to explain that? >> @@ -468,6 +469,37 @@ void hvm_do_resume(struct vcpu *v) >> } >> } >> >> + ASSERT(v == current); >> + >> + if ( d->arch.event_write_data ) > > unlikely() to not penalize the common case? Also an ASSERT() like > this belongs either at the beginning of the function or, if really only > relevant with the changes you do here, inside the if(). Ack, will move the ASSERT() and wrap the condition with an unlikely(). >> @@ -3189,12 +3221,13 @@ static void hvm_update_cr(struct vcpu *v, unsigned int cr, unsigned long value) >> hvm_update_guest_cr(v, cr); >> } >> >> -int hvm_set_cr0(unsigned long value) >> +int hvm_set_cr0(unsigned long value, bool_t event_only) > > "event_only" seems pretty misleading, as it limits the function's > operation only when ... > >> @@ -3224,6 +3257,22 @@ int hvm_set_cr0(unsigned long value) >> goto gpf; >> } >> >> + if ( event_only && unlikely(currad->monitor.write_ctrlreg_enabled & >> + monitor_ctrlreg_bitmask(VM_EVENT_X86_CR0)) && >> + value != old_value ) > > ... a number of other conditions are true. Maybe "may_defer" or > some such making clear this is conditional behavior? > > Also - indentation. Ack on both counts. >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -2010,9 +2010,9 @@ static int vmx_cr_access(unsigned long exit_qualification) >> } >> case VMX_CONTROL_REG_ACCESS_TYPE_CLTS: { >> unsigned long old = curr->arch.hvm_vcpu.guest_cr[0]; >> + hvm_event_crX(CR0, curr->arch.hvm_vcpu.guest_cr[0], old); >> curr->arch.hvm_vcpu.guest_cr[0] &= ~X86_CR0_TS; >> vmx_update_guest_cr(curr, 0); >> - hvm_event_crX(CR0, curr->arch.hvm_vcpu.guest_cr[0], old); >> HVMTRACE_0D(CLTS); >> break; >> } > > I suppose it is intentional to ignore a possible deny here? If so, > shouldn't the be documented by way of a comment? Actually on second thought I should probably not ignore a deny there. While this wasn't required in tests, it's probably better to be consistent. > Also, since you already touch this code, please add a blank line > between declaration and statements. Ack. >> --- a/xen/arch/x86/hvm/vmx/vvmx.c >> +++ b/xen/arch/x86/hvm/vmx/vvmx.c >> @@ -1048,15 +1048,16 @@ static void load_shadow_guest_state(struct vcpu *v) >> >> nvcpu->guest_cr[0] = __get_vvmcs(vvmcs, CR0_READ_SHADOW); >> nvcpu->guest_cr[4] = __get_vvmcs(vvmcs, CR4_READ_SHADOW); >> - hvm_set_cr0(__get_vvmcs(vvmcs, GUEST_CR0)); >> - hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4)); >> - hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3)); >> + hvm_set_cr0(__get_vvmcs(vvmcs, GUEST_CR0), 1); >> + hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4), 1); >> + hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3), 1); > > Are you sure there are no dependencies between the individual > registers getting set? And what about the order here versus how > you carry out the deferred writes in hvm_do_resume()? I don't > think any good can come from updating CR3 before CR4... Ack, I'll mirror this order in hvm_do_resume(). >> control = __get_vvmcs(vvmcs, VM_ENTRY_CONTROLS); >> if ( control & VM_ENTRY_LOAD_GUEST_PAT ) >> hvm_set_guest_pat(v, __get_vvmcs(vvmcs, GUEST_PAT)); >> if ( control & VM_ENTRY_LOAD_PERF_GLOBAL_CTRL ) >> - hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL, __get_vvmcs(vvmcs, GUEST_PERF_GLOBAL_CTRL)); >> + hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL, >> + __get_vvmcs(vvmcs, GUEST_PERF_GLOBAL_CTRL), 1); > > So one special MSR gets potentially denied writes to - how about > all the other ones like PAT (visible above), EFER, etc? And once > you send events for multiple ones - how would you know which > ones were denied access to be the time to reach hvm_do_resume()? > > All the same questions of course apply to nested SVM code too, just > that there the problems are leass easily visible from looking at the > patch. "Regular" MSRs (the ones that go through hvm_msr_write_intercept()) are sufficient in all the introspection use cases we've tested. The code would of course have the problem you've mentioned if the code would be modified to include PAT & friends, but for now this doesn't seem to be an issue. Should I add a comment, maybe in hvm_do_resume()? >> @@ -1249,15 +1250,16 @@ static void load_vvmcs_host_state(struct vcpu *v) >> __vmwrite(vmcs_h2g_field[i].guest_field, r); >> } >> >> - hvm_set_cr0(__get_vvmcs(vvmcs, HOST_CR0)); >> - hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4)); >> - hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3)); >> + hvm_set_cr0(__get_vvmcs(vvmcs, HOST_CR0), 1); >> + hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4), 1); >> + hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3), 1); >> >> control = __get_vvmcs(vvmcs, VM_EXIT_CONTROLS); >> if ( control & VM_EXIT_LOAD_HOST_PAT ) >> hvm_set_guest_pat(v, __get_vvmcs(vvmcs, HOST_PAT)); >> if ( control & VM_EXIT_LOAD_PERF_GLOBAL_CTRL ) >> - hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL, __get_vvmcs(vvmcs, HOST_PERF_GLOBAL_CTRL)); >> + hvm_msr_write_intercept(MSR_CORE_PERF_GLOBAL_CTRL, >> + __get_vvmcs(vvmcs, HOST_PERF_GLOBAL_CTRL), 1); > > Considering these together with the above - do you really want/ > need to intercept and send events for both host and guest shadow > state changes? Guest MSRs should suffice indeed. I was just trying to be consistent, but no, the host changes should not be necessary. >> --- a/xen/arch/x86/mm/p2m.c >> +++ b/xen/arch/x86/mm/p2m.c >> @@ -1417,6 +1417,35 @@ static void p2m_vm_event_fill_regs(vm_event_request_t *req) >> void p2m_mem_access_emulate_check(struct vcpu *v, >> const vm_event_response_t *rsp) >> { >> + ASSERT(v->domain->arch.event_write_data != NULL); >> + >> + if ( rsp->flags & MEM_ACCESS_DENY ) >> + { >> + struct monitor_write_data *w = >> + &v->domain->arch.event_write_data[v->vcpu_id]; >> + > > I think the ASSERT() above belongs here. Ack. >> --- a/xen/include/asm-x86/domain.h >> +++ b/xen/include/asm-x86/domain.h >> @@ -250,6 +250,21 @@ struct pv_domain >> struct mapcache_domain mapcache; >> }; >> >> +struct monitor_write_data { >> + struct { >> + uint8_t msr : 1; >> + uint8_t cr0 : 1; >> + uint8_t cr3 : 1; >> + uint8_t cr4 : 1; > > unsigned int Ack. > >> + } do_write; >> + >> + uint64_t msr; > > unsigned int or uint32_t Ack. > >> --- a/xen/include/public/vm_event.h >> +++ b/xen/include/public/vm_event.h >> @@ -158,6 +158,11 @@ struct vm_event_regs_x86 { >> * MEM_ACCESS_EMULATE_NOWRITE. >> */ >> #define MEM_ACCESS_SET_EMUL_READ_DATA (1 << 8) >> + /* >> + * Deny completion of the operation that triggered the event. >> + * Currently only useful for MSR write events. >> + */ >> +#define MEM_ACCESS_DENY (1 << 9) > > Same as for the description - isn't the comment, referring to only > MSR writes, stale? It is. Will update it. Thanks, Razvan