From mboxrd@z Thu Jan 1 00:00:00 1970 From: Razvan Cojocaru Subject: Re: [PATCH RFC V9 4/5] xen, libxc: Request page fault injection via libxc Date: Wed, 10 Sep 2014 12:30:48 +0300 Message-ID: <54101A48.8030401@bitdefender.com> References: <53FF38A6020000780002EB2B@mail.emea.novell.com> <54002F43.4070802@bitdefender.com> <5400638A020000780002EFD6@mail.emea.novell.com> <540421E1.9020505@bitdefender.com> <540453C8020000780002F59C@mail.emea.novell.com> <54045E7C.50604@bitdefender.com> <54047D1D020000780002F73A@mail.emea.novell.com> <54058B4E.9060001@bitdefender.com> <20140902132434.GA24202@deinos.phlegethon.org> <20140909201435.GB82414@deinos.phlegethon.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XReEb-000604-O3 for xen-devel@lists.xenproject.org; Wed, 10 Sep 2014 09:30:41 +0000 Received: from smtp01.buh.bitdefender.com (smtp.bitdefender.biz [10.17.80.75]) by mx-sr.buh.bitdefender.com (Postfix) with ESMTP id 8A61680086 for ; Wed, 10 Sep 2014 12:30:38 +0300 (EEST) In-Reply-To: <20140909201435.GB82414@deinos.phlegethon.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tim Deegan , George Dunlap Cc: "Tian, Kevin" , Ian Campbell , Stefano Stabellini , Jun Nakajima , Andrew Cooper , "Dong, Eddie" , Jan Beulich , xen-devel , Ian Jackson List-Id: xen-devel@lists.xenproject.org On 09/09/2014 11:14 PM, Tim Deegan wrote: > At 17:57 +0100 on 09 Sep (1410281829), George Dunlap wrote: >> On Tue, Sep 2, 2014 at 2:24 PM, Tim Deegan wrote: >>> Hi, >>> >>> At 12:18 +0300 on 02 Sep (1409656686), Razvan Cojocaru wrote: >>>> While we need to set the data per-domain and have whatever VCPU inject >>>> the page fault - _but_only_if_ it's in usermode and its CR3 points to >>>> something interesting. >>> >>> That's a strange and specific thing to ask the hypervisor to do for >>> you. Given that you can already trap CR3 changes as mem-events can >>> you trigger the fault injection in response to the contect switch? >>> I guess that would probably catch it in kernel mode. :( >> >> I was wondering, rather than special-casing inject_trap, would it make >> sense to be able for the memory controller to get notifications when >> certain more complex conditions happen (e.g., "some vcpu is in user >> mode with this CR3")? Then the controller could ask to be notified >> when the event happens, and when it does, just call inject_fault. > > Yes, that sounds like a better place to put this kind of test. As > part of the mem_event trigger framework it doesn't seem nearly so out > of place (and it avoids many of the problems of clashes between > different event injection paths). Do you mean someplace here (hvm.c)? 3265 int hvm_set_cr3(unsigned long value) 3266 { 3267 struct vcpu *v = current; 3268 struct page_info *page; 3269 unsigned long old; 3270 3271 if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && 3272 (value != v->arch.hvm_vcpu.guest_cr[3]) ) 3273 { 3274 /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ 3275 HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); 3276 page = get_page_from_gfn(v->domain, value >> PAGE_SHIFT, 3277 NULL, P2M_ALLOC); 3278 if ( !page ) 3279 goto bad_cr3; 3280 3281 put_page(pagetable_get_page(v->arch.guest_table)); 3282 v->arch.guest_table = pagetable_from_page(page); 3283 3284 HVM_DBG_LOG(DBG_LEVEL_VMMU, "Update CR3 value = %lx", value); 3285 } 3286 3287 old=v->arch.hvm_vcpu.guest_cr[3]; 3288 v->arch.hvm_vcpu.guest_cr[3] = value; 3289 paging_update_cr3(v); 3290 hvm_memory_event_cr3(value, old); 3291 return X86EMUL_OKAY; 3292 3293 bad_cr3: 3294 gdprintk(XENLOG_ERR, "Invalid CR3\n"); 3295 domain_crash(v->domain); 3296 return X86EMUL_UNHANDLEABLE; 3297 } Alongside hvm_memory_event_cr3(value, old), have another function checking an array of CR3s and if v is in user mode and send out an event? As I've explained in my earlier reply to Tamas, the way we use the injection request hypercall now, conditions normally apply for immediate injection. Also, I'm not sure the "is it user mode?" check would reliably work at the time of calling hvm_set_cr3(). Are CR3 loads not always happening in kernel-mode? >> That way, inject_fault isn't special-cased at all; and one could >> imagine designing the "condition" such that any number of interesting >> conditions could be trapped. >> >> Thoughts? >> >> But ultimately, as Tim said, you're basically just *hoping* that it >> won't take too long to happen to be at the hypervisor when the proper >> condition happens. If the process in question isn't getting many >> interrupts, or is spending the vast majority of its time in the >> kernel, you may end up waiting an unbounded amount of time to be able >> to "catch" it in user mode. It seems like it would be better to find >> a reliable way to trap on the return into user mode, in which case you >> wouldn't need to have a special "wait for this complicated event to >> happen" call at all, would you? > > Yeah; I was thinking about page-protecting the process's stack as an > approach to this. Breakpointing the return address might work too but > would probably cause more false alarms -- you'd at least need to walk > up past the libc/win32 wrappers to avoid trapping every thread. > > Ideally there'd be something vcpu-specific we could tinker with > (e.g. arranging MSRs so that SYSRET will fault) once we see the right > CR3 (assuming intercepting CR3 is cheap enough in this case). All valid suggestions, however they would seem to have a greater impact on guest responsiveness. There would be quite a lot of CR3 loads and SYSRETs. Thanks, Razvan Cojocaru