From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH RFC V9 4/5] xen, libxc: Request page fault injection via libxc Date: Wed, 10 Sep 2014 11:39:28 +0100 Message-ID: References: <53FF36A1020000780002EAED@mail.emea.novell.com> <53FF1BD8.5010401@bitdefender.com> <53FF38A6020000780002EB2B@mail.emea.novell.com> <54002F43.4070802@bitdefender.com> <5400638A020000780002EFD6@mail.emea.novell.com> <540421E1.9020505@bitdefender.com> <540453C8020000780002F59C@mail.emea.novell.com> <54045E7C.50604@bitdefender.com> <54047D1D020000780002F73A@mail.emea.novell.com> <54058B4E.9060001@bitdefender.com> <20140902132434.GA24202@deinos.phlegethon.org> <540F3B46.1030602@bitdefender.com> <54100722.1090604@bitdefender.com> <54101047.4060901@citrix.com> <541011F1.70106@bitdefender.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XRfJC-0006RQ-K9 for xen-devel@lists.xenproject.org; Wed, 10 Sep 2014 10:39:30 +0000 Received: by mail-wg0-f43.google.com with SMTP id x12so5094635wgg.14 for ; Wed, 10 Sep 2014 03:39:28 -0700 (PDT) In-Reply-To: <541011F1.70106@bitdefender.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Razvan Cojocaru Cc: "Tian, Kevin" , Ian Campbell , Stefano Stabellini , Jun Nakajima , Andrew Cooper , Tim Deegan , "Dong, Eddie" , Jan Beulich , Tamas K Lengyel , xen-devel , Ian Jackson List-Id: xen-devel@lists.xenproject.org On Wed, Sep 10, 2014 at 9:55 AM, Razvan Cojocaru wrote: > On 09/10/2014 11:48 AM, Andrew Cooper wrote: >> On 10/09/2014 09:09, Razvan Cojocaru wrote: >>> On 09/09/2014 09:38 PM, Tamas K Lengyel wrote: >>>> > But ultimately, as Tim said, you're basically just *hoping* that it >>>> > won't take too long to happen to be at the hypervisor when the proper >>>> > condition happens. If the process in question isn't getting many >>>> > interrupts, or is spending the vast majority of its time in the >>>> > kernel, you may end up waiting an unbounded amount of time to be able >>>> > to "catch" it in user mode. It seems like it would be better to find >>>> > a reliable way to trap on the return into user mode, in which case you >>>> > wouldn't need to have a special "wait for this complicated event to >>>> > happen" call at all, would you? >>>> >>>> Indeed, but it is assumed that the trap injection request is being made >>>> by the caller in the proper context (when it knows that the condition >>>> will be true sooner rather than later). >>>> >>>> >>>> How is it known that the condition will be true soon? Some more >>>> information on what you consider 'proper context' would be valuable. >>> It's actually pretty simple for us: the application always requests an >>> injection when the guest is already in the address space of the >>> interesting application, and in user mode. >> >> Does this mean that you always request a pagefault as a direct result of >> a mem_event, when the vcpu is in blocked the correct context? > > Yes, exactly. > >> If so, how about extending the mem_event response mechanism with >> trap/fault information? > > For this particular case, that is indeed a very good suggestion - > however, things may change. From what I understand, it is likely that in > the future we (or somebody else doing memory introspection) will need to > request a page fault injection in other cases. The risks described above > will of course exist in that case, but they are acceptable. Sorry -- do you mean that you don't actually need this functionality right now, but you think that maybe someone else might need it, or you may need it in the future? That doesn't sound very promising; at the moment it sounds like you're not actually even testing this mechanism to make sure that it works the way you hope it does. I definitely think that the clean way to approach this is to try to allow you to trap on a mem_event at the point where you want to inject the page fault, and then allow the controller to inject the page fault with the existing mechanisms. If at the moment you can already do that reliably, then you don't need any extra functionality. (Although being able to say, "continue this vcpu and inject a trap" might be useful functionality.) If in the future you think you may not be able to, then we can try to add mem_event notifications that can help you. > Also, I'm not sure the "is it user mode?" check would reliably work at > the time of calling hvm_set_cr3(). Are CR3 loads not always happening in > kernel-mode? Yes, cr3 writes happen in kernel mode, so just trapping cr3 accesses won't get you what you need. But you could: 1. Trap on a cr3 access *being set to a certain value* 2. Then trap on some other condition which would signal the return to userspace (int3, page of the return address, &c) 3. Then inject the trap. That would just be two round-trips to the introspector, which shouldn't be all that bad. -George