From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m Date: Thu, 25 Jun 2015 10:00:03 +0100 Message-ID: <558BC313.6060305@citrix.com> References: <1434999372-3688-1-git-send-email-edmund.h.white@intel.com> <558A4297.9090806@bitdefender.com> <558AB2B3.4030106@bitdefender.com> <558ADE4D.9060303@intel.com> <558B28FE.5060803@intel.com> <558B3563.9070402@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <558B3563.9070402@intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ed White , "Lengyel, Tamas" Cc: Ravi Sahita , Wei Liu , Razvan Cojocaru , Tim Deegan , Ian Jackson , Xen-devel , Jan Beulich , Daniel De Graaf List-Id: xen-devel@lists.xenproject.org On 24/06/15 23:55, Ed White wrote: > On 06/24/2015 03:45 PM, Lengyel, Tamas wrote: >> On Wed, Jun 24, 2015 at 6:02 PM, Ed White wrote: >> >>> On 06/24/2015 02:34 PM, Lengyel, Tamas wrote: >>>> Hi Ed, >>>> I tried the system using memsharing and I collected the following crash >>>> log. In this test I ran memsharing on all pages of the domain before >>>> activating altp2m and creating the view. Afterwards I used my updated >>>> xen-access to create a copy of this p2m with only R/X permissions. The >>> idea >>>> would be that the altp2m view remains completely shared, while the >>> hostp2m >>>> would be able to do its CoW propagation as the domain is executing. >>>> >>>> (XEN) mm locking order violation: 278 > 239 >>>> (XEN) Xen BUG at mm-locks.h:68 >>>> (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- >>>> (XEN) CPU: 2 >>>> (XEN) RIP: e008:[] >>>> p2m_altp2m_propagate_change+0x85/0x4a9 >>>> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d6v0) >>>> (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: >>> 0000000000000000 >>>> (XEN) rdx: ffff8302163a8000 rsi: 000000000000000a rdi: >>> ffff82d0802a069c >>>> (XEN) rbp: ffff8302163afa68 rsp: ffff8302163af9e8 r8: >>> ffff83021c000000 >>>> (XEN) r9: 0000000000000003 r10: 00000000000000ef r11: >>> 0000000000000003 >>>> (XEN) r12: ffff83010cc51820 r13: 0000000000000000 r14: >>> ffff830158d90000 >>>> (XEN) r15: 0000000000025697 cr0: 0000000080050033 cr4: >>> 00000000001526f0 >>>> (XEN) cr3: 00000000dbba3000 cr2: 00000000778c9714 >>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >>>> (XEN) Xen stack trace from rsp=ffff8302163af9e8: >>>> (XEN) ffff8302163af9f8 00000000803180f8 000000000000000c >>> ffff82d0801892ee >>>> (XEN) ffff82d0801fb4d1 ffff83010cc51de0 000000000008ff49 >>> ffff82d08012f86a >>>> (XEN) ffff83010cc51820 ffff83010cc51820 0000000000000000 >>> 0000000000000000 >>>> (XEN) ffff83010cc51820 0000000000000000 ffff8300dbb334b8 >>> ffff8302163afa00 >>>> (XEN) ffff8302163afb18 ffff82d0801fd549 0000000500000009 >>> ffff830200000001 >>>> (XEN) 0000000000000001 ffff830158d90000 0000000000000002 >>> 000000000008ff49 >>>> (XEN) 0000000000025697 000000000000000c ffff8302163afae8 >>> 80c000008ff49175 >>>> (XEN) 80c00000d0a97175 01ff83010cc51820 0000000000000097 >>> ffff8300dbb33000 >>>> (XEN) ffff8302163afb78 000000000008ff49 0000000000000000 >>> 0000000000000001 >>>> (XEN) 0000000000025697 ffff83010cc51820 ffff8302163afb38 >>> ffff82d0801fd644 >>>> (XEN) ffffffffffffffff 00000000000d0a97 ffff8302163afb98 >>> ffff82d0801f23c5 >>>> (XEN) ffff830158d90000 000000000cc51820 ffff830158d90000 >>> 000000000000000c >>>> (XEN) 000000000008ff49 ffff83010cc51820 0000000000025697 >>> 00000000000d0a97 >>>> (XEN) 000000000008ff49 ffff830158d90000 ffff8302163afbd8 >>> ffff82d0801f45c8 >>>> (XEN) ffff83010cc51820 000000000000000c ffff83008fd41170 >>> 000000000008ff49 >>>> (XEN) 0000000000025697 ffff82e001a152e0 ffff8302163afc58 >>> ffff82d080205b51 >>>> (XEN) 0000000000000009 000000000008ff49 ffff8300d0a97000 >>> ffff83008fd41160 >>>> (XEN) ffff82e001a152f0 ffff82e0011fe920 ffff83010cc51820 >>> 0000000c00000000 >>>> (XEN) 0000000000025697 0000000000000003 ffff83010cc51820 >>> ffff8302163afd34 >>>> (XEN) 0000000000025697 0000000000000000 ffff8302163afca8 >>> ffff82d0801f1f7d >>>> (XEN) Xen call trace: >>>> (XEN) [] p2m_altp2m_propagate_change+0x85/0x4a9 >>>> (XEN) [] ept_set_entry_sve+0x5fa/0x6e6 >>>> (XEN) [] ept_set_entry+0xf/0x11 >>>> (XEN) [] p2m_set_entry+0xd4/0x112 >>>> (XEN) [] set_shared_p2m_entry+0x2d0/0x39b >>>> (XEN) [] __mem_sharing_unshare_page+0x83f/0xbd6 >>>> (XEN) [] __get_gfn_type_access+0x224/0x2b0 >>>> (XEN) [] hvm_hap_nested_page_fault+0x21f/0x795 >>>> (XEN) [] vmx_vmexit_handler+0x1764/0x1af3 >>>> (XEN) [] vmx_asm_vmexit_handler+0x41/0xc0 >>> The crash here is because I haven't successfully forced all the shared >>> pages in the host p2m to become unshared before copying, >>> which is the intended behaviour. >>> >>> I think I know how that has happened and how to fix it, but what you're >>> trying to do won't work by design. By the time a copy from host p2m to >>> altp2m occurs, the sharing is supposed to be broken. >>> >> Hm. If the sharing gets broken before the hostp2m->altp2m copy, maybe doing >> sharing after the view has been created is a better route? I guess the >> sharing code would need to be adapted to check if altp2m is enabled for >> that to work.. >> >> >>> You're coming up with some ways of attempting to use altp2m that we >>> hadn't thought of. That's a good thing, and just what we want, but >>> there are limits to what we can support without more far-reaching >>> changes to existing parts of Xen. This isn't going to be do-able for >>> 4.6. >>> >> My main concern is just getting it to work, hitting 4.6 is not a priority. >> I understand that my stuff is highly experimental ;) While the gfn >> remapping feature is intriguing, in my setup I already have a copy of the >> page I would want to present during a singlestep-altp2mswitch - in the >> origin domains memory. AFAIU the gfn remapping would work only within the >> domains existing p2m space. > Understood, but for us hitting 4.6 with the initial version of altp2m > is *the* priority. And yes, remapping is restricted to pages from the > same host p2m. It is fine for experimental features to have known interaction issues. I don't necessarily see this as a blocker to 4.6, although it would indeed be better if it could be fixed in time. ~Andrew