From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: [PATCHv4 0/2] x86/ept: reduce translation invalidation impact Date: Mon, 14 Dec 2015 14:39:04 +0000 Message-ID: <1450103946-14232-1-git-send-email-david.vrabel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1a8UHV-0004UA-Rq for xen-devel@lists.xenproject.org; Mon, 14 Dec 2015 14:39:17 +0000 List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xenproject.org Cc: Kevin Tian , Jun Nakajima , George Dunlap , Andrew Cooper , Tim Deegan , David Vrabel , Jan Beulich List-Id: xen-devel@lists.xenproject.org This series improves the performance of EPT by further reducing the impact of the translation invalidations (ept_sync_domain()). By: a) Deferring invalidations until the p2m write lock is released. Prior to this change a 16 VCPU guest could not be successfully migrated on an (admittedly slow) 160 PCPU box because the p2m write lock was held for such extended periods of time. This starved the read lock needed (by the toolstack) to map the domain's memory, triggering the watchdog. After this change a 64 VCPU guest could be successfully migrated. ept_sync_domain() is very expensive because: a) it uses on_selected_cpus() and the IPI cost can be particularly high for a multi-socket machine. b) on_selected_cpus() is serialized by its own spin lock. On this particular box, ept_sync_domain() could take ~3-5 ms. Simply using a fair rw lock was not sufficient to resolve this (but it was an improvement) as the cost of the ept_sync_domain calls() was still delaying the read locks enough for the watchdog to trigger (the toolstack maps a batch of 1024 GFNs at a time, which means trying to acquire the p2m read lock 1024 times). Changes in v4: - __ept_sync_domain() is a no-op -- invalidates are done before VMENTER. - initialize ept->invalidate to all ones so the initial invalidate is always done. Changes in v3: - Drop already applied "x86/ept: remove unnecessary sync after resolving misconfigured entries". - Replaced "mm: don't free pages until mm locks are released" with "x86/ept: invalidate guest physical mappings on VMENTER". Changes in v2: - Use a per-p2m (not per-CPU) list for page table pages to be freed. - Hold the write lock while updating the synced_mask. David