From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [RFC Design Doc] Add vNVDIMM support for Xen Date: Mon, 15 Feb 2016 04:07:47 -0700 Message-ID: <56C1BF9302000078000D202D@prv-mh.provo.novell.com> References: <20160201054414.GA25211@hz-desktop.sh.intel.com> <20160202191519.GB21656@char.us.oracle.com> <20160215084352.GB8938@hz-desktop.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160215084352.GB8938@hz-desktop.sh.intel.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Haozhong Zhang Cc: Juergen Gross , Kevin Tian , Wei Liu , Ian Campbell , Stefano Stabellini , George Dunlap , Andrew Cooper , Ian Jackson , "xen-devel@lists.xen.org" , Jun Nakajima , Xiao Guangrong , Keir Fraser List-Id: xen-devel@lists.xenproject.org >>> On 15.02.16 at 09:43, wrote: > On 02/03/16 03:15, Konrad Rzeszutek Wilk wrote: >> > Similarly to that in KVM/QEMU, enabling vNVDIMM in Xen is composed of >> > three parts: >> > (1) Guest clwb/clflushopt/pcommit enabling, >> > (2) Memory mapping, and >> > (3) Guest ACPI emulation. >> >> >> .. MCE? and vMCE? >> > > NVDIMM can generate UCR errors like normal ram. Xen may handle them in a > way similar to what mc_memerr_dhandler() does, with some differences in > the data structure and the broken page offline parts: > > Broken NVDIMM pages should be marked as "offlined" so that Xen > hypervisor can refuse further requests that map them to DomU. > > The real problem here is what data structure will be used to record > information of NVDIMM pages. Because the size of NVDIMM is usually much > larger than normal ram, using struct page_info for NVDIMM pages would > occupy too much memory. I don't see how your alternative below would be less memory hungry: Since guests have at least partial control of their GFN space, a malicious guest could punch holes into the contiguous GFN range that you appear to be thinking about, thus causing arbitrary splitting of the control structure. Also - see how you all of the sudden came to think of using struct page_info here (implying hypervisor control of these NVDIMM ranges)? > (4) When a MCE for host NVDIMM SPA range [start_mfn, end_mfn] happens, > (a) search xen_nvdimm_pages_list for affected nvdimm_pages structures, > (b) for each affected nvdimm_pages, if it belongs to a domain d and > its broken field is already set, the domain d will be shutdown to > prevent malicious guest accessing broken page (similarly to what > offline_page() does). > (c) for each affected nvdimm_pages, set its broken field to 1, and > (d) for each affected nvdimm_pages, inject to domain d a vMCE that > covers its GFN range if that nvdimm_pages belongs to domain d. I don't see why you'd want to mark the entire range bad: All that's known to be broken is a single page. Hence this would be another source of splits of the proposed control structures. Jan