From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Jan Beulich" <JBeulich@suse.com>
Subject: Re: [RFC Design Doc] Add vNVDIMM support for Xen
Date: Mon, 15 Feb 2016 04:07:47 -0700
Message-ID: <56C1BF9302000078000D202D@prv-mh.provo.novell.com>
References: <20160201054414.GA25211@hz-desktop.sh.intel.com>
	<20160202191519.GB21656@char.us.oracle.com>
	<20160215084352.GB8938@hz-desktop.sh.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <20160215084352.GB8938@hz-desktop.sh.intel.com>
Content-Disposition: inline
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: Juergen Gross <JGross@suse.com>, Kevin Tian <kevin.tian@intel.com>, Wei Liu <wei.liu2@citrix.com>, Ian Campbell <ian.campbell@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Ian Jackson <ian.jackson@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, Xiao Guangrong <guangrong.xiao@linux.intel.com>, Keir Fraser <keir@xen.org>
List-Id: xen-devel@lists.xenproject.org

>>> On 15.02.16 at 09:43, <haozhong.zhang@intel.com> wrote:
> On 02/03/16 03:15, Konrad Rzeszutek Wilk wrote:
>> >  Similarly to that in KVM/QEMU, enabling vNVDIMM in Xen is composed of
>> >  three parts:
>> >  (1) Guest clwb/clflushopt/pcommit enabling,
>> >  (2) Memory mapping, and
>> >  (3) Guest ACPI emulation.
>> 
>> 
>> .. MCE? and vMCE?
>> 
> 
> NVDIMM can generate UCR errors like normal ram. Xen may handle them in a
> way similar to what mc_memerr_dhandler() does, with some differences in
> the data structure and the broken page offline parts:
> 
> Broken NVDIMM pages should be marked as "offlined" so that Xen
> hypervisor can refuse further requests that map them to DomU.
> 
> The real problem here is what data structure will be used to record
> information of NVDIMM pages. Because the size of NVDIMM is usually much
> larger than normal ram, using struct page_info for NVDIMM pages would
> occupy too much memory.

I don't see how your alternative below would be less memory
hungry: Since guests have at least partial control of their GFN
space, a malicious guest could punch holes into the contiguous
GFN range that you appear to be thinking about, thus causing
arbitrary splitting of the control structure.

Also - see how you all of the sudden came to think of using
struct page_info here (implying hypervisor control of these
NVDIMM ranges)?

> (4) When a MCE for host NVDIMM SPA range [start_mfn, end_mfn] happens,
>   (a) search xen_nvdimm_pages_list for affected nvdimm_pages structures,
>   (b) for each affected nvdimm_pages, if it belongs to a domain d and
>       its broken field is already set, the domain d will be shutdown to
>       prevent malicious guest accessing broken page (similarly to what
>       offline_page() does).
>   (c) for each affected nvdimm_pages, set its broken field to 1, and
>   (d) for each affected nvdimm_pages, inject to domain d a vMCE that
>       covers its GFN range if that nvdimm_pages belongs to domain d.

I don't see why you'd want to mark the entire range bad: All
that's known to be broken is a single page. Hence this would be
another source of splits of the proposed control structures.

Jan