xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Haozhong Zhang <haozhong.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Juergen Gross <JGross@suse.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Jun Nakajima <jun.nakajima@intel.com>,
	Xiao Guangrong <guangrong.xiao@linux.intel.com>
Subject: Re: [RFC Design Doc v2] Add vNVDIMM support for Xen
Date: Wed, 3 Aug 2016 18:08:14 +0800	[thread overview]
Message-ID: <20160803100814.i252bg3fvocmaurn@hz-desktop> (raw)
In-Reply-To: <57A1D9E002000078001021C9@prv-mh.provo.novell.com>

On 08/03/16 03:47, Jan Beulich wrote:
> >>> On 03.08.16 at 11:37, <haozhong.zhang@intel.com> wrote:
> > On 08/03/16 02:45, Jan Beulich wrote:
> >> >>> On 03.08.16 at 08:54, <haozhong.zhang@intel.com> wrote:
> >> > On 08/02/16 08:46, Jan Beulich wrote:
> >> >> >>> On 18.07.16 at 02:29, <haozhong.zhang@intel.com> wrote:
> >> >> >  (4) Because the reserved area is now used by Xen hypervisor, it
> >> >> >      should not be accessible by Dom0 any more. Therefore, if a host
> >> >> >      pmem device is recorded by Xen hypervisor, Xen will unmap its
> >> >> >      reserved area from Dom0. Our design also needs to extend Linux
> >> >> >      NVDIMM driver to "balloon out" the reserved area after it
> >> >> >      successfully reports a pmem device to Xen hypervisor.
> >> >> 
> >> >> ... "balloon out" ... _after_? That'd be unsafe.
> >> >>
> >> > 
> >> > Before ballooning is accomplished, the pmem driver does not create any
> >> > device node under /dev/ and hence no one except the pmem drive can
> >> > access the reserved area on pmem, so I think it's okey to balloon
> >> > after reporting.
> >> 
> >> Right now Dom0 isn't allowed to access any memory in use by Xen
> >> (and not explicitly shared), and I don't think we should deviate
> >> from that model for pmem.
> > 
> > In this design, Xen hypervisor unmaps the reserved area from Dom0 so
> > that Dom0 cannot access the reserved area afterwards. And "balloon" is
> > in fact not a memory ballooning, because Linux kernel never allocates
> > from pmem like normal ram. In my current implementation, it's just to
> > remove the reserved area from a resource struct covering pmem.
> 
> Ah, in that case please either use a different term, or explain what
> "balloon out" is meant to mean in this context.
> 
> >> >> > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files
> >> >> > 
> >> >> >  Before a pmem file is assigned to a domain, we need to know the host
> >> >> >  SPA ranges that are allocated to this file. We do this work in xl.
> >> >> > 
> >> >> >  If a pmem device /dev/pmem0 is given, xl will read
> >> >> >  /sys/block/pmem0/device/{resource,size} respectively for the start
> >> >> >  SPA and size of the pmem device.
> >> >> > 
> >> >> >  If a pre-allocated file /mnt/dax/file is given,
> >> >> >  (1) xl first finds the host pmem device where /mnt/dax/file is. Then
> >> >> >      it uses the method above to get the start SPA of the host pmem
> >> >> >      device.
> >> >> >  (2) xl then uses fiemap ioctl to get the extend mappings of
> >> >> >      /mnt/dax/file, and adds the corresponding physical offsets and
> >> >> >      lengths in each mapping entries to above start SPA to get the SPA
> >> >> >      ranges pre-allocated for this file.
> >> >> 
> >> >> Remind me again: These extents never change, not even across
> >> >> reboot? I think this would be good to be written down here explicitly.
> >> > 
> >> > Yes
> >> > 
> >> >> Hadn't there been talk of using labels to be able to allow a guest to
> >> >> own the exact same physical range again after reboot or guest or
> >> >> host?
> >> > 
> >> > You mean labels in NVDIMM label storage area? As defined in Intel
> >> > NVDIMM Namespace Specification, labels are used to specify
> >> > namespaces. For a pmem interleave set (possible cross several dimms),
> >> > at most one pmem namespace (and hence at most one label) is
> >> > allowed. Therefore, labels can not be used to partition pmem.
> >> 
> >> Okay. But then how do particular ranges get associated with the
> >> owning guest(s)? Merely by SPA would seem rather fragile to me.
> >> 
> > 
> > By using the file name, e.g. if I specify vnvdimm = [ 'file=/mnt/dax/foo' ]
> > in a domain config file, SPA occupied by /mnt/dax/foo are mapped to
> > the domain.  If the same file is used every time the domain is created,
> > the same virtual device will be seen by that domain.
> 
> So what if the file got deleted and re-created in between? Since
> I don't think you can specify the SPAs to use when creating such
> a file, such an operation would be quite different from removing
> and re-adding e.g. a specific PCI device (to be used by a guest)
> on a host (while the guest is not running).
> 

If modified in between, guest will see a virtual pmem device of
different data. But the usage of pmem is similar to disk: if a file of
the same content is given every time, the guest can get a virtual
pmem/disk of the same data as last reboot/shutdown; keeping the data
unchanged between multiple boots is out of the scope of Xen.

Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-08-03 10:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-18  0:29 [RFC Design Doc v2] Add vNVDIMM support for Xen Haozhong Zhang
2016-07-18  8:36 ` Tian, Kevin
2016-07-18  9:01   ` Zhang, Haozhong
2016-07-19  0:58     ` Tian, Kevin
2016-07-19  2:10       ` Zhang, Haozhong
2016-07-19  1:57 ` Bob Liu
2016-07-19  2:40   ` Haozhong Zhang
2016-08-02 14:46 ` Jan Beulich
2016-08-03  6:54   ` Haozhong Zhang
2016-08-03  8:45     ` Jan Beulich
2016-08-03  9:37       ` Haozhong Zhang
2016-08-03  9:47         ` Jan Beulich
2016-08-03 10:08           ` Haozhong Zhang [this message]
2016-08-03 10:18             ` Jan Beulich
2016-08-03 21:25 ` Konrad Rzeszutek Wilk
2016-08-03 23:16   ` Konrad Rzeszutek Wilk
2016-08-04  1:51     ` Haozhong Zhang
2016-08-04  8:52   ` Haozhong Zhang
2016-08-04  9:25     ` Jan Beulich
2016-08-04  9:35       ` Haozhong Zhang
2016-08-04 14:51         ` Konrad Rzeszutek Wilk
2016-08-04 14:51     ` Konrad Rzeszutek Wilk
2016-08-05  6:25       ` Haozhong Zhang
2016-08-05 13:29         ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160803100814.i252bg3fvocmaurn@hz-desktop \
    --to=haozhong.zhang@intel.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=JGross@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=sstabellini@kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).