From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [Qemu-devel] [PATCH v2 2/2] virtio-pmem: Add virtio pmem driver Date: Wed, 17 Oct 2018 12:36:51 -0700 Message-ID: References: <20181013050021.11962-1-pagupta@redhat.com> <20181013050021.11962-3-pagupta@redhat.com> <431127218.21694133.1539803509205.JavaMail.zimbra@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <431127218.21694133.1539803509205.JavaMail.zimbra@redhat.com> Sender: linux-kernel-owner@vger.kernel.org To: Pankaj Gupta Cc: Kevin Wolf , Jan Kara , Xiao Guangrong , KVM list , Rik van Riel , linux-nvdimm , David Hildenbrand , Linux Kernel Mailing List , Dave Jiang , Qemu Developers , Christoph Hellwig , Vishal L Verma , Igor Mammedov , "Michael S. Tsirkin" , Stefan Hajnoczi , zwisler@kernel.org, lcapitulino@redhat.com, Paolo Bonzini , Nitesh Narayan Lal List-Id: linux-nvdimm@lists.01.org On Wed, Oct 17, 2018 at 12:11 PM Pankaj Gupta wrote: > > > > > On Fri, Oct 12, 2018 at 10:01 PM Pankaj Gupta wrote: > > > > > > This patch adds virtio-pmem driver for KVM guest. > > > > > > Guest reads the persistent memory range information from > > > Qemu over VIRTIO and registers it on nvdimm_bus. It also > > > creates a nd_region object with the persistent memory > > > range information so that existing 'nvdimm/pmem' driver > > > can reserve this into system memory map. This way > > > 'virtio-pmem' driver uses existing functionality of pmem > > > driver to register persistent memory compatible for DAX > > > capable filesystems. > > > > > > This also provides function to perform guest flush over > > > VIRTIO from 'pmem' driver when userspace performs flush > > > on DAX memory range. > > > > Before we can move forward with this driver we need additional > > filesystem enabling to detect when the backing device is fronting DAX > > pmem or a paravirtualized page cache through virtio-pmem. Any > > interface that requires fsync() and a round trip to the hypervisor to > > flush host page cache is not DAX. > > I saw your proposal[1] for new mmap flag MAP_DIRECT. IIUIC mapping should fail for > MAP_DIRECT if it requires explicit flush or buffer indirection. So, if we disable > MAP_SYNC flag for virtio-pmem this should fail MAP_DIRECT as well? Otherwise > without MAP_DIRECT, virtio-pmem should be defaulted to VIRTIO flush mechanism. Right, although I wouldn't worry about MAP_DIRECT in the short term since we're still discussing what the upstream interface. Regardless of whether MAP_DIRECT is specified or not the virtio-flush mechanism would always be used for virtio-pmem. I.e. there is no possibility to get full DAX operation with virtio-pmem, only the page-cache bypass sub-set. Taking a look at where we could inject this check for filesystems it's a bit awkward to do it in xfs_file_mmap() for example because we do not have the backing device for the extents of the inode. So at a minimum you would need to investigate calling xfs_inode_supports_dax() from that path and teaching it about a new dax_device flag. I'm thinking the dax_device flag should be called DAXDEV_BUFFERED to indicate the presence of software buffering on a device that otherwise supports bypassing the local page cache.