On Thu, Apr 29, 2021 at 10:02:23PM +0530, Aneesh Kumar K.V wrote: > On 4/29/21 9:25 PM, Stefan Hajnoczi wrote: > > On Wed, Apr 28, 2021 at 11:48:21PM -0400, Shivaprasad G Bhat wrote: > > > The nvdimm devices are expected to ensure write persistence during power > > > failure kind of scenarios. > > > > > > The libpmem has architecture specific instructions like dcbf on POWER > > > to flush the cache data to backend nvdimm device during normal writes > > > followed by explicit flushes if the backend devices are not synchronous > > > DAX capable. > > > > > > Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest > > > and the subsequent flush doesn't traslate to actual flush to the backend > > > file on the host in case of file backed v-nvdimms. This is addressed by > > > virtio-pmem in case of x86_64 by making explicit flushes translating to > > > fsync at qemu. > > > > > > On SPAPR, the issue is addressed by adding a new hcall to > > > request for an explicit flush from the guest ndctl driver when the backend > > > nvdimm cannot ensure write persistence with dcbf alone. So, the approach > > > here is to convey when the hcall flush is required in a device tree > > > property. The guest makes the hcall when the property is found, instead > > > of relying on dcbf. > > > > Sorry, I'm not very familiar with SPAPR. Why add a hypercall when the > > virtio-nvdimm device already exists? > > > > On virtualized ppc64 platforms, guests use papr_scm.ko kernel drive for > persistent memory support. This was done such that we can use one kernel > driver to support persistent memory with multiple hypervisors. To avoid > supporting multiple drivers in the guest, -device nvdimm Qemu command-line > results in Qemu using PAPR SCM backend. What this patch series does is to > make sure we expose the correct synchronous fault support, when we back such > nvdimm device with a file. > > The existing PAPR SCM backend enables persistent memory support with the > help of multiple hypercall. > > #define H_SCM_READ_METADATA 0x3E4 > #define H_SCM_WRITE_METADATA 0x3E8 > #define H_SCM_BIND_MEM 0x3EC > #define H_SCM_UNBIND_MEM 0x3F0 > #define H_SCM_UNBIND_ALL 0x3FC > > Most of them are already implemented in Qemu. This patch series implements > H_SCM_FLUSH hypercall. The overall point here is that we didn't define the hypercall. It was defined in order to support NVDIMM/pmem devices under PowerVM. For uniformity between PowerVM and KVM guests, we want to support the same hypercall interface on KVM/qemu as well. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson