On Fri, Apr 30, 2021 at 02:27:18PM +1000, David Gibson wrote: > On Thu, Apr 29, 2021 at 10:02:23PM +0530, Aneesh Kumar K.V wrote: > > On 4/29/21 9:25 PM, Stefan Hajnoczi wrote: > > > On Wed, Apr 28, 2021 at 11:48:21PM -0400, Shivaprasad G Bhat wrote: > > > > The nvdimm devices are expected to ensure write persistence during power > > > > failure kind of scenarios. > > > > > > > > The libpmem has architecture specific instructions like dcbf on POWER > > > > to flush the cache data to backend nvdimm device during normal writes > > > > followed by explicit flushes if the backend devices are not synchronous > > > > DAX capable. > > > > > > > > Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest > > > > and the subsequent flush doesn't traslate to actual flush to the backend > > > > file on the host in case of file backed v-nvdimms. This is addressed by > > > > virtio-pmem in case of x86_64 by making explicit flushes translating to > > > > fsync at qemu. > > > > > > > > On SPAPR, the issue is addressed by adding a new hcall to > > > > request for an explicit flush from the guest ndctl driver when the backend > > > > nvdimm cannot ensure write persistence with dcbf alone. So, the approach > > > > here is to convey when the hcall flush is required in a device tree > > > > property. The guest makes the hcall when the property is found, instead > > > > of relying on dcbf. > > > > > > Sorry, I'm not very familiar with SPAPR. Why add a hypercall when the > > > virtio-nvdimm device already exists? > > > > > > > On virtualized ppc64 platforms, guests use papr_scm.ko kernel drive for > > persistent memory support. This was done such that we can use one kernel > > driver to support persistent memory with multiple hypervisors. To avoid > > supporting multiple drivers in the guest, -device nvdimm Qemu command-line > > results in Qemu using PAPR SCM backend. What this patch series does is to > > make sure we expose the correct synchronous fault support, when we back such > > nvdimm device with a file. > > > > The existing PAPR SCM backend enables persistent memory support with the > > help of multiple hypercall. > > > > #define H_SCM_READ_METADATA 0x3E4 > > #define H_SCM_WRITE_METADATA 0x3E8 > > #define H_SCM_BIND_MEM 0x3EC > > #define H_SCM_UNBIND_MEM 0x3F0 > > #define H_SCM_UNBIND_ALL 0x3FC > > > > Most of them are already implemented in Qemu. This patch series implements > > H_SCM_FLUSH hypercall. > > The overall point here is that we didn't define the hypercall. It was > defined in order to support NVDIMM/pmem devices under PowerVM. For > uniformity between PowerVM and KVM guests, we want to support the same > hypercall interface on KVM/qemu as well. Okay, that's fine. Now Linux and QEMU have multiple ways of doing this, but it's fair enough if it's an existing platform hypercall. Stefan