nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Cc: groug@kaod.org, qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	aneesh.kumar@linux.ibm.com, nvdimm@lists.linux.dev,
	kvm-ppc@vger.kernel.org, bharata@linux.vnet.ibm.com
Subject: Re: [PATCH REBASED v5 2/2] spapr: nvdimm: Introduce spapr-nvdimm device
Date: Tue, 21 Sep 2021 16:32:27 +1000	[thread overview]
Message-ID: <YUl8e5NLb1Jnn5W6@yekko> (raw)
In-Reply-To: <162571304881.1030381.2406869533148471546.stgit@lep8c.aus.stglabs.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 6040 bytes --]

On Wed, Jul 07, 2021 at 09:57:31PM -0500, Shivaprasad G Bhat wrote:
> If the device backend is not persistent memory for the nvdimm, there is
> need for explicit IO flushes on the backend to ensure persistence.
> 
> On SPAPR, the issue is addressed by adding a new hcall to request for
> an explicit flush from the guest when the backend is not pmem. So, the
> approach here is to convey when the hcall flush is required in a device
> tree property. The guest once it knows the device backend is not pmem,
> makes the hcall whenever flush is required.
> 
> To set the device tree property, the patch introduces a new papr specific
> device type inheriting the nvdimm device. When the backend doesn't have
> pmem="yes", the device tree property "ibm,hcall-flush-required" is set,
> and the guest makes hcall H_SCM_FLUSH requesting for an explicit flush.
> 
> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
> ---
>  hw/ppc/spapr_nvdimm.c         |   46 +++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr_nvdimm.h |    4 ++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
> index 4f8931ab15..4dc7c3f147 100644
> --- a/hw/ppc/spapr_nvdimm.c
> +++ b/hw/ppc/spapr_nvdimm.c
> @@ -54,6 +54,8 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
>  {
>      const MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
>      const MachineState *ms = MACHINE(hotplug_dev);
> +    PCDIMMDevice *dimm = PC_DIMM(nvdimm);
> +    MemoryRegion *mr = host_memory_backend_get_memory(dimm->hostmem);
>      g_autofree char *uuidstr = NULL;
>      QemuUUID uuid;
>      int ret;
> @@ -91,6 +93,14 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
>          return false;
>      }
>  
> +    if (object_dynamic_cast(OBJECT(nvdimm), TYPE_SPAPR_NVDIMM) &&
> +        (memory_region_get_fd(mr) < 0)) {
> +        error_setg(errp, "spapr-nvdimm device requires the "
> +                   "memdev %s to be of memory-backend-file type",
> +                   object_get_canonical_path_component(OBJECT(dimm->hostmem)));

It's not obvious to me why the spapr nvdimm device has an additional
restriction here over the regular nvdimm device.

> +        return false;
> +    }
> +
>      return true;
>  }
>  
> @@ -162,6 +172,21 @@ static int spapr_dt_nvdimm(SpaprMachineState *spapr, void *fdt,
>                               "operating-system")));
>      _FDT(fdt_setprop(fdt, child_offset, "ibm,cache-flush-required", NULL, 0));
>  
> +    if (object_dynamic_cast(OBJECT(nvdimm), TYPE_SPAPR_NVDIMM)) {
> +        bool is_pmem = false;
> +#ifdef CONFIG_LIBPMEM
> +        PCDIMMDevice *dimm = PC_DIMM(nvdimm);
> +        HostMemoryBackend *hostmem = dimm->hostmem;
> +
> +        is_pmem = object_property_get_bool(OBJECT(hostmem), "pmem",
> +                                           &error_abort);

Presenting to the guest a property of the backend worries me
slightly.  How the backends are synchronized between the source and
destination is out of scope for qemu: is there any possibility that we
could migrate from a host where the backend is pmem to one where it is
not (or the reverse).

I think at the least we want a property on the spapr-nvdimm object
which will override what's presented to the guest (which, yes, might
mean lying to the guest).  I think that could be important for
testing, if nothing else.

> +#endif
> +        if (!is_pmem) {
> +            _FDT(fdt_setprop(fdt, child_offset, "ibm,hcall-flush-required",
> +                             NULL, 0));
> +        }
> +    }
> +
>      return child_offset;
>  }
>  
> @@ -585,7 +610,16 @@ static target_ulong h_scm_flush(PowerPCCPU *cpu, SpaprMachineState *spapr,
>      }
>  
>      dimm = PC_DIMM(drc->dev);
> +    if (!object_dynamic_cast(OBJECT(dimm), TYPE_SPAPR_NVDIMM)) {
> +        return H_PARAMETER;
> +    }

Hmm.  If you're going to make flushes specific to spapr nvdimms, you
could put the queue of pending flushes into the spapr-nvdimm object,
rather than having a global list in the machine.

> +
>      backend = MEMORY_BACKEND(dimm->hostmem);
> +#ifdef CONFIG_LIBPMEM
> +    if (object_property_get_bool(OBJECT(backend), "pmem", &error_abort)) {
> +        return H_UNSUPPORTED;

Could you make this not be UNSUPPORTED, but instead fake the flush for
the pmem device?  Either as a no-op, or simulating the guest invoking
the right cpu cache flushes?  That seems like it would be more useful:
that way users who don't care too much about performance could just
always do a flush hcall and not have to have another path for the
"real" pmem case.

> +    }
> +#endif
>      fd = memory_region_get_fd(&backend->mr);
>  
>      if (fd < 0) {
> @@ -766,3 +800,15 @@ static void spapr_scm_register_types(void)
>  }
>  
>  type_init(spapr_scm_register_types)
> +
> +static TypeInfo spapr_nvdimm_info = {
> +    .name          = TYPE_SPAPR_NVDIMM,
> +    .parent        = TYPE_NVDIMM,
> +};
> +
> +static void spapr_nvdimm_register_types(void)
> +{
> +    type_register_static(&spapr_nvdimm_info);
> +}
> +
> +type_init(spapr_nvdimm_register_types)
> diff --git a/include/hw/ppc/spapr_nvdimm.h b/include/hw/ppc/spapr_nvdimm.h
> index 24d8e37b33..fb4e56418e 100644
> --- a/include/hw/ppc/spapr_nvdimm.h
> +++ b/include/hw/ppc/spapr_nvdimm.h
> @@ -13,6 +13,10 @@
>  #include "hw/mem/nvdimm.h"
>  #include "migration/vmstate.h"
>  
> +#define TYPE_SPAPR_NVDIMM "spapr-nvdimm"
> +OBJECT_DECLARE_SIMPLE_TYPE(SpaprNVDIMMDevice, SPAPR_NVDIMM)
> +
> +typedef struct SpaprNVDIMMDevice  SpaprNVDIMMDevice;
>  typedef struct SpaprDrc SpaprDrc;
>  typedef struct SpaprMachineState SpaprMachineState;
>  
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-09-21  6:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-08  2:57 [PATCH REBASED v5 0/3] spapr: nvdimm: Introduce spapr-nvdimm device Shivaprasad G Bhat
2021-07-08  2:57 ` [PATCH REBASED v5 1/2] spapr: nvdimm: Implement H_SCM_FLUSH hcall Shivaprasad G Bhat
2021-07-08  6:12   ` David Gibson
2021-09-21  6:23   ` David Gibson
2022-02-01 21:41     ` Shivaprasad G Bhat
2021-07-08  2:57 ` [PATCH REBASED v5 2/2] spapr: nvdimm: Introduce spapr-nvdimm device Shivaprasad G Bhat
2021-09-21  6:32   ` David Gibson [this message]
2022-02-01 21:41     ` Shivaprasad G Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUl8e5NLb1Jnn5W6@yekko \
    --to=david@gibson.dropbear.id.au \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=groug@kaod.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sbhat@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).