All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Alastair D'Silva <alastair@d-silva.org>
Subject: Re: Feedback requested: Exposing NVDIMM performance statistics in a generic way
Date: Wed, 27 May 2020 12:24:26 -0700	[thread overview]
Message-ID: <CAPcyv4jQmQceE_eptnfnrORfAUnikHConhchYLEUPARYRFOcbA@mail.gmail.com> (raw)
In-Reply-To: <87d06swfr4.fsf@linux.ibm.com>

On Mon, May 25, 2020 at 2:01 AM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>
> Hello,
>
> I am looking for some community feedback on these two Problem-statements:
>
> 1.How to expose NVDIMM performance statistics in an arch or nvdimm vendor
> agnostic manner ?
>
> 2. Is there a common set of performance statistics for NVDIMMs that all
> vendors should provide ?

It would be nice to try to encourage some common keyword metrics
across vendors, but I suspect that like perf there will always be some
statistics that are architecture specific.

> Problem context
> ===============
> While working on bring up of PAPR SCM based NVDIMMs[1] for arch/powerpc
> we want to expose certain dimm performance statistics like "Media
> Read/Write Counts", "Power-on Seconds" etc to user-space [2]. These
> performance statistics are similar to what ipmctl[3] reports for Intel®
> Optane™ persistent memory via the '-show performance' command line
> arg. However the reported set of performance stats doesn't cover the
> entirety of all performance stats supported by PAPR SCM based NVDimms.
>
> For example here is a subset of performance stats which are specific to
> PAPR SCM NVDimms and that not reported by ipmctl:
>
> * Controller Reset Count
> * Controller Reset Elapsed Time
> * Power-on Seconds
> * Cache Read Hit Count
> * Cache Write Hit Count
>
> Possibility of updating ipmctl to add support for these performance
> statistics is greatly hampered by no support for ACPI on Powerpc
> arch. Secondly vendors who dont support ACPI/NFIT command set
> similar to Intel® Optane™ (Example MSFT) are also left out in
> lurch. Problem-statement#1 points to this specific problem.

ipmctl is vendor specific and OS agnostic.

ndctl is vendor/platform agnostic and Linux specific.

ndctl is built for this abstraction as it depends on libnvdimm of
which ACPI/NFIT is just one of many co-equal bus providers. In short,
I would expect this support to land in ndctl, not ipmctl.

The only thing that has prevented ndctl from adding performance
statistics was the lack of a public specification, otherwise ndctl
aims to abstract and provide a common tool for any publicly specified
persistent memory device.

> Additionally in absence of any pre-agreed set of performance statistics
> which all vendors should support, adding support for such a
> functionality in ipmctl may not bode well of other nvdimm vendors. For
> example if support for reporting "Controller Reset Count" is added to
> ipmctl then it may not be applicable to other vendors such as Intel®
> Optane™. This issue is what Problem-statement#2 refers to.
>
> Possible Solution for Problem#1
> ===============================
>
> One possible solution to Problem#1 can to add support for reporting
> NVDIMM performance statistics in 'ndtcl'. 'libndctl' already has a layer
> that abstracts underlying NVDIMM vendors (via struct ndctl_dimm_ops),
> making supporting different NVDIMM vendors fairly easy. Also ndctl is
> more widely used compared to 'ipmctl', hence adding such a functionality
> to ndctl would make it more widely used.
>
> Above solution was implemented as RFC patch-set[2] that exposes these
> performance statistics through a generic abstraction in libndctl and
> added a presentation layer for this data in ndctl[4]. It added a new
> command line flags '--stat' to ndctl to report *all* nvdimm vendor
> reported performance stats. The output is similar to one below:
>
> # ndctl list -D --stats
> [
>   {
>     "dev":"nmem0",
>     "stats":{
>       "Power-on Seconds":603931,
>       "Media Read Count":0,
>       "Media Write Count":6313,

I wonder if this should be explicit about the platform-specific stats
versus the common ones? At least with the health data implementation
so far it is implementing a common set of keywords across vendors. I
just worry that someone that writes a useful tool for this data needs
to understand that their tool is vendor generic, or tied to a given
implementation. I'm thinking "platform_stats" for the ones that are
tied to the nvdimm-bus-provider vs "stats" that might reasonably show
up on more than one vendor's implementation.

>     }
>   }
> ]
>
> This was done by adding two new dimm-ops callbacks that were
> implemented by the papr_scm implementation within libndctl. These
> callbacks are invoked by newly introduce code in 'util/json-smart.c'
> that format the returned stats from these new dimm-ops and transform
> them into a json-object to later presentation. I would request you to
> look at RFC patch-set[2] to understand the implementation details.

I'm ok to add some stats to ndctl, but I want ndctl to be limited to
general statistics and not performance counters. Performance counters
and performance events should be abstracted through perf where
possible.

> Possibled Solution for Problem#2
> ================================
>
> Solution to Problem-statement#2 is what eludes me though. If there is a
> minimal set of performance stats (similar to what ndctl enforces for
> health-stats) then implementation of such a functionality in
> ndctl/ipmctl would be easy to implement. But is it really possible to
> have such a common set of performance stats that NVDIMM vendors can
> expose.
>
> Patch-set[2] though tries to bypass this problem by letting the vendor
> descide which performance stats to expose. This opens up a possibility
> of this functionality to abused by dimm vendors to reports arbirary data
> through this flag that may not be performance-stats.
>
> Summing-up
> ==========
>
> In light of above, requesting your feedback as to how
> problem-statements#{1, 2} can be addressed within ndctl subsystem. Also
> are these problems even worth solving.

Yes, I think it's worth solving, just the hard part of giving
implementations enough freedom to convey the data they need, but not
enough freedom that we damage ndctl and the kernel's ability to
maintain a common interface across vendors.

>
> References
> ==========
> [1] https://github.com/torvalds/linux/blob/master/Documentation/powerpc/papr_hcalls.rst
>
> [2] "[ndctl RFC-PATCH 0/4] Add support for reporting PAPR NVDIMM
> Statistics"
> https://lore.kernel.org/linux-nvdimm/20200518110814.145644-1-vaibhav@linux.ibm.com/
>
> [3] https://docs.pmem.io/ipmctl-user-guide/instrumentation/show-device-performance
>
> [4] "[RFC-PATCH 1/4] ndctl,libndctl: Implement new dimm-ops 'new_stats'
> and 'get_stat'"
> https://lore.kernel.org/linux-nvdimm/20200514225258.508463-2-vaibhav@linux.ibm.com

Appreciate the thorough write-up.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

  reply	other threads:[~2020-05-27 19:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-25  9:00 Feedback requested: Exposing NVDIMM performance statistics in a generic way Vaibhav Jain
2020-05-27 19:24 ` Dan Williams [this message]
2020-05-28  0:55   ` Dan Williams
2020-05-28 18:59     ` Vaibhav Jain
2020-05-28 22:40       ` Dan Williams
2020-10-23 17:28       ` Michal Suchánek
2020-10-23 19:03         ` Dan Williams
2020-11-23  7:21           ` Vaibhav Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4jQmQceE_eptnfnrORfAUnikHConhchYLEUPARYRFOcbA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=alastair@d-silva.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=vaibhav@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.