All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: <alison.schofield@intel.com>, Vishal Verma <vishal.l.verma@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>,
	<nvdimm@lists.linux.dev>, <linux-cxl@vger.kernel.org>
Subject: Re: [ndctl PATCH v6 0/7] Support poison list retrieval
Date: Thu, 18 Jan 2024 13:56:51 -0800	[thread overview]
Message-ID: <65a99ea31393a_2d43c29454@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <cover.1705534719.git.alison.schofield@intel.com>

alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> Changes since v5:
> - Use a private parser for cxl_poison events. (Dan)
>   Previously used the default parser and re-parsed per the cxl-list
>   needs. Replace that with a private parsing method for cxl_poison.
> - Add a private context to support private parsers. 
> - Add helpers to use with the cxl_poison parser.
> - cxl list json: drop nr_records field (Dan)
> - cxl list option: replace "poison" w "media-errors" (Dan)
> - cxl list json: replace "poison" w "media_errors" (Dan)
> - Link to v5: https://lore.kernel.org/linux-cxl/cover.1700615159.git.alison.schofield@intel.com/
> 
> 
> Begin cover letter:
> 
> Add the option to add a memory devices poison list to the cxl-list
> json output. Offer the option by memdev and by region. Sample usage:
> 
> # cxl list -m mem1 --media-errors
> [
>   {
>     "memdev":"mem1",
>     "pmem_size":1073741824,
>     "ram_size":1073741824,
>     "serial":1,
>     "numa_node":1,
>     "host":"cxl_mem.1",
>     "media_errors":[
>       {
>         "dpa":0,
>         "dpa_length":64,
>         "source":"Injected"
>       },
>       {
>         "region":"region5",

It feels odd to list the region here. I feel like what really matters is
to list the endpoint decoder and if someone wants to associate endpoint
decoder to region, or endpoint decoder to memdev there are other queries
for that.

Then this format does not change between the "region" listing and
"memdev" listing, they both just output the endpoint decoder and leave
the rest to follow-on queries.

For example I expect operations software has already recorded the
endpoint decoder to region mapping, so when this data comes in the
endpoint decoder is a key to make that association. Otherwise:

    cxl list -RT -e $endpoint_decoder

...can answer follow up questions about what is impacted by a given
media error record.

>         "dpa":1073741824,
>         "dpa_length":64,

The dpa_length is also the hpa_length, right? So maybe just call the
field "length".

>         "hpa":1035355557888,
>         "source":"Injected"
>       },
>       {
>         "region":"region5",
>         "dpa":1073745920,
>         "dpa_length":64,
>         "hpa":1035355566080,
>         "source":"Injected"

This "source" field feels like debug data. In production nobody is going
to be doing poison injection, and if the administrator injected it then
its implied they know that status. Otherwise a media-error is a
media-error regardless of the source.

>       }
>     ]
>   }
> ]
> 
> # cxl list -r region5 --media-errors
> [
>   {
>     "region":"region5",
>     "resource":1035355553792,
>     "size":2147483648,
>     "type":"pmem",
>     "interleave_ways":2,
>     "interleave_granularity":4096,
>     "decode_state":"commit",
>     "media_errors":[
>       {
>         "memdev":"mem1",
>         "dpa":1073741824,
>         "dpa_length":64,
>         "hpa":1035355557888,
>         "source":"Injected"
>       },
>       {
>         "memdev":"mem1",
>         "dpa":1073745920,
>         "dpa_length":64,
>         "hpa":1035355566080,
>         "source":"Injected"
>       }
>     ]
>   }
> ]
> 
> Alison Schofield (7):
>   libcxl: add interfaces for GET_POISON_LIST mailbox commands
>   cxl: add an optional pid check to event parsing
>   cxl/event_trace: add a private context for private parsers
>   cxl/event_trace: add helpers get_field_[string|data]()
>   cxl/list: collect and parse media_error records
>   cxl/list: add --media-errors option to cxl list
>   cxl/test: add cxl-poison.sh unit test
> 
>  Documentation/cxl/cxl-list.txt |  71 +++++++++++
>  cxl/event_trace.c              |  53 +++++++-
>  cxl/event_trace.h              |   9 +-
>  cxl/filter.h                   |   3 +
>  cxl/json.c                     | 218 +++++++++++++++++++++++++++++++++
>  cxl/lib/libcxl.c               |  47 +++++++
>  cxl/lib/libcxl.sym             |   6 +
>  cxl/libcxl.h                   |   2 +
>  cxl/list.c                     |   2 +
>  test/cxl-poison.sh             | 133 ++++++++++++++++++++
>  test/meson.build               |   2 +
>  11 files changed, 543 insertions(+), 3 deletions(-)
>  create mode 100644 test/cxl-poison.sh
> 
> 
> base-commit: a871e6153b11fe63780b37cdcb1eb347b296095c
> -- 
> 2.37.3
> 
> 



  parent reply	other threads:[~2024-01-18 21:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-18  0:27 [ndctl PATCH v6 0/7] Support poison list retrieval alison.schofield
2024-01-18  0:28 ` [PATCH v6 1/7] libcxl: add interfaces for GET_POISON_LIST mailbox commands alison.schofield
2024-01-19 18:32   ` Dave Jiang
2024-01-18  0:28 ` [PATCH v6 2/7] cxl: add an optional pid check to event parsing alison.schofield
2024-01-19 18:35   ` Dave Jiang
2024-01-18  0:28 ` [PATCH v6 3/7] cxl/event_trace: add a private context for private parsers alison.schofield
2024-01-19 21:08   ` Dave Jiang
2024-01-18  0:28 ` [PATCH v6 4/7] cxl/event_trace: add helpers get_field_[string|data]() alison.schofield
2024-01-19 21:18   ` Dave Jiang
2024-01-18  0:28 ` [PATCH v6 5/7] cxl/list: collect and parse media_error records alison.schofield
2024-01-18  0:28 ` [PATCH v6 6/7] cxl/list: add --media-errors option to cxl list alison.schofield
2024-01-18  0:28 ` [PATCH v6 7/7] cxl/test: add cxl-poison.sh unit test alison.schofield
2024-01-18 21:56 ` Dan Williams [this message]
2024-01-18 23:34   ` [ndctl PATCH v6 0/7] Support poison list retrieval Alison Schofield
2024-01-18 23:55     ` Dan Williams
2024-02-07 22:54       ` Alison Schofield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=65a99ea31393a_2d43c29454@dwillia2-mobl3.amr.corp.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=alison.schofield@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.