From: Dave Jiang <dave.jiang@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
linux-cxl@vger.kernel.org, dan.j.williams@intel.com
Cc: linuxarm@huawei.com, ira.weiny@intel.com,
vishal.l.verma@intel.com, alison.schofield@intel.com
Subject: Re: [RFC PATCH 1/2] cxl: RAS: Multiple header recording support
Date: Tue, 17 Jan 2023 11:05:34 -0700 [thread overview]
Message-ID: <db285965-1bf4-7e46-df94-d4ce349baf8f@intel.com> (raw)
In-Reply-To: <20230113154058.16227-2-Jonathan.Cameron@huawei.com>
On 1/13/23 8:40 AM, Jonathan Cameron wrote:
> Similar to PCIe, CXL devices may support logging multiple headers
> corresponding to multiple errors as reported via the CXL RAS capability.
>
> Unlike PCIe, in CXL there is no Multiple Header Recording Enable bit
> and the CXL r3.0 specification is sparse on details. As such, the
> kernel should allow for any reasonable interpretation including
> endpoints for which the capability bit is set that behave as per
> the PCIe equivalent definitions (with assumption that the missing
> 'enable bit' is set). Note that behaving as if Multiple Headers
> are being logged is also valid behavior when they are not so this
> approach should be safe with all sensible specification interpretations.
>
> By repeatedly attempting to clear a single bit corresponding to the reported
> First Error (may need multiple goes if multiple records of same type
> are tracked by the hardware) the additional header logs may be obtained.
>
> Note that each trace record only records the FE in the status.
> We could record them all as done without Multi header recording
> capability but that seemed less intuitive to me.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Looks reasonable
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/pci.c | 17 ++++++++++++-----
> drivers/cxl/cxl.h | 1 +
> 2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 184ead6a2796..6fd311e313c6 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -673,10 +673,13 @@ static bool cxl_report_and_clear(struct cxl_dev_state *cxlds)
> void __iomem *addr;
> u32 status;
> u32 fe;
> + bool mh;
>
> if (!cxlds->regs.ras)
> return false;
>
> +next_record:
> + mh = false;
> addr = cxlds->regs.ras + CXL_RAS_UNCORRECTABLE_STATUS_OFFSET;
> status = readl(addr);
> if (!(status & CXL_RAS_UNCORRECTABLE_STATUS_MASK))
> @@ -684,11 +687,13 @@ static bool cxl_report_and_clear(struct cxl_dev_state *cxlds)
>
> /* If multiple errors, log header points to first error from ctrl reg */
> if (hweight32(status) > 1) {
> - void __iomem *rcc_addr =
> - cxlds->regs.ras + CXL_RAS_CAP_CONTROL_OFFSET;
> -
> - fe = BIT(FIELD_GET(CXL_RAS_CAP_CONTROL_FE_MASK,
> - readl(rcc_addr)));
> + u32 capctrl = readl(cxlds->regs.ras + CXL_RAS_CAP_CONTROL_OFFSET);
> + fe = BIT(FIELD_GET(CXL_RAS_CAP_CONTROL_FE_MASK, capctrl));
> + if (FIELD_GET(CXL_RAS_CAP_CONTROL_MH_REC_CAP, capctrl)) {
> + mh = true;
> + /* Report and clear only first error */
> + status = fe;
> + }
> } else {
> fe = status;
> }
> @@ -696,6 +701,8 @@ static bool cxl_report_and_clear(struct cxl_dev_state *cxlds)
> header_log_copy(cxlds, hl);
> trace_cxl_aer_uncorrectable_error(dev, status, fe, hl);
> writel(status & CXL_RAS_UNCORRECTABLE_STATUS_MASK, addr);
> + if (mh)
> + goto next_record;
>
> return true;
> }
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index aa3af3bb73b2..ee31a99073c2 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -138,6 +138,7 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> #define CXL_RAS_CORRECTABLE_MASK_MASK GENMASK(6, 0)
> #define CXL_RAS_CAP_CONTROL_OFFSET 0x14
> #define CXL_RAS_CAP_CONTROL_FE_MASK GENMASK(5, 0)
> +#define CXL_RAS_CAP_CONTROL_MH_REC_CAP BIT(9)
> #define CXL_RAS_HEADER_LOG_OFFSET 0x18
> #define CXL_RAS_CAPABILITY_LENGTH 0x58
> #define CXL_HEADERLOG_SIZE SZ_512
next prev parent reply other threads:[~2023-01-17 18:46 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-13 15:40 [RFC PATCH 0/2] CXL UE RAS Multiple Header Logging support Jonathan Cameron
2023-01-13 15:40 ` [RFC PATCH 1/2] cxl: RAS: Multiple header recording support Jonathan Cameron
2023-01-17 18:05 ` Dave Jiang [this message]
2023-01-13 15:40 ` [RFC PATCH 2/2] cxl: Add tprintk support for header log hex dump Jonathan Cameron
2023-01-17 18:08 ` Dave Jiang
2023-01-18 9:43 ` Jonathan Cameron
2023-01-18 15:19 ` Dave Jiang
-- strict thread matches above, loose matches on Subject: below --
2023-01-13 15:40 [RFC PATCH 0/2] CXL UE RAS Multiple Header Logging support Jonathan Cameron
2023-01-13 15:40 ` [RFC PATCH 1/2] cxl: RAS: Multiple header recording support Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=db285965-1bf4-7e46-df94-d4ce349baf8f@intel.com \
--to=dave.jiang@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).