All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ruidong Tian <tianruidong@linux.alibaba.com>
To: catalin.marinas@arm.com, will@kernel.org, lpieralisi@kernel.org,
	guohanjun@huawei.com, sudeep.holla@arm.com,
	xueshuai@linux.alibaba.com, baolin.wang@linux.alibaba.com,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Cc: Ruidong Tian <tianruidong@linux.alibaba.com>
Subject: [PATCH 0/2] ARM Error Source Table V1 Support
Date: Mon,  4 Mar 2024 19:15:15 +0800	[thread overview]
Message-ID: <20240304111517.33001-1-tianruidong@linux.alibaba.com> (raw)

This series adds support for the ARM Error Source Table (AEST) based on
the 1.1 version of ACPI for the Armv8 RAS Extensions [0].

The Arm Error Source Table (AEST) enable kernel-first handling of errors
in a system that supports the Armv8 RAS extensions. Hardware errors will
trigger a RAS interrupt to kernel, kernel scan all AEST node to fine
error node which occur error in irq context and use a workqueue to log
this hardware errors.

I have tested this series on PTG Yitian710 SOC. Both corrected and
uncorrected errors were tested to verify the non-fatal vs fatal
scenarios.

Future work:
1. UE trigger memory_failure other than panic.
2. Add CE storm mitigation.
3. Support AEST V2.

This series is based on Tyler Baicar's patches [1], which do not have v2
sended to mail list yet. Change from origin patch:
1. Add a genpool to collect all AEST error, and log them in a workqueue
other than in irq context.
2. Just use the same one aest_proc function for system register interface
and MMIO interface.
3. Reconstruct some structures and functions to make it more clear.
4. Accept all comments in Tyler Baicar's mail list.

[0]: https://developer.arm.com/documentation/den0085/0101/
[1]: https://lore.kernel.org/all/20211124170708.3874-1-baicar@os.amperecomputing.com/

Tyler Baicar (2):
  ACPI/AEST: Initial AEST driver
  trace, ras: add ARM RAS extension trace event

 MAINTAINERS                  |  11 +
 arch/arm64/include/asm/ras.h |  38 ++
 drivers/acpi/arm64/Kconfig   |  10 +
 drivers/acpi/arm64/Makefile  |   1 +
 drivers/acpi/arm64/aest.c    | 728 +++++++++++++++++++++++++++++++++++
 include/linux/acpi_aest.h    |  91 +++++
 include/linux/cpuhotplug.h   |   1 +
 include/ras/ras_event.h      |  55 +++
 8 files changed, 935 insertions(+)
 create mode 100644 arch/arm64/include/asm/ras.h
 create mode 100644 drivers/acpi/arm64/aest.c
 create mode 100644 include/linux/acpi_aest.h

-- 
2.33.1


WARNING: multiple messages have this Message-ID (diff)
From: Ruidong Tian <tianruidong@linux.alibaba.com>
To: catalin.marinas@arm.com, will@kernel.org, lpieralisi@kernel.org,
	guohanjun@huawei.com, sudeep.holla@arm.com,
	xueshuai@linux.alibaba.com, baolin.wang@linux.alibaba.com,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Cc: Ruidong Tian <tianruidong@linux.alibaba.com>
Subject: [PATCH 0/2] ARM Error Source Table V1 Support
Date: Mon,  4 Mar 2024 19:15:15 +0800	[thread overview]
Message-ID: <20240304111517.33001-1-tianruidong@linux.alibaba.com> (raw)

This series adds support for the ARM Error Source Table (AEST) based on
the 1.1 version of ACPI for the Armv8 RAS Extensions [0].

The Arm Error Source Table (AEST) enable kernel-first handling of errors
in a system that supports the Armv8 RAS extensions. Hardware errors will
trigger a RAS interrupt to kernel, kernel scan all AEST node to fine
error node which occur error in irq context and use a workqueue to log
this hardware errors.

I have tested this series on PTG Yitian710 SOC. Both corrected and
uncorrected errors were tested to verify the non-fatal vs fatal
scenarios.

Future work:
1. UE trigger memory_failure other than panic.
2. Add CE storm mitigation.
3. Support AEST V2.

This series is based on Tyler Baicar's patches [1], which do not have v2
sended to mail list yet. Change from origin patch:
1. Add a genpool to collect all AEST error, and log them in a workqueue
other than in irq context.
2. Just use the same one aest_proc function for system register interface
and MMIO interface.
3. Reconstruct some structures and functions to make it more clear.
4. Accept all comments in Tyler Baicar's mail list.

[0]: https://developer.arm.com/documentation/den0085/0101/
[1]: https://lore.kernel.org/all/20211124170708.3874-1-baicar@os.amperecomputing.com/

Tyler Baicar (2):
  ACPI/AEST: Initial AEST driver
  trace, ras: add ARM RAS extension trace event

 MAINTAINERS                  |  11 +
 arch/arm64/include/asm/ras.h |  38 ++
 drivers/acpi/arm64/Kconfig   |  10 +
 drivers/acpi/arm64/Makefile  |   1 +
 drivers/acpi/arm64/aest.c    | 728 +++++++++++++++++++++++++++++++++++
 include/linux/acpi_aest.h    |  91 +++++
 include/linux/cpuhotplug.h   |   1 +
 include/ras/ras_event.h      |  55 +++
 8 files changed, 935 insertions(+)
 create mode 100644 arch/arm64/include/asm/ras.h
 create mode 100644 drivers/acpi/arm64/aest.c
 create mode 100644 include/linux/acpi_aest.h

-- 
2.33.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2024-03-04 11:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 11:15 Ruidong Tian [this message]
2024-03-04 11:15 ` [PATCH 0/2] ARM Error Source Table V1 Support Ruidong Tian
2024-03-04 11:15 ` [PATCH 1/2] ACPI/AEST: Initial AEST driver Ruidong Tian
2024-03-04 11:15   ` Ruidong Tian
2024-03-04 12:07   ` Marc Zyngier
2024-03-04 12:07     ` Marc Zyngier
2024-03-08  4:49     ` Ruidong Tian
2024-03-08  4:49       ` Ruidong Tian
     [not found]     ` <aaad88c3-333d-4714-a9ca-3b66c8a5d9c8@linux.alibaba.com>
2024-03-09 10:33       ` Marc Zyngier
2024-03-09 10:33         ` Marc Zyngier
2024-03-12  9:53         ` Ruidong Tian
2024-03-12  9:53           ` Ruidong Tian
2024-03-04 11:15 ` [PATCH 2/2] trace, ras: add ARM RAS extension trace event Ruidong Tian
2024-03-04 11:15   ` Ruidong Tian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240304111517.33001-1-tianruidong@linux.alibaba.com \
    --to=tianruidong@linux.alibaba.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=guohanjun@huawei.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.