linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/1] arm64: Add workaround for Fujitsu A64FX erratum 010001
@ 2019-01-22  8:54 Zhang, Lei
  2019-01-22  8:54 ` [PATCH v2 1/1] " Zhang, Lei
  0 siblings, 1 reply; 4+ messages in thread
From: Zhang, Lei @ 2019-01-22  8:54 UTC (permalink / raw)
  To: 'Mark Rutland', 'catalin.marinas@arm.com',
	'will.deacon@arm.com',
	'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org'
  Cc: Zhang, Lei

On some variants of the Fujitsu-A64FX cores ver(1.0, 1.1),
 memory accesses may cause undefined fault (Data abort, DFSC=0b111111).
This problem will be fixed by next version of Fujitsu-A64FX.
I would like to post a workaround to avoid this problem on existing version.
The workaround is to replace the fault handler for Data abort
DFSC=0b111111 with a new one to ignore this undefined fault, 
which will only affect the Fujitsu-A64FX.

The detail for this problem.
> * Under what conditions can the fault occur? e.g. is this in place of
>   some other fault, or completely spurious?
This fault can occur completely spurious under a 
specific hardware condition and instructions order.
 
> * Does this only occur for data abort? i.e. not instruction aborts?
Yes. This fault only occurs for data abort.

> * How often does this fault occur?
In my test, this fault occurs once every several times 
in the OS boot sequence, and after the completion of OS boot, 
this fault have never occurred.
In my opinion, this fault rarely occurs after the completion of OS boot.

> * Does this only apply to Stage-1, or can the same faults be taken at
>   Stage-2?
This fault can be taken only at Stage-1.

> I'm a bit surprised by the single retry. Is there any guarantee that a 
> thread will eventually stop delivering this fault code?
I guarantee that a thread will stop delivering this fault code by the this patch.
The hardware condition which cause this fault is reset at exception entry, 
therefore execution of at least one instruction is 
guaranteed by this single retry.

Changes since [v1]
As Mark's review:

 * Adopted errata framework.

I have confirmed as followings:
 * Fujitsu A64FX - The problem doesn't happen.
 * QEMU          - No problems to boot.

I fully appreciate that if someone can test this patch on different chips 
to verity no harmful effect on other chips.

If there is no problem on other chips, please merge this patch.

The patch based on linux-5.0-rc2.

Zhang Lei (1):
  arm64: Add workaround for Fujitsu A64FX erratum 010001.

 Documentation/arm64/silicon-errata.txt |  1 +
 arch/arm64/Kconfig                     | 13 +++++++++++++
 arch/arm64/include/asm/cpucaps.h       |  3 ++-
 arch/arm64/include/asm/cputype.h       |  4 ++++
 arch/arm64/kernel/cpu_errata.c         |  8 ++++++++
 arch/arm64/mm/fault.c                  | 24 +++++++++++++++++++++++-
 6 files changed, 51 insertions(+), 2 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-29 10:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-22  8:54 [PATCH v2 0/1] arm64: Add workaround for Fujitsu A64FX erratum 010001 Zhang, Lei
2019-01-22  8:54 ` [PATCH v2 1/1] " Zhang, Lei
2019-01-25 18:08   ` Catalin Marinas
2019-01-29 10:54     ` Zhang, Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).