linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: Dave Martin <Dave.Martin@arm.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	Arnd Bergmann <arnd@arndb.de>, Nicolas Pitre <nico@linaro.org>,
	Tony Lindgren <tony@atomide.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Tyler Baicar <tbaicar@codeaurora.org>,
	Will Deacon <will.deacon@arm.com>,
	Oleg Nesterov <oleg@redhat.com>, Olof Johansson <olof@lixom.net>,
	Santosh Shilimkar <santosh.shilimkar@ti.com>,
	linux-arm-kernel@lists.infradead.org,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 07/11] signal/arm64: Document conflicts with SI_USER and SIGFPE, SIGTRAP, SIGBUS
Date: Mon, 15 Jan 2018 19:30:26 +0000	[thread overview]
Message-ID: <5A5D0152.3090707@arm.com> (raw)
In-Reply-To: <20180115163028.GU22781@e103592.cambridge.arm.com>

Hi Dave,

Thanks for going through all these,

On 15/01/18 16:30, Dave Martin wrote:
> On Thu, Jan 11, 2018 at 06:59:36PM -0600, Eric W. Biederman wrote:
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 9b7f89df49db..abe200587334 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -596,7 +596,7 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)

>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"synchronous external abort"	},
> 
> This si_code seems to be a fallback for if ACPI is absent or doesn't
> know what to do with this error.
> 
> -> SIGBUS/BUS_OBJERR?
> 
> Can probably legitimately happen for userspace for suitable MMIO mappings.

It can happen for normal memory too, there are specific ESR values for
parity/checksum errors when read/writing memory. I think this first one is
'other/unknown', and its up to the CPU how to classify them.


> Perhaps it's more serious though in the presence of ACPI.  Do we expect
> that ACPI can diagnose all localisable errors?

Its not just ACPI, the CPU's v8.2 RAS Extensions use this
synchronous-external-abort as notification of a RAS error, (the other details
are written to to memory-mapped nodes). With the v8.2 RAS Extensions the ESR
tells us if the error was contained.

For ACPI we rely on firmware to set an appropriate severity in the CPER records
generated by firmware. The APEI helpers will call panic() if they find a fatal
error.

For systems with neither {firmware,kernel}-first RAS, BUS_OBJERR looks like a
good choice.


>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 0 (translation table walk)"	},
>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 1 (translation table walk)"	},
>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 2 (translation table walk)"	},
>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 3 (translation table walk)"	},
> 
> Pagetable screwup or kernel/system/CPU bug -> SIGKILL, or panic().

(RAS mechanisms may claim this and send their own signals, if not:)

SIGKILL is probably a better choice here, while we do have an address, there is
nothing user-space can do about it.


>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"synchronous parity or ECC error" },	// Reserved when RAS is implemented
> 
> Possibly SIGBUS/BUS_MCEERR_AR (though I don't know exactly what
> userspace is supposed to do with this or whether this implies the
> existence or certain kernel features for managing the error that
> may not be present on arm64...)

I'd like to keep the MCEERR signals to errors that we know are contained, the
kernel has understood and handled.

(These features do exist for arm64, enabling CONFIG_MEMORY_FAILURE and a few
APEI options allows all this to work today with suitable firmware. My Seattle
claims to support it).


> Otherwise, SIGKILL.

Sounds good,


>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 0 synchronous parity error (translation table walk)"	},	// Reserved when RAS is implemented
>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 1 synchronous parity error (translation table walk)"	},	// Reserved when RAS is implemented
>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 2 synchronous parity error (translation table walk)"	},	// Reserved when RAS is implemented
>> +	{ do_sea,		SIGBUS,  BUS_FIXME,	"level 3 synchronous parity error (translation table walk)"	},	// Reserved when RAS is implemented
> 
> Process page tables corrupt: if the kernel couldn't fix this, the
> process can't reasonably fix it -> SIGKILL
> 
> Since this is a RAS-type error it could be triggered by a cosmic ray
> rather than requiring a kernel or system bug or other major failure, so
> we probably shouldn't panic the system if the error is localisable to a
> particular process.

Without the RAS-Extensions severity to tell us the error is contained I'm not
sure what we can expect. But given the page-tables are per-process, and we never
swap them to disk etc, its probably a safe bet that it doesn't matter either way
for these.


Thanks,

James

  parent reply	other threads:[~2018-01-15 19:32 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-12  0:57 [PATCH 00/11] siginfo fixes/cleanups esp SI_USER Eric W. Biederman
2018-01-12  0:59 ` [PATCH 01/11] signal: Simplify and fix kdb_send_sig Eric W. Biederman
2018-01-12  0:59 ` [PATCH 02/11] signal/sh: Ensure si_signo is initialized in do_divide_error Eric W. Biederman
2018-01-12  0:59 ` [PATCH 03/11] signal/openrisc: Fix do_unaligned_access to send the proper signal Eric W. Biederman
2018-01-12 13:25   ` Stafford Horne
2018-01-12 17:37     ` Eric W. Biederman
2018-01-12  0:59 ` [PATCH 04/11] signal/parisc: Document a conflict with SI_USER with SIGFPE Eric W. Biederman
2018-01-12 22:29   ` Helge Deller
2018-01-13 21:06     ` Eric W. Biederman
2018-01-14  1:46       ` Eric W. Biederman
2018-02-23  0:15     ` Eric W. Biederman
2018-02-25 19:49       ` Helge Deller
2018-02-27  2:19         ` Eric W. Biederman
2018-01-12  0:59 ` [PATCH 05/11] signal/metag: " Eric W. Biederman
2018-01-12  0:59 ` [PATCH 06/11] signal/powerpc: Document conflicts with SI_USER and SIGFPE and SIGTRAP Eric W. Biederman
2018-01-12  0:59 ` [PATCH 07/11] signal/arm64: Document conflicts with SI_USER and SIGFPE,SIGTRAP,SIGBUS Eric W. Biederman
2018-01-15 16:30   ` [PATCH 07/11] signal/arm64: Document conflicts with SI_USER and SIGFPE, SIGTRAP, SIGBUS Dave Martin
2018-01-15 17:23     ` Eric W. Biederman
2018-01-16 17:24       ` Dave Martin
2018-01-16 22:28         ` Eric W. Biederman
2018-01-17 11:46           ` Dave Martin
2018-01-17 11:57           ` Russell King - ARM Linux
2018-01-17 12:15             ` Dave Martin
2018-01-17 12:37               ` Russell King - ARM Linux
2018-01-17 15:37                 ` Dave Martin
2018-01-17 15:49                   ` Russell King - ARM Linux
2018-01-17 16:11                     ` Dave Martin
2018-01-17 16:45                 ` Eric W. Biederman
2018-01-17 17:14                   ` Russell King - ARM Linux
2018-01-24 21:28                     ` Eric W. Biederman
2018-01-17 17:17       ` Dave Martin
2018-01-17 17:24         ` Eric W. Biederman
2018-01-17 17:39           ` Dave Martin
2018-01-15 19:30     ` James Morse [this message]
2018-01-12  0:59 ` [PATCH 08/11] signal/arm: Document conflicts with SI_USER and SIGFPE Eric W. Biederman
2018-01-15 17:49   ` Russell King - ARM Linux
2018-01-15 20:12     ` Eric W. Biederman
2018-01-16 17:41     ` Dave Martin
2018-01-19 12:05     ` Dave Martin
2018-01-12  0:59 ` [PATCH 09/11] signal: Reduce copy_siginfo to just a memcpy Eric W. Biederman
2018-01-12  0:59 ` [PATCH 10/11] signal: Introduce clear_siginfo Eric W. Biederman
2018-01-12  0:59 ` [PATCH 11/11] signal: Ensure generic siginfos the kernel sends have all bits initialized Eric W. Biederman
2018-01-12 20:29 ` [PATCH 0/2] siginfo fixes Eric W. Biederman
2018-01-12 20:31   ` [PATCH 1/2] mn10300/misalignment: Use SIGSEGV SEGV_MAPERR to report a failed user copy Eric W. Biederman
2018-01-12 20:31   ` [PATCH 2/2] x86/mm/pkeys: Fix fill_sig_info_pkey Eric W. Biederman
2018-01-14 11:44     ` [tip:x86/urgent] " tip-bot for Eric W. Biederman
2018-01-16  0:39   ` [PATCH 00/22] siginfo unification Eric W. Biederman
2018-01-16  0:39     ` [PATCH 01/22] signal: Document all of the signals that use the _sigfault union member Eric W. Biederman
2018-01-16  0:39     ` [PATCH 02/22] signal: Document the strange si_codes used by ptrace event stops Eric W. Biederman
2018-01-16  0:39     ` [PATCH 03/22] signal: Document glibc's si_code of SI_ASYNCNL Eric W. Biederman
2018-01-16  0:39     ` [PATCH 04/22] signal: Ensure no siginfo union member increases the size of struct siginfo Eric W. Biederman
2018-01-16  0:39     ` [PATCH 05/22] signal: Clear si_sys_private before copying siginfo to userspace Eric W. Biederman
2018-01-16  0:39     ` [PATCH 06/22] signal: Remove _sys_private and _overrun_incr from struct compat_siginfo Eric W. Biederman
2018-01-16  0:39     ` [PATCH 07/22] ia64/signal: switch to generic struct siginfo Eric W. Biederman
2018-01-16  0:39     ` [PATCH 08/22] signal/ia64: switch the last arch-specific copy_siginfo_to_user() to generic version Eric W. Biederman
2018-01-16  0:39     ` [PATCH 09/22] signal/mips: switch mips to generic siginfo Eric W. Biederman
2018-01-16  0:39     ` [PATCH 10/22] signal: Remove unnecessary ifdefs now that there is only one struct siginfo Eric W. Biederman
2018-01-16  0:39     ` [PATCH 11/22] signal: kill __ARCH_SI_UID_T Eric W. Biederman
2018-01-16  0:39     ` [PATCH 12/22] signal: unify compat_siginfo_t Eric W. Biederman
2018-01-16  0:40     ` [PATCH 13/22] signal: Move addr_lsb into the _sigfault union for clarity Eric W. Biederman
2018-03-16 19:00       ` Dave Hansen
2018-03-16 19:24         ` Dave Hansen
2018-03-16 20:06           ` Eric W. Biederman
2018-03-16 20:33             ` Dave Hansen
2018-03-16 21:08               ` Eric W. Biederman
2018-01-16  0:40     ` [PATCH 14/22] signal/powerpc: Remove redefinition of NSIGTRAP on powerpc Eric W. Biederman
2018-01-16  0:40     ` [PATCH 15/22] signal/ia64: Move the ia64 specific si_codes to asm-generic/siginfo.h Eric W. Biederman
2018-01-16  0:40     ` [PATCH 16/22] signal/frv: Move the frv " Eric W. Biederman
2018-01-16  0:40     ` [PATCH 17/22] signal/tile: Move the tile " Eric W. Biederman
2018-01-16  0:40     ` [PATCH 18/22] signal/blackfin: Move the blackfin " Eric W. Biederman
2018-01-16  0:40     ` [PATCH 19/22] signal/blackfin: Remove pointless UID16_SIGINFO_COMPAT_NEEDED Eric W. Biederman
2018-01-16  0:40     ` [PATCH 20/22] signal: Unify and correct copy_siginfo_from_user32 Eric W. Biederman
2018-01-16  0:40     ` [PATCH 21/22] signal: Remove the code to clear siginfo before calling copy_siginfo_from_user32 Eric W. Biederman
2018-01-16  0:40     ` [PATCH 22/22] signal: Unify and correct copy_siginfo_to_user32 Eric W. Biederman
2018-01-19 18:03       ` Al Viro
2018-01-19 21:04         ` Eric W. Biederman
2018-01-23 21:05     ` [PATCH 00/10] siginfo infrastructure Eric W. Biederman
2018-01-23 21:07       ` [PATCH 01/10] ptrace: Use copy_siginfo in setsiginfo and getsiginfo Eric W. Biederman
2018-01-23 21:07       ` [PATCH 02/10] signal/arm64: Better isolate the COMPAT_TASK portion of ptrace_hbptriggered Eric W. Biederman
2018-01-23 21:07       ` [PATCH 03/10] signal: Don't use structure initializers for struct siginfo Eric W. Biederman
2018-01-23 21:07       ` [PATCH 04/10] signal: Replace memset(info,...) with clear_siginfo for clarity Eric W. Biederman
2018-01-23 21:07       ` [PATCH 05/10] signal: Add send_sig_fault and force_sig_fault Eric W. Biederman
2018-01-23 21:07       ` [PATCH 06/10] signal: Helpers for faults with specialized siginfo layouts Eric W. Biederman
2018-01-24 19:26         ` Ram Pai
2018-01-24 20:54           ` Eric W. Biederman
2018-01-23 21:07       ` [PATCH 07/10] signal/powerpc: Remove unnecessary signal_code parameter of do_send_trap Eric W. Biederman
2018-01-23 21:07       ` [PATCH 08/10] signal/ptrace: Add force_sig_ptrace_errno_trap and use it where needed Eric W. Biederman
2018-01-23 21:07       ` [PATCH 09/10] mm/memory_failure: Remove unused trapno from memory_failure Eric W. Biederman
2018-01-23 21:07       ` [PATCH 10/10] signal/memory-failure: Use force_sig_mceerr and send_sig_mceerr Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A5D0152.3090707@arm.com \
    --to=james.morse@arm.com \
    --cc=Dave.Martin@arm.com \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nico@linaro.org \
    --cc=oleg@redhat.com \
    --cc=olof@lixom.net \
    --cc=santosh.shilimkar@ti.com \
    --cc=tbaicar@codeaurora.org \
    --cc=tony@atomide.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).