From: Xiaofei Tan <tanxiaofei@huawei.com>
To: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Linuxarm <linuxarm@huawei.com>, Will Deacon <will@kernel.org>,
Dave Martin <Dave.Martin@arm.com>,
linux-arm-kernel@lists.infradead.org,
Shiju Jose <shiju.jose@huawei.com>
Subject: Re: Question about SEA handling process happened in user space
Date: Thu, 2 Apr 2020 14:35:15 +0800 [thread overview]
Message-ID: <5E8587A3.6030101@huawei.com> (raw)
In-Reply-To: <2b0e5507-ad75-9af1-6afe-aa87d8cf597f@arm.com>
Hi James,
On 2020/3/31 0:49, James Morse wrote:
> Hi Xiaofei,
>
> On 3/30/20 2:10 PM, Xiaofei Tan wrote:
>> I'm a little confused about the handling process of SEA happened in user space.
>
>> Following the description of FnV bit of register ESR_ELx in ARMv8.4 SPEC,FAR is
>> valid only for synchronous External abort on a translation table walk.
>
>> But for this FAR valid scenario(attached code from line 684 to 687),
>> we send signal SIGKILL to kill process. For some other scenario, such as line 680,
>> FAR is not valid, but we send SIGBUS and transfer error address to process to try to do some recovery.
>
> 'FAR is not valid': its optional. The ESR_EL1.FnV bit can be set for the 'catch
> all' external abort fault-status-code. This lets the CPU tell us that it doesn't
> know what the faulting virtual address is.
>
> do_sea() checks for this:
> | if (esr & ESR_ELx_FnV)
> | siaddr = NULL;
> | else
> | siaddr = (void __user *)addr;
>
> If we can't know the address, there isn't much we can do.
> (This check is really making sure we don't pass junk to user-space when FnV is set)
>
>
>> should it be an problem ?
>
> I'm not quite sure what your question is.
>
> If the CPU doesn't tell us the address, we can't tell user-space what it is. The
> alternative is to upgrade to SIGKILL in that case.
>
>
> If you see this instead of the address provided via firmware-first, there is a
> series to improve that here:
> https://lore.kernel.org/linux-acpi/20200228174817.74278-1-james.morse@arm.com/
>
> (We skip this signal code of APEI promises it did all the work. This lets you
> take the signal from memory_failure() instead, which may have better information.)
>
There may be an competition issue.
APEI run memory_failure() in an bottom half for memory errors. Then it may be not finished
before here SEA handling end, and application process may back to run.
>
> If its the SIGKILL entries: these are for the translation table walk.
> There is no point telling user-space about corruption in its page tables as it
> can't do anything about it. The kernel's handling of this is to kill the
> process. (page tables make up a very small amount of memory, so this should be
> rarer than the regular 'external abort' case)
>
>
> Thanks,
>
> James
>
>
>
>> 680 { do_sea, SIGBUS, BUS_OBJERR, "synchronous external abort" },
>> 684 { do_sea, SIGKILL, SI_KERNEL, "level 0 (translation table walk)" },
>> 685 { do_sea, SIGKILL, SI_KERNEL, "level 1 (translation table walk)" },
>> 686 { do_sea, SIGKILL, SI_KERNEL, "level 2 (translation table walk)" },
>> 687 { do_sea, SIGKILL, SI_KERNEL, "level 3 (translation table walk)" },
>> 688 { do_sea, SIGBUS, BUS_OBJERR, "synchronous parity or ECC error" }, // Reserved when RAS is implemented
>> 692 { do_sea, SIGKILL, SI_KERNEL, "level 0 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented
>> 693 { do_sea, SIGKILL, SI_KERNEL, "level 1 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented
>> 694 { do_sea, SIGKILL, SI_KERNEL, "level 2 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented
>> 695 { do_sea, SIGKILL, SI_KERNEL, "level 3 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented
>> 696 { do_bad, SIGKILL, SI_KERNEL, "unknown 32" },
>
> .
>
--
thanks
tanxiaofei
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-04-02 6:35 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-30 13:10 Question about SEA handling process happened in user space Xiaofei Tan
2020-03-30 16:49 ` James Morse
2020-03-31 9:41 ` Xiaofei Tan
2020-03-31 17:00 ` James Morse
2020-04-01 3:49 ` Xiaofei Tan
2020-04-07 16:37 ` James Morse
2020-04-09 8:42 ` Xiaofei Tan
2020-04-09 14:28 ` James Morse
2020-04-10 2:55 ` Xiaofei Tan
2020-04-16 13:27 ` James Morse
2020-04-18 10:49 ` Xiaofei Tan
2020-04-02 6:35 ` Xiaofei Tan [this message]
2020-04-07 16:37 ` James Morse
2020-04-09 9:17 ` Xiaofei Tan
2020-04-09 14:28 ` James Morse
2020-04-10 9:43 ` Xiaofei Tan
2020-04-16 13:50 ` James Morse
2020-04-18 11:25 ` Xiaofei Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5E8587A3.6030101@huawei.com \
--to=tanxiaofei@huawei.com \
--cc=Dave.Martin@arm.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linuxarm@huawei.com \
--cc=shiju.jose@huawei.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).