All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Baicar, Tyler" <tbaicar@codeaurora.org>
To: James Morse <james.morse@arm.com>
Cc: christoffer.dall@linaro.org, marc.zyngier@arm.com,
	pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk,
	catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net,
	lenb@kernel.org, matt@codeblueprint.co.uk,
	robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org,
	zjzhang@codeaurora.org, mark.rutland@arm.com,
	akpm@linux-foundation.org, eun.taik.lee@samsung.com,
	sandeepa.s.prabhu@gmail.com, labbott@redhat.com,
	shijie.huang@arm.com, rruigrok@codeaurora.org,
	paul.gortmaker@windriver.com, tn@semihalf.com, fu.wei@linaro.org,
	rostedt@goodmis.org, bristot@redhat.com,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-efi@vger.kernel.org, devel@acpica.org, Suzuki.Po
Subject: Re: [PATCH V7 04/10] arm64: exception: handle Synchronous External Abort
Date: Tue, 24 Jan 2017 11:41:38 -0700	[thread overview]
Message-ID: <4ee2633d-e4a9-6237-4298-4454e7d268ad@codeaurora.org> (raw)
In-Reply-To: <5885D46A.7000304@arm.com>

On 1/23/2017 3:01 AM, James Morse wrote:
> Hi Tyler,
>
> On 20/01/17 20:35, Baicar, Tyler wrote:
>> On 1/19/2017 10:55 AM, James Morse wrote:
>>> On 18/01/17 23:26, Baicar, Tyler wrote:
>>>> On 1/17/2017 3:31 AM, James Morse wrote:
>>>>> On 12/01/17 18:15, Tyler Baicar wrote:
>>>>>> +    info.si_addr  = (void __user *)addr;
>>>>> addr here was read from FAR_EL1, but for some of the classes of exception you
>>>>> have listed below this register isn't updated with the faulting address.
>>>>>
>>>>> The ARM-ARM version 'k' in D1.10.5 "Summary of registers on faults taken to an
>>>>> Exception level that is using Aarch64" has:
>>>>>> The architecture permits that the FAR_ELx is UNKNOWN for Synchronous External
>>>>>> Aborts other than Synchronous External Aborts on Translation Table Walks. In
>>>>>> this case, the ISS.FnV bit returned in ESR_ELx  indicates whether FAR_ELx is
>>>>>> valid.
>>>>> This is a problem if we get 'synchronous external abort' or 'synchronous parity
>>>>> error' while a user space process was running.
>>>> It looks like this would just cause an incorrect address to be printed in the
>>>> above pr_err.
>>>> Unless I'm missing something, I don't see arm64_notify_die or anything that gets
>>>> called from
>>>> there using the info.si_addr variable.
>>> I may be misreading something here...
>>>
>>> This patch has:
>>>>      info.si_addr  = (void __user *)addr;
>>>>      arm64_notify_die("", regs, &info, esr);
>>>   From arch/arm64/kernel/traps.c:arm64_notify_die():
>>>>      if (user_mode(regs)) {
>>>>          current->thread.fault_address = 0;
>>>>          current->thread.fault_code = err;
>>>>          force_sig_info(info->si_signo, info, current);
>>>>      }
>>> So if the SEA interrupted userspace, we put maybe-unknown addr into
>>> force_sig_info() to deliver a signal to user space. User-space then gets a copy
>>> of the info struct containing the maybe-unknown addr.
>>>
>>> I think this is an existing bug, but if we are separating the synchronous
>>> external aborts from the generic do_bad handler, we should probably check the
>>> FnV bit. (I think we should still print it out)
>>>
>>>
>>>> What do you suggest I do here? The firmware should be reporting the physical and
>>>> virtual
>>>> address information if it is available in the HEST entry that the kernel will
>>>> parse.
>>> Its not just firmware that may trigger this, other SoCs may use it for parity or
>>> ECC errors, and they may not always have a valid address in FAR_EL1.
>>>
>>> I think we should check the FnV bit in the esr variable and set info.si_addr to
>>> 0 if the addr we have isn't valid:
>>> 'For some implementations, the value of si_addr may be inaccurate.' [0]
>> Okay, that makes sense, we don't want userspace to be notified with an incorrect
>> address.
>> I will add the check to verify it's valid. Which bit in the ESR is the FnV bit?
>> I'm not finding
>> the bit breakdown of the ISS that shows it.
> The bits in ISS vary depending on the EC, so a little digging is required.
> "D7.2.27 ESR_ELx, Exception Syndrome Register (ELx)" lists the EC values, from
> there 'Instruction Abort' and 'Data Abort' both list FnV as bit 10. Version 'k'
> of the ARM-ARM has this on pages D7-1953 to D7-1956.
Got it! I'll add the check for this in my next patchset.

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


WARNING: multiple messages have this Message-ID (diff)
From: "Baicar, Tyler" <tbaicar@codeaurora.org>
To: James Morse <james.morse@arm.com>
Cc: christoffer.dall@linaro.org, marc.zyngier@arm.com,
	pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk,
	catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net,
	lenb@kernel.org, matt@codeblueprint.co.uk,
	robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org,
	zjzhang@codeaurora.org, mark.rutland@arm.com,
	akpm@linux-foundation.org, eun.taik.lee@samsung.com,
	sandeepa.s.prabhu@gmail.com, labbott@redhat.com,
	shijie.huang@arm.com, rruigrok@codeaurora.org,
	paul.gortmaker@windriver.com, tn@semihalf.com, fu.wei@linaro.org,
	rostedt@goodmis.org, bristot@redhat.com,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-efi@vger.kernel.org, devel@acpica.org,
	Suzuki.Poulose@arm.com, punit.agrawal@arm.com, astone@redhat.com,
	harba@codeaurora.org, hanjun.guo@linaro.org,
	john.garry@huawei.com, shiju.jose@huawei.com
Subject: Re: [PATCH V7 04/10] arm64: exception: handle Synchronous External Abort
Date: Tue, 24 Jan 2017 11:41:38 -0700	[thread overview]
Message-ID: <4ee2633d-e4a9-6237-4298-4454e7d268ad@codeaurora.org> (raw)
In-Reply-To: <5885D46A.7000304@arm.com>

On 1/23/2017 3:01 AM, James Morse wrote:
> Hi Tyler,
>
> On 20/01/17 20:35, Baicar, Tyler wrote:
>> On 1/19/2017 10:55 AM, James Morse wrote:
>>> On 18/01/17 23:26, Baicar, Tyler wrote:
>>>> On 1/17/2017 3:31 AM, James Morse wrote:
>>>>> On 12/01/17 18:15, Tyler Baicar wrote:
>>>>>> +    info.si_addr  = (void __user *)addr;
>>>>> addr here was read from FAR_EL1, but for some of the classes of exception you
>>>>> have listed below this register isn't updated with the faulting address.
>>>>>
>>>>> The ARM-ARM version 'k' in D1.10.5 "Summary of registers on faults taken to an
>>>>> Exception level that is using Aarch64" has:
>>>>>> The architecture permits that the FAR_ELx is UNKNOWN for Synchronous External
>>>>>> Aborts other than Synchronous External Aborts on Translation Table Walks. In
>>>>>> this case, the ISS.FnV bit returned in ESR_ELx  indicates whether FAR_ELx is
>>>>>> valid.
>>>>> This is a problem if we get 'synchronous external abort' or 'synchronous parity
>>>>> error' while a user space process was running.
>>>> It looks like this would just cause an incorrect address to be printed in the
>>>> above pr_err.
>>>> Unless I'm missing something, I don't see arm64_notify_die or anything that gets
>>>> called from
>>>> there using the info.si_addr variable.
>>> I may be misreading something here...
>>>
>>> This patch has:
>>>>      info.si_addr  = (void __user *)addr;
>>>>      arm64_notify_die("", regs, &info, esr);
>>>   From arch/arm64/kernel/traps.c:arm64_notify_die():
>>>>      if (user_mode(regs)) {
>>>>          current->thread.fault_address = 0;
>>>>          current->thread.fault_code = err;
>>>>          force_sig_info(info->si_signo, info, current);
>>>>      }
>>> So if the SEA interrupted userspace, we put maybe-unknown addr into
>>> force_sig_info() to deliver a signal to user space. User-space then gets a copy
>>> of the info struct containing the maybe-unknown addr.
>>>
>>> I think this is an existing bug, but if we are separating the synchronous
>>> external aborts from the generic do_bad handler, we should probably check the
>>> FnV bit. (I think we should still print it out)
>>>
>>>
>>>> What do you suggest I do here? The firmware should be reporting the physical and
>>>> virtual
>>>> address information if it is available in the HEST entry that the kernel will
>>>> parse.
>>> Its not just firmware that may trigger this, other SoCs may use it for parity or
>>> ECC errors, and they may not always have a valid address in FAR_EL1.
>>>
>>> I think we should check the FnV bit in the esr variable and set info.si_addr to
>>> 0 if the addr we have isn't valid:
>>> 'For some implementations, the value of si_addr may be inaccurate.' [0]
>> Okay, that makes sense, we don't want userspace to be notified with an incorrect
>> address.
>> I will add the check to verify it's valid. Which bit in the ESR is the FnV bit?
>> I'm not finding
>> the bit breakdown of the ISS that shows it.
> The bits in ISS vary depending on the EC, so a little digging is required.
> "D7.2.27 ESR_ELx, Exception Syndrome Register (ELx)" lists the EC values, from
> there 'Instruction Abort' and 'Data Abort' both list FnV as bit 10. Version 'k'
> of the ARM-ARM has this on pages D7-1953 to D7-1956.
Got it! I'll add the check for this in my next patchset.

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

WARNING: multiple messages have this Message-ID (diff)
From: tbaicar@codeaurora.org (Baicar, Tyler)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V7 04/10] arm64: exception: handle Synchronous External Abort
Date: Tue, 24 Jan 2017 11:41:38 -0700	[thread overview]
Message-ID: <4ee2633d-e4a9-6237-4298-4454e7d268ad@codeaurora.org> (raw)
In-Reply-To: <5885D46A.7000304@arm.com>

On 1/23/2017 3:01 AM, James Morse wrote:
> Hi Tyler,
>
> On 20/01/17 20:35, Baicar, Tyler wrote:
>> On 1/19/2017 10:55 AM, James Morse wrote:
>>> On 18/01/17 23:26, Baicar, Tyler wrote:
>>>> On 1/17/2017 3:31 AM, James Morse wrote:
>>>>> On 12/01/17 18:15, Tyler Baicar wrote:
>>>>>> +    info.si_addr  = (void __user *)addr;
>>>>> addr here was read from FAR_EL1, but for some of the classes of exception you
>>>>> have listed below this register isn't updated with the faulting address.
>>>>>
>>>>> The ARM-ARM version 'k' in D1.10.5 "Summary of registers on faults taken to an
>>>>> Exception level that is using Aarch64" has:
>>>>>> The architecture permits that the FAR_ELx is UNKNOWN for Synchronous External
>>>>>> Aborts other than Synchronous External Aborts on Translation Table Walks. In
>>>>>> this case, the ISS.FnV bit returned in ESR_ELx  indicates whether FAR_ELx is
>>>>>> valid.
>>>>> This is a problem if we get 'synchronous external abort' or 'synchronous parity
>>>>> error' while a user space process was running.
>>>> It looks like this would just cause an incorrect address to be printed in the
>>>> above pr_err.
>>>> Unless I'm missing something, I don't see arm64_notify_die or anything that gets
>>>> called from
>>>> there using the info.si_addr variable.
>>> I may be misreading something here...
>>>
>>> This patch has:
>>>>      info.si_addr  = (void __user *)addr;
>>>>      arm64_notify_die("", regs, &info, esr);
>>>   From arch/arm64/kernel/traps.c:arm64_notify_die():
>>>>      if (user_mode(regs)) {
>>>>          current->thread.fault_address = 0;
>>>>          current->thread.fault_code = err;
>>>>          force_sig_info(info->si_signo, info, current);
>>>>      }
>>> So if the SEA interrupted userspace, we put maybe-unknown addr into
>>> force_sig_info() to deliver a signal to user space. User-space then gets a copy
>>> of the info struct containing the maybe-unknown addr.
>>>
>>> I think this is an existing bug, but if we are separating the synchronous
>>> external aborts from the generic do_bad handler, we should probably check the
>>> FnV bit. (I think we should still print it out)
>>>
>>>
>>>> What do you suggest I do here? The firmware should be reporting the physical and
>>>> virtual
>>>> address information if it is available in the HEST entry that the kernel will
>>>> parse.
>>> Its not just firmware that may trigger this, other SoCs may use it for parity or
>>> ECC errors, and they may not always have a valid address in FAR_EL1.
>>>
>>> I think we should check the FnV bit in the esr variable and set info.si_addr to
>>> 0 if the addr we have isn't valid:
>>> 'For some implementations, the value of si_addr may be inaccurate.' [0]
>> Okay, that makes sense, we don't want userspace to be notified with an incorrect
>> address.
>> I will add the check to verify it's valid. Which bit in the ESR is the FnV bit?
>> I'm not finding
>> the bit breakdown of the ISS that shows it.
> The bits in ISS vary depending on the EC, so a little digging is required.
> "D7.2.27 ESR_ELx, Exception Syndrome Register (ELx)" lists the EC values, from
> there 'Instruction Abort' and 'Data Abort' both list FnV as bit 10. Version 'k'
> of the ARM-ARM has this on pages D7-1953 to D7-1956.
Got it! I'll add the check for this in my next patchset.

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

  reply	other threads:[~2017-01-24 18:41 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-12 18:15 [PATCH V7 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2017-01-12 18:15 ` Tyler Baicar
2017-01-12 18:15 ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 01/10] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1 Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 03/10] efi: parse ARM processor error Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 04/10] arm64: exception: handle Synchronous External Abort Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-16 11:53   ` Will Deacon
2017-01-16 11:53     ` Will Deacon
2017-01-16 11:53     ` Will Deacon
2017-01-16 20:09     ` Baicar, Tyler
2017-01-16 20:09       ` Baicar, Tyler
2017-01-16 20:09       ` Baicar, Tyler
2017-01-17 10:27       ` Will Deacon
2017-01-17 10:27         ` Will Deacon
2017-01-17 10:27         ` Will Deacon
2017-01-18 22:53         ` Baicar, Tyler
2017-01-18 22:53           ` Baicar, Tyler
2017-01-18 22:53           ` Baicar, Tyler
2017-01-17 10:23     ` James Morse
2017-01-17 10:23       ` James Morse
2017-01-17 10:23       ` James Morse
2017-01-18 22:52       ` Baicar, Tyler
2017-01-18 22:52         ` Baicar, Tyler
2017-01-18 22:52         ` Baicar, Tyler
2017-01-18 22:52         ` Baicar, Tyler
2017-01-17 10:31   ` James Morse
2017-01-17 10:31     ` James Morse
2017-01-17 10:31     ` James Morse
2017-01-18 23:26     ` Baicar, Tyler
2017-01-18 23:26       ` Baicar, Tyler
2017-01-18 23:26       ` Baicar, Tyler
2017-01-19 17:55       ` James Morse
2017-01-19 17:55         ` James Morse
2017-01-19 17:55         ` James Morse
2017-01-20 20:35         ` Baicar, Tyler
2017-01-20 20:35           ` Baicar, Tyler
2017-01-20 20:35           ` Baicar, Tyler
2017-01-23 10:01           ` James Morse
2017-01-23 10:01             ` James Morse
2017-01-23 10:01             ` James Morse
2017-01-24 18:41             ` Baicar, Tyler [this message]
2017-01-24 18:41               ` Baicar, Tyler
2017-01-24 18:41               ` Baicar, Tyler
2017-01-12 18:15 ` [PATCH V7 05/10] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-18 14:50   ` James Morse
2017-01-18 14:50     ` James Morse
2017-01-18 14:50     ` James Morse
2017-01-18 23:51     ` Baicar, Tyler
2017-01-18 23:51       ` Baicar, Tyler
2017-01-18 23:51       ` Baicar, Tyler
2017-01-19 17:57       ` James Morse
2017-01-19 17:57         ` James Morse
2017-01-19 17:57         ` James Morse
2017-01-20 20:58         ` Baicar, Tyler
2017-01-20 20:58           ` Baicar, Tyler
2017-01-20 20:58           ` Baicar, Tyler
     [not found]           ` <8b9d254a-5450-d841-baf7-5819a88043e4-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-01-24 17:55             ` James Morse
2017-01-24 17:55               ` James Morse
2017-01-24 17:55               ` James Morse
2017-01-24 18:43               ` Baicar, Tyler
2017-01-24 18:43                 ` Baicar, Tyler
2017-01-24 18:43                 ` Baicar, Tyler
2017-01-12 18:15 ` [PATCH V7 06/10] acpi: apei: panic OS with fatal error status block Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 07/10] efi: print unrecognized CPER section Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 08/10] ras: acpi / apei: generate trace event for " Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 09/10] trace, ras: add ARM processor error trace event Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15 ` [PATCH V7 10/10] arm/arm64: KVM: add guest SEA support Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
2017-01-12 18:15   ` Tyler Baicar
     [not found]   ` <1484244924-24786-11-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-01-16 11:58     ` Marc Zyngier
2017-01-16 11:58       ` Marc Zyngier
2017-01-16 11:58       ` Marc Zyngier
2017-01-16 20:14       ` Baicar, Tyler
2017-01-16 20:14         ` Baicar, Tyler
2017-01-16 20:14         ` Baicar, Tyler
2017-01-16 20:14         ` Baicar, Tyler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ee2633d-e4a9-6237-4298-4454e7d268ad@codeaurora.org \
    --to=tbaicar@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=devel@acpica.org \
    --cc=eun.taik.lee@samsung.com \
    --cc=fu.wei@linaro.org \
    --cc=james.morse@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=labbott@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lv.zheng@intel.com \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=matt@codeblueprint.co.uk \
    --cc=nkaje@codeaurora.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pbonzini@redhat.com \
    --cc=rjw@rjwysocki.net \
    --cc=rkrcmar@redhat.com \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rruigrok@codeaurora.org \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=tn@semihalf.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.