All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: "Baicar, Tyler" <tbaicar@codeaurora.org>
Cc: linux-efi@vger.kernel.org, kvm@vger.kernel.org,
	matt@codeblueprint.co.uk, catalin.marinas@arm.com,
	will.deacon@arm.com, robert.moore@intel.com,
	paul.gortmaker@windriver.com, lv.zheng@intel.com,
	kvmarm@lists.cs.columbia.edu, fu.wei@linaro.org,
	rafael@kernel.org, zjzhang@codeaurora.org, linux@armlinux.org.uk,
	gengdongjiu@huawei.com, linux-acpi@vger.kernel.org,
	eun.taik.lee@samsung.com, shijie.huang@arm.com,
	labbott@redhat.com, lenb@kernel.org, harba@codeaurora.org,
	john.garry@huawei.com, marc.zyngier@arm.com,
	punit.agrawal@arm.com, rostedt@goodmis.org, nkaje@codeaurora.org,
	sandeepa.s.prabhu@gmail.com,
	linux-arm-kernel@lists.infradead.org, tony.luck@intel.com,
	rjw@rjwysocki.net, rruigrok@codeaurora.org,
	linux-kernel@vger.kernel.org, astone@redhat.com,
	hanjun.guo@linaro.org, joe@perches.com, pbonzini@redhat.com,
	akpm@linux-foundation.org, bristot@redhat.com, shiju.jose@huawe
Subject: Re: [PATCH V15 04/11] efi: parse ARM processor error
Date: Mon, 24 Apr 2017 19:52:40 +0200	[thread overview]
Message-ID: <20170424175240.3nvhbxzwicxnk6og@pd.tnic> (raw)
In-Reply-To: <e54c9893-446f-9e77-b78b-2548d394719b@codeaurora.org>

On Fri, Apr 21, 2017 at 12:22:09PM -0600, Baicar, Tyler wrote:
> I guess it's not really needed. It just may be useful considering there can
> be numerous error info structures, numerous context info structures, and a
> variable length vendor information section. I can move this print to only in
> the length check failure cases.

And? Why does the user care?

I mean, it is good for debugging when you wanna see you're parsing the
error info data properly but otherwise it doesn't improve the error
reporting one bit.

> Because these are part of the error information structure. I wouldn't think
> FW would populate error information structures that are different versions
> in the same processor error, but it could be possible from the spec (at
> least once there are different versions of the table).

Same argument as above.

> There is an error information 64 bit value in the ARM processor error
> information structure. (UEFI spec 2.6 table 261)

So that's IP-dependent and explained in the following tables. Any plans
on decoding that too?

> Why's that? Dumping this vendor specific error information is similar to the
> unrecognized CPER section reporting which is also meant for vendor specific
> information https://lkml.org/lkml/2017/4/18/751

And how do those naked bytes help the user understand the error happening?

Even in your example you have:

[  140.739210] {1}[Hardware Error]:   00000000: 4d415201 4d492031 453a4d45 435f4343  .RAM1 IMEM:ECC_C
[  140.739214] {1}[Hardware Error]:   00000010: 53515f45 44525f42 00000000 00000000  E_QSB_RD........

Which looks like some correctable ECC DRAM error and is actually begging
to be decoded in a human-readable form. So let's do that completely and
not dump partially decoded information.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

WARNING: multiple messages have this Message-ID (diff)
From: Borislav Petkov <bp@alien8.de>
To: "Baicar, Tyler" <tbaicar@codeaurora.org>
Cc: christoffer.dall@linaro.org, marc.zyngier@arm.com,
	pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk,
	catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net,
	lenb@kernel.org, matt@codeblueprint.co.uk,
	robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org,
	zjzhang@codeaurora.org, mark.rutland@arm.com,
	james.morse@arm.com, akpm@linux-foundation.org,
	eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com,
	labbott@redhat.com, shijie.huang@arm.com,
	rruigrok@codeaurora.org, paul.gortmaker@windriver.com,
	tn@semihalf.com, fu.wei@linaro.org, rostedt@goodmis.org,
	bristot@redhat.com, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-efi@vger.kernel.org, Suzuki.Poulose@arm.com,
	punit.agrawal@arm.com, astone@redhat.com, harba@codeaurora.org,
	hanjun.guo@linaro.org, john.garry@huawei.com,
	shiju.jose@huawei.com, joe@perches.com, rafael@kernel.org,
	tony.luck@intel.com, gengdongjiu@huawei.com, xiexiuqi@huawei.com
Subject: Re: [PATCH V15 04/11] efi: parse ARM processor error
Date: Mon, 24 Apr 2017 19:52:40 +0200	[thread overview]
Message-ID: <20170424175240.3nvhbxzwicxnk6og@pd.tnic> (raw)
In-Reply-To: <e54c9893-446f-9e77-b78b-2548d394719b@codeaurora.org>

On Fri, Apr 21, 2017 at 12:22:09PM -0600, Baicar, Tyler wrote:
> I guess it's not really needed. It just may be useful considering there can
> be numerous error info structures, numerous context info structures, and a
> variable length vendor information section. I can move this print to only in
> the length check failure cases.

And? Why does the user care?

I mean, it is good for debugging when you wanna see you're parsing the
error info data properly but otherwise it doesn't improve the error
reporting one bit.

> Because these are part of the error information structure. I wouldn't think
> FW would populate error information structures that are different versions
> in the same processor error, but it could be possible from the spec (at
> least once there are different versions of the table).

Same argument as above.

> There is an error information 64 bit value in the ARM processor error
> information structure. (UEFI spec 2.6 table 261)

So that's IP-dependent and explained in the following tables. Any plans
on decoding that too?

> Why's that? Dumping this vendor specific error information is similar to the
> unrecognized CPER section reporting which is also meant for vendor specific
> information https://lkml.org/lkml/2017/4/18/751

And how do those naked bytes help the user understand the error happening?

Even in your example you have:

[  140.739210] {1}[Hardware Error]:   00000000: 4d415201 4d492031 453a4d45 435f4343  .RAM1 IMEM:ECC_C
[  140.739214] {1}[Hardware Error]:   00000010: 53515f45 44525f42 00000000 00000000  E_QSB_RD........

Which looks like some correctable ECC DRAM error and is actually begging
to be decoded in a human-readable form. So let's do that completely and
not dump partially decoded information.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

WARNING: multiple messages have this Message-ID (diff)
From: bp@alien8.de (Borislav Petkov)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V15 04/11] efi: parse ARM processor error
Date: Mon, 24 Apr 2017 19:52:40 +0200	[thread overview]
Message-ID: <20170424175240.3nvhbxzwicxnk6og@pd.tnic> (raw)
In-Reply-To: <e54c9893-446f-9e77-b78b-2548d394719b@codeaurora.org>

On Fri, Apr 21, 2017 at 12:22:09PM -0600, Baicar, Tyler wrote:
> I guess it's not really needed. It just may be useful considering there can
> be numerous error info structures, numerous context info structures, and a
> variable length vendor information section. I can move this print to only in
> the length check failure cases.

And? Why does the user care?

I mean, it is good for debugging when you wanna see you're parsing the
error info data properly but otherwise it doesn't improve the error
reporting one bit.

> Because these are part of the error information structure. I wouldn't think
> FW would populate error information structures that are different versions
> in the same processor error, but it could be possible from the spec (at
> least once there are different versions of the table).

Same argument as above.

> There is an error information 64 bit value in the ARM processor error
> information structure. (UEFI spec 2.6 table 261)

So that's IP-dependent and explained in the following tables. Any plans
on decoding that too?

> Why's that? Dumping this vendor specific error information is similar to the
> unrecognized CPER section reporting which is also meant for vendor specific
> information https://lkml.org/lkml/2017/4/18/751

And how do those naked bytes help the user understand the error happening?

Even in your example you have:

[  140.739210] {1}[Hardware Error]:   00000000: 4d415201 4d492031 453a4d45 435f4343  .RAM1 IMEM:ECC_C
[  140.739214] {1}[Hardware Error]:   00000010: 53515f45 44525f42 00000000 00000000  E_QSB_RD........

Which looks like some correctable ECC DRAM error and is actually begging
to be decoded in a human-readable form. So let's do that completely and
not dump partially decoded information.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

  reply	other threads:[~2017-04-24 17:52 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-18 23:05 [PATCH V15 00/11] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2017-04-18 23:05 ` Tyler Baicar
2017-04-18 23:05 ` Tyler Baicar
2017-04-18 23:05 ` Tyler Baicar
2017-04-18 23:05 ` [PATCH V15 01/11] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
     [not found]   ` <1492556723-9189-2-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-04-19 18:31     ` Borislav Petkov
2017-04-19 18:31       ` Borislav Petkov
2017-04-19 18:31       ` Borislav Petkov
2017-04-19 18:31       ` Borislav Petkov
     [not found]       ` <20170419183112.x7tmjzpoq7ds64s2-fF5Pk5pvG8Y@public.gmane.org>
2017-04-19 20:31         ` Baicar, Tyler
2017-04-19 20:31           ` Baicar, Tyler
2017-04-19 20:31           ` Baicar, Tyler
2017-04-19 20:31           ` Baicar, Tyler
2017-04-19 20:41           ` Borislav Petkov
2017-04-19 20:41             ` Borislav Petkov
2017-04-19 20:41             ` Borislav Petkov
2017-04-19 20:41             ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 02/11] ras: acpi/apei: cper: add support for generic data v3 structure Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
     [not found]   ` <1492556723-9189-3-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-04-20 11:14     ` Borislav Petkov
2017-04-20 11:14       ` Borislav Petkov
2017-04-20 11:14       ` Borislav Petkov
2017-04-20 11:14       ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 03/11] cper: add timestamp print to CPER status printing Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-21 12:21   ` Borislav Petkov
2017-04-21 12:21     ` Borislav Petkov
2017-04-21 12:21     ` Borislav Petkov
2017-04-21 16:04     ` Baicar, Tyler
2017-04-21 16:04       ` Baicar, Tyler
2017-04-21 16:04       ` Baicar, Tyler
2017-04-21 17:26       ` Borislav Petkov
2017-04-21 17:26         ` Borislav Petkov
2017-04-21 17:26         ` Borislav Petkov
2017-04-21 18:08         ` Baicar, Tyler
2017-04-21 18:08           ` Baicar, Tyler
2017-04-21 18:08           ` Baicar, Tyler
2017-04-21 18:12           ` Borislav Petkov
2017-04-21 18:12             ` Borislav Petkov
2017-04-21 18:12             ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 04/11] efi: parse ARM processor error Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-21 17:55   ` Borislav Petkov
2017-04-21 17:55     ` Borislav Petkov
2017-04-21 17:55     ` Borislav Petkov
2017-04-21 18:22     ` Baicar, Tyler
2017-04-21 18:22       ` Baicar, Tyler
2017-04-21 18:22       ` Baicar, Tyler
2017-04-24 17:52       ` Borislav Petkov [this message]
2017-04-24 17:52         ` Borislav Petkov
2017-04-24 17:52         ` Borislav Petkov
2017-04-25 16:05         ` Baicar, Tyler
2017-04-25 16:05           ` Baicar, Tyler
2017-04-25 16:05           ` Baicar, Tyler
2017-04-25 16:31           ` Borislav Petkov
2017-04-25 16:31             ` Borislav Petkov
2017-04-25 16:31             ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 05/11] arm64: exception: handle Synchronous External Abort Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05 ` [PATCH V15 06/11] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-25 17:21   ` Borislav Petkov
2017-04-25 17:21     ` Borislav Petkov
2017-04-25 17:21     ` Borislav Petkov
2017-04-25 17:41     ` Baicar, Tyler
2017-04-25 17:41       ` Baicar, Tyler
2017-04-25 17:41       ` Baicar, Tyler
2017-04-25 17:41       ` Baicar, Tyler
2017-04-25 17:46       ` Borislav Petkov
2017-04-25 17:46         ` Borislav Petkov
2017-04-25 17:46         ` Borislav Petkov
2017-05-08 17:28   ` James Morse
2017-05-08 17:28     ` James Morse
2017-05-08 17:28     ` James Morse
     [not found]     ` <5910AAB8.8070703-5wv7dgnIgG8@public.gmane.org>
2017-05-08 19:59       ` Baicar, Tyler
2017-05-08 19:59         ` Baicar, Tyler
2017-05-08 19:59         ` Baicar, Tyler
2017-05-12 16:45         ` James Morse
2017-05-12 16:45           ` James Morse
2017-05-12 16:45           ` James Morse
     [not found]   ` <1492556723-9189-7-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-08-14  7:55     ` Xiongfeng Wang
2017-08-14  7:55   ` Xiongfeng Wang
2017-08-14  7:55     ` Xiongfeng Wang
2017-08-14  7:55     ` Xiongfeng Wang
2017-08-14  7:55     ` Xiongfeng Wang
2017-04-18 23:05 ` [PATCH V15 07/11] acpi: apei: panic OS with fatal error status block Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-28 13:07   ` Borislav Petkov
2017-04-28 13:07     ` Borislav Petkov
2017-04-28 13:07     ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 08/11] efi: print unrecognized CPER section Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-05-05 13:27   ` Borislav Petkov
2017-05-05 13:27     ` Borislav Petkov
2017-05-05 13:27     ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 09/11] ras: acpi / apei: generate trace event for " Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-05-05 17:53   ` Borislav Petkov
2017-05-05 17:53     ` Borislav Petkov
2017-05-05 17:53     ` Borislav Petkov
2017-05-05 18:44   ` Steven Rostedt
2017-05-05 18:44     ` Steven Rostedt
2017-05-05 18:44     ` Steven Rostedt
2017-04-18 23:05 ` [PATCH V15 10/11] trace, ras: add ARM processor error trace event Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-05-08 17:34   ` Borislav Petkov
2017-05-08 17:34     ` Borislav Petkov
2017-05-08 17:34     ` Borislav Petkov
2017-04-18 23:05 ` [PATCH V15 11/11] arm/arm64: KVM: add guest SEA support Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-04-18 23:05   ` Tyler Baicar
2017-05-08 17:40   ` Borislav Petkov
2017-05-08 17:40     ` Borislav Petkov
2017-05-08 17:40     ` Borislav Petkov
2017-05-08 19:54     ` Baicar, Tyler
2017-05-08 19:54       ` Baicar, Tyler
2017-05-08 19:54       ` Baicar, Tyler
2017-05-08 19:54       ` Baicar, Tyler
2017-05-08 20:22       ` Borislav Petkov
2017-05-08 20:22         ` Borislav Petkov
2017-05-08 20:22         ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170424175240.3nvhbxzwicxnk6og@pd.tnic \
    --to=bp@alien8.de \
    --cc=akpm@linux-foundation.org \
    --cc=astone@redhat.com \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=eun.taik.lee@samsung.com \
    --cc=fu.wei@linaro.org \
    --cc=gengdongjiu@huawei.com \
    --cc=hanjun.guo@linaro.org \
    --cc=harba@codeaurora.org \
    --cc=joe@perches.com \
    --cc=john.garry@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=labbott@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lv.zheng@intel.com \
    --cc=marc.zyngier@arm.com \
    --cc=matt@codeblueprint.co.uk \
    --cc=nkaje@codeaurora.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pbonzini@redhat.com \
    --cc=punit.agrawal@arm.com \
    --cc=rafael@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rruigrok@codeaurora.org \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=shiju.jose@huawe \
    --cc=tbaicar@codeaurora.org \
    --cc=tony.luck@intel.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.