All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Salter <msalter@redhat.com>
To: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Geoff Levand <geoff@infradead.org>,
	Riku Voipio <riku.voipio@linaro.org>,
	linux-acpi@vger.kernel.org, Hanjun Guo <hanjun.guo@linaro.org>,
	Sudeep Holla <sudeep.holla@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/acpi: Add fixup for HPE m400 quirks
Date: Fri, 29 Jun 2018 09:05:30 -0400	[thread overview]
Message-ID: <f09927a6ab2f951bf24c02846e603a987663e5bb.camel@redhat.com> (raw)
In-Reply-To: <d1ed8486-077b-6360-9c98-03aff5051762@arm.com>

On Thu, 2018-06-28 at 11:06 +0100, James Morse wrote:
> Hi Mark,
> 
> On 26/06/18 21:20, Mark Salter wrote:
> > On Tue, 2018-06-26 at 15:51 +0100, James Morse wrote:
> > > On 25/06/18 16:34, Mark Salter wrote:
> > > > On Fri, 2018-06-22 at 11:19 -0400, Mark Salter wrote:
> > > > > I'm going to hack something to get to the ghes info earlier in boot and
> > > > > check the things you mention above wrt Error Status Block and GHES.0.
> > > > 
> > > > So I had to end up instrumenting the EFI stub to see where the error came
> > > > from. At the start of the stub, there is no GHES.2 error. The error first
> > > > shows up after the stub's call to ExitBootServices returns.
> > > 
> > > What's the notification type of GHES.2? I'm guessing POLLed or some kind of IRQ.
> > > These systems don't have EL3, so the CPU must continue running while something
> > > external generates the CPER records. The records being visible is the last point
> > > the faulty-access could have been made, with the window of time depending on how
> > > fast this external-thing receives and processes the error.
> > 
> > There's a System Control Processor (slimpro) on the SoC which can interact with
> > the CPU in various ways and which has access to memory and other hw.
> 
> Thanks, saves me guessing!
> 
> 
> > > > So it looks
> > > > like the firmware itself is causing the error. There's still a chance that
> > > > the stub is doing something wrong with the memory map passed to the
> > > > firmware, so I'll try to eliminate that as well.
> > > 
> > > adding delay loops will help prove the EFIStub is innocent.
> > 
> > Didn't change anything.
> 

Just closing the loop on this...

> Okay, so just to clarify, a delay before ExitBootServices doesn't cause the
> error to show up before ExitBootServices, so the error hasn't occurred prior to
> this point.

Correct. I have never seen the error before ExitBootServices.

> And a delay after ExitBootServices allows us to see the error before we exit
> into head.S. (this rules out a bug in head.S)
> The delays should be long enough to tell us this slimpro isn't generating the
> error records N seconds after reset.

No delay needed after ExitBootServices. The error would be there right after the
call returns back to stub.


> 
> Given this I agree we should disable_hest based on the DMI platform name and the
> UEFI version number. (it may be earlier firmware didn't have this bug).
> 
> 
> I don't have anything to test this on, so I've picked the DMI strings out the
> demsg output on that bugzilla entry. Any chance you could give it a test?
> 
> 
> > > Are redhat able to rebuild UEFI on these systems? (Can it be fixed?)
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1285107 is about the m400
> > > description of the GIC, comments 15 and 16 show a UEFI patch to something other
> > > than the upstream platforms tree[0], and new firmware being tested.
> > > (although this may be wishful thinking)
> > 
> > HPe would respond to bug reports until m400 reached EOL. They have been pretty
> > clear that no more firmware updates will be done.
> 
> Thanks, it was a bit murky from that ticket...
> 
> 
> Thanks for doing this!
> 
> James

WARNING: multiple messages have this Message-ID (diff)
From: msalter@redhat.com (Mark Salter)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] arm64/acpi: Add fixup for HPE m400 quirks
Date: Fri, 29 Jun 2018 09:05:30 -0400	[thread overview]
Message-ID: <f09927a6ab2f951bf24c02846e603a987663e5bb.camel@redhat.com> (raw)
In-Reply-To: <d1ed8486-077b-6360-9c98-03aff5051762@arm.com>

On Thu, 2018-06-28 at 11:06 +0100, James Morse wrote:
> Hi Mark,
> 
> On 26/06/18 21:20, Mark Salter wrote:
> > On Tue, 2018-06-26 at 15:51 +0100, James Morse wrote:
> > > On 25/06/18 16:34, Mark Salter wrote:
> > > > On Fri, 2018-06-22 at 11:19 -0400, Mark Salter wrote:
> > > > > I'm going to hack something to get to the ghes info earlier in boot and
> > > > > check the things you mention above wrt Error Status Block and GHES.0.
> > > > 
> > > > So I had to end up instrumenting the EFI stub to see where the error came
> > > > from. At the start of the stub, there is no GHES.2 error. The error first
> > > > shows up after the stub's call to ExitBootServices returns.
> > > 
> > > What's the notification type of GHES.2? I'm guessing POLLed or some kind of IRQ.
> > > These systems don't have EL3, so the CPU must continue running while something
> > > external generates the CPER records. The records being visible is the last point
> > > the faulty-access could have been made, with the window of time depending on how
> > > fast this external-thing receives and processes the error.
> > 
> > There's a System Control Processor (slimpro) on the SoC which can interact with
> > the CPU in various ways and which has access to memory and other hw.
> 
> Thanks, saves me guessing!
> 
> 
> > > > So it looks
> > > > like the firmware itself is causing the error. There's still a chance that
> > > > the stub is doing something wrong with the memory map passed to the
> > > > firmware, so I'll try to eliminate that as well.
> > > 
> > > adding delay loops will help prove the EFIStub is innocent.
> > 
> > Didn't change anything.
> 

Just closing the loop on this...

> Okay, so just to clarify, a delay before ExitBootServices doesn't cause the
> error to show up before ExitBootServices, so the error hasn't occurred prior to
> this point.

Correct. I have never seen the error before ExitBootServices.

> And a delay after ExitBootServices allows us to see the error before we exit
> into head.S. (this rules out a bug in head.S)
> The delays should be long enough to tell us this slimpro isn't generating the
> error records N seconds after reset.

No delay needed after ExitBootServices. The error would be there right after the
call returns back to stub.


> 
> Given this I agree we should disable_hest based on the DMI platform name and the
> UEFI version number. (it may be earlier firmware didn't have this bug).
> 
> 
> I don't have anything to test this on, so I've picked the DMI strings out the
> demsg output on that bugzilla entry. Any chance you could give it a test?
> 
> 
> > > Are redhat able to rebuild UEFI on these systems? (Can it be fixed?)
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1285107 is about the m400
> > > description of the GIC, comments 15 and 16 show a UEFI patch to something other
> > > than the upstream platforms tree[0], and new firmware being tested.
> > > (although this may be wishful thinking)
> > 
> > HPe would respond to bug reports until m400 reached EOL. They have been pretty
> > clear that no more firmware updates will be done.
> 
> Thanks, it was a bit murky from that ticket...
> 
> 
> Thanks for doing this!
> 
> James

  reply	other threads:[~2018-06-29 13:05 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-13 18:22 [PATCH] arm64/acpi: Add fixup for HPE m400 quirks Geoff Levand
2018-06-13 18:22 ` Geoff Levand
2018-06-15  8:47 ` Riku Voipio
2018-06-15  8:47   ` Riku Voipio
2018-06-15  9:51 ` Graeme Gregory
2018-06-15  9:51   ` Graeme Gregory
2018-06-15 11:14 ` James Morse
2018-06-15 11:14   ` James Morse
2018-06-15 17:17   ` Geoff Levand
2018-06-15 17:17     ` Geoff Levand
2018-06-15 17:33     ` Mark Salter
2018-06-15 17:33       ` Mark Salter
2018-06-15 18:15       ` Geoff Levand
2018-06-15 18:15         ` Geoff Levand
2018-06-15 19:14         ` Mark Salter
2018-06-15 19:14           ` Mark Salter
2018-06-18 16:18     ` James Morse
2018-06-18 16:18       ` James Morse
2018-06-18 18:04       ` Geoff Levand
2018-06-18 18:04         ` Geoff Levand
2018-06-18 22:18         ` Mark Salter
2018-06-18 22:18           ` Mark Salter
2018-06-19 10:21           ` James Morse
2018-06-19 10:21             ` James Morse
2018-06-22 15:19             ` Mark Salter
2018-06-22 15:19               ` Mark Salter
2018-06-25 15:34               ` Mark Salter
2018-06-25 15:34                 ` Mark Salter
2018-06-26 14:51                 ` James Morse
2018-06-26 14:51                   ` James Morse
2018-06-26 20:20                   ` Mark Salter
2018-06-26 20:20                     ` Mark Salter
2018-06-27  8:48                     ` Ard Biesheuvel
2018-06-27  8:48                       ` Ard Biesheuvel
2018-06-27 12:25                       ` Mark Salter
2018-06-27 12:25                         ` Mark Salter
2018-07-03  9:30                         ` Ian Campbell
2018-07-03  9:30                           ` Ian Campbell
2018-07-03 15:20                           ` Mark Salter
2018-07-03 15:20                             ` Mark Salter
2018-06-28 10:06                     ` James Morse
2018-06-28 10:06                       ` James Morse
2018-06-29 13:05                       ` Mark Salter [this message]
2018-06-29 13:05                         ` Mark Salter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f09927a6ab2f951bf24c02846e603a987663e5bb.camel@redhat.com \
    --to=msalter@redhat.com \
    --cc=geoff@infradead.org \
    --cc=hanjun.guo@linaro.org \
    --cc=james.morse@arm.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=riku.voipio@linaro.org \
    --cc=sudeep.holla@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.