All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>,
	Riku Voipio <riku.voipio@linaro.org>,
	Mark Salter <msalter@redhat.com>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	James Morse <james.morse@arm.com>,
	Hanjun Guo <hanjun.guo@linaro.org>,
	Sudeep Holla <sudeep.holla@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC/RFT PATCH 0/2] disable_hest quirk on HP m400 with bad UEFI firmwware
Date: Thu, 28 Jun 2018 13:51:30 +0100	[thread overview]
Message-ID: <20180628125130.GA9025@red-moon> (raw)
In-Reply-To: <CAKv+Gu9ZHAGzcfgHShM1qvZ3qf4gb1pD_XfoceJu9Qy+5_pEsg@mail.gmail.com>

On Thu, Jun 28, 2018 at 12:25:06PM +0200, Ard Biesheuvel wrote:
> Hi James,
> 
> On 28 June 2018 at 12:06, James Morse <james.morse@arm.com> wrote:
> > There are reports[0] that HPE's 'ProLiant m400 Server' (aka moonshot) has
> > broken RAS support, and adding disable_hest to the kernel cmdline is the
> > only way to make the board boot if APEI support is built into the kernel.
> >
> > After Mark Salter's investigation[1] we know that UEFI's ExitBootServices
> > is doing something that causes a fatal error to be written to GHES.2.
> > Once the kernel finds this, it falsely assume it was due to something that
> > happened during boot, and panic()s.
> >
> > This series adds a DMI quirks table to hest.c, and adds a helper that lets
> > us query the UEFI system table version, to set hest_disabled on this
> > platform.
> >
> > Testing the HEST table vendor and revision is a problem as this would
> > match all 'HPE ProLiant', some of which may be a totally different CPU
> > architecture.
> >
> >
> > I don't have access to an m400, these DMI and UEFI values were taken from
> > the crashlog report at [0], then tested with the equivalent fields on
> > Seattle.
> >
> 
> I understand the desire to keep running these M400s as long as they
> have some life left in them, but the reality is that they are end of
> life already, and not many were manufactured to begin with.
> 
> Given how the upstream kernel is aimed at future development, I don't
> think we should fix this in the upstream kernel at all. Distros are
> free to do what they like, of course, and I'm sure RedHat already have
> a fix for this in their downstream kernel. But putting this upstream
> means we will never be able to remove it again, which would be
> especially unfortunate given that it is the first ever DMI quirk for
> arm64, which we tried *very* hard to avoid, also because we don't
> initialize the DMI framework as early as x86 does, and so once we open
> the floodgates, we will run into issues where we will need to reorder
> the init sequence to make DMI data available early enough.
> 
> As for the efi.h patch: I don't object to adding code that makes the
> spec revision available, but note that this is *not* a firmware build
> number, and so it should not be used as such. Also, given that m400 is
> EOL and unmaintained, no firmware updates are expected, and so
> assuming that there will be a UEFI 2.7 based update in the future
> seems rather optimistic.
> 
> Ultimately, it is not up to me to decide whether
> 
> a) DMI quirks will be permitted on arm64
> b) we care about m400 enough to put this quirk in the upstream kernel
> 
> but I'd prefer it if we steered clear of this.

I apologise to James (and Mark) who went all the way to debug this FW
bug and worked around it with a series that is upstreamable, I was in
two minds about this but eventually I would agree with you, your
reasoning is linear and it is an acceptable reason not to merge this
series, if HPe do not care I do not think we should either, for the time
being let's keep the floodgates watertight, with my apologies.

Thanks,
Lorenzo

WARNING: multiple messages have this Message-ID (diff)
From: lorenzo.pieralisi@arm.com (Lorenzo Pieralisi)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC/RFT PATCH 0/2] disable_hest quirk on HP m400 with bad UEFI firmwware
Date: Thu, 28 Jun 2018 13:51:30 +0100	[thread overview]
Message-ID: <20180628125130.GA9025@red-moon> (raw)
In-Reply-To: <CAKv+Gu9ZHAGzcfgHShM1qvZ3qf4gb1pD_XfoceJu9Qy+5_pEsg@mail.gmail.com>

On Thu, Jun 28, 2018 at 12:25:06PM +0200, Ard Biesheuvel wrote:
> Hi James,
> 
> On 28 June 2018 at 12:06, James Morse <james.morse@arm.com> wrote:
> > There are reports[0] that HPE's 'ProLiant m400 Server' (aka moonshot) has
> > broken RAS support, and adding disable_hest to the kernel cmdline is the
> > only way to make the board boot if APEI support is built into the kernel.
> >
> > After Mark Salter's investigation[1] we know that UEFI's ExitBootServices
> > is doing something that causes a fatal error to be written to GHES.2.
> > Once the kernel finds this, it falsely assume it was due to something that
> > happened during boot, and panic()s.
> >
> > This series adds a DMI quirks table to hest.c, and adds a helper that lets
> > us query the UEFI system table version, to set hest_disabled on this
> > platform.
> >
> > Testing the HEST table vendor and revision is a problem as this would
> > match all 'HPE ProLiant', some of which may be a totally different CPU
> > architecture.
> >
> >
> > I don't have access to an m400, these DMI and UEFI values were taken from
> > the crashlog report at [0], then tested with the equivalent fields on
> > Seattle.
> >
> 
> I understand the desire to keep running these M400s as long as they
> have some life left in them, but the reality is that they are end of
> life already, and not many were manufactured to begin with.
> 
> Given how the upstream kernel is aimed at future development, I don't
> think we should fix this in the upstream kernel at all. Distros are
> free to do what they like, of course, and I'm sure RedHat already have
> a fix for this in their downstream kernel. But putting this upstream
> means we will never be able to remove it again, which would be
> especially unfortunate given that it is the first ever DMI quirk for
> arm64, which we tried *very* hard to avoid, also because we don't
> initialize the DMI framework as early as x86 does, and so once we open
> the floodgates, we will run into issues where we will need to reorder
> the init sequence to make DMI data available early enough.
> 
> As for the efi.h patch: I don't object to adding code that makes the
> spec revision available, but note that this is *not* a firmware build
> number, and so it should not be used as such. Also, given that m400 is
> EOL and unmaintained, no firmware updates are expected, and so
> assuming that there will be a UEFI 2.7 based update in the future
> seems rather optimistic.
> 
> Ultimately, it is not up to me to decide whether
> 
> a) DMI quirks will be permitted on arm64
> b) we care about m400 enough to put this quirk in the upstream kernel
> 
> but I'd prefer it if we steered clear of this.

I apologise to James (and Mark) who went all the way to debug this FW
bug and worked around it with a series that is upstreamable, I was in
two minds about this but eventually I would agree with you, your
reasoning is linear and it is an acceptable reason not to merge this
series, if HPe do not care I do not think we should either, for the time
being let's keep the floodgates watertight, with my apologies.

Thanks,
Lorenzo

  reply	other threads:[~2018-06-28 12:51 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 10:06 [RFC/RFT PATCH 0/2] disable_hest quirk on HP m400 with bad UEFI firmwware James Morse
2018-06-28 10:06 ` James Morse
2018-06-28 10:06 ` [RFC/RFT PATCH 1/2] efi: Add helper to retrieve runtime version number James Morse
2018-06-28 10:06   ` James Morse
2018-06-28 10:06 ` [RFC/RFT PATCH 2/2] ACPI / APEI: Add DMI matching quirks for platforms that require hest_disable James Morse
2018-06-28 10:06   ` James Morse
2018-06-28 10:25 ` [RFC/RFT PATCH 0/2] disable_hest quirk on HP m400 with bad UEFI firmwware Ard Biesheuvel
2018-06-28 10:25   ` Ard Biesheuvel
2018-06-28 12:51   ` Lorenzo Pieralisi [this message]
2018-06-28 12:51     ` Lorenzo Pieralisi
2018-06-28 14:24   ` James Morse
2018-06-28 14:24     ` James Morse
2018-06-28 16:15   ` Geoff Levand
2018-06-28 16:15     ` Geoff Levand
2018-06-28 20:56     ` Ard Biesheuvel
2018-06-28 20:56       ` Ard Biesheuvel
2018-07-03  8:46       ` Ian Campbell
2018-07-03  8:46         ` Ian Campbell
2018-07-03  8:44   ` Ian Campbell
2018-07-03  8:44     ` Ian Campbell
2018-07-03 15:17     ` Ard Biesheuvel
2018-07-03 15:17       ` Ard Biesheuvel
2018-07-03 15:47       ` Ian Campbell
2018-07-03 15:47         ` Ian Campbell
2018-07-03 17:12         ` Lorenzo Pieralisi
2018-07-03 17:12           ` Lorenzo Pieralisi
2018-07-03 17:16           ` Ian Campbell
2018-07-03 17:16             ` Ian Campbell
2018-07-03 17:39             ` Lorenzo Pieralisi
2018-07-03 17:39               ` Lorenzo Pieralisi
2018-07-03 19:47               ` Ian Campbell
2018-07-03 19:47                 ` Ian Campbell
2018-07-04  9:14                 ` Lorenzo Pieralisi
2018-07-04  9:14                   ` Lorenzo Pieralisi
2018-07-04  9:47                 ` Ard Biesheuvel
2018-07-04  9:47                   ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180628125130.GA9025@red-moon \
    --to=lorenzo.pieralisi@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=geoff@infradead.org \
    --cc=hanjun.guo@linaro.org \
    --cc=james.morse@arm.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=msalter@redhat.com \
    --cc=riku.voipio@linaro.org \
    --cc=sudeep.holla@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.