All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex G." <mr.nuke.me@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org,
	rjw@rjwysocki.net, lenb@kernel.org, tony.luck@intel.com,
	tbaicar@codeaurora.org, will.deacon@arm.com, james.morse@arm.com,
	shiju.jose@huawei.com, zjzhang@codeaurora.org,
	gengdongjiu@huawei.com, linux-kernel@vger.kernel.org,
	alex_gagniuc@dellteam.com, austin_bolen@dell.com,
	shyam_iyer@dell.com, devel@acpica.org, mchehab@kernel.org,
	robert.moore@intel.com, erik.schmauss@intel.com
Subject: Re: [RFC PATCH v2 3/4] acpi: apei: Do not panic() when correctable errors are marked as fatal.
Date: Thu, 19 Apr 2018 12:40:54 -0500	[thread overview]
Message-ID: <ffe6b459-c80a-d3c0-451d-0d21ae63da54@gmail.com> (raw)
In-Reply-To: <20180419164528.GD5635@pd.tnic>

SURPRISE!!!

On 04/19/2018 11:45 AM, Borislav Petkov wrote:
> On Thu, Apr 19, 2018 at 11:26:57AM -0500, Alex G. wrote:
>> At a very high level, I'm working with Dell on improving server
>> reliability, with a focus on NVME hotplug and surprise removal. One of
>> the features we don't support is surprise removal of NVME drives;
>> hotplug is supported with 'prepare to remove'. This is one of the
>> reasons NVME is not on feature parity with SAS and SATA.
> 
> Ok, first question: is surprise removal something purely mechanical or
> do you need firmware support for it? In the sense that you need to tell
> the firmware that you will be removing the drive.

SURPRISE!!! removal only means that the system was not expecting the
drive to be yanked. An example is removing a USB flash drive without
first unmounting it and removing the usb device (echo 0 >
/sys/bus/usb/.../authorized).

PCIe removal and hotplug is fairly well spec'd, and NVMe rides on that
without issue. It's much easier and faster for an OS to just follow the
spec and handle things on its own.

Interference from firmware only comes in with EFI/ACPI and FFS. From a
purely technical point of view, firmware has nothing to do with this.
>From a firmware-centric view, unfortunately, firmware wants the ability
to log errors to the BMC... and hotplug events.

Does firmware need to know that a drive will be removed? I'm not aware
of any such requirement. I think the main purpose of 'prepare to remove'
is to shut down any traffic on the link. This way, link removal does not
generate PCIe errors which may otherwise end up crashing the OS.


> I'm sceptical, though, as it has "surprise" in the name so I'm guessing
> the firmware doesn't know about it, the drive physically disappears and
> the FW starts spewing PCIe errors...

It's not the FW that spews out errors. It's the hardware. It's very
likely that a device which is actively used will have several DMA
transactions already queued up and lots of traffic going through the
link. When the link dies and the traffic can't be delivered, Unsupported
Request errors are very common.

On the r740xd, FW just hides those errors from the OS with no further
notification. On this machine BIOS sets things up such that non-posted
requests report fatal (PCIe) errors. FW still tries very hard to hide
this from the OS, and I think the heuristic is that if the drive
physical presence is gone, don't even report the error.

There are a lot of problems with the approach, but one thing to keep in
mind is that the FW was written at a time when OSes were more than happy
to crash at any PCIe error reported through GHES.

Alex

>> I'm not sure if this is the example you're looking for, but
>> take an r740xd server, and slowly unplug an Intel NVME drives at an
>> angle. You're likely to crash the machine.
> 
> No no, that's actually a great example!
> 
> Thx.
> 

WARNING: multiple messages have this Message-ID (diff)
From: Alexandru Gagniuc <mr.nuke.me@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org,
	rjw@rjwysocki.net, lenb@kernel.org, tony.luck@intel.com,
	tbaicar@codeaurora.org, will.deacon@arm.com, james.morse@arm.com,
	shiju.jose@huawei.com, zjzhang@codeaurora.org,
	gengdongjiu@huawei.com, linux-kernel@vger.kernel.org,
	alex_gagniuc@dellteam.com, austin_bolen@dell.com,
	shyam_iyer@dell.com, devel@acpica.org, mchehab@kernel.org,
	robert.moore@intel.com, erik.schmauss@intel.com
Subject: [RFC,v2,3/4] acpi: apei: Do not panic() when correctable errors are marked as fatal.
Date: Thu, 19 Apr 2018 12:40:54 -0500	[thread overview]
Message-ID: <ffe6b459-c80a-d3c0-451d-0d21ae63da54@gmail.com> (raw)

SURPRISE!!!

On 04/19/2018 11:45 AM, Borislav Petkov wrote:
> On Thu, Apr 19, 2018 at 11:26:57AM -0500, Alex G. wrote:
>> At a very high level, I'm working with Dell on improving server
>> reliability, with a focus on NVME hotplug and surprise removal. One of
>> the features we don't support is surprise removal of NVME drives;
>> hotplug is supported with 'prepare to remove'. This is one of the
>> reasons NVME is not on feature parity with SAS and SATA.
> 
> Ok, first question: is surprise removal something purely mechanical or
> do you need firmware support for it? In the sense that you need to tell
> the firmware that you will be removing the drive.

SURPRISE!!! removal only means that the system was not expecting the
drive to be yanked. An example is removing a USB flash drive without
first unmounting it and removing the usb device (echo 0 >
/sys/bus/usb/.../authorized).

PCIe removal and hotplug is fairly well spec'd, and NVMe rides on that
without issue. It's much easier and faster for an OS to just follow the
spec and handle things on its own.

Interference from firmware only comes in with EFI/ACPI and FFS. From a
purely technical point of view, firmware has nothing to do with this.
From a firmware-centric view, unfortunately, firmware wants the ability
to log errors to the BMC... and hotplug events.

Does firmware need to know that a drive will be removed? I'm not aware
of any such requirement. I think the main purpose of 'prepare to remove'
is to shut down any traffic on the link. This way, link removal does not
generate PCIe errors which may otherwise end up crashing the OS.


> I'm sceptical, though, as it has "surprise" in the name so I'm guessing
> the firmware doesn't know about it, the drive physically disappears and
> the FW starts spewing PCIe errors...

It's not the FW that spews out errors. It's the hardware. It's very
likely that a device which is actively used will have several DMA
transactions already queued up and lots of traffic going through the
link. When the link dies and the traffic can't be delivered, Unsupported
Request errors are very common.

On the r740xd, FW just hides those errors from the OS with no further
notification. On this machine BIOS sets things up such that non-posted
requests report fatal (PCIe) errors. FW still tries very hard to hide
this from the OS, and I think the heuristic is that if the drive
physical presence is gone, don't even report the error.

There are a lot of problems with the approach, but one thing to keep in
mind is that the FW was written at a time when OSes were more than happy
to crash at any PCIe error reported through GHES.

Alex

>> I'm not sure if this is the example you're looking for, but
>> take an r740xd server, and slowly unplug an Intel NVME drives at an
>> angle. You're likely to crash the machine.
> 
> No no, that's actually a great example!
> 
> Thx.
>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-04-19 17:40 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-16 21:58 [RFC PATCH v2 0/4] acpi: apei: Improve error handling with firmware-first Alexandru Gagniuc
2018-04-16 21:59 ` [RFC PATCH v2 1/4] EDAC, GHES: Remove unused argument to ghes_edac_report_mem_error Alexandru Gagniuc
2018-04-16 21:59   ` [RFC,v2,1/4] " Alexandru Gagniuc
2018-04-17  9:36   ` [RFC PATCH v2 1/4] " Borislav Petkov
2018-04-17  9:36     ` [RFC,v2,1/4] " Borislav Petkov
2018-04-17 16:43     ` [RFC PATCH v2 1/4] " Alex G.
2018-04-17 16:43       ` [RFC,v2,1/4] " Alexandru Gagniuc
2018-04-16 21:59 ` [RFC PATCH v2 2/4] acpi: apei: Split GHES handlers outside of ghes_do_proc Alexandru Gagniuc
2018-04-16 21:59   ` [RFC,v2,2/4] " Alexandru Gagniuc
2018-04-18 17:52   ` [RFC PATCH v2 2/4] " Borislav Petkov
2018-04-18 17:52     ` [RFC,v2,2/4] " Borislav Petkov
2018-04-19 14:19     ` [RFC PATCH v2 2/4] " Alex G.
2018-04-19 14:19       ` [RFC,v2,2/4] " Alexandru Gagniuc
2018-04-19 14:30       ` [RFC PATCH v2 2/4] " Borislav Petkov
2018-04-19 14:30         ` [RFC,v2,2/4] " Borislav Petkov
2018-04-19 14:57         ` [RFC PATCH v2 2/4] " Alex G.
2018-04-19 14:57           ` [RFC,v2,2/4] " Alexandru Gagniuc
2018-04-19 15:29           ` [RFC PATCH v2 2/4] " Borislav Petkov
2018-04-19 15:29             ` [RFC,v2,2/4] " Borislav Petkov
2018-04-19 15:46             ` [RFC PATCH v2 2/4] " Alex G.
2018-04-19 15:46               ` [RFC,v2,2/4] " Alexandru Gagniuc
2018-04-19 16:40               ` [RFC PATCH v2 2/4] " Borislav Petkov
2018-04-19 16:40                 ` [RFC,v2,2/4] " Borislav Petkov
2018-04-16 21:59 ` [RFC PATCH v2 3/4] acpi: apei: Do not panic() when correctable errors are marked as fatal Alexandru Gagniuc
2018-04-16 21:59   ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-18 17:54   ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-18 17:54     ` [RFC,v2,3/4] " Borislav Petkov
2018-04-19 14:57     ` [RFC PATCH v2 3/4] " Alex G.
2018-04-19 14:57       ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-19 15:35       ` [RFC PATCH v2 3/4] " James Morse
2018-04-19 15:35         ` [Devel] " James Morse
2018-04-19 15:35         ` [RFC,v2,3/4] " James Morse
2018-04-19 16:27         ` [RFC PATCH v2 3/4] " Alex G.
2018-04-19 16:27           ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-19 15:40       ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-19 15:40         ` [RFC,v2,3/4] " Borislav Petkov
2018-04-19 16:26         ` [RFC PATCH v2 3/4] " Alex G.
2018-04-19 16:26           ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-19 16:45           ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-19 16:45             ` [RFC,v2,3/4] " Borislav Petkov
2018-04-19 17:40             ` Alex G. [this message]
2018-04-19 17:40               ` Alexandru Gagniuc
2018-04-19 19:03               ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-19 19:03                 ` [RFC,v2,3/4] " Borislav Petkov
2018-04-19 22:55                 ` [RFC PATCH v2 3/4] " Alex G.
2018-04-19 22:55                   ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-22 10:48                   ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-22 10:48                     ` [RFC,v2,3/4] " Borislav Petkov
2018-04-24  4:19                     ` [RFC PATCH v2 3/4] " Alex G.
2018-04-24  4:19                       ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-25 14:01                       ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-25 14:01                         ` [RFC,v2,3/4] " Borislav Petkov
2018-04-25 15:00                         ` [RFC PATCH v2 3/4] " Alex G.
2018-04-25 15:00                           ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-25 17:15                           ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-25 17:15                             ` [RFC,v2,3/4] " Borislav Petkov
2018-04-25 17:27                             ` [RFC PATCH v2 3/4] " Alex G.
2018-04-25 17:27                               ` [RFC,v2,3/4] " Alexandru Gagniuc
2018-04-25 17:39                               ` [RFC PATCH v2 3/4] " Borislav Petkov
2018-04-25 17:39                                 ` [RFC,v2,3/4] " Borislav Petkov
2018-04-16 21:59 ` [RFC PATCH v2 4/4] acpi: apei: Warn when GHES marks correctable errors as "fatal" Alexandru Gagniuc
2018-04-16 21:59   ` [RFC,v2,4/4] " Alexandru Gagniuc
2018-04-18 17:54   ` [RFC PATCH v2 4/4] " Borislav Petkov
2018-04-18 17:54     ` [RFC,v2,4/4] " Borislav Petkov
2018-04-19 15:11     ` [RFC PATCH v2 4/4] " Alex G.
2018-04-19 15:11       ` [RFC,v2,4/4] " Alexandru Gagniuc
2018-04-19 15:46       ` [RFC PATCH v2 4/4] " Borislav Petkov
2018-04-19 15:46         ` [RFC,v2,4/4] " Borislav Petkov
2018-04-25 20:39 ` [RFC PATCH v3 0/3] acpi: apei: Improve PCIe error handling with firmware-first Alexandru Gagniuc
2018-04-25 20:39   ` [RFC PATCH v3 1/3] EDAC, GHES: Remove unused argument to ghes_edac_report_mem_error Alexandru Gagniuc
2018-04-25 20:39     ` [RFC,v3,1/3] " Alexandru Gagniuc
2018-04-25 20:39   ` [RFC PATCH v3 2/3] acpi: apei: Do not panic() on PCIe errors reported through GHES Alexandru Gagniuc
2018-04-25 20:39     ` [RFC,v3,2/3] " Alexandru Gagniuc
2018-04-26 11:19     ` [RFC PATCH v3 2/3] " Borislav Petkov
2018-04-26 11:19       ` [RFC,v3,2/3] " Borislav Petkov
2018-04-26 17:44       ` [RFC PATCH v3 2/3] " Alex G.
2018-04-26 17:44         ` [RFC,v3,2/3] " Alexandru Gagniuc
2018-04-25 20:39   ` [RFC PATCH v3 3/3] acpi: apei: Warn when GHES marks correctable errors as "fatal" Alexandru Gagniuc
2018-04-25 20:39     ` [RFC,v3,3/3] " Alexandru Gagniuc
2018-04-26 11:20     ` [RFC PATCH v3 3/3] " Borislav Petkov
2018-04-26 11:20       ` [RFC,v3,3/3] " Borislav Petkov
2018-04-26 17:47       ` [RFC PATCH v3 3/3] " Alex G.
2018-04-26 17:47         ` [RFC,v3,3/3] " Alexandru Gagniuc
2018-04-26 18:03         ` [RFC PATCH v3 3/3] " Borislav Petkov
2018-04-26 18:03           ` [RFC,v3,3/3] " Borislav Petkov
2018-05-02 19:10       ` [RFC PATCH v3 3/3] " Pavel Machek
2018-05-02 19:10         ` [RFC,v3,3/3] " Pavel Machek
2018-05-02 19:29         ` [RFC PATCH v3 3/3] " Alex G.
2018-05-02 19:29           ` [RFC,v3,3/3] " Alexandru Gagniuc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ffe6b459-c80a-d3c0-451d-0d21ae63da54@gmail.com \
    --to=mr.nuke.me@gmail.com \
    --cc=alex_gagniuc@dellteam.com \
    --cc=austin_bolen@dell.com \
    --cc=bp@alien8.de \
    --cc=devel@acpica.org \
    --cc=erik.schmauss@intel.com \
    --cc=gengdongjiu@huawei.com \
    --cc=james.morse@arm.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robert.moore@intel.com \
    --cc=shiju.jose@huawei.com \
    --cc=shyam_iyer@dell.com \
    --cc=tbaicar@codeaurora.org \
    --cc=tony.luck@intel.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.