linux-hardening.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeshua Smith <jeshuas@nvidia.com>
To: Borislav Petkov <bp@alien8.de>, "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"gpiccoli@igalia.com" <gpiccoli@igalia.com>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hardening@vger.kernel.org"
	<linux-hardening@vger.kernel.org>,
	"linux-tegra@vger.kernel.org" <linux-tegra@vger.kernel.org>,
	Thierry Reding <treding@nvidia.com>,
	Jonathan Hunter <jonathanh@nvidia.com>
Subject: RE: [PATCH V2] ACPI: APEI: Use ERST timeout for slow devices
Date: Wed, 25 Oct 2023 14:09:37 +0000	[thread overview]
Message-ID: <MN2PR12MB33738FA73A69BC6AEB64BD63DBDEA@MN2PR12MB3373.namprd12.prod.outlook.com> (raw)
In-Reply-To: <20231024152708.GFZTfiTL9C6onZKn99@fat_crate.local>

Hi Boris,

You asked several questions, and it's not clear to me if you are suggesting the answers be sent as email reply, or if you're asking for the answers to be added to the commit message. Below are my email replies to those questions.

Borislav Petkov wrote:
> When I see "may" in commit messages "Slow devices such as flash may not meet the default 1ms timeout value" then I wanna know what devices are those?

The ERST table specifies an interface for how the OS can serialize error records to a "persistent store". The details of the persistent storage device are implementation defined. The ERST table provides microsecond values for the "nominal" and "maximum" amount of time it takes for the implemented device to process and complete an EXECUTE_OPERATION (a record write, read, or clear request). The current APEI ERST code hardcodes the timeout to 1ms, and ignores the actual timing information that the platform has provided in the ERST table for the platform's implementation. This is a problem for any device that can or will take more than 1ms worst case to process and complete requested operations. On a platform that uses NOR flash as the "persistent store", for example, it can easily take longer than 1ms.

Detailed NOR flash example:
A Micron nor-flash spec sheet: https://media-www.micron.com/-/media/client/global/documents/products/data-sheet/nor-flash/serial-nor/mt25q/die-rev-b/mt25q_qlkt_u_02g_cbb_0.pdf?rev=9b167fbf2b3645efba6385949a72e453

Page 82 lists "Page program time (256 bytes)" as 120us typical, 1800us max.

A 32KB error log would be (32K/256) = 128 nor-flash pages.

Writing 128 nor-flash pages would then take 120us * 128 = 15ms typical, or 1800us * 128 = 230.4ms max.

> What is the actual use case here?

Actual use case:
Kernel panic -> Pstore calls APEI's ERST code to write the ~32KB error log to persistent store -> ERST code writes the error log to nor-flash, which takes more than 1ms to complete. This is expected, as communicated by the platform to the OS via the maximum time field in the ERST table.

Currently APIE's ERST code will flag a timeout of the write operation after 1ms and return an error to Pstore. My patch will let the write (or read or clear) operation take as long as the maximum time ERST says it could take before flagging a timeout and returning an error. The ERST table has a "attributes" field that includes a "slow" bit to allow the platform to indicate that the address range for the error log "has slow access times", but the spec doesn't define what is considered a slow access time. My patch assumes that since the current timeout is 1ms, any access times greater than 1ms would reasonably be considered "slow", and therefore the extended (ERST-defined) timeout is only applied for implementations that indicate that they are "slow". I assume that platforms which bother to set the "slow" bit will also specify actual timings, and platforms which don't are OK with the current 1ms timeout.

> Upthread there's a question about the ACPI spec. That should be explained too. Because I have no clue what "the ERST max execution time value" is.

As I replied to Tony upthread, the Serialization Actions table entry for OPERATION_TIMINGS (https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#serialization-actions) says that it is the "value in microseconds that the platform expects would be the maximum amount of time it will take to process and complete an EXECUTE_OPERATION." Based on the spec, I take that to be the worst case time between when the OS does "EXECUTE_OPERATION" and when the OS sees "CHECK_BUSY_STATUS" return FALSE rather than TRUE, for the worst case time of a read/write/clear operation for the maximum size supported by the platform's ERST implementation.

Does that answer your questions?

  reply	other threads:[~2023-10-25 14:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-12 22:34 [PATCH V2] ACPI: APEI: Use ERST timeout for slow devices Jeshua Smith
2023-08-04 15:55 ` Jeshua Smith
2023-08-04 16:31   ` Luck, Tony
2023-08-05  1:04     ` Jeshua Smith
2023-08-18 16:50       ` Jeshua Smith
2023-09-11 16:15         ` Jeshua Smith
2023-10-02 16:10           ` Jeshua Smith
2023-10-23 15:45             ` Jeshua Smith
2023-10-24 14:32               ` Rafael J. Wysocki
2023-10-24 15:27                 ` Borislav Petkov
2023-10-25 14:09                   ` Jeshua Smith [this message]
2023-10-25 14:22                     ` Borislav Petkov
2023-10-24 15:24 ` Tony Luck
2023-10-24 18:51   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MN2PR12MB33738FA73A69BC6AEB64BD63DBDEA@MN2PR12MB3373.namprd12.prod.outlook.com \
    --to=jeshuas@nvidia.com \
    --cc=bp@alien8.de \
    --cc=gpiccoli@igalia.com \
    --cc=james.morse@arm.com \
    --cc=jonathanh@nvidia.com \
    --cc=keescook@chromium.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=treding@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).