From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28563C433FE for ; Mon, 14 Feb 2022 00:27:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238871AbiBNA2B (ORCPT ); Sun, 13 Feb 2022 19:28:01 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:43712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235413AbiBNA15 (ORCPT ); Sun, 13 Feb 2022 19:27:57 -0500 Received: from smtp-relay-internal-1.canonical.com (smtp-relay-internal-1.canonical.com [185.125.188.123]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 006E913F3A for ; Sun, 13 Feb 2022 16:27:50 -0800 (PST) Received: from mail-ot1-f69.google.com (mail-ot1-f69.google.com [209.85.210.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id D5B4D40337 for ; Mon, 14 Feb 2022 00:27:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1644798469; bh=ej9ajKafdGwYpCYGQHrOi84wn2e/P4L47SMVJrwt6j4=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=dgToTokZ+aa+7I/VTtm8bKASud+h9N+Crq2HNTW0a23E5ZtadUtWJ8luckHd5p3xl Svqb6U+gb7A+VxyzyUZSol+F6MJekUmpWqLm2oldCB/EPrPyH9WA/noIdjIq/NS8hz 6B0LRhDeNUcwb+XFwAeNSVEuBNdsdoknF2TlYclxk0LBBfxUigV1DKbcedjqHL+JUt oST1nQ3N4+vcMElxV9Iww9ULyilNuqFVB+u7jf/5Vn1OTvGFwJrLGbE+ThK/411e/y dEiVVvcYbgWTGR0YzxhffvxsxrCt5KHyarVv+1Ogqqi01K8FlGFiStI43xuZOTBmj/ euovf6Cvc8e2w== Received: by mail-ot1-f69.google.com with SMTP id a4-20020a9d5c84000000b005a1daff4564so9466048oti.2 for ; Sun, 13 Feb 2022 16:27:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ej9ajKafdGwYpCYGQHrOi84wn2e/P4L47SMVJrwt6j4=; b=XRFL10vlHEuSVemMO1XcrFZfJC5ExpFGjqGO7r8tRQAEMSqLqMWnz4gWxnQhi2klAC wGh8FeeK1DcLLWFD8OtUV0hrxY29O/SCHbpkRPVqu/tfUamA9/BmMI1dBXhzUfJtMSez UtiBNg/5gBcIOipc15Tc1bZn0u9GBSYO7XHhJILkkeNFLHNbAagfA8NB8c0Qc7aKBTb0 P+F8QX226BsFlDq2Ds9YAHJ6cV9mE76NvMJ5AscaOIxZ5HbcGuJ0NARyUcqerKKKKFN7 V8pU97gokAYBfMQTUEAmkWxvmno63BOyZTvInPq5v8vpqxgJKuCfHTgVJnn6nXrEzNRx udgQ== X-Gm-Message-State: AOAM531lZvN2jvILg65xnO+ZXqd20hIKFniHLiCJubsGkvJE5CjvDlQl awegdFjM5sGxWw+qZl0qnHb0QCoQ+7IXW/Qk44oyklQD04npLzfbZQDRyq6HwXvZVvLPSO9BAuK I7BuVdPZqcrqqWEiFcbEAbxiVuelzuAYhZFW/l/MO1v76BUG5GR4nplDJcw== X-Received: by 2002:a9d:1ca5:: with SMTP id l37mr4209235ota.233.1644798468559; Sun, 13 Feb 2022 16:27:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJyN1ZSAVS0GC78LiPm5hAJLWowe6qWTFLHN8Mj+qs0vO1FBPjaBZJa9R1QvrGkgKJJIQGrW81OKX3Z0Ufukxwg= X-Received: by 2002:a9d:1ca5:: with SMTP id l37mr4209230ota.233.1644798468296; Sun, 13 Feb 2022 16:27:48 -0800 (PST) MIME-Version: 1.0 References: <20220209213638.GA587920@bhelgaas> In-Reply-To: From: Kai-Heng Feng Date: Mon, 14 Feb 2022 08:27:37 +0800 Message-ID: Subject: Re: [PATCH v3] PCI: vmd: Honor ACPI _OSC on PCIe features To: Jonathan Derrick Cc: Bjorn Helgaas , "Rafael J. Wysocki" , Keith Busch , Bjorn Helgaas , Linux PM , Nirmal Patel , Lorenzo Pieralisi , Rob Herring , =?UTF-8?Q?Krzysztof_Wilczy=C5=84ski?= , Linux PCI , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 11, 2022 at 1:53 AM Jonathan Derrick wrote: > > > > On 2/9/2022 2:36 PM, Bjorn Helgaas wrote: > > On Tue, Dec 07, 2021 at 02:15:04PM +0100, Rafael J. Wysocki wrote: > >> On Tue, Dec 7, 2021 at 12:12 AM Keith Busch wrote: > >>> On Fri, Dec 03, 2021 at 11:15:41AM +0800, Kai-Heng Feng wrote: > >>>> When Samsung PCIe Gen4 NVMe is connected to Intel ADL VMD, the > >>>> combination causes AER message flood and drags the system performance > >>>> down. > >>>> > >>>> The issue doesn't happen when VMD mode is disabled in BIOS, since AER > >>>> isn't enabled by acpi_pci_root_create() . When VMD mode is enabled, AER > >>>> is enabled regardless of _OSC: > >>>> [ 0.410076] acpi PNP0A08:00: _OSC: platform does not support [AER] > >>>> ... > >>>> [ 1.486704] pcieport 10000:e0:06.0: AER: enabled with IRQ 146 > >>>> > >>>> Since VMD is an aperture to regular PCIe root ports, honor ACPI _OSC to > >>>> disable PCIe features accordingly to resolve the issue. > >>> > >>> At least for some versions of this hardare, I recall ACPI is unaware of > >>> any devices in the VMD domain; the platform can not see past the VMD > >>> endpoint, so I throught the driver was supposed to always let the VMD > >>> domain use OS native support regardless of the parent's ACPI _OSC. > >> > >> This is orthogonal to whether or not ACPI is aware of the VMD domain > >> or the devices in it. > >> > >> If the platform firmware does not allow the OS to control specific > >> PCIe features at the physical host bridge level, that extends to the > >> VMD "bus", because it is just a way to expose a hidden part of the > >> PCIe hierarchy. > > > > I don't understand what's going on here. Do we understand the AER > > message flood? Are we just papering over it by disabling AER? > > > > If an error occurs below a VMD, who notices and reports it? If we > > disable native AER below VMD because of _OSC, as this patch does, I > > guess we're assuming the platform will handle AER events below VMD. > > Is that really true? Does the platform know how to find AER log > > registers of devices below VMD? > ACPI (and the specific UEFI implementation) might remain unaware of > VMD domains. It's possible that the system management mode (SMM) > controller which typically handles firmware-first errors would be > capable of handling VMD errors in the vendor-specific manner. > However if _OSC hadn't taken into account VMD ports, SMM wouldn't > be capable of handling those errors and silently disabling AER on > VMD domains is a bad idea. Are VMD ports on server platforms also 'apertures' for root ports like those on consumer platforms? > > The bugzilla made it sound like a specific platform/drive combination. > What about a DMI match to mask the Corrected Physical Layer bits? We confirmed _all_ PCIe Gen4 NVMes on that platform have the issue. So using DMI to match won't scale... Kai-Heng > > > > >> The platform firmware does that through ACPI _OSC under the host > >> bridge device (not under the VMD device) which it is very well aware > >> of.