All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xuesong Chen <xuesong.chen@linux.alibaba.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: catalin.marinas@arm.com, lorenzo.pieralisi@arm.com,
	james.morse@arm.com, will@kernel.org, rafael@kernel.org,
	tony.luck@intel.com, bp@alien8.de, mingo@kernel.org,
	bhelgaas@google.com, linux-pci@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Huang Ying <ying.huang@intel.com>,
	xuesong.chen@linux.alibaba.com
Subject: Re: [PATCH v3 2/2] ACPI: APEI: Filter the PCI MCFG address with an arch-agnostic method
Date: Wed, 27 Oct 2021 13:29:49 +0800	[thread overview]
Message-ID: <b392f501-e9fb-2c75-42b6-d94e8b8e6ace@linux.alibaba.com> (raw)
In-Reply-To: <20211026204722.GA158130@bhelgaas>



On 27/10/2021 04:47, Bjorn Helgaas wrote:
> On Tue, Oct 26, 2021 at 05:16:47PM +0800, Xuesong Chen wrote:
>> On 26/10/2021 07:37, Bjorn Helgaas wrote:
> 
>>> My point was that when ECAM is implemented correctly, a CPU does a
>>> single MMIO load to do a PCI config read and a single MMIO store to do
>>> a PCI config write.  In that case there no need for any locking, so
>>> there's no need for APEI to reserve those resources.
>>
>> Ah, got it. That means the PCI ECAM has a implicit mutual exclusion with EINJ
>> if the hardware implemention is correct, so we can remove the MCFG from
>> the APEI's safely.
> 
> Well, not quite.  ECAM doesn't *need* mutual exclusion.  Single loads
> and stores are atomic by definition.
> 

OK, because ECAM config access has intrinsic atomic primitive, so no need to
reserve those ECAM config space resources for APEI, that's why commit d91525eb8ee6 
("ACPI, EINJ: Enhance error injection tolerance level") fix make sense... 

>>> I think apei_resources_request() should continue to reserve MCFG areas
>>> on tegra194 and xgene, but it does not need to reserve them on other
>>> ARM64 platforms.
>>
>> As a summary: we need to reserve the MCFG areas on those platforms with a
>> quirk ECAM implementation since there's no lockless method to access the
>> configuration space, on other platforms we don't need to reserve the MCFG
>> resources (so can remove it safely).
>>
>> So we need to add another patch to handle the case of tegra194 and xgene...
>> I will try to figure it out. 
> 
> I looked through these again and found another problem case (thunder).
> Here are my notes from my research.
> 
> Normal ECAM users require no device-specific support.  The platform
> supplies an MCFG table, the generic code works, no mutual exclusion is
> required, and APEI doesn't need to reserve the MCFG areas.
> 
> The problem cases are platforms that supply an MCFG table but require
> some device-specific workarounds.  We can identify these because they
> have quirks in pci-mcfg.c.  Here are the existing quirks and the
> pci_ecam_ops structs they supply:
> 
>   AL_ECAM             al_pcie_ops                 # OK
>   QCOM_ECAM32         pci_32b_ops                 # OK
>   HISI_QUAD_DOM       hisi_pcie_ops               # OK
>   THUNDER_PEM_QUIRK   thunder_pem_ecam_ops        # problem
>   THUNDER_PEM_QUIRK   thunder_pem_ecam_ops        # problem
>   THUNDER_ECAM_QUIRK  pci_thunder_ecam_ops        # OK
>   tegra               tegra194_pcie_ops           # problem
>   XGENE_V1_ECAM_MCFG  xgene_v1_pcie_ecam_ops      # problem
>   XGENE_V2_ECAM_MCFG  xgene_v2_pcie_ecam_ops      # problem
>   ALTRA_ECAM_QUIRK    pci_32b_read_ops            # OK
> 
> The ones marked "OK" have .map_bus(), .read(), and .write() methods
> that need no mutual exclusion because they boil down to just a single
> MMIO load or store.  These are fine and there shouldn't be a problem
> if an EINJ action accesses the ECAM space.
> 
> The others do require mutual exclusion:
> 
>   - thunder_pem_ecam_ops: thunder_pem_config_read() calls
>     thunder_pem_bridge_read(), which does a writeq() to PEM_CFG_RD
>     followed by a readq().  The writeq() and readq() must be atomic to
>     avoid corruption.
> 
>   - tegra194_pcie_ops: tegra194_map_bus() programs the ATU.  This and
>     the subsequent ECAM read/write must be atomic.
> 
>   - xgene_v1_pcie_ecam_ops and xgene_v2_pcie_ecam_ops:
>     xgene_pcie_map_bus() sets the RTID.  This and the subsequent ECAM
>     read/write must be atomic.
> 
> I had to look at all these ops individually to find them, so I don't
> see an easy way to identify these problem cases at run-time.
> 
> I personally would not have an issue with having APEI try to reserve
> the MCFG regions for any platform that has an MCFG quirk.  That would
> prevent the al, qcom, hisi, thunder-ecam, and altra drivers from using
> EINJ even though it would probably be safe for them.  But we already
> know those platforms are not really ACPI-compliant, so ...

OK, understood. Since those platforms are not really ACPI-compliant, so
we can unify all the quirks together. Let me send an inital solution about
this for your review and see if there's room for further improvement...

Thanks,
Xuesong

> 
> Bjorn
> 

WARNING: multiple messages have this Message-ID (diff)
From: Xuesong Chen <xuesong.chen@linux.alibaba.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: catalin.marinas@arm.com, lorenzo.pieralisi@arm.com,
	james.morse@arm.com, will@kernel.org, rafael@kernel.org,
	tony.luck@intel.com, bp@alien8.de, mingo@kernel.org,
	bhelgaas@google.com, linux-pci@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Huang Ying <ying.huang@intel.com>,
	xuesong.chen@linux.alibaba.com
Subject: Re: [PATCH v3 2/2] ACPI: APEI: Filter the PCI MCFG address with an arch-agnostic method
Date: Wed, 27 Oct 2021 13:29:49 +0800	[thread overview]
Message-ID: <b392f501-e9fb-2c75-42b6-d94e8b8e6ace@linux.alibaba.com> (raw)
In-Reply-To: <20211026204722.GA158130@bhelgaas>



On 27/10/2021 04:47, Bjorn Helgaas wrote:
> On Tue, Oct 26, 2021 at 05:16:47PM +0800, Xuesong Chen wrote:
>> On 26/10/2021 07:37, Bjorn Helgaas wrote:
> 
>>> My point was that when ECAM is implemented correctly, a CPU does a
>>> single MMIO load to do a PCI config read and a single MMIO store to do
>>> a PCI config write.  In that case there no need for any locking, so
>>> there's no need for APEI to reserve those resources.
>>
>> Ah, got it. That means the PCI ECAM has a implicit mutual exclusion with EINJ
>> if the hardware implemention is correct, so we can remove the MCFG from
>> the APEI's safely.
> 
> Well, not quite.  ECAM doesn't *need* mutual exclusion.  Single loads
> and stores are atomic by definition.
> 

OK, because ECAM config access has intrinsic atomic primitive, so no need to
reserve those ECAM config space resources for APEI, that's why commit d91525eb8ee6 
("ACPI, EINJ: Enhance error injection tolerance level") fix make sense... 

>>> I think apei_resources_request() should continue to reserve MCFG areas
>>> on tegra194 and xgene, but it does not need to reserve them on other
>>> ARM64 platforms.
>>
>> As a summary: we need to reserve the MCFG areas on those platforms with a
>> quirk ECAM implementation since there's no lockless method to access the
>> configuration space, on other platforms we don't need to reserve the MCFG
>> resources (so can remove it safely).
>>
>> So we need to add another patch to handle the case of tegra194 and xgene...
>> I will try to figure it out. 
> 
> I looked through these again and found another problem case (thunder).
> Here are my notes from my research.
> 
> Normal ECAM users require no device-specific support.  The platform
> supplies an MCFG table, the generic code works, no mutual exclusion is
> required, and APEI doesn't need to reserve the MCFG areas.
> 
> The problem cases are platforms that supply an MCFG table but require
> some device-specific workarounds.  We can identify these because they
> have quirks in pci-mcfg.c.  Here are the existing quirks and the
> pci_ecam_ops structs they supply:
> 
>   AL_ECAM             al_pcie_ops                 # OK
>   QCOM_ECAM32         pci_32b_ops                 # OK
>   HISI_QUAD_DOM       hisi_pcie_ops               # OK
>   THUNDER_PEM_QUIRK   thunder_pem_ecam_ops        # problem
>   THUNDER_PEM_QUIRK   thunder_pem_ecam_ops        # problem
>   THUNDER_ECAM_QUIRK  pci_thunder_ecam_ops        # OK
>   tegra               tegra194_pcie_ops           # problem
>   XGENE_V1_ECAM_MCFG  xgene_v1_pcie_ecam_ops      # problem
>   XGENE_V2_ECAM_MCFG  xgene_v2_pcie_ecam_ops      # problem
>   ALTRA_ECAM_QUIRK    pci_32b_read_ops            # OK
> 
> The ones marked "OK" have .map_bus(), .read(), and .write() methods
> that need no mutual exclusion because they boil down to just a single
> MMIO load or store.  These are fine and there shouldn't be a problem
> if an EINJ action accesses the ECAM space.
> 
> The others do require mutual exclusion:
> 
>   - thunder_pem_ecam_ops: thunder_pem_config_read() calls
>     thunder_pem_bridge_read(), which does a writeq() to PEM_CFG_RD
>     followed by a readq().  The writeq() and readq() must be atomic to
>     avoid corruption.
> 
>   - tegra194_pcie_ops: tegra194_map_bus() programs the ATU.  This and
>     the subsequent ECAM read/write must be atomic.
> 
>   - xgene_v1_pcie_ecam_ops and xgene_v2_pcie_ecam_ops:
>     xgene_pcie_map_bus() sets the RTID.  This and the subsequent ECAM
>     read/write must be atomic.
> 
> I had to look at all these ops individually to find them, so I don't
> see an easy way to identify these problem cases at run-time.
> 
> I personally would not have an issue with having APEI try to reserve
> the MCFG regions for any platform that has an MCFG quirk.  That would
> prevent the al, qcom, hisi, thunder-ecam, and altra drivers from using
> EINJ even though it would probably be safe for them.  But we already
> know those platforms are not really ACPI-compliant, so ...

OK, understood. Since those platforms are not really ACPI-compliant, so
we can unify all the quirks together. Let me send an inital solution about
this for your review and see if there's room for further improvement...

Thanks,
Xuesong

> 
> Bjorn
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-10-27  5:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19  4:50 [PATCH v3 2/2] ACPI: APEI: Filter the PCI MCFG address with an arch-agnostic method Xuesong Chen
2021-10-19  4:50 ` Xuesong Chen
2021-10-19 15:04 ` Bjorn Helgaas
2021-10-19 15:04   ` Bjorn Helgaas
2021-10-20  2:35   ` Xuesong Chen
2021-10-20  2:35     ` Xuesong Chen
2021-10-19 19:23 ` Bjorn Helgaas
2021-10-19 19:23   ` Bjorn Helgaas
2021-10-20  3:16   ` Xuesong Chen
2021-10-20  3:16     ` Xuesong Chen
2021-10-20 18:50     ` Bjorn Helgaas
2021-10-20 18:50       ` Bjorn Helgaas
2021-10-21 15:46       ` Xuesong Chen
2021-10-21 15:46         ` Xuesong Chen
2021-10-21 16:57         ` Bjorn Helgaas
2021-10-21 16:57           ` Bjorn Helgaas
2021-10-22  9:52           ` Xuesong Chen
2021-10-22  9:52             ` Xuesong Chen
2021-10-25 23:37             ` Bjorn Helgaas
2021-10-25 23:37               ` Bjorn Helgaas
2021-10-26  9:16               ` Xuesong Chen
2021-10-26  9:16                 ` Xuesong Chen
2021-10-26 20:47                 ` Bjorn Helgaas
2021-10-26 20:47                   ` Bjorn Helgaas
2021-10-27  5:29                   ` Xuesong Chen [this message]
2021-10-27  5:29                     ` Xuesong Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b392f501-e9fb-2c75-42b6-d94e8b8e6ace@linux.alibaba.com \
    --to=xuesong.chen@linux.alibaba.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=helgaas@kernel.org \
    --cc=james.morse@arm.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=mingo@kernel.org \
    --cc=rafael@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=will@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.