linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	"Kalle Valo" <kvalo@codeaurora.org>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Marek Behún" <kabel@kernel.org>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	vtolkm@gmail.com, "Rob Herring" <robh@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	linux-pci@vger.kernel.org, ath10k@lists.infradead.org,
	linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] PCI: Disallow retraining link for Atheros chips on non-Gen1 PCIe bridges
Date: Sat, 26 Jun 2021 16:38:16 +0200	[thread overview]
Message-ID: <20210626143816.2p4qwwzuxfeys2y2@pali> (raw)
In-Reply-To: <20210625201936.GA3293099@bjorn-Precision-5520>

On Friday 25 June 2021 15:19:36 Bjorn Helgaas wrote:
> On Mon, Jun 21, 2021 at 04:28:55PM +0200, Pali Rohár wrote:
> > On Wednesday 16 June 2021 16:38:19 Bjorn Helgaas wrote:
> > > On Wed, Jun 02, 2021 at 09:03:02PM +0200, Pali Rohár wrote:
> > > > On Wednesday 02 June 2021 10:55:59 Bjorn Helgaas wrote:
> > > > > On Wed, Jun 02, 2021 at 02:08:16PM +0200, Pali Rohár wrote:
> > > > > > On Tuesday 01 June 2021 19:00:36 Bjorn Helgaas wrote:
> > > > > 
> > > > > > > I wonder if this could be restructured as a generic quirk
> > > > > > > in quirks.c that simply set the bridge's TLS to 2.5 GT/s
> > > > > > > during enumeration.  Or would the retrain fail even in
> > > > > > > that case?
> > > > > > 
> > > > > > If I understand it correctly then PCIe link is already up
> > > > > > when kernel starts enumeration. So setting Bridge TLS to 2.5
> > > > > > GT/s does not change anything here.
> > > > > > 
> > > > > > Moreover it would have side effect that cards which are
> > > > > > already set to 5+ GT/s would be downgraded to 2.5 GT/s
> > > > > > during enumeration and for increasing speed would be needed
> > > > > > another round of "enumeration" to set a new TLS and retrain
> > > > > > link again. As TLS affects link only after link goes into
> > > > > > Recovery state.
> > > > > > 
> > > > > > So this would just complicate card enumeration and settings.
> > > > > 
> > > > > The current quirk complicates the ASPM code.  I'm hoping that
> > > > > if we set the bridge's Target Link Speed during enumeration,
> > > > > the link retrain will "just work" without complicating the
> > > > > ASPM code.
> > > > > 
> > > > > An enumeration quirk wouldn't have to set the bridge's TLS to
> > > > > 2.5 GT/s; the quirk would be attached to specific endpoint
> > > > > devices and could set the bridge's TLS to whatever the
> > > > > endpoint supports.
> > > > 
> > > > Now I see what you mean. Yes, I agree this is a good idea and
> > > > can simplify code. Quirk is not related to ASPM code and
> > > > basically has nothing with it, just I put it into aspm.c because
> > > > this is the only place where link retraining was activated.
> > > > 
> > > > But with this proposal there is one issue. Some kernel drivers
> > > > already overwrite PCI_EXP_LNKCTL2_TLS value. So if PCI
> > > > enumeration code set some value into PCI_EXP_LNKCTL2_TLS bits
> > > > then drivers can change it and once ASPM will try to retrain
> > > > link this may cause this issue.
> > > 
> > > I guess you mean the amdgpu, radeon, and hfi1 drivers.  They
> > > really shouldn't be mucking with that stuff anyway.  But they do
> > > and are unlikely to change because we don't have any good
> > > alternative.
> > 
> > Yea, these are examples of such drivers... Maybe it is a good idea
> > to ask those people why changing PCI_EXP_LNKCTL2_TLS is needed. As
> > these drivers are often derived from codebase of shared multisystem
> > drivers or from common documentation, it is possible that original
> > source has this code as a workaround or common pattern used in other
> > operating systems, not related to linux...
> > 
> > > One way around that would be to add some quirk code to
> > > pcie_capability_write_word().  Ugly, but we do have something sort
> > > of similar in pcie_capability_read_word() already.
> > 
> > Bjorn, do you really want such ugly hack in
> > pcie_capability_write_word?  It is common code used and called from
> > lot of places so it may affect whole system if in future somebody
> > changes it again...
> 
> I don't know which is uglier, a quirk in pcie_capability_write_word()
> or a quirk in aspm.c that has nothing to do with ASPM.  They're both
> ugly :)

Ok :-)

> FWIW, in pcie_capability_write_word() I would envision not a check for
> Atheros, but rather something like a "dev->max_target_link_speed" that
> could be set by an Atheros quirk.  It does get uglier if we want to
> restrict the bridge's link speed via a quirk, then unrestrict it when
> the endpoint is unplugged.
> 
> I know pcie_downgrade_link_to_gen1() only returns failure for corner
> cases that "should not occur,"

It is not only corner case. It happens _always_ with at least two
pci drivers.

As I wrote in other email due to this issue, some quirk code which
allows / disallows link retraining is required in aspm.c file.

> but I don't like the fact that it's
> possible to change Common Clock Configuration without doing the
> retrain.  That would leave us with incorrect ASPM exit latencies,
> which is really hard to debug.

I see... Any idea how to solve this issue?

> Here's the relevant text in the spec (PCIe r5.0):
> 
>   7.5.3.6 Link Capabilities
> 
>     L0s Exit Latency - This field indicates the L0s exit latency for
>     the given PCI Express Link. The value reported indicates the
>     length of time this Port requires to complete transition from L0s
>     to L0. ...
> 
>     Note that exit latencies may be influenced by PCI Express
>     reference clock configuration depending upon whether a component
>     uses a common or separate reference clock.
> 
>   7.5.3.6 Link Control
>     Common Clock Configuration - When Set, this bit indicates that
>     this component and the component at the opposite end of this Link
>     are operating with a distributed common reference clock. ...
> 
>     After changing the value in this bit in both components on a Link,
>     software must trigger the Link to retrain by writing a 1b to the
>     Retrain Link bit of the Downstream Port.

  reply	other threads:[~2021-06-26 14:38 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26 12:43 [PATCH] PCI: Disallow retraining link for Atheros QCA98xx chips on non-Gen1 PCIe bridges Pali Rohár
2021-03-26 18:13 ` Toke Høiland-Jørgensen
2021-03-27  0:14 ` Krzysztof Wilczyński
2021-03-27 13:29   ` Pali Rohár
2021-03-27 14:42     ` Marek Behún
2021-03-27 17:33       ` Pali Rohár
2021-04-27 10:55 ` [PATCH v2] PCI: Disallow retraining link for Atheros " Pali Rohár
2021-04-30 11:41   ` Pali Rohár
2021-05-05 16:33 ` [PATCH v3] " Pali Rohár
2021-05-11 20:39   ` Pali Rohár
2021-05-28  0:08     ` Pali Rohár
2021-06-01 11:28   ` Krzysztof Wilczyński
2021-06-01 20:05   ` Bjorn Helgaas
2021-06-01 21:18     ` Pali Rohár
2021-06-02  0:00       ` Bjorn Helgaas
2021-06-02 12:08         ` Pali Rohár
2021-06-02 15:55           ` Bjorn Helgaas
2021-06-02 19:03             ` Pali Rohár
2021-06-16 21:38               ` Bjorn Helgaas
2021-06-21 14:28                 ` Pali Rohár
2021-06-25 20:19                   ` Bjorn Helgaas
2021-06-26 14:38                     ` Pali Rohár [this message]
2021-06-21 14:39               ` Pali Rohár
2021-10-05 19:43   ` Jannis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210626143816.2p4qwwzuxfeys2y2@pali \
    --to=pali@kernel.org \
    --cc=ath10k@lists.infradead.org \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kabel@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=toke@redhat.com \
    --cc=vtolkm@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).