linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: vtolkm@gmail.com, "Bjorn Helgaas" <helgaas@kernel.org>,
	linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	"Rob Herring" <robh@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Marek Behún" <marek.behun@nic.cz>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>
Subject: Re: PCI trouble on mvebu (Turris Omnia)
Date: Fri, 26 Mar 2021 13:50:28 +0100	[thread overview]
Message-ID: <20210326125028.tyqkcc5fvaqbwqkn@pali> (raw)
In-Reply-To: <20210318231629.vhix2cqpt25bgrne@pali>

On Friday 19 March 2021 00:16:29 Pali Rohár wrote:
> On Thursday 18 March 2021 23:43:58 Toke Høiland-Jørgensen wrote:
> > Pali Rohár <pali@kernel.org> writes:
> > 
> > > On Monday 15 March 2021 20:58:06 Pali Rohár wrote:
> > >> On Monday 02 November 2020 16:54:35 Toke Høiland-Jørgensen wrote:
> > >> > Pali Rohár <pali@kernel.org> writes:
> > >> > 
> > >> > > On Saturday 31 October 2020 13:49:49 Toke Høiland-Jørgensen wrote:
> > >> > >> "™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:
> > >> > >> 
> > >> > >> > On 30/10/2020 15:23, Pali Rohár wrote:
> > >> > >> >> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
> > >> > >> >>> Pali Rohár <pali@kernel.org> writes:
> > >> > >> >>>> My experience with that WLE900VX card, aardvark driver and aspm code:
> > >> > >> >>>>
> > >> > >> >>>> Link training in GEN2 mode for this card succeed only once after reset.
> > >> > >> >>>> Repeated link retraining fails and it fails even when aardvark is
> > >> > >> >>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
> > >> > >> >>>> working link training.
> > >> > >> >>>>
> > >> > >> >>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
> > >> > >> >>>> success read "negotiated link speed" from "Link Control Status Register"
> > >> > >> >>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
> > >> > >> >>>> retrain link again (for WLE900VX now it would be at GEN1). After that
> > >> > >> >>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
> > >> > >> >>>>
> > >> > >> >>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
> > >> > >> >>>> training fails. And if I change mode to GEN1 after this failed link
> > >> > >> >>>> training then nothing happen, link training do not success.
> > >> > >> >>>>
> > >> > >> >>>> So just speculation now... In current setup initialization of card does
> > >> > >> >>>> one link training at GEN2. Then aspm.c is called which is doing second
> > >> > >> >>>> link retraining at GEN2. And if it fails then below patch issue third
> > >> > >> >>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
> > >> > >> >>>> then second link retraining must be at GEN1 (not GEN2) to workaround
> > >> > >> >>>> this issue.
> > >> > >> >>>>
> > >> > >> >>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
> > >> > >> >>>> retraining at GEN2 speed? And always force GEN1 speed prior link
> > >> > >> >>>> training?
> > >> > >> >>> Sounds like a plan. I poked around in aspm.c and must confess to being a
> > >> > >> >>> bit lost in the soup of registers ;)
> > >> > >> >>>
> > >> > >> >>> So if one of you can cook up a patch, that would be most helpful!
> > >> > >> >> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
> > >> > >> >> about cls (current link speed, that what is used by aardvark). It is
> > >> > >> >> untested, I just tried to compile it.
> > >> > >> >>
> > >> > >> >> Can try it?
> > >> > >> >>
> > >> > >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > >> > >> >> index 253c30cc1967..f934c0b52f41 100644
> > >> > >> >> --- a/drivers/pci/pcie/aspm.c
> > >> > >> >> +++ b/drivers/pci/pcie/aspm.c
> > >> > >> >> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
> > >> > >> >>   	unsigned long end_jiffies;
> > >> > >> >>   	u16 reg16;
> > >> > >> >>   
> > >> > >> >> +	u32 lnkcap2;
> > >> > >> >> +	u16 lnksta, lnkctl2, cls, tls;
> > >> > >> >> +
> > >> > >> >> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
> > >> > >> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
> > >> > >> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
> > >> > >> >> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
> > >> > >> >> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
> > >> > >> >> +
> > >> > >> >> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
> > >> > >> >> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
> > >> > >> >> +		lnksta, cls,
> > >> > >> >> +		lnkctl2, tls);
> > >> > >> >> +
> > >> > >> >> +	tls = 1;
> > >> > >> >> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
> > >> > >> >> +					PCI_EXP_LNKCTL2_TLS, tls);
> > >> > >> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
> > >> > >> >> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
> > >> > >> >> +		lnkctl2, tls);
> > >> > >> >> +
> > >> > >> >>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
> > >> > >> >>   	reg16 |= PCI_EXP_LNKCTL_RL;
> > >> > >> >>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> > >> > >> >> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
> > >> > >> >>   			break;
> > >> > >> >>   		msleep(1);
> > >> > >> >>   	} while (time_before(jiffies, end_jiffies));
> > >> > >> >> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
> > >> > >> >> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
> > >> > >> >>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
> > >> > >> >>   }
> > >> > >> >>   
> > >> > >> >
> > >> > >> > Still exhibiting the BAR update error, run tested with next--20201030
> > >> > >> 
> > >> > >> Yup, same for me :(
> > >> 
> > >> I'm answering my own question. This code does not work on Omnia because
> > >> A38x pci-mvebu.c driver is using emulator for PCIe root bridge and it
> > >> does not implement PCI_EXP_LNKCTL2 and PCI_EXP_LNKCTL2 registers. So
> > >> code for forcing link speed has no effect on Omnia...
> > >
> > > Toke, on A38x PCIe controller it is possible to access PCI_EXP_LNKCTL2
> > > register. Just access is not exported via emulated root bridge.
> > >
> > > Documentation for this PCIe controller is public, so anybody can look at
> > > register description. See page 571, A.7 PCI Express 2.0 Port 0 Registers
> > >
> > > http://web.archive.org/web/20200420191927/https://www.marvell.com/content/dam/marvell/en/public-collateral/embedded-processors/marvell-embedded-processors-armada-38x-functional-specifications-2015-11.pdf
> > >
> > > In drivers/pci/controller/pci-mvebu.c you can set a new value for this
> > > register via function call:
> > >
> > >     mvebu_writel(port, val, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > >
> > > So, could you try to set PCI_EXP_LNKCTL2_TLS bits to gen1 in some hw
> > > init function, e.g. mvebu_pcie_setup_hw()?
> > >
> > >     u32 val = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > >     val &= ~PCI_EXP_LNKCTL2_TLS;
> > >     val |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> > >     mvebu_writel(port, val, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > 
> > I pasted this into the top of mvebu_pcie_setup_hw(), and that indeed
> > seems to fix things so that all three PCIE devices work even with ASPM
> > turned on! :)
> 
> Perfect! Now I'm sure that it is same issue as in aardvark driver.
> 
> I will prepare patches for both pci-aardvark.c and pci-mvebu.c to export
> PCI_EXP_LNKCTL2 register via emulated bridge. And so aspm.c code would
> be able to use Bjorn or my patch which I have sent last year.
> 
> Question reminds, if this is issue with QCA wifi chip on that Compex
> card or it is issue with PCIe controllers, now on A38x and A3720 SoC.
> Note that both A38x and A3720 platforms are from Marvell, but they have
> different PCIe controllers (so it does not mean that both must have same
> hw bugs).

Seems that this is really issue in QCA98xx chips. I have send patch
which adds quirk for these wifi chips:

https://lore.kernel.org/linux-pci/20210326124326.21163-1-pali@kernel.org/

> > Do you still need me to test the card on a different machine? Not sure I
> > have an x86 machine with a mini-PCIe slot handy, but I can go hunting if
> > needed...
> 
> Yes, now it is needed to know if we can find any other PCIe controller
> in which this card does not work when ASPM is enabled and above "retrain
> link" kernel code is executed.
> 
> It does not have to be x86. But due to how UEFI, ACPI and other
> firmwares touches PCIe, there is a high chance that on some x86 machine
> this bug can appear too. More firmwares = more problems.
> 
> On arm platforms with native controller drivers there does not have to
> be any firmware (like in Marvell case) so only kernel touches PCIe HW at
> the same time.
> 
> Note that on x86, ASPM may be disabled (if firmware indicates it), so
> command line argument like "pcie_aspm=force" is probably required for
> tests.

  reply	other threads:[~2021-03-26 12:51 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27 15:43 Toke Høiland-Jørgensen
2020-10-27 17:20 ` Bjorn Helgaas
2020-10-27 17:44   ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-27 18:59     ` Toke Høiland-Jørgensen
2020-10-27 20:20       ` Toke Høiland-Jørgensen
2020-10-27 21:22         ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-27 21:31           ` Toke Høiland-Jørgensen
2020-10-27 22:01             ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-27 22:12               ` Toke Høiland-Jørgensen
2020-10-27 18:56   ` Toke Høiland-Jørgensen
2020-10-28 13:36     ` Toke Høiland-Jørgensen
2020-10-28 14:42       ` Bjorn Helgaas
2020-10-28 15:08         ` Toke Høiland-Jørgensen
2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-28 23:16             ` Bjorn Helgaas
2020-10-29 10:09               ` Pali Rohár
2020-10-29 10:56                 ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-29 11:12                 ` Toke Høiland-Jørgensen
2020-10-29 19:30                   ` Bjorn Helgaas
2020-10-29 19:56                     ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-29 19:57                     ` Andrew Lunn
2020-10-29 21:55                       ` Thomas Petazzoni
2020-10-29 20:18                     ` Toke Høiland-Jørgensen
2020-10-29 22:09                       ` Toke Høiland-Jørgensen
2020-10-29 20:58                     ` Marek Behun
2020-10-30 10:08                       ` Pali Rohár
2020-10-30 10:45                         ` Marek Behun
2020-10-29 21:54                     ` Thomas Petazzoni
2020-10-29 23:15                       ` Toke Høiland-Jørgensen
2020-10-30  8:23                         ` Thomas Petazzoni
2020-10-30 10:15                         ` Pali Rohár
2020-10-29 10:41               ` Toke Høiland-Jørgensen
2020-10-29 11:18                 ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-30 11:23               ` Pali Rohár
2020-10-30 13:02                 ` Toke Høiland-Jørgensen
2020-10-30 14:23                   ` Pali Rohár
2020-10-30 14:54                     ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-31 12:49                       ` Toke Høiland-Jørgensen
2020-11-02 15:24                         ` Pali Rohár
2020-11-02 15:54                           ` Toke Høiland-Jørgensen
2020-11-02 16:18                             ` ™֟☻̭҇ Ѽ ҉ ®
2020-11-02 16:33                               ` Toke Høiland-Jørgensen
2021-03-15 19:58                             ` Pali Rohár
2021-03-16  9:25                               ` Pali Rohár
2021-03-18 22:43                                 ` Toke Høiland-Jørgensen
2021-03-18 23:16                                   ` Pali Rohár
2021-03-26 12:50                                     ` Pali Rohár [this message]
2021-03-26 15:25                                       ` Toke Høiland-Jørgensen
2021-03-26 15:34                                         ` Pali Rohár
2021-03-26 16:54                                           ` Toke Høiland-Jørgensen
2021-03-26 17:11                                             ` Pali Rohár
2021-03-26 17:51                                               ` Toke Høiland-Jørgensen
2021-03-29 17:09                                                 ` Pali Rohár
2021-03-31 14:02                                                   ` Toke Høiland-Jørgensen
2021-03-31 16:15                                                     ` Pali Rohár
2021-03-31 16:53                                                       ` Toke Høiland-Jørgensen
2020-10-29  1:21             ` Marek Behun
2020-10-29 15:12           ` Rob Herring
2020-10-27 18:03 ` Marek Behun
2020-10-27 19:00   ` Toke Høiland-Jørgensen
2020-10-27 20:19     ` Marek Behun
2020-10-27 20:49       ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210326125028.tyqkcc5fvaqbwqkn@pali \
    --to=pali@kernel.org \
    --cc=helgaas@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=marek.behun@nic.cz \
    --cc=robh@kernel.org \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=toke@redhat.com \
    --cc=vtolkm@gmail.com \
    --subject='Re: PCI trouble on mvebu (Turris Omnia)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).