All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Lorenzo Pieralisi" <lorenzo.pieralisi@arm.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Andrew Murray" <amurray@thegoodpenguin.co.uk>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Marek Behún" <marek.behun@nic.cz>,
	"Remi Pommarel" <repk@triplefau.lt>,
	"Tomasz Maciej Nowak" <tmn505@gmail.com>,
	Xogium <contact@xogium.me>,
	linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected
Date: Fri, 10 Jul 2020 21:30:03 +0200	[thread overview]
Message-ID: <20200710193003.2lt3i5ocy5kk3b3p@pali> (raw)
In-Reply-To: <20200710160828.GA63389@bjorn-Precision-5520>

On Friday 10 July 2020 11:08:28 Bjorn Helgaas wrote:
> On Fri, Jul 10, 2020 at 05:44:58PM +0200, Pali Rohár wrote:
> > I can reproduce following issue: Connect Compex WLE900VX card, configure
> > aardvark to gen2 mode. And then card is detected only after the first
> > link training. If kernel tries to retrain link again (e.g. via ASPM
> > code) then card is not detected anymore. 
> 
> Somebody should go over the ASPM retrain link code and the PCIe spec
> with a fine-toothed comb.  Maybe we're doing something wrong there.

I think this is not ASPM related as card simply disappear just after
flipping PCI_EXP_LNKCTL_RL bit second time without changing ASPM bits.

> Or maybe aardvark has some hardware issue and we need some sort of
> quirk to work around it.

It is possible that this is aardvark issue. As I said I really do not
know.

In aardvark driver there is already merged workaround for this issue:
driver force gen1 aardvark mode for gen1 card.

> > Another issue which happens for WLE900VX, WLE600VX and WLE1216VS-20 (but
> > not for WLE200VX): Linux kernel can detect these cards only if it issues
> > card reset via PERST# signal and start link training (via standard pcie
> > endpoint register PCI_EXP_LNKCTL/PCI_EXP_LNKCTL_RL)
> 
> I think you mean "downstream port" (not "endpoint") register?

Yes.

> PCI_EXP_LNKCTL_RL is only applicable to *downstream ports* (root ports
> or switch downstream ports) and is reserved for endpoints.
> 
> > immediately after
> > enable link training in aardvark (via aardvark specific LINK_TRAINING_EN
> > bit). If there is e.g. 100ms delay between enabling link training and
> > setting PCI_EXP_LNKCTL_RL bit then these cards are not detected.
> 
> This sounds problematic.  Hardware should not be dependent on the
> software being "fast enough".  In general we should be able to insert
> arbitrary delays at any point without breaking anything.

Yes, it is problematic. For example following commit broke those cards:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f4c7d053d7f77cd5c1a1ba7c7ce085ddba13d1d7

And this commit fixed it (just msleep was moved to different stage):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6964494582f56a3882c2c53b0edbfe99eb32b2e1

But we somehow need to deal with it until we find root cause.

Basically additional sleep in aardvark init phase can break WLE900VX
cards, but not WLE200VX.

And because WLE900VX works fine with pci-mvebu and WLE200VX works fine
with pci-aardvark we cannot deduce from it if problem for combination of
WLE900VX and aardvark is in WLE900VX or in aardvark.

> But I have the impression that aardvark requires more software
> hand-holding that most hardware does.  If it imposes timing
> requirements on the software, that *should* be documented in the
> aardvark spec.

There is absolutely nothing regarding to timings in documentation which
I saw. In documentation are just instructions/steps how to init PCI
subsystem and it is basically advk_pcie_setup_hw() function.

> > I read in kernel bugzilla that WLE600VX and WLE900VX cards are buggy and
> > more people have problems with them. But issues described in kernel
> > bugzilla (like card is reporting incorrect PCI device id) I'm not
> > observing.
> 
> Pointer?

Hm... I cannot find right now pointer to bugzilla, but I have pointer to
ath9k-devel mailing list with that incorrect device id:

https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html

> Is the incorrect device ID 0xffff?

No, incorrect device ID in that case is 0xabcd and vendor ID is correct
(Qualcomm).

> That could be a symptom
> of a PCIe error.  If we read a device ID that's something other than
> 0, 0xffff, or the correct ID, that would be really weird.  Even 0
> would be really strange.

It is strange and also reason why discussion on that list is long.

As I said, I'm not seeing that problem with wrong device ID.

But I know people who are observing same problem on different boards
(which do not use aardvark) as described in above mailing list thread
with Compex ath10k cards.

> I suspect these wifi cards are a little special because they probably
> play unusual games with power for airplane mode and the like.

This is another/different problem and is already "documented" in kernel
bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=84821#c52

WARNING: multiple messages have this Message-ID
From: "Pali Rohár" <pali@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Tomasz Maciej Nowak" <tmn505@gmail.com>,
	"Lorenzo Pieralisi" <lorenzo.pieralisi@arm.com>,
	linux-pci@vger.kernel.org, Xogium <contact@xogium.me>,
	linux-kernel@vger.kernel.org, "Marek Behún" <marek.behun@nic.cz>,
	"Remi Pommarel" <repk@triplefau.lt>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	linux-arm-kernel@lists.infradead.org,
	"Andrew Murray" <amurray@thegoodpenguin.co.uk>
Subject: Re: [PATCH v3] PCI: aardvark: Don't touch PCIe registers if no card connected
Date: Fri, 10 Jul 2020 21:30:03 +0200	[thread overview]
Message-ID: <20200710193003.2lt3i5ocy5kk3b3p@pali> (raw)
In-Reply-To: <20200710160828.GA63389@bjorn-Precision-5520>

On Friday 10 July 2020 11:08:28 Bjorn Helgaas wrote:
> On Fri, Jul 10, 2020 at 05:44:58PM +0200, Pali Rohár wrote:
> > I can reproduce following issue: Connect Compex WLE900VX card, configure
> > aardvark to gen2 mode. And then card is detected only after the first
> > link training. If kernel tries to retrain link again (e.g. via ASPM
> > code) then card is not detected anymore. 
> 
> Somebody should go over the ASPM retrain link code and the PCIe spec
> with a fine-toothed comb.  Maybe we're doing something wrong there.

I think this is not ASPM related as card simply disappear just after
flipping PCI_EXP_LNKCTL_RL bit second time without changing ASPM bits.

> Or maybe aardvark has some hardware issue and we need some sort of
> quirk to work around it.

It is possible that this is aardvark issue. As I said I really do not
know.

In aardvark driver there is already merged workaround for this issue:
driver force gen1 aardvark mode for gen1 card.

> > Another issue which happens for WLE900VX, WLE600VX and WLE1216VS-20 (but
> > not for WLE200VX): Linux kernel can detect these cards only if it issues
> > card reset via PERST# signal and start link training (via standard pcie
> > endpoint register PCI_EXP_LNKCTL/PCI_EXP_LNKCTL_RL)
> 
> I think you mean "downstream port" (not "endpoint") register?

Yes.

> PCI_EXP_LNKCTL_RL is only applicable to *downstream ports* (root ports
> or switch downstream ports) and is reserved for endpoints.
> 
> > immediately after
> > enable link training in aardvark (via aardvark specific LINK_TRAINING_EN
> > bit). If there is e.g. 100ms delay between enabling link training and
> > setting PCI_EXP_LNKCTL_RL bit then these cards are not detected.
> 
> This sounds problematic.  Hardware should not be dependent on the
> software being "fast enough".  In general we should be able to insert
> arbitrary delays at any point without breaking anything.

Yes, it is problematic. For example following commit broke those cards:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f4c7d053d7f77cd5c1a1ba7c7ce085ddba13d1d7

And this commit fixed it (just msleep was moved to different stage):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6964494582f56a3882c2c53b0edbfe99eb32b2e1

But we somehow need to deal with it until we find root cause.

Basically additional sleep in aardvark init phase can break WLE900VX
cards, but not WLE200VX.

And because WLE900VX works fine with pci-mvebu and WLE200VX works fine
with pci-aardvark we cannot deduce from it if problem for combination of
WLE900VX and aardvark is in WLE900VX or in aardvark.

> But I have the impression that aardvark requires more software
> hand-holding that most hardware does.  If it imposes timing
> requirements on the software, that *should* be documented in the
> aardvark spec.

There is absolutely nothing regarding to timings in documentation which
I saw. In documentation are just instructions/steps how to init PCI
subsystem and it is basically advk_pcie_setup_hw() function.

> > I read in kernel bugzilla that WLE600VX and WLE900VX cards are buggy and
> > more people have problems with them. But issues described in kernel
> > bugzilla (like card is reporting incorrect PCI device id) I'm not
> > observing.
> 
> Pointer?

Hm... I cannot find right now pointer to bugzilla, but I have pointer to
ath9k-devel mailing list with that incorrect device id:

https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html

> Is the incorrect device ID 0xffff?

No, incorrect device ID in that case is 0xabcd and vendor ID is correct
(Qualcomm).

> That could be a symptom
> of a PCIe error.  If we read a device ID that's something other than
> 0, 0xffff, or the correct ID, that would be really weird.  Even 0
> would be really strange.

It is strange and also reason why discussion on that list is long.

As I said, I'm not seeing that problem with wrong device ID.

But I know people who are observing same problem on different boards
(which do not use aardvark) as described in above mailing list thread
with Compex ath10k cards.

> I suspect these wifi cards are a little special because they probably
> play unusual games with power for airplane mode and the like.

This is another/different problem and is already "documented" in kernel
bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=84821#c52

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-07-10 19:30 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28 14:31 [PATCH] " Pali Rohár
2020-05-28 14:31 ` Pali Rohár
2020-05-28 16:26 ` Bjorn Helgaas
2020-05-28 16:26   ` Bjorn Helgaas
2020-05-28 16:38   ` Pali Rohár
2020-05-28 16:38     ` Pali Rohár
2020-05-28 16:49     ` Bjorn Helgaas
2020-05-28 16:49       ` Bjorn Helgaas
2020-05-29  8:30       ` Pali Rohár
2020-05-29  8:30         ` Pali Rohár
2020-06-30 12:31         ` Pali Rohár
2020-06-30 12:31           ` Pali Rohár
2020-06-30 13:51     ` Bjorn Helgaas
2020-06-30 13:51       ` Bjorn Helgaas
2020-06-30 14:04       ` Pali Rohár
2020-06-30 14:04         ` Pali Rohár
2020-06-30 14:58         ` Bjorn Helgaas
2020-06-30 14:58           ` Bjorn Helgaas
2020-07-01  8:08           ` Pali Rohár
2020-07-01  8:08             ` Pali Rohár
2020-07-01  8:20 ` [PATCH v2] " Pali Rohár
2020-07-01  8:20   ` Pali Rohár
2020-07-01 21:34   ` Bjorn Helgaas
2020-07-01 21:34     ` Bjorn Helgaas
2020-07-02  8:23     ` Pali Rohár
2020-07-02  8:23       ` Pali Rohár
2020-07-02  8:30 ` [PATCH v3] " Pali Rohár
2020-07-02  8:30   ` Pali Rohár
2020-07-09 11:35   ` Lorenzo Pieralisi
2020-07-09 11:35     ` Lorenzo Pieralisi
2020-07-09 12:22     ` Pali Rohár
2020-07-09 12:22       ` Pali Rohár
2020-07-09 14:47       ` Lorenzo Pieralisi
2020-07-09 14:47         ` Lorenzo Pieralisi
2020-07-09 15:09         ` Pali Rohár
2020-07-09 15:09           ` Pali Rohár
2020-07-10  9:18           ` Lorenzo Pieralisi
2020-07-10  9:18             ` Lorenzo Pieralisi
2020-07-10 15:44             ` Pali Rohár
2020-07-10 15:44               ` Pali Rohár
2020-07-10 16:08               ` Bjorn Helgaas
2020-07-10 16:08                 ` Bjorn Helgaas
2020-07-10 19:30                 ` Pali Rohár [this message]
2020-07-10 19:30                   ` Pali Rohár
2020-07-10 20:08                   ` Bjorn Helgaas
2020-07-10 20:08                     ` Bjorn Helgaas
2020-07-13  8:27             ` Pali Rohár
2020-07-13  8:27               ` Pali Rohár
2020-07-13 11:23               ` Lorenzo Pieralisi
2020-07-13 11:23                 ` Lorenzo Pieralisi
2020-07-13 14:50                 ` Pali Rohár
2020-07-13 14:50                   ` Pali Rohár
2020-07-13 16:41                   ` Lorenzo Pieralisi
2020-07-13 16:41                     ` Lorenzo Pieralisi
2020-07-14  7:38                     ` Pali Rohár
2020-07-14  7:38                       ` Pali Rohár
2020-07-15 12:17                 ` Pali Rohár
2020-07-15 12:17                   ` Pali Rohár
2020-07-15 16:21                   ` Lorenzo Pieralisi
2020-07-15 16:21                     ` Lorenzo Pieralisi
2020-07-21  8:57                     ` Pali Rohár
2020-07-21  8:57                       ` Pali Rohár
2020-07-21 10:48                       ` Lorenzo Pieralisi
2020-07-21 10:48                         ` Lorenzo Pieralisi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200710193003.2lt3i5ocy5kk3b3p@pali \
    --to=pali@kernel.org \
    --cc=amurray@thegoodpenguin.co.uk \
    --cc=bhelgaas@google.com \
    --cc=contact@xogium.me \
    --cc=helgaas@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=marek.behun@nic.cz \
    --cc=repk@triplefau.lt \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=tmn505@gmail.com \
    --subject='Re: [PATCH v3] PCI: aardvark: Don'\''t touch PCIe registers if no card connected' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.