linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: wi nk <wink@technolu.st>
To: Thomas Krause <thomaskrause@posteo.de>
Cc: Kalle Valo <kvalo@codeaurora.org>,
	Govind Singh <govinds@codeaurora.org>,
	linux-pci@vger.kernel.org, Stefani Seibold <stefani@seibold.net>,
	linux-wireless@vger.kernel.org, Devin Bayer <dev@doubly.so>,
	Christoph Hellwig <hch@lst.de>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	ath11k@lists.infradead.org, David Woodhouse <dwmw@amazon.co.uk>
Subject: Re: pci_alloc_irq_vectors fails ENOSPC for XPS 13 9310
Date: Sun, 15 Nov 2020 20:55:20 +0100	[thread overview]
Message-ID: <CAHUdJJVoi2_BnubtADpdLQoe1xAuHCvkPF-RMX=dnY3nXoTm5g@mail.gmail.com> (raw)
In-Reply-To: <e4ba4457-bd08-42fe-ade7-32059367701a@posteo.de>

On Sun, Nov 15, 2020 at 2:30 PM Thomas Krause <thomaskrause@posteo.de> wrote:
>
>
> Am 12.11.20 um 16:44 schrieb wi nk:
> > On Thu, Nov 12, 2020 at 10:00 AM Kalle Valo <kvalo@codeaurora.org> wrote:
> >> wi nk <wink@technolu.st> writes:
> >>
> >>> On Thu, Nov 12, 2020 at 8:15 AM Kalle Valo <kvalo@codeaurora.org> wrote:
> >>>> Stefani Seibold <stefani@seibold.net> writes:
> >>>>
> >>>>> Am Donnerstag, den 12.11.2020, 02:10 +0100 schrieb wi nk:
> >>>>>> I've yet to see any instability after 45 minutes of exercising it, I
> >>>>>> do see a couple of messages that came out of the driver:
> >>>>>>
> >>>>>> [    8.963389] ath11k_pci 0000:55:00.0: Unknown eventid: 0x16005
> >>>>>> [   11.342317] ath11k_pci 0000:55:00.0: Unknown eventid: 0x1d00a
> >>>>>>
> >>>>>> then when it associates:
> >>>>>>
> >>>>>> [   16.718895] wlp85s0: send auth to ec:08:6b:27:01:ea (try 1/3)
> >>>>>> [   16.722636] wlp85s0: authenticated
> >>>>>> [   16.724150] wlp85s0: associate with ec:08:6b:27:01:ea (try 1/3)
> >>>>>> [   16.726486] wlp85s0: RX AssocResp from ec:08:6b:27:01:ea
> >>>>>> (capab=0x411 status=0 aid=8)
> >>>>>> [   16.738443] wlp85s0: associated
> >>>>>> [   16.764966] IPv6: ADDRCONF(NETDEV_CHANGE): wlp85s0: link becomes
> >>>>>> ready
> >>>>>>
> >>>>>> The adapter is achieving around 500 mbps on my gigabit connection, my
> >>>>>> 2018 mbp sees around 650, so it's doing pretty well so far.
> >>>>>>
> >>>>>> Stefani - when you applied the patch that Kalle shared, which branch
> >>>>>> did you apply it to?  I applied it to ath11k-qca6390-bringup and when
> >>>>>> I revert 7fef431be9c9 there is a small merge conflict I needed to
> >>>>>> resolve.  I wonder if either the starting branch, or your chosen
> >>>>>> resolution are related to the instability you see (or I'm just lucky
> >>>>>> so far! :)).
> >>>>>>
> >>>>> I used the vanilla kernel tree
> >>>>> https://git.kernel.org/torvalds/t/linux-5.10-rc2.tar.gz. On top of this
> >>>>> i applied the
> >>>>>
> >>>>> RFT-ath11k-pci-support-platforms-with-one-MSI-vector.patch
> >>>>>
> >>>>> and reverted the patch 7fef431be9c9
> >>>> I did also my testing on v5.10-rc2 and I recommend to use that as the
> >>>> baseline when debuggin these ath11k problems. It helps to compare the
> >>>> results if everyone have the same baseline.
> >>>>
> >>>> --
> >>>> https://patchwork.kernel.org/project/linux-wireless/list/
> >>>>
> >>>> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
> >>> Absolutely, I'll rebuild to 5.10 later today and apply the same series
> >>> of patches and report back.
> >> Great, thanks.
> >>
> >>> I'll also test out the patch on both versions from Carl to fix
> >>> resuming. It stands to reason that we may be seeing another regression
> >>> between Stefani (5.10) and myself (5.9 bringup branch) as I don't see
> >>> any disconnections or instability once the interface is online.
> >> Yeah, there is something strange happening between v5.9 and v5.10 we
> >> have not yet figured out. Most likely it has something to do with memory
> >> allocations and DMA transfers failing, but no clear understanding yet.
> >>
> >> But to keep things simple let's only discuss the MSI problem on this
> >> thread, and discuss the timeouts in the another thread:
> >>
> >> http://lists.infradead.org/pipermail/ath11k/2020-November/000641.html
> >>
> >> I'll include you and other reporters to that thread.
> >>
> >> --
> >> https://patchwork.kernel.org/project/linux-wireless/list/
> >>
> >> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
> > Ok, I've tried a clean checkout of 5.10-rc2 with the one MSI patch
> > applied and 7fef431be9c9 reverted.  I can't get my machine to  boot
> > into anything usable with that configuration.  I'm running ubuntu so
> > its starting right into X and sometime between showing the available
> > users and me clicking the icon to login the machine freezes.  I can
> > see in the system tray that the wifi adapter is being activated and
> > appears to have associated with an AP, I just can't do much beyond
> > that as the keyboard backlight wakes up, but the caps lock key doesn't
> > work.  I see similar behavior with the 5.9 configuration, but after a
> > reboot or two I win whatever race is occuring.  With 5.10, I tried
> > maybe 10-15 times with 0 success.
>
> I can confirm this behavior on my configuration. I managed to login once
> and select the Wifi and connect to it. It seemed curiously enough be
> stable long enough to enter the Wifi passphrase. After the connection
> was established, the system hang and on each attempt to reboot into the
> graphical system it would freeze at some point (sometimes even before
> showing the login screen).
>
> Kernel was both based on 5.10-rc2 and 5.10-rc3 (I did see the same
> behavior) with the patch applied, 7fef431be9c9 reverted and firmware
> downloaded and copied to /lib/firmware/ath11k/QCA6390/hw2.0/.
>
>

I did a bit more digging to see if I could find any new information,
I'm not sure I did but here's what I did / found.  I spent the time to
get a kdump kernel running and enabled, I was able to SysRq-C (both
via keyboard and echo c > /proc/sysrq-trigger) and generate a crash
dump.  Actually viewing them at the moment will require reverting a
couple of patches to printk to fix the file for the crash utility
(https://github.com/crash-utility/crash/issues/67), but right now
that's not super important since the mechanism isn't being triggered.
As reported here and by Mitchell, the adapter will work occasionally,
but more often it will hang the machine (I too tried 5.10-rc3 with no
noticable differences).  Whatever is causing the system to hang isn't
triggering the kdump kernel to take over and dump the vmcore.  I've
set watchdog=1 , nmi_watchdog=1, hung_task_panic=1, softlockup_panic=1
trying to convince the kernel to dump it's state during this.  I've
not been able to make it write a crash, it just sits 'hung'.  One
interesting observation that may be related, is that if the lockup
occurs during my login, I can actually see the system grind to a halt
over the course of a number of frames (the rendering of the login
animations starts to stutter/get really slow, then after a few frames
everything is frozen).  If something were spin locking/ed, I'd expect
the soft lockup panic to find it, but I don't know these mechanisms
well.

The only consistent behavior that I managed to create is that if the
wifi adapter / machine are in a 'working' state (ie: I can browse the
internet, etc) and I issue sysrq-c to crash the kernel and then let
the crash dump write and reboot the machine, once booted the adapter
is no longer seen by the kernel, and there are zero messages in dmesg
that match "ath11k".  The driver shows up in lsmod , but it reports
zero messages and it's like the adapter is completely invisible.  A
power off and back on of the machine will re-enter it into the
freezing/wifi working lottery.

  reply	other threads:[~2020-11-15 19:55 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2849fd39-a7a6-8366-7c78-fc9fec4dffa4@posteo.de>
     [not found] ` <87tuuqhc1i.fsf@codeaurora.org>
     [not found]   ` <1ce6f735-21ff-db7e-c8dc-d567761964aa@posteo.de>
2020-11-02 18:49     ` pci_alloc_irq_vectors fails ENOSPC for XPS 13 9310 Kalle Valo
2020-11-02 20:57       ` Bjorn Helgaas
2020-11-03  3:01         ` Carl Huang
2020-11-03  6:49         ` Kalle Valo
2020-11-03 16:08           ` Bjorn Helgaas
2020-11-03 21:08             ` Thomas Gleixner
2020-11-03 22:42               ` Thomas Gleixner
2020-11-09 18:44                 ` Kalle Valo
     [not found]               ` <fa26ac8b-ed48-7ea3-c21b-b133532716b8@posteo.de>
2020-11-04 15:26                 ` Thomas Gleixner
2020-11-05 13:23                   ` Kalle Valo
2020-11-10  8:33                     ` Kalle Valo
2020-11-11  8:53                       ` Thomas Krause
2020-11-11  9:22                         ` Kalle Valo
2020-11-11 19:10                           ` Kalle Valo
2020-11-11 19:24                             ` wi nk
2020-11-11 19:30                               ` wi nk
2020-11-11 19:45                                 ` Kalle Valo
2020-11-11 20:12                                   ` wi nk
2020-11-11 21:35                             ` Stefani Seibold
2020-11-11 22:02                             ` Stefani Seibold
2020-11-12  0:24                               ` wi nk
2020-11-12  1:10                                 ` wi nk
2020-11-12  1:11                                   ` wi nk
2020-11-12  2:31                                     ` wi nk
2020-11-12  6:29                                       ` Carl Huang
2020-11-12  7:05                                   ` Stefani Seibold
2020-11-12  7:15                                     ` Kalle Valo
2020-11-12  7:41                                       ` wi nk
2020-11-12  8:59                                         ` Kalle Valo
2020-11-12 15:44                                           ` wi nk
2020-11-13  9:52                                             ` wi nk
2020-11-15 13:30                                             ` Thomas Krause
2020-11-15 19:55                                               ` wi nk [this message]
2020-11-17 15:49                                                 ` wi nk
2020-11-17 20:59                                                   ` Thomas Gleixner
2020-11-18 10:22                                                     ` wi nk
2020-11-11  9:39                         ` Thomas Gleixner
2020-11-06 11:45               ` Devin Bayer
2020-11-09 18:48             ` Kalle Valo
2020-11-03 11:20         ` Devin Bayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHUdJJVoi2_BnubtADpdLQoe1xAuHCvkPF-RMX=dnY3nXoTm5g@mail.gmail.com' \
    --to=wink@technolu.st \
    --cc=ath11k@lists.infradead.org \
    --cc=dev@doubly.so \
    --cc=dwmw@amazon.co.uk \
    --cc=govinds@codeaurora.org \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=stefani@seibold.net \
    --cc=tglx@linutronix.de \
    --cc=thomaskrause@posteo.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).