From: wi nk <wink@technolu.st>
To: Kalle Valo <kvalo@codeaurora.org>
Cc: "ath11k@lists.infradead.org" <ath11k@lists.infradead.org>,
Mitchell Nordine <mail@mitchellnordine.com>
Subject: Re: ath11k: QCA6390 on Dell XPS 13 and kernel crashes
Date: Wed, 9 Dec 2020 16:55:50 +0100 [thread overview]
Message-ID: <CAHUdJJWNSKw9aAHQBc8Ftne1J+s5KdVfdMLwgWu+g-ZfeDnitA@mail.gmail.com> (raw)
In-Reply-To: <87mtyn9dx0.fsf@codeaurora.org>
On Wed, Dec 9, 2020 at 4:50 PM Kalle Valo <kvalo@codeaurora.org> wrote:
>
> wi nk <wink@technolu.st> writes:
>
> > On Wed, Dec 9, 2020 at 4:35 PM Kalle Valo <kvalo@codeaurora.org> wrote:
> >>
> >> wi nk <wink@technolu.st> writes:
> >>
> >> > So I've managed to stabilise my system now, so either the race is
> >> > gone, or I've done something to win it all the time. So one of the
> >> > avenues of racing I was chasing at first was in the ath11k driver
> >> > itself. There are a couple areas where the single/shared IRQ is being
> >> > forcibly toggled in ways that the documentation says are not great
> >> > (and the original patch was trying to avoid). Fixing those didn't
> >> > seem to have much impact on the stability of things (I've included
> >> > those changes in my patch though). After the last email I was
> >> > thinking about the MHI side of things a bit more and found a number of
> >> > call sites that my naive grepping had missed that do the same thing,
> >> > but via acquiring a lock at the same time. I modified all the calls
> >> > to *_lock_irq and *_unlock_irq to the lock/unlock - save/restore
> >> > variants that accept the flags parameter to capture state. I've now
> >> > booted and loaded the driver 10+ times without a single freeze or
> >> > crash. I'm not sure all of those modifications are necessary (ie:
> >> > which things are re-entrant in this single interrupt operating mode vs
> >> > which ones can use the simpler lock/unlock mechanisms), so I could use
> >> > some advice/guidance there.
> >> >
> >> > Mitchell - if you want to grab this patch and try it, let me know how
> >> > it goes and I can clean it up for the mailing list:
> >> > https://github.com/w1nk/ath11k-debug/blob/master/one-irq-manage.patch
> >> > (apply to ath11k-qca6390-bringup-202011301608)
> >>
> >> Wink, I want to ask more about your the very interesting
> >> one-irq-manage.patch you wrote. Have you seen the "sched: RT throttling
> >> activated" crash with that patch? If yes, how many times, for example 5
> >> out of 10 times or something like that?
> >>
> >> Or is it so with one-irq-manage.patch the kernel doesn't crash at all? I
> >> didn't quite understand the situation.
> >>
> >> --
> >> https://patchwork.kernel.org/project/linux-wireless/list/
> >>
> >> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
> >
> > Kalle,
> >
> > Sorry for moving the thread :).
>
> No problem, I'll just make extra questions to make sure that I'm
> understanding things correctly :)
>
> > So I've attempted 2 patches that seem to produce varying degrees of
> > success. The single IRQ patch took the crashing behaviour from hard
> > locking immediately, to that stuttering / RT throttling message
> > consistently. So instead of hard locking 9/10 times and stuttering
> > 1/10, it was inverted.
>
> Ok, got it now.
>
> > The second patch disabling the m2 transition (even without the single
> > IRQ patch) seems to have resolved the issues altogether, but at the
> > expense of disabling this m2 state, which I don't have much idea of
> > the consequences..
>
> Sorry, I have missed that. What second patch are you talking about?
>
> Also can you share your /proc/interrupts in full?
>
> --
> https://patchwork.kernel.org/project/linux-wireless/list/
>
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
>
> --
> ath11k mailing list
> ath11k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath11k
Here's interrupts in full , and the short patch after:
CPU0 CPU1 CPU2 CPU3 CPU4
CPU5 CPU6 CPU7
0: 7 0 0 0 0
0 0 0 IO-APIC 2-edge timer
1: 0 0 0 0 0
0 0 2923 IO-APIC 1-edge i8042
8: 0 0 0 0 0
0 0 0 IO-APIC 8-edge rtc0
9: 0 9290 0 0 0
0 0 0 IO-APIC 9-fasteoi acpi
12: 0 0 0 0 0
0 53 0 IO-APIC 12-edge i8042
14: 0 29816 0 0 0
0 0 0 IO-APIC 14-fasteoi INT34C5:00
16: 0 0 0 0 0
10376 0 0 IO-APIC 16-fasteoi intel_ish_ipc,
i801_smbus, idma64.4
27: 0 0 0 0 0
0 0 0 IO-APIC 27-fasteoi idma64.0,
i2c_designware.0
31: 0 0 0 0 0
0 0 0 IO-APIC 31-fasteoi idma64.2,
i2c_designware.2
32: 0 0 0 0 0
0 0 0 IO-APIC 32-fasteoi idma64.3,
i2c_designware.3
40: 9681 777197 27906 0 0
0 0 0 IO-APIC 40-fasteoi idma64.1,
i2c_designware.1
120: 0 0 0 0 0
0 0 0 PCI-MSI 114688-edge PCIe PME, pciehp
121: 0 0 0 0 0
0 0 0 PCI-MSI 118784-edge PCIe PME, pciehp
122: 0 0 0 0 0
0 0 0 PCI-MSI 458752-edge PCIe PME
123: 0 0 0 0 0
0 0 0 PCI-MSI 475136-edge PCIe PME
124: 0 0 1 0 0
0 0 0 PCI-MSI 229376-edge vmd
125: 0 0 0 27 0
0 0 0 PCI-MSI 229377-edge vmd
126: 0 0 0 0 4303
0 0 0 PCI-MSI 229378-edge vmd
127: 0 0 0 0 0
2992 0 434 PCI-MSI 229379-edge vmd
128: 0 0 0 0 0
593 2504 0 PCI-MSI 229380-edge vmd
129: 0 0 0 0 699
0 1061 1873 PCI-MSI 229381-edge vmd
130: 2382 394 0 603 0
0 0 0 PCI-MSI 229382-edge vmd
131: 0 1670 0 406 646
0 0 0 PCI-MSI 229383-edge vmd
132: 692 0 2903 0 0
0 0 0 PCI-MSI 229384-edge vmd
133: 0 518 913 2198 0
0 0 0 PCI-MSI 229385-edge vmd
134: 0 0 0 0 0
0 0 0 PCI-MSI 229386-edge vmd
135: 0 0 0 0 0
0 0 0 PCI-MSI 229387-edge vmd
136: 0 0 0 0 0
0 0 0 PCI-MSI 229388-edge vmd
137: 0 0 0 0 0
0 0 0 PCI-MSI 229389-edge vmd
138: 0 0 0 0 0
0 0 0 PCI-MSI 229390-edge vmd
139: 0 0 0 0 0
0 0 0 PCI-MSI 229391-edge vmd
140: 0 0 0 0 0
0 0 0 PCI-MSI 229392-edge vmd
141: 0 0 0 0 0
0 0 0 PCI-MSI 229393-edge vmd
142: 0 0 0 0 0
0 0 0 PCI-MSI 229394-edge vmd
143: 0 0 0 0 0
0 0 0 VMD-MSI 124 PCIe PME, aerdrv, pcie-dpc
144: 0 0 0 0 0
0 1 0 PCI-MSI 212992-edge xhci_hcd
145: 0 0 0 0 0
0 0 72 PCI-MSI 327680-edge xhci_hcd
146: 6 0 0 0 0
0 0 0 PCI-MSI 45088768-edge rtsx_pci
147: 0 0 0 0 0
0 0 0 VMD-MSI 125 nvme0q0
148: 0 0 0 1859 0
0 0 38399 PCI-MSI 32768-edge i915
149: 0 0 0 0 0
0 0 0 VMD-MSI 126 nvme0q1
150: 0 0 0 0 0
0 0 0 VMD-MSI 127 nvme0q2
151: 0 0 0 0 0
0 0 0 VMD-MSI 128 nvme0q3
152: 0 0 0 0 0
0 0 0 VMD-MSI 129 nvme0q4
153: 0 0 0 0 0
0 0 0 VMD-MSI 130 nvme0q5
154: 0 0 0 0 0
0 0 0 VMD-MSI 131 nvme0q6
155: 0 0 0 0 0
0 0 0 VMD-MSI 132 nvme0q7
156: 0 0 0 0 0
0 0 0 VMD-MSI 133 nvme0q8
157: 0 29816 0 0 0
0 0 0 INT34C5:00 327 DLL0945:00
158: 0 0 0 0 0
0 48 0 PCI-MSI 360448-edge mei_me
159: 0 0 0 0 0
0 0 1134 PCI-MSI 514048-edge AudioDSP
162: 0 0 0 108102 0
0 0 0 PCI-MSI 44564480-edge ce0, ce1, ce2,
ce3, ce5, ce7, ce8, DP_EXT_IRQ, DP_EXT_IRQ, DP_EXT_IRQ, DP_EXT_IRQ,
DP_EXT_IRQ, DP_EXT_IRQ, DP_EXT_IRQ, DP_EXT_IRQ, DP_EXT_IRQ,
DP_EXT_IRQ, bhi, mhi, mhi
NMI: 0 0 0 0 0
0 0 0 Non-maskable interrupts
LOC: 64516 80387 54151 82574 64663
113373 58033 81555 Local timer interrupts
SPU: 0 0 0 0 0
0 0 0 Spurious interrupts
PMI: 0 0 0 0 0
0 0 0 Performance monitoring interrupts
IWI: 5 2 1 760 1
1 0 16078 IRQ work interrupts
RTR: 6 0 0 0 0
0 0 0 APIC ICR read retries
RES: 1834 7304 1432 1807 3015
1552 1417 1498 Rescheduling interrupts
CAL: 21739 26798 28934 22211 22590
28622 22541 20023 Function call interrupts
TLB: 51267 49182 59392 48384 46755
56491 48103 46560 TLB shootdowns
TRM: 2 2 2 2 2
2 2 2 Thermal event interrupts
THR: 0 0 0 0 0
0 0 0 Threshold APIC interrupts
DFR: 0 0 0 0 0
0 0 0 Deferred Error APIC interrupts
MCE: 0 0 0 0 0
0 0 0 Machine check exceptions
MCP: 3 4 4 4 4
4 4 4 Machine check polls
ERR: 16
MIS: 0
PIN: 0 0 0 0 0
0 0 0 Posted-interrupt notification event
NPI: 0 0 0 0 0
0 0 0 Nested posted-interrupt event
PIW: 0 0 0 0 0
0 0 0 Posted-interrupt wakeup event
and the modification that disables m2 state:
diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
index 3de7b1639ec6..20f670c8b129 100644
--- a/drivers/bus/mhi/core/pm.c
+++ b/drivers/bus/mhi/core/pm.c
@@ -55,12 +55,12 @@ static struct mhi_pm_transitions const
dev_state_transitions[] = {
},
{
MHI_PM_M0,
- MHI_PM_M0 | MHI_PM_M2 | MHI_PM_M3_ENTER |
+ MHI_PM_M0 | MHI_PM_M3_ENTER |
MHI_PM_SYS_ERR_DETECT | MHI_PM_SHUTDOWN_PROCESS |
MHI_PM_LD_ERR_FATAL_DETECT | MHI_PM_FW_DL_ERR
},
{
- MHI_PM_M2,
+ MHI_PM_M0,
MHI_PM_M0 | MHI_PM_SYS_ERR_DETECT | MHI_PM_SHUTDOWN_PROCESS |
MHI_PM_LD_ERR_FATAL_DETECT
},
--
ath11k mailing list
ath11k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath11k
next prev parent reply other threads:[~2020-12-09 15:56 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-06 17:38 ath11k: QCA6390 on Dell XPS 13 and kernel crashes Mitchell Nordine
2020-12-06 17:53 ` wi nk
2020-12-06 21:45 ` wi nk
2020-12-07 1:17 ` wi nk
2020-12-07 14:45 ` Mitchell Nordine
2020-12-07 17:01 ` wi nk
2020-12-09 1:52 ` wi nk
2020-12-09 9:43 ` wi nk
2020-12-09 15:28 ` wi nk
2020-12-09 15:35 ` Kalle Valo
2020-12-09 15:39 ` wi nk
2020-12-09 15:50 ` wi nk
2020-12-09 15:50 ` Kalle Valo
2020-12-09 15:55 ` wi nk [this message]
2020-12-09 21:46 ` wi nk
2020-12-11 12:28 ` wi nk
2020-12-12 5:37 ` Kalle Valo
2020-12-12 11:46 ` wi nk
2020-12-12 23:29 ` wi nk
2020-12-13 0:03 ` wi nk
2020-12-13 0:59 ` Mitchell Nordine
2020-12-13 22:09 ` Stephen Liang
2020-12-16 8:50 ` Kalle Valo
-- strict thread matches above, loose matches on Subject: below --
2020-12-02 23:49 Stephen Liang
2020-12-09 15:09 ` Kalle Valo
2020-12-10 3:07 ` Stephen Liang
2020-12-10 7:37 ` Stephen Liang
2020-11-30 16:55 Kalle Valo
2020-11-30 17:02 ` wi nk
2020-12-01 10:17 ` wi nk
2020-12-05 19:17 ` wi nk
2020-12-06 8:05 ` wi nk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHUdJJWNSKw9aAHQBc8Ftne1J+s5KdVfdMLwgWu+g-ZfeDnitA@mail.gmail.com \
--to=wink@technolu.st \
--cc=ath11k@lists.infradead.org \
--cc=kvalo@codeaurora.org \
--cc=mail@mitchellnordine.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.