From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mail-ej1-x642.google.com ([2a00:1450:4864:20::642]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kn1Yv-0005og-1k for ath11k@lists.infradead.org; Wed, 09 Dec 2020 15:39:30 +0000 Received: by mail-ej1-x642.google.com with SMTP id ce23so2734907ejb.8 for ; Wed, 09 Dec 2020 07:39:28 -0800 (PST) MIME-Version: 1.0 References: <87r1nz9emq.fsf@codeaurora.org> In-Reply-To: <87r1nz9emq.fsf@codeaurora.org> From: wi nk Date: Wed, 9 Dec 2020 16:39:16 +0100 Message-ID: Subject: Re: ath11k: QCA6390 on Dell XPS 13 and kernel crashes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "ath11k" Errors-To: ath11k-bounces+kvalo=adurom.com@lists.infradead.org To: Kalle Valo Cc: "ath11k@lists.infradead.org" , Mitchell Nordine On Wed, Dec 9, 2020 at 4:35 PM Kalle Valo wrote: > > wi nk writes: > > > So I've managed to stabilise my system now, so either the race is > > gone, or I've done something to win it all the time. So one of the > > avenues of racing I was chasing at first was in the ath11k driver > > itself. There are a couple areas where the single/shared IRQ is being > > forcibly toggled in ways that the documentation says are not great > > (and the original patch was trying to avoid). Fixing those didn't > > seem to have much impact on the stability of things (I've included > > those changes in my patch though). After the last email I was > > thinking about the MHI side of things a bit more and found a number of > > call sites that my naive grepping had missed that do the same thing, > > but via acquiring a lock at the same time. I modified all the calls > > to *_lock_irq and *_unlock_irq to the lock/unlock - save/restore > > variants that accept the flags parameter to capture state. I've now > > booted and loaded the driver 10+ times without a single freeze or > > crash. I'm not sure all of those modifications are necessary (ie: > > which things are re-entrant in this single interrupt operating mode vs > > which ones can use the simpler lock/unlock mechanisms), so I could use > > some advice/guidance there. > > > > Mitchell - if you want to grab this patch and try it, let me know how > > it goes and I can clean it up for the mailing list: > > https://github.com/w1nk/ath11k-debug/blob/master/one-irq-manage.patch > > (apply to ath11k-qca6390-bringup-202011301608) > > Wink, I want to ask more about your the very interesting > one-irq-manage.patch you wrote. Have you seen the "sched: RT throttling > activated" crash with that patch? If yes, how many times, for example 5 > out of 10 times or something like that? > > Or is it so with one-irq-manage.patch the kernel doesn't crash at all? I > didn't quite understand the situation. > > -- > https://patchwork.kernel.org/project/linux-wireless/list/ > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches Kalle, Sorry for moving the thread :). So I've attempted 2 patches that seem to produce varying degrees of success. The single IRQ patch took the crashing behaviour from hard locking immediately, to that stuttering / RT throttling message consistently. So instead of hard locking 9/10 times and stuttering 1/10, it was inverted. The second patch disabling the m2 transition (even without the single IRQ patch) seems to have resolved the issues altogether, but at the expense of disabling this m2 state, which I don't have much idea of the consequences.. -- ath11k mailing list ath11k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath11k