From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from m43-15.mailgun.net ([69.72.43.15]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kn1jn-0007U8-EW for ath11k@lists.infradead.org; Wed, 09 Dec 2020 15:50:44 +0000 From: Kalle Valo Subject: Re: ath11k: QCA6390 on Dell XPS 13 and kernel crashes References: <87r1nz9emq.fsf@codeaurora.org> Date: Wed, 09 Dec 2020 17:50:35 +0200 In-Reply-To: (wi nk's message of "Wed, 9 Dec 2020 16:39:16 +0100") Message-ID: <87mtyn9dx0.fsf@codeaurora.org> MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "ath11k" Errors-To: ath11k-bounces+kvalo=adurom.com@lists.infradead.org To: wi nk Cc: "ath11k@lists.infradead.org" , Mitchell Nordine wi nk writes: > On Wed, Dec 9, 2020 at 4:35 PM Kalle Valo wrote: >> >> wi nk writes: >> >> > So I've managed to stabilise my system now, so either the race is >> > gone, or I've done something to win it all the time. So one of the >> > avenues of racing I was chasing at first was in the ath11k driver >> > itself. There are a couple areas where the single/shared IRQ is being >> > forcibly toggled in ways that the documentation says are not great >> > (and the original patch was trying to avoid). Fixing those didn't >> > seem to have much impact on the stability of things (I've included >> > those changes in my patch though). After the last email I was >> > thinking about the MHI side of things a bit more and found a number of >> > call sites that my naive grepping had missed that do the same thing, >> > but via acquiring a lock at the same time. I modified all the calls >> > to *_lock_irq and *_unlock_irq to the lock/unlock - save/restore >> > variants that accept the flags parameter to capture state. I've now >> > booted and loaded the driver 10+ times without a single freeze or >> > crash. I'm not sure all of those modifications are necessary (ie: >> > which things are re-entrant in this single interrupt operating mode vs >> > which ones can use the simpler lock/unlock mechanisms), so I could use >> > some advice/guidance there. >> > >> > Mitchell - if you want to grab this patch and try it, let me know how >> > it goes and I can clean it up for the mailing list: >> > https://github.com/w1nk/ath11k-debug/blob/master/one-irq-manage.patch >> > (apply to ath11k-qca6390-bringup-202011301608) >> >> Wink, I want to ask more about your the very interesting >> one-irq-manage.patch you wrote. Have you seen the "sched: RT throttling >> activated" crash with that patch? If yes, how many times, for example 5 >> out of 10 times or something like that? >> >> Or is it so with one-irq-manage.patch the kernel doesn't crash at all? I >> didn't quite understand the situation. >> >> -- >> https://patchwork.kernel.org/project/linux-wireless/list/ >> >> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches > > Kalle, > > Sorry for moving the thread :). No problem, I'll just make extra questions to make sure that I'm understanding things correctly :) > So I've attempted 2 patches that seem to produce varying degrees of > success. The single IRQ patch took the crashing behaviour from hard > locking immediately, to that stuttering / RT throttling message > consistently. So instead of hard locking 9/10 times and stuttering > 1/10, it was inverted. Ok, got it now. > The second patch disabling the m2 transition (even without the single > IRQ patch) seems to have resolved the issues altogether, but at the > expense of disabling this m2 state, which I don't have much idea of > the consequences.. Sorry, I have missed that. What second patch are you talking about? Also can you share your /proc/interrupts in full? -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches -- ath11k mailing list ath11k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath11k