On 2022/12/27 21:11, Mikhail Gavrilov wrote: > On Tue, Dec 27, 2022 at 4:03 PM Qu Wenruo wrote: >> >> I have a similar laptop (G14), only GPU is different (RTX3060), and I >> failed to reproduce this so far... >> >> My gcc is only a small version behind (12.2.0). >> >> Thus none of the hardware seems suspicious at all... >> >> Anyway I have attached my last struggle for the weird problem. >> For now, I have no idea why this can even happen... > > The new Kernel log is attached. > This time, the main difference was that the file system did not > immediately switch to readonly. > The Steam client stopped a couple of times with a write error, but > after pressing the resume button, it resumed downloading. For the > third or fourth time refused to download. > I'm a total idiot. From the very first dmesg with calltrack, it already shows the submit_one_bio() is called from submit_extent_page(), which means cases cross stripe boundary, and has no parent_check populated at all. And since you're using RAID0 on two NVMEs, it matches the symptom, while most tests done here are using single device (DUP and SINGLE), thus no stripe boundary cases at all. (In fact it should still be possible to trigger on SINGLE, but way too hard to trigger) With proper root cause found, this version should mostly handle the regression correctly. This version should mostly be the formal one I'd later send to the mailing list. I can not thank you more for all the testing you have provided, it not only pinned down the bug, but also proves I'm a total idiot... Thanks, Qu