On Wed, Dec 20, 2017 at 11:58:57AM +0800, Dou Liyang wrote: > Hi Thomas, > > At 12/20/2017 08:31 AM, Thomas Gleixner wrote: > > On Tue, 19 Dec 2017, Alexandru Chirvasitu wrote: > > > > > I had never heard of 'bisect' before this casual mention (you might tell > > > I am a bit out of my depth). I've since applied it to Linus' tree between > > > > > bebc608 Linux 4.14 (good) > > > > > > and > > > > > > 4fbd8d1 Linux 4.15-rc1 (bad) > > > > Is Linus current head 4.15-rc4 bad as well? > > > [...] Yes. Exactly the same symptoms on 1291a0d5 Linux 4.15-rc4 compiled just now from Linus' tree. > > > > Thanks for doing that bisect, but unfortunately this commit cannot be the > > problematic one, It merily adds a config symbol, but it does not change any > > code at all. It has no effect whatsoever. So something might have gone > > wrong in your bisecting. > > > > Agree. I agree too, I think.. For what it's worth, 0696d05 which is that commit's parent, *also* exhibits the symptoms (behaves identically). But then I'm at a bit of a loss as to how to use git bisect to pin this down. I'll look into this some more; I'm sure I must have misunderstood something in the documentation. > > > I CC'ed Dou Liyang. He has changed the early APIC setup code and there has > > been an issue reported already. Though I lost track of that. Dou, any > ^^^^^^^^ Is it this one? >             https://marc.info/?l=linux-kernel&m=151188084018443 > > pointers? > > > > Not sure, but seems the APIC failed to start in that 32-bit system. > > I will look into it. > > Alex, > > Could you give me your .config file and the dmesg-log of 4.15.0-rc3. I am attaching the config file I used, as well as the output of 'journalctl --boot=-1 -k' (boot=-1 because that specific boot fails; I need to reboot with a working kernel). Unfortunately, the logs don't show the lockup trace and I don't know how to get them to: I've run grep -r -i "lockup" in /var/log/ with no results. --- Merging the contents of another exchange spawned from the original thread: On Wed, Dec 20, 2017 at 02:12:05AM +0000, Dexuan Cui wrote: > Hi Alexandru, > Is there any chance this could be a build issue somehow? :-) > > I would suggest you check out 4.15.0-rc3 cleanly, run "make mrproper", and generate a proper .config a\ nd build & install the kernel, and double check the grub.cfg to make the correct kernel is used to boot \ the machine. > Hmm.. Perhaps, but I do make clean && make mrproper after every single compile, so I am quite confident that's not it. The config file I used to build 4.15.0-rc3 from Linus' tree is attached. I produced it by taking the current one working for me in 4.14.7 on that same machine and issuing yes "" | make oldconfig > I tried the 4.15.0-rc3 kernel in my Linux VM running on Hyper-V, and everything worked fine. > Unluckily I didn't have a bare metal machine to test the kernel. > > For Linux VM running on Hyper-V, we did get "spurious APIC interrupt through vector " and a patchset, \ which included the patch you identifed ("genirq: Add config option for reservation mode"), was made to f\ ix the issue. But since you're using a physical machine rathter than a VM, I suspect it should be a diff\ erent issue. > > When you have the issue with 4.15.0-rc3, can you try if the kernel pararmeter pci=nomsi would help? If\ it works, we'll have more confidenct to say there is a kernel bug. > It did work! Passing that kernel parameter through the relevant line of grub.cfg corrects 4.15.0-rc3's misbehaviour (that machine is now booted into it and logged in with no issue). > Good luck! > > Thanks, > -- Dexuan