On Fri, Oct 21, 2016 at 11:46:04AM +0200, Auger Eric wrote: > Hi Marc, > On 21/10/2016 11:40, Marc Zyngier wrote: > > On 21/10/16 10:05, Auger Eric wrote: > >> Hi Marc, > >> > >> On 21/10/2016 10:45, Marc Zyngier wrote: > >>> +Robert > >>> > >>> On 21/10/16 08:01, Auger Eric wrote: > >>>> Hi, > >>>> > >>>> I am not able to boot 4.9-rc1 as a guest on Cavium ThunderX (dt and acpi > >>>> mode). Bisecting the guest shows that the problem shows up at > >>>> > >>>> 91ef84428a86b75a52e15c6fe4f56b446ba75f93 > >>>> irqchip/gic-v3: Reset BPR during initialization > >>>> > >>>> If I remove the write to the ICC_BPR1_EL1 register on guest, the VM boots. > >>> > >>> That's very odd. A ICC_BPR1_EL1 access when HCR_EL2.IMO is set only > >>> affects ICH_VMCR_EL2.VBPR1. It is not trapped, since we don't set > >>> ICH_HCR_EL2.TALL1. It is a very boring sysreg! > >>> > >>> So from a pure architectural point of view, I don't see how this can > >>> fail. I've just run the same configuration on my Freescale board (GICv3 > >>> as well), and can't see any issue at all. > >>> > >>>> Investigating KVM code ... > >>> > >>> What is the failure syndrome? Do you see it crashing? Locking up? What > >>> is the PC at that stage? > >> No guest crash. the guest just locks up. No traces output. > > > > But you're able to kill the guest, right, > Yes I am > and the CPU is not going to > > lalaland. We should be able to put a breakpoint on this instruction > > using qemu + GDB, and step it to find out what's happening. Or even > > execute the instruction in isolation with a bunch of printks in the guest. > Yep I will investigate this afternoon. > You might be able to debug faster with kvm-unit-tests. The attached patch applies to the arm/gic branch of my repo, https://github.com/rhdrjones/kvm-unit-tests/commits/arm/gic and reproduces the issue. Before applying the attached patch, running $ arm/run arm/gic.flat passes. PASS: gicv3: ipi: self: Completed in 100 ms After applying the patch it times-out FAIL: gicv3: ipi: self: Timed-out (5s). ACKS: missing=1 extra=0 unexpected=0 Using the monitor and stopping/starting the vcpu to see what it's doing I confirmed that we're just spinning in udelay waiting for the interrupt. So it appears setting this register to zero disables the vcpu's ability to receive interrupts? I also read the register before writing it and saw it was 3. I tried writing 3 instead of 0 to see what would happen, but the failure persisted. I did read back the register after writing it to confirm the change took affect. Thanks, drew