From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nadav Amit Subject: Re: x86: Question regarding the reset value of LINT0 Date: Thu, 9 Apr 2015 21:49:12 +0300 Message-ID: <8E93BDA9-CCB5-4D7C-9FF0-0CDBCDB78051@gmail.com> References: <2B474EEE-85C9-47C3-89FF-C56754CFEC0D@gmail.com> <55255AF2.2070706@siemens.com> <06513D06-1629-4AC0-9014-C6D13C29A1FC@gmail.com> <55256004.8030403@siemens.com> <55256A89.3030100@siemens.com> <06DCB70D-52E7-457B-BEEF-051F20136D7A@gmail.com> <5526C4B9.6030101@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Bandan Das , Jan Kiszka , Paolo Bonzini , kvm list To: Avi Kivity Return-path: Received: from mail-wi0-f180.google.com ([209.85.212.180]:35986 "EHLO mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755724AbbDIStQ convert rfc822-to-8bit (ORCPT ); Thu, 9 Apr 2015 14:49:16 -0400 Received: by wizk4 with SMTP id k4so103195665wiz.1 for ; Thu, 09 Apr 2015 11:49:15 -0700 (PDT) In-Reply-To: <5526C4B9.6030101@gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > On 04/09/2015 09:21 PM, Nadav Amit wrote: >> Bandan Das wrote: >>=20 >>> Nadav Amit writes: >>>=20 >>>> Jan Kiszka wrote: >>>>=20 >>>>> On 2015-04-08 19:40, Nadav Amit wrote: >>>>>> Jan Kiszka wrote: >>>>>>=20 >>>>>>> On 2015-04-08 18:59, Nadav Amit wrote: >>>>>>>> Jan Kiszka wrote: >>>>>>>>=20 >>>>>>>>> On 2015-04-08 18:40, Nadav Amit wrote: >>>>>>>>>> Hi, >>>>>>>>>>=20 >>>>>>>>>> I would appreciate if someone explains the reason for enabli= ng LINT0 during >>>>>>>>>> APIC reset. This does not correspond with Intel SDM Figure 1= 0-8: =E2=80=9CLocal >>>>>>>>>> Vector Table=E2=80=9D that says all LVT registers are reset = to 0x10000. >>>>>>>>>>=20 >>>>>>>>>> In kvm_lapic_reset, I see: >>>>>>>>>>=20 >>>>>>>>>> apic_set_reg(apic, APIC_LVT0, >>>>>>>>>> SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT)); >>>>>>>>>>=20 >>>>>>>>>> Which is actually pretty similar to QEMU=E2=80=99s apic_rese= t_common: >>>>>>>>>>=20 >>>>>>>>>> if (bsp) { >>>>>>>>>> /* >>>>>>>>>> * LINT0 delivery mode on CPU #0 is set to ExtInt at ini= tialization >>>>>>>>>> * time typically by BIOS, so PIC interrupt can be deliv= ered to the >>>>>>>>>> * processor when local APIC is enabled. >>>>>>>>>> */ >>>>>>>>>> s->lvt[APIC_LVT_LINT0] =3D 0x700; >>>>>>>>>> } >>>>>>>>>>=20 >>>>>>>>>> Yet, in both cases, I miss the point - if it is typically do= ne by the BIOS, >>>>>>>>>> why does QEMU or KVM enable it? >>>>>>>>>>=20 >>>>>>>>>> BTW: KVM seems to run fine without it, and I think setting i= t causes me >>>>>>>>>> problems in certain cases. >>>>>>>>> I suspect it has some historic BIOS backgrounds. Already trie= d to find >>>>>>>>> more information in the git logs of both code bases? Or somet= hing that >>>>>>>>> indicates of SeaBIOS or BochsBIOS once didn't do this initial= ization? >>>>>>>> Thanks. I found no indication of such thing. >>>>>>>>=20 >>>>>>>> QEMU=E2=80=99s commit message (0e21e12bb311c4c1095d0269dc2ef81= 196ccb60a) says: >>>>>>>>=20 >>>>>>>> Don't route PIC interrupts through the local APIC if the loca= l APIC >>>>>>>> config says so. By Ari Kivity. >>>>>>>>=20 >>>>>>>> Maybe Avi Kivity knows this guy. >>>>>>> ths? That should have been Thiemo Seufer (IIRC), but he just co= mmitted >>>>>>> the code back then (and is no longer with us, sadly). >>>>>> Oh=E2=80=A6 I am sorry - I didn=E2=80=99t know about that.. (I t= ried to make an unfunny joke >>>>>> about Avi knowing =E2=80=9CAri=E2=80=9D). >>>>> Ah. No problem. My brain apparently fixed that typo up unnoticed. >>>>>=20 >>>>>>> But if that commit went in without any BIOS changes around it, = QEMU >>>>>>> simply had to do the job of the latter to keep things working. >>>>>> So should I leave it as is? Can I at least disable in KVM during= INIT (and >>>>>> leave it as is for RESET)? >>>>> No, I don't think there is a need to leave this inaccurate for QE= MU if >>>>> our included BIOS gets it right. I don't know what the backward >>>>> bug-compatibility of KVM is, though. Maybe you can identify since= when >>>>> our BIOS is fine so that we can discuss time frames. >>>> I think that it was addressed in commit >>>> 19c1a7692bf65fc40e56f93ad00cc3eefaad22a4 ("Initialize the LINT LVT= s on the >>>> local APIC of the BSP.=E2=80=9D) So it should be included in seabi= os 0.5.0, which >>>> means qemu 0.12 - so we are talking about the end of 2009 or start= of 2010. >>> The probability that someone will use a newer version of kernel wit= h something >>> as old as 0.12 is probably minimal. I think it's ok to change it wi= th a comment >>> indicating the reason. To be on the safe side, however, a user chan= geable switch >>> is something worth considering. >> I don=E2=80=99t see any existing mechanism for KVM to be aware of it= s user type and >> version. I do see another case of KVM hacks that are intended for fi= xing >> very old QEMU bugs (see 3a624e29c75 changes in vmx_set_segment, whic= h are >> from pretty much the same time-frame of the issue I try to fix). >>=20 >> Since this is something which would follow around, please advise wha= t would >> be the format. A new ioctl that would supply the userspace =E2=80=9C= type=E2=80=9D (according >> to predefined constants) and version? >=20 > That would be madness. KVM shouldn't even know that qemu exists, let = alone > track its versions. >=20 > Simply add a new toggle KVM_USE_STANDARD_LAPIC_LVT_INIT and document = that > userspace MUST use it. Old userspace won't, and will get the old bugg= y > behavior. I fully agree it would be madness. Yet it appears to be a recurring pro= blem. Here are similar problems found from a short search: 1. vmx_set_segment setting segment accessed (3a624e29c75) 2. svm_set_cr0 clearing CD and NW (709ddebf81c) 3. Limited number of MTRRs due to Seabios bus (0d234daf7e0a) Excluding (1) all of the other issues are related to the VM BIOS. Perha= ps KVM should somehow realize which VM BIOS runs? (yes, it sounds just as = bad.) Nadav