From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Date: Wed, 26 Jan 2011 14:31:37 +0100 Message-ID: <4D402239.7090403@web.de> References: <4D2C8305.2090609@linux.vnet.ibm.com> <4D2ED260.4010801@redhat.com> <4D30A38F.3030002@linux.vnet.ibm.com> <4D3303FD.8020509@redhat.com> <4D35030E.4080406@linux.vnet.ibm.com> <4D3554F4.6080405@siemens.com> <4D3DC49E.2000100@linux.vnet.ibm.com> <4D3DFE5A.802@web.de> <4D3E3FDE.80805@linux.vnet.ibm.com> <4D3E7B13.5000303@web.de> <4D3EFF01.9080608@linux.vnet.ibm.com> <4D3FD7ED.2000001@web.de> <4D400E1B.2020007@linux.vnet.ibm.com> <4D400EF5.5090105@web.de> <4D401CE0.4030909@linux.vnet.ibm.com> <4D401E61.1030408@web.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigD5AB4FBED64DC5DD184B47E8" Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org To: Stefan Berger Return-path: Received: from fmmailgate01.web.de ([217.72.192.221]:48524 "EHLO fmmailgate01.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752732Ab1AZNbt (ORCPT ); Wed, 26 Jan 2011 08:31:49 -0500 In-Reply-To: <4D401E61.1030408@web.de> Sender: kvm-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigD5AB4FBED64DC5DD184B47E8 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable On 2011-01-26 14:15, Jan Kiszka wrote: > On 2011-01-26 14:08, Stefan Berger wrote: >> On 01/26/2011 07:09 AM, Jan Kiszka wrote: >>> On 2011-01-26 13:05, Stefan Berger wrote: >>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote: >>>>> On 2011-01-25 17:49, Stefan Berger wrote: >>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote: >>>>>>> Do you see a chance to look closer at the issue yourself? E.g. >>>>>>> instrument the kernel's irqchip models and dump their states once= >>>>>>> your >>>>>>> guest is stuck? >>>>>> The device runs on iRQ 3. So I applied this patch here. >>>>>> >>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c >>>>>> index 3cece05..8f4f94c 100644 >>>>>> --- a/arch/x86/kvm/i8259.c >>>>>> +++ b/arch/x86/kvm/i8259.c >>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct >>>>>> kvm_kpic_state >>>>>> *s, int irq, int level) >>>>>> { >>>>>> int mask, ret =3D 1; >>>>>> mask =3D 1<< irq; >>>>>> - if (s->elcr& mask) /* level triggered */ >>>>>> + if (s->elcr& mask) /* level triggered */ { >>>>>> if (level) { >>>>>> ret =3D !(s->irr& mask); >>>>>> s->irr |=3D mask; >>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct >>>>>> kvm_kpic_state *s, int irq, int level) >>>>>> s->irr&=3D ~mask; >>>>>> s->last_irr&=3D ~mask; >>>>>> } >>>>>> - else /* edge triggered */ >>>>>> +if (irq =3D=3D 3) >>>>>> + printk("%s %d: level=3D%d, irr =3D %x\n", >>>>>> __FUNCTION__,__LINE__,level, >>>>>> s->irr); >>>>>> + } >>>>>> + else /* edge triggered */ { >>>>>> if (level) { >>>>>> if ((s->last_irr& mask) =3D=3D 0) { >>>>>> ret =3D !(s->irr& mask); >>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct >>>>>> kvm_kpic_state >>>>>> *s, int irq, int level) >>>>>> s->last_irr |=3D mask; >>>>>> } else >>>>>> s->last_irr&=3D ~mask; >>>>>> - >>>>>> +if (irq =3D=3D 3) >>>>>> + printk("%s %d: level=3D%d, irr =3D %x\n", >>>>>> __FUNCTION__,__LINE__,level, >>>>>> s->irr); >>>>>> + } >>>>>> return (s->imr& mask) ? -1 : ret; >>>>>> } >>>>>> >>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int= >>>>>> level) >>>>>> >>>>>> pic_lock(s); >>>>>> if (irq>=3D 0&& irq< PIC_NUM_PINS) { >>>>>> +if (irq =3D=3D 3) >>>>>> +printk("%s\n", __FUNCTION__); >>>>>> ret =3D pic_set_irq1(&s->pics[irq>> 3], irq& 7, l= evel); >>>>>> pic_update_irq(s); >>>>>> trace_kvm_pic_set_irq(irq>> 3, irq& 7, s->pics[ir= q>> >>>>>> 3].elcr, >>>>>> >>>>>> >>>>>> >>>>>> While it's still working I see this here with the levels changing >>>>>> 0-1-0. >>>>>> Though then it stops and levels are only at '1'. >>>>>> >>>>>> [ 1773.833824] kvm_pic_set_irq >>>>>> [ 1773.833827] pic_set_irq1 131: level=3D0, irr =3D 5b >>>>>> [ 1773.834161] kvm_pic_set_irq >>>>>> [ 1773.834163] pic_set_irq1 131: level=3D1, irr =3D 5b >>>>>> [ 1773.834193] kvm_pic_set_irq >>>>>> [ 1773.834195] pic_set_irq1 131: level=3D0, irr =3D 5b >>>>>> [ 1773.835028] kvm_pic_set_irq >>>>>> [ 1773.835031] pic_set_irq1 131: level=3D1, irr =3D 5b >>>>>> [ 1773.835542] kvm_pic_set_irq >>>>>> [ 1773.835545] pic_set_irq1 131: level=3D1, irr =3D 5b >>>>>> [ 1773.889892] kvm_pic_set_irq >>>>>> [ 1773.889894] pic_set_irq1 131: level=3D1, irr =3D 5b >>>>>> [ 1791.258793] pic_set_irq1 119: level=3D1, irr =3D d9 >>>>>> [ 1791.258824] pic_set_irq1 119: level=3D0, irr =3D d1 >>>>>> [ 1791.402476] pic_set_irq1 119: level=3D1, irr =3D d9 >>>>>> [ 1791.402534] pic_set_irq1 119: level=3D0, irr =3D d1 >>>>>> [ 1791.402538] pic_set_irq1 119: level=3D1, irr =3D d9 >>>>>> [...] >>>>>> >>>>>> >>>>>> I believe the last 5 shown calls can be ignored. After that the >>>>>> interrupts don't go through anymore. >>>>>> >>>>>> In the device model I see interrupts being raised and cleared. >>>>>> After the >>>>>> last one was cleared in 'my' device model, only interrupts are rai= sed. >>>>>> This looks like as if the interrupt handler in the guest Linux was= >>>>>> never >>>>>> run, thus the IRQ is never cleared and we're stuck. >>>>>> >>>>> User space is responsible for both setting and clearing that line. = IRQ3 >>>>> means you are using some serial device model? Then you should check= >>>>> what >>>>> its state is. >>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-g= it) >>>> from what I can see. There was no UART on IRQ3 before, though, but >>>> certainly it was the wrong IRQ for it. >>>>> Moreover, a complete picture of the kernel/user space interaction >>>>> should >>>>> be obtainable by using fstrace for capturing kvm events. >>>>> >>>> Should it be working on IRQ3? If so, I'd look into it when I get a >>>> chance... >>> I don't know your customizations, so it's hard to tell if that should= >>> work or not. IRQ3 is intended to be used by ISA devices on the PC >>> machine. Are you adding an ISA model, or what is your use case? >>> >> The use case is to add a TPM device interface. >> >> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/i= oemu/hw/tpm_tis.c >> >> >> This one typically is connected to the LPC bus. >=20 > I see. Do you also have the xen-free version of it? Maybe there are > still issues with proper qdev integration etc. >=20 Without knowing the hardware spec or what is actually behind set_irq, this looks at least suspicious: [...] if (off =3D=3D TPM_REG_INT_STATUS) { /* clearing of interrupt flags */ if ((val & INTERRUPTS_SUPPORTED) && (s->loc[locty].ints & INTERRUPTS_SUPPORTED)) { s->set_irq(s->irq_opaque, s->irq, 0); s->irq_pending =3D 0; } s->loc[locty].ints &=3D ~(val & INTERRUPTS_SUPPORTED); } else [...] The code does no t check if there are ints left after masking out those provided in val. Does that device already de-asserts the line if you only clear a single interrupt reason? BTW, irq_pending looks redundant, at least when using the qemu irq subsystem. Jan --------------enigD5AB4FBED64DC5DD184B47E8 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk1AIjkACgkQitSsb3rl5xQB0gCcC5pf0BHUeAw588oY+1+5W9BU 4UMAoKi1LjqiVqopDTkOWqnmxsO97eNl =/lbA -----END PGP SIGNATURE----- --------------enigD5AB4FBED64DC5DD184B47E8--