From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Berger Subject: Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Date: Tue, 25 Jan 2011 11:49:05 -0500 Message-ID: <4D3EFF01.9080608@linux.vnet.ibm.com> References: <4D2C8305.2090609@linux.vnet.ibm.com> <4D2ED260.4010801@redhat.com> <4D30A38F.3030002@linux.vnet.ibm.com> <4D3303FD.8020509@redhat.com> <4D35030E.4080406@linux.vnet.ibm.com> <4D3554F4.6080405@siemens.com> <4D3DC49E.2000100@linux.vnet.ibm.com> <4D3DFE5A.802@web.de> <4D3E3FDE.80805@linux.vnet.ibm.com> <4D3E7B13.5000303@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org To: Jan Kiszka Return-path: Received: from e9.ny.us.ibm.com ([32.97.182.139]:46361 "EHLO e9.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751307Ab1AYQtI (ORCPT ); Tue, 25 Jan 2011 11:49:08 -0500 Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85]) by e9.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0PGOS3B002065 for ; Tue, 25 Jan 2011 11:24:32 -0500 Received: from d01relay05.pok.ibm.com (d01relay05.pok.ibm.com [9.56.227.237]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 057044DE8043 for ; Tue, 25 Jan 2011 11:45:37 -0500 (EST) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay05.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0PGn67e163038 for ; Tue, 25 Jan 2011 11:49:06 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0PGn5Mc016749 for ; Tue, 25 Jan 2011 11:49:05 -0500 In-Reply-To: <4D3E7B13.5000303@web.de> Sender: kvm-owner@vger.kernel.org List-ID: On 01/25/2011 02:26 AM, Jan Kiszka wrote: > > Do you see a chance to look closer at the issue yourself? E.g. > instrument the kernel's irqchip models and dump their states once your > guest is stuck? The device runs on iRQ 3. So I applied this patch here. diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c index 3cece05..8f4f94c 100644 --- a/arch/x86/kvm/i8259.c +++ b/arch/x86/kvm/i8259.c @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level) { int mask, ret = 1; mask = 1<< irq; - if (s->elcr& mask) /* level triggered */ + if (s->elcr& mask) /* level triggered */ { if (level) { ret = !(s->irr& mask); s->irr |= mask; @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level) s->irr&= ~mask; s->last_irr&= ~mask; } - else /* edge triggered */ +if (irq == 3) + printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, s->irr); + } + else /* edge triggered */ { if (level) { if ((s->last_irr& mask) == 0) { ret = !(s->irr& mask); @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level) s->last_irr |= mask; } else s->last_irr&= ~mask; - +if (irq == 3) + printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, s->irr); + } return (s->imr& mask) ? -1 : ret; } @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level) pic_lock(s); if (irq>= 0&& irq< PIC_NUM_PINS) { +if (irq == 3) +printk("%s\n", __FUNCTION__); ret = pic_set_irq1(&s->pics[irq>> 3], irq& 7, level); pic_update_irq(s); trace_kvm_pic_set_irq(irq>> 3, irq& 7, s->pics[irq>> 3].elcr, While it's still working I see this here with the levels changing 0-1-0. Though then it stops and levels are only at '1'. [ 1773.833824] kvm_pic_set_irq [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b [ 1773.834161] kvm_pic_set_irq [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b [ 1773.834193] kvm_pic_set_irq [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b [ 1773.835028] kvm_pic_set_irq [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b [ 1773.835542] kvm_pic_set_irq [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b [ 1773.889892] kvm_pic_set_irq [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b [ 1791.258793] pic_set_irq1 119: level=1, irr = d9 [ 1791.258824] pic_set_irq1 119: level=0, irr = d1 [ 1791.402476] pic_set_irq1 119: level=1, irr = d9 [ 1791.402534] pic_set_irq1 119: level=0, irr = d1 [ 1791.402538] pic_set_irq1 119: level=1, irr = d9 [...] I believe the last 5 shown calls can be ignored. After that the interrupts don't go through anymore. In the device model I see interrupts being raised and cleared. After the last one was cleared in 'my' device model, only interrupts are raised. This looks like as if the interrupt handler in the guest Linux was never run, thus the IRQ is never cleared and we're stuck. Regards, Stefan