From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Berger Subject: Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Date: Wed, 26 Jan 2011 07:05:47 -0500 Message-ID: <4D400E1B.2020007@linux.vnet.ibm.com> References: <4D2C8305.2090609@linux.vnet.ibm.com> <4D2ED260.4010801@redhat.com> <4D30A38F.3030002@linux.vnet.ibm.com> <4D3303FD.8020509@redhat.com> <4D35030E.4080406@linux.vnet.ibm.com> <4D3554F4.6080405@siemens.com> <4D3DC49E.2000100@linux.vnet.ibm.com> <4D3DFE5A.802@web.de> <4D3E3FDE.80805@linux.vnet.ibm.com> <4D3E7B13.5000303@web.de> <4D3EFF01.9080608@linux.vnet.ibm.com> <4D3FD7ED.2000001@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org To: Jan Kiszka Return-path: Received: from e9.ny.us.ibm.com ([32.97.182.139]:52838 "EHLO e9.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752286Ab1AZMFy (ORCPT ); Wed, 26 Jan 2011 07:05:54 -0500 Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85]) by e9.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0QBf266005491 for ; Wed, 26 Jan 2011 06:41:17 -0500 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 4F8874DE8040 for ; Wed, 26 Jan 2011 07:02:22 -0500 (EST) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0QC5rVA191942 for ; Wed, 26 Jan 2011 07:05:53 -0500 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0QC5qWv007664 for ; Wed, 26 Jan 2011 05:05:52 -0700 In-Reply-To: <4D3FD7ED.2000001@web.de> Sender: kvm-owner@vger.kernel.org List-ID: On 01/26/2011 03:14 AM, Jan Kiszka wrote: > On 2011-01-25 17:49, Stefan Berger wrote: >> On 01/25/2011 02:26 AM, Jan Kiszka wrote: >>> Do you see a chance to look closer at the issue yourself? E.g. >>> instrument the kernel's irqchip models and dump their states once your >>> guest is stuck? >> The device runs on iRQ 3. So I applied this patch here. >> >> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c >> index 3cece05..8f4f94c 100644 >> --- a/arch/x86/kvm/i8259.c >> +++ b/arch/x86/kvm/i8259.c >> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state >> *s, int irq, int level) >> { >> int mask, ret = 1; >> mask = 1<< irq; >> - if (s->elcr& mask) /* level triggered */ >> + if (s->elcr& mask) /* level triggered */ { >> if (level) { >> ret = !(s->irr& mask); >> s->irr |= mask; >> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct >> kvm_kpic_state *s, int irq, int level) >> s->irr&= ~mask; >> s->last_irr&= ~mask; >> } >> - else /* edge triggered */ >> +if (irq == 3) >> + printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, >> s->irr); >> + } >> + else /* edge triggered */ { >> if (level) { >> if ((s->last_irr& mask) == 0) { >> ret = !(s->irr& mask); >> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state >> *s, int irq, int level) >> s->last_irr |= mask; >> } else >> s->last_irr&= ~mask; >> - >> +if (irq == 3) >> + printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, >> s->irr); >> + } >> return (s->imr& mask) ? -1 : ret; >> } >> >> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level) >> >> pic_lock(s); >> if (irq>= 0&& irq< PIC_NUM_PINS) { >> +if (irq == 3) >> +printk("%s\n", __FUNCTION__); >> ret = pic_set_irq1(&s->pics[irq>> 3], irq& 7, level); >> pic_update_irq(s); >> trace_kvm_pic_set_irq(irq>> 3, irq& 7, s->pics[irq>> 3].elcr, >> >> >> >> While it's still working I see this here with the levels changing 0-1-0. >> Though then it stops and levels are only at '1'. >> >> [ 1773.833824] kvm_pic_set_irq >> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b >> [ 1773.834161] kvm_pic_set_irq >> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b >> [ 1773.834193] kvm_pic_set_irq >> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b >> [ 1773.835028] kvm_pic_set_irq >> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b >> [ 1773.835542] kvm_pic_set_irq >> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b >> [ 1773.889892] kvm_pic_set_irq >> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b >> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9 >> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1 >> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9 >> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1 >> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9 >> [...] >> >> >> I believe the last 5 shown calls can be ignored. After that the >> interrupts don't go through anymore. >> >> In the device model I see interrupts being raised and cleared. After the >> last one was cleared in 'my' device model, only interrupts are raised. >> This looks like as if the interrupt handler in the guest Linux was never >> run, thus the IRQ is never cleared and we're stuck. >> > User space is responsible for both setting and clearing that line. IRQ3 > means you are using some serial device model? Then you should check what > its state is. Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git) from what I can see. There was no UART on IRQ3 before, though, but certainly it was the wrong IRQ for it. > Moreover, a complete picture of the kernel/user space interaction should > be obtainable by using fstrace for capturing kvm events. > Should it be working on IRQ3? If so, I'd look into it when I get a chance... Stefan