From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Berger Subject: Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Date: Mon, 24 Jan 2011 22:13:34 -0500 Message-ID: <4D3E3FDE.80805@linux.vnet.ibm.com> References: <4D2C8305.2090609@linux.vnet.ibm.com> <4D2ED260.4010801@redhat.com> <4D30A38F.3030002@linux.vnet.ibm.com> <4D3303FD.8020509@redhat.com> <4D35030E.4080406@linux.vnet.ibm.com> <4D3554F4.6080405@siemens.com> <4D3DC49E.2000100@linux.vnet.ibm.com> <4D3DFE5A.802@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org To: Jan Kiszka Return-path: Received: from e5.ny.us.ibm.com ([32.97.182.145]:42270 "EHLO e5.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751374Ab1AYDNk (ORCPT ); Mon, 24 Jan 2011 22:13:40 -0500 Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85]) by e5.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0P2nxJ8011642 for ; Mon, 24 Jan 2011 21:50:03 -0500 Received: from d01relay05.pok.ibm.com (d01relay05.pok.ibm.com [9.56.227.237]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 7DC2D4DE803F for ; Mon, 24 Jan 2011 22:10:10 -0500 (EST) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay05.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0P3Dck0168916 for ; Mon, 24 Jan 2011 22:13:38 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0P3DcQG016981 for ; Tue, 25 Jan 2011 01:13:38 -0200 In-Reply-To: <4D3DFE5A.802@web.de> Sender: kvm-owner@vger.kernel.org List-ID: On 01/24/2011 05:34 PM, Jan Kiszka wrote: > On 2011-01-24 19:27, Stefan Berger wrote: >> On 01/18/2011 03:53 AM, Jan Kiszka wrote: >>> On 2011-01-18 04:03, Stefan Berger wrote: >>>> On 01/16/2011 09:43 AM, Avi Kivity wrote: >>>>> On 01/14/2011 09:27 PM, Stefan Berger wrote: >>>>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to >>>>>>> verify this? >>>>>>> >>>>>> Here's what I did: >>>>>> >>>>>> >>>>>> interrupt exit requested >>>>> It appears from this you're using qemu.git. Please try qemu-kvm.git, >>>>> where the code appears to be correct. >>>>> >>>> Cc'ing qemu-devel now. For reference, here the initial problem >>>> description: >>>> >>>> http://www.spinics.net/lists/kvm/msg48274.html >>>> >>>> I didn't know there was another tree... >>>> >>>> I have seen now a couple of suspends-while-reading with patches applied >>>> to the qemu-kvm.git tree and indeed, when run with the same host kernel >>>> and VM I do not see the debugging dumps due to double-reads that I would >>>> have anticipated seeing by now. Now what? Can this be easily fixed in >>>> the other Qemu tree as well? >>> Please give this a try: >>> >>> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream >>> >>> I bet (& hope) "kvm: Unconditionally reenter kernel after IO exits" >>> fixes the issue for you. If other problems pop up with that tree, also >>> try resetting to that particular commit. >>> >>> I'm currently trying to shake all those hidden or forgotten bug fixes >>> out of qemu-kvm and port them upstream. Most of those subtle differences >>> should hopefully soon be history. >>> >> I did the same test as I did with Avi's tree and haven't seen the >> consequences of possible double-reads. So, I would say that you should >> upstream those patches... >> >> I searched for the text you mention above using 'gitk' but couldn't find >> a patch with that headline in your tree. There were others that seem to >> be related: >> >> Gleb Natapov: "do not enter vcpu again if it was stopped during IO" > Err, I don't think you checked out queues/kvm-upstream. I bet you just > ran my master branch which is a version of qemu-kvm's master. Am I right? :) > You're right. :-) my lack of git knowledge - checked out the branch now. I redid the testing and it passed. No double-reads and lost bytes from what I could see. >>>> One thing I'd like to mention is that I have seen what I think are >>>> interrupt stalls when running my tests inside the qemu-kvm.git tree >>>> version and not suspending at all. A some point the interrupt counter in >>>> the guest kernel does not increase anymore even though I see the device >>>> model raising the IRQ and lowering it. The same tests run literally >>>> forever in the qemu.git tree version of Qemu. >>> What about qemu-kmv and -no-kvm-irqchip? >> That seems to be necessary for both trees, yours and the one Avi pointed >> me to. If applied, then I did not see the interrupt problem. > And the fact that you were able to call qemu from my tree with > -no-kvm-irqchip just underlines my assumption: that switch is refused by > upstream. Please retry with the latest kvm-upstream queue. > > Besides that, this other bug you may see in the in-kernel IRQ path - how > can we reproduce it? Unfortunately I don't know. Some things have to come together for the code I am working on to become available and useful for everyone. It's going to be a while. Thanks! Stefan > Jan >