From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Date: Tue, 25 Jan 2011 08:26:11 +0100 Message-ID: <4D3E7B13.5000303@web.de> References: <4D2C8305.2090609@linux.vnet.ibm.com> <4D2ED260.4010801@redhat.com> <4D30A38F.3030002@linux.vnet.ibm.com> <4D3303FD.8020509@redhat.com> <4D35030E.4080406@linux.vnet.ibm.com> <4D3554F4.6080405@siemens.com> <4D3DC49E.2000100@linux.vnet.ibm.com> <4D3DFE5A.802@web.de> <4D3E3FDE.80805@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigE8E167B8EDC8603D3309611E" Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org To: Stefan Berger Return-path: Received: from fmmailgate02.web.de ([217.72.192.227]:46683 "EHLO fmmailgate02.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751233Ab1AYH0o (ORCPT ); Tue, 25 Jan 2011 02:26:44 -0500 In-Reply-To: <4D3E3FDE.80805@linux.vnet.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigE8E167B8EDC8603D3309611E Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable On 2011-01-25 04:13, Stefan Berger wrote: > On 01/24/2011 05:34 PM, Jan Kiszka wrote: >> On 2011-01-24 19:27, Stefan Berger wrote: >>> On 01/18/2011 03:53 AM, Jan Kiszka wrote: >>>> On 2011-01-18 04:03, Stefan Berger wrote: >>>>> On 01/16/2011 09:43 AM, Avi Kivity wrote: >>>>>> On 01/14/2011 09:27 PM, Stefan Berger wrote: >>>>>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) t= o >>>>>>>> verify this? >>>>>>>> >>>>>>> Here's what I did: >>>>>>> >>>>>>> >>>>>>> interrupt exit requested >>>>>> It appears from this you're using qemu.git. Please try qemu-kvm.g= it, >>>>>> where the code appears to be correct. >>>>>> >>>>> Cc'ing qemu-devel now. For reference, here the initial problem >>>>> description: >>>>> >>>>> http://www.spinics.net/lists/kvm/msg48274.html >>>>> >>>>> I didn't know there was another tree... >>>>> >>>>> I have seen now a couple of suspends-while-reading with patches >>>>> applied >>>>> to the qemu-kvm.git tree and indeed, when run with the same host >>>>> kernel >>>>> and VM I do not see the debugging dumps due to double-reads that I >>>>> would >>>>> have anticipated seeing by now. Now what? Can this be easily fixed = in >>>>> the other Qemu tree as well? >>>> Please give this a try: >>>> >>>> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream >>>> >>>> I bet (& hope) "kvm: Unconditionally reenter kernel after IO exits= " >>>> fixes the issue for you. If other problems pop up with that tree, al= so >>>> try resetting to that particular commit. >>>> >>>> I'm currently trying to shake all those hidden or forgotten bug fixe= s >>>> out of qemu-kvm and port them upstream. Most of those subtle >>>> differences >>>> should hopefully soon be history. >>>> >>> I did the same test as I did with Avi's tree and haven't seen the >>> consequences of possible double-reads. So, I would say that you shoul= d >>> upstream those patches... >>> >>> I searched for the text you mention above using 'gitk' but couldn't f= ind >>> a patch with that headline in your tree. There were others that seem = to >>> be related: >>> >>> Gleb Natapov: "do not enter vcpu again if it was stopped during IO" >> Err, I don't think you checked out queues/kvm-upstream. I bet you just= >> ran my master branch which is a version of qemu-kvm's master. Am I >> right? :) >> >=20 > You're right. :-) my lack of git knowledge - checked out the branch no= w. >=20 > I redid the testing and it passed. No double-reads and lost bytes from > what I could see. Great, thanks. >=20 >>>>> One thing I'd like to mention is that I have seen what I think are >>>>> interrupt stalls when running my tests inside the qemu-kvm.git tree= >>>>> version and not suspending at all. A some point the interrupt >>>>> counter in >>>>> the guest kernel does not increase anymore even though I see the >>>>> device >>>>> model raising the IRQ and lowering it. The same tests run literally= >>>>> forever in the qemu.git tree version of Qemu. >>>> What about qemu-kmv and -no-kvm-irqchip? >>> That seems to be necessary for both trees, yours and the one Avi poin= ted >>> me to. If applied, then I did not see the interrupt problem. >> And the fact that you were able to call qemu from my tree with >> -no-kvm-irqchip just underlines my assumption: that switch is refused = by >> upstream. Please retry with the latest kvm-upstream queue. >> >> Besides that, this other bug you may see in the in-kernel IRQ path - h= ow >> can we reproduce it? > Unfortunately I don't know. Some things have to come together for the > code I am working on to become available and useful for everyone. It's > going to be a while. Do you see a chance to look closer at the issue yourself? E.g. instrument the kernel's irqchip models and dump their states once your guest is stuck? Jan --------------enigE8E167B8EDC8603D3309611E Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk0+exYACgkQitSsb3rl5xSn6ACfQytL+4JDXFW6pjTUYR8izWKa lLkAoLS8NFQGr1yG7CqIPFF7HXocpV22 =o0Nn -----END PGP SIGNATURE----- --------------enigE8E167B8EDC8603D3309611E--