From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60488) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXX6S-0004V1-JE for qemu-devel@nongnu.org; Mon, 16 Mar 2015 11:39:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YXX6D-0002fc-SG for qemu-devel@nongnu.org; Mon, 16 Mar 2015 11:38:52 -0400 Received: from e28smtp05.in.ibm.com ([122.248.162.5]:32787) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXX6D-0002Yn-91 for qemu-devel@nongnu.org; Mon, 16 Mar 2015 11:38:37 -0400 Received: from /spool/local by e28smtp05.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 16 Mar 2015 21:08:34 +0530 Date: Tue, 17 Mar 2015 02:38:26 +1100 From: Gavin Shan Message-ID: <20150316153826.GA8718@shangw> References: <1426054314-19564-1-git-send-email-gwshan@linux.vnet.ibm.com> <1426054314-19564-2-git-send-email-gwshan@linux.vnet.ibm.com> <1426283487.3643.132.camel@redhat.com> <20150316010459.GA13680@shangw> <1426478732.17565.227.camel@kernel.crashing.org> <20150316143425.GA6946@shangw> <1426518327.3643.177.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426518327.3643.177.camel@redhat.com> Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/3] VFIO: Clear INTx pending state on EEH reset Reply-To: Gavin Shan List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: david@gibson.dropbear.id.au, qemu-ppc@nongnu.org, Gavin Shan , qemu-devel@nongnu.org On Mon, Mar 16, 2015 at 09:05:27AM -0600, Alex Williamson wrote: >On Tue, 2015-03-17 at 01:34 +1100, Gavin Shan wrote: >> On Mon, Mar 16, 2015 at 03:05:32PM +1100, Benjamin Herrenschmidt wrote: >> >On Mon, 2015-03-16 at 12:04 +1100, Gavin Shan wrote: >> >> >> >> (2) QEMU sends IOCTL commands to host to disable MSIx and enable INTx. At >> >> this stage the INTx is still masked. At later point, the guest is requesting >> >> unmasking INTx, which is captured by host. Host checks and founds pending >> >> INTx, which is sent to QEMU. In QEMU INTx handler (vfio_intx_interrupt()), >> >> the mmap'ed regions are disabled, "intx.pending" is set and a timer is started >> >> to reenable mmap'ed regions if "intx.pending" is cleared there. However, >> >> "intx.pending" is only cleared upon BAR access in slow path, which is never >> >> happing. >> >> >> >> (3) After guest disables MSIx and issue EEH reset, the device driver starts >> >> to check its firmware state by reading MMIO register, which isn't completed >> >> by QEMU VFIO BAR slow path (Note: fast path supported by mmaped regions have >> >> been disabled). Eventually, the guest hangs on reading MMIO register. With >> >> this patch applied to QEMU, I didn't see the problem again. >> > >> >Note that it might be a good idea to disable INTx (and synchronize with a cfg >> >read of some sort) around resetting a device. >> > >> >Otherwise, you may hit a known issue if the device is behind a switch and has >> >sent the INTx "assert" message, and not the "deassert" one before it gets reset. >> > >> >That can cause the INTx to effectively be "stuck" in the switch preventing a >> >subsequent one from being delivered. >> > >> >> Yeah, It makes more sense to disable INTx before issuing EEH reset. I verified >> that disabling INTx interrupt upon EEH reset can avoid the issue as well. I'll >> post updated patch accordingly if Alex Williamson doesn't object. > >That sounds like a cleaner approach, but you seem to be skipping >something around why the slow-path clearing of intx.pending isn't >working for you. Step (2) says "... is only cleared upon BAR access in >slow path, which is never happening." Step (3) "the device driver >starts to check its firmware state by reading MMIO register, which isn't >completed by QEMU VFIO BAR slow path". So it sounds like (3) is doing >exactly what should allow the QEMU path INTx state machine to advance, >so why doesn't it work? Thanks, > Thanks for confirm. I'll send out v2 tomorrow. Nope, I'm not skipping why the slow path doesn't work. I didn't understand well on how the QEMU memory model works together with KVM. I need more time to trace and update with the findings. Thanks, Gavin >Alex > >