From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Gibson Subject: Re: Reset problem vs. MMIO emulation, hypercalls, etc... Date: Tue, 7 Aug 2012 22:14:42 +1000 Message-ID: <20120807121442.GN16664@truffala.fritz.box> References: <1343791031.16975.41.camel@pasglop> <501A740F.2000000@redhat.com> <1343938818.6911.9.camel@pasglop> <20120803174113.GA13174@amt.cnet> <1344033008.24037.67.camel@pasglop> <20120806031344.GG16664@truffala.fritz.box> <1344286677.24037.100.camel@pasglop> <20120807013228.GL16664@truffala.fritz.box> <5020D5EB.9060104@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Benjamin Herrenschmidt , Marcelo Tosatti , kvm@vger.kernel.org, Alexander Graf , Paul Mackerras , kvm-ppc@vger.kernel.org To: Avi Kivity Return-path: Content-Disposition: inline In-Reply-To: <5020D5EB.9060104@redhat.com> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Tue, Aug 07, 2012 at 11:46:35AM +0300, Avi Kivity wrote: > On 08/07/2012 04:32 AM, David Gibson wrote: > > On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote: > >> On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote: > >> > So, I'm still trying to nut out the implications for H_CEDE, and think > >> > if there are any other hypercalls that might want to block the guest > >> > for a time. We were considering blocking H_PUT_TCE if qemu devices > >> > had active dma maps on the previously mapped iovas. I'm not sure if > >> > the discussions that led to the inclusion of the qemu IOMMU code > >> > decided that was wholly unnnecessary or just not necessary for the > >> > time being. > >> > >> For "sleeping hcalls" they will simply have to set exit_request to > >> complete the hcall from the kernel perspective, leaving us in a state > >> where the kernel is about to restart at srr0 + 4, along with some other > >> flag (stop or halt) to actually freeze the vcpu. > >> > >> If such an "async" hcall decides to return an error, it can then set > >> gpr3 directly using ioctls before restarting the vcpu. > > > > Yeah, I'd pretty much convinced myself of that by the end of > > yesterday. I hope to send patches implementing these fixes today. > > > > There are also some questions about why our in-kernel H_CEDE works > > kind of differently from x86's hlt instruction implementation (which > > comes out to qemu unless the irqchip is in-kernel as well). I don't > > think we have an urgent problem there though. > > It's the other way round, hlt sleeps in the kernel unless the irqchip is > not in the kernel. That's the same as what I said. We never have irqchip in kernel (because we haven't written that yet) but we still sleep in-kernel for CEDE. I haven't spotted any problem with that, but now I'm wondering if there is one, since x86 don't do it in what seems like the analogous situation. It's possible this works because our decrementer (timer) interrupts are different at the core level from external interrupts coming from the PIC, and *are* handled in kernel, but I haven't actually followed the logic to work out if this is the case. > Meaning the normal state of things is to sleep in > the kernel (whether or not you have an emulated interrupt controller in > the kernel -- the term irqchip in kernel is overloaded for x86). Uh.. overloaded in what way. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Gibson Date: Tue, 07 Aug 2012 12:14:42 +0000 Subject: Re: Reset problem vs. MMIO emulation, hypercalls, etc... Message-Id: <20120807121442.GN16664@truffala.fritz.box> List-Id: References: <1343791031.16975.41.camel@pasglop> <501A740F.2000000@redhat.com> <1343938818.6911.9.camel@pasglop> <20120803174113.GA13174@amt.cnet> <1344033008.24037.67.camel@pasglop> <20120806031344.GG16664@truffala.fritz.box> <1344286677.24037.100.camel@pasglop> <20120807013228.GL16664@truffala.fritz.box> <5020D5EB.9060104@redhat.com> In-Reply-To: <5020D5EB.9060104@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Avi Kivity Cc: Benjamin Herrenschmidt , Marcelo Tosatti , kvm@vger.kernel.org, Alexander Graf , Paul Mackerras , kvm-ppc@vger.kernel.org On Tue, Aug 07, 2012 at 11:46:35AM +0300, Avi Kivity wrote: > On 08/07/2012 04:32 AM, David Gibson wrote: > > On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote: > >> On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote: > >> > So, I'm still trying to nut out the implications for H_CEDE, and think > >> > if there are any other hypercalls that might want to block the guest > >> > for a time. We were considering blocking H_PUT_TCE if qemu devices > >> > had active dma maps on the previously mapped iovas. I'm not sure if > >> > the discussions that led to the inclusion of the qemu IOMMU code > >> > decided that was wholly unnnecessary or just not necessary for the > >> > time being. > >> > >> For "sleeping hcalls" they will simply have to set exit_request to > >> complete the hcall from the kernel perspective, leaving us in a state > >> where the kernel is about to restart at srr0 + 4, along with some other > >> flag (stop or halt) to actually freeze the vcpu. > >> > >> If such an "async" hcall decides to return an error, it can then set > >> gpr3 directly using ioctls before restarting the vcpu. > > > > Yeah, I'd pretty much convinced myself of that by the end of > > yesterday. I hope to send patches implementing these fixes today. > > > > There are also some questions about why our in-kernel H_CEDE works > > kind of differently from x86's hlt instruction implementation (which > > comes out to qemu unless the irqchip is in-kernel as well). I don't > > think we have an urgent problem there though. > > It's the other way round, hlt sleeps in the kernel unless the irqchip is > not in the kernel. That's the same as what I said. We never have irqchip in kernel (because we haven't written that yet) but we still sleep in-kernel for CEDE. I haven't spotted any problem with that, but now I'm wondering if there is one, since x86 don't do it in what seems like the analogous situation. It's possible this works because our decrementer (timer) interrupts are different at the core level from external interrupts coming from the PIC, and *are* handled in kernel, but I haven't actually followed the logic to work out if this is the case. > Meaning the normal state of things is to sleep in > the kernel (whether or not you have an emulated interrupt controller in > the kernel -- the term irqchip in kernel is overloaded for x86). Uh.. overloaded in what way. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson