From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Hildenbrand Subject: Re: [s390] possible deadlock in handle_sigp? Date: Mon, 19 Sep 2016 13:25:47 +0200 Message-ID: <20160919132547.507bc252@thinkpad-w530> References: <33773797-04ec-413f-7ba2-4bb7a4350a44@de.ibm.com> <20160915212142.5fd5048e@thinkpad-w530> <18e8d5ed-c6f7-4617-0426-be203beb1965@de.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Paolo Bonzini , KVM list , Cornelia Huck , qemu-devel To: Christian Borntraeger Return-path: Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:42966 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751713AbcISLZ6 (ORCPT ); Mon, 19 Sep 2016 07:25:58 -0400 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8JBNGT6118651 for ; Mon, 19 Sep 2016 07:25:58 -0400 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 25h2bjdjxc-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 19 Sep 2016 07:25:57 -0400 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 19 Sep 2016 12:25:54 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 2CB191B08070 for ; Mon, 19 Sep 2016 12:27:43 +0100 (BST) Received: from d06av08.portsmouth.uk.ibm.com (d06av08.portsmouth.uk.ibm.com [9.149.37.249]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u8JBPqCG49414238 for ; Mon, 19 Sep 2016 11:25:52 GMT Received: from d06av08.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av08.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u8JBPpQw003787 for ; Mon, 19 Sep 2016 05:25:52 -0600 In-Reply-To: <18e8d5ed-c6f7-4617-0426-be203beb1965@de.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: > On 09/15/2016 09:21 PM, David Hildenbrand wrote: > >> On 09/12/2016 08:03 PM, Paolo Bonzini wrote: > >>> > >>> > >>> On 12/09/2016 19:37, Christian Borntraeger wrote: > >>>> On 09/12/2016 06:44 PM, Paolo Bonzini wrote: > >>>>> I think that two CPUs doing reciprocal SIGPs could in principle end up > >>>>> waiting on each other to complete their run_on_cpu. If the SIGP has to > >>>>> be synchronous the fix is not trivial (you'd have to put the CPU in a > >>>>> state similar to cpu->halted = 1), otherwise it's enough to replace > >>>>> run_on_cpu with async_run_on_cpu. > >>>> > >>>> IIRC the sigps are supossed to be serialized by the big QEMU lock. WIll > >>>> have a look. > >>> > >>> Yes, but run_on_cpu drops it when it waits on the qemu_work_cond > >>> condition variable. (Related: I stumbled upon it because I wanted to > >>> remove the BQL from run_on_cpu work items). > >> > >> Yes, seems you are right. If both CPUs have just exited from KVM doing a > >> crossover sigp, they will do the arch_exit handling before the run_on_cpu > >> stuff which might result in this hang. (luckily it seems quite unlikely > >> but still we need to fix it). > >> We cannot simply use async as the callbacks also provide the condition > >> code for the initiater, so this requires some rework. > >> > >> > > > > Smells like having to provide a lock per CPU. Trylock that lock, if that's not > > possible, cc=busy. SIGP SET ARCHITECTURE has to lock all CPUs. > > > > That was the initital design, until I realized that this was all protected by > > the BQL. > > > > David > > We only do the slow path things in QEMU. Maybe we could just have one lock that > we trylock and return a condition code of 2 (busy) if we fail. That seems the > most simple solution while still being architecturally correct. Something like According to the architecture, CC=busy is returned in case the access path to the CPU is busy. So this might not be optimal but should work for now. > > > diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c > index f348745..5706218 100644 > --- a/target-s390x/kvm.c > +++ b/target-s390x/kvm.c > @@ -133,6 +133,8 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { > KVM_CAP_LAST_INFO > }; > > +static QemuMutex qemu_sigp_mutex; > + > static int cap_sync_regs; > static int cap_async_pf; > static int cap_mem_op; > @@ -358,6 +360,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s) > rc = compat_disable_facilities(s, fac_mask, ARRAY_SIZE(fac_mask)); > } > > + qemu_mutex_init(&qemu_sigp_mutex); > + > return rc; > } > > @@ -1845,6 +1849,11 @@ static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1) > status_reg = &env->regs[r1]; > param = (r1 % 2) ? env->regs[r1] : env->regs[r1 + 1]; > > + if (qemu_mutex_trylock(&qemu_sigp_mutex)) { > + setcc(cpu, SIGP_CC_BUSY ); > + return 0; > + } > + > switch (order) { > case SIGP_SET_ARCH: > ret = sigp_set_architecture(cpu, param, status_reg); > @@ -1854,6 +1863,7 @@ static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1) > dst_cpu = s390_cpu_addr2state(env->regs[r3]); > ret = handle_sigp_single_dst(dst_cpu, order, param, status_reg); > } > + qemu_mutex_unlock(&qemu_sigp_mutex); > > trace_kvm_sigp_finished(order, CPU(cpu)->cpu_index, > dst_cpu ? CPU(dst_cpu)->cpu_index : -1, ret); > > > This makes SET ARCHITECTURE handling much more easier. David From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42222) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1blwi3-000629-QJ for qemu-devel@nongnu.org; Mon, 19 Sep 2016 07:26:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1blwhy-000391-QM for qemu-devel@nongnu.org; Mon, 19 Sep 2016 07:26:03 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:57259 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1blwhy-00038l-Ls for qemu-devel@nongnu.org; Mon, 19 Sep 2016 07:25:58 -0400 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8JBNGVx100532 for ; Mon, 19 Sep 2016 07:25:57 -0400 Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) by mx0b-001b2d01.pphosted.com with ESMTP id 25h1ykx3va-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 19 Sep 2016 07:25:57 -0400 Received: from localhost by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 19 Sep 2016 12:25:55 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id 84DAA219006A for ; Mon, 19 Sep 2016 12:25:12 +0100 (BST) Received: from d06av08.portsmouth.uk.ibm.com (d06av08.portsmouth.uk.ibm.com [9.149.37.249]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u8JBPqx43801598 for ; Mon, 19 Sep 2016 11:25:52 GMT Received: from d06av08.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av08.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u8JBPpQs003787 for ; Mon, 19 Sep 2016 05:25:51 -0600 Date: Mon, 19 Sep 2016 13:25:47 +0200 From: David Hildenbrand In-Reply-To: <18e8d5ed-c6f7-4617-0426-be203beb1965@de.ibm.com> References: <33773797-04ec-413f-7ba2-4bb7a4350a44@de.ibm.com> <20160915212142.5fd5048e@thinkpad-w530> <18e8d5ed-c6f7-4617-0426-be203beb1965@de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: <20160919132547.507bc252@thinkpad-w530> Subject: Re: [Qemu-devel] [s390] possible deadlock in handle_sigp? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Christian Borntraeger Cc: Paolo Bonzini , KVM list , Cornelia Huck , qemu-devel > On 09/15/2016 09:21 PM, David Hildenbrand wrote: > >> On 09/12/2016 08:03 PM, Paolo Bonzini wrote: > >>> > >>> > >>> On 12/09/2016 19:37, Christian Borntraeger wrote: > >>>> On 09/12/2016 06:44 PM, Paolo Bonzini wrote: > >>>>> I think that two CPUs doing reciprocal SIGPs could in principle end up > >>>>> waiting on each other to complete their run_on_cpu. If the SIGP has to > >>>>> be synchronous the fix is not trivial (you'd have to put the CPU in a > >>>>> state similar to cpu->halted = 1), otherwise it's enough to replace > >>>>> run_on_cpu with async_run_on_cpu. > >>>> > >>>> IIRC the sigps are supossed to be serialized by the big QEMU lock. WIll > >>>> have a look. > >>> > >>> Yes, but run_on_cpu drops it when it waits on the qemu_work_cond > >>> condition variable. (Related: I stumbled upon it because I wanted to > >>> remove the BQL from run_on_cpu work items). > >> > >> Yes, seems you are right. If both CPUs have just exited from KVM doing a > >> crossover sigp, they will do the arch_exit handling before the run_on_cpu > >> stuff which might result in this hang. (luckily it seems quite unlikely > >> but still we need to fix it). > >> We cannot simply use async as the callbacks also provide the condition > >> code for the initiater, so this requires some rework. > >> > >> > > > > Smells like having to provide a lock per CPU. Trylock that lock, if that's not > > possible, cc=busy. SIGP SET ARCHITECTURE has to lock all CPUs. > > > > That was the initital design, until I realized that this was all protected by > > the BQL. > > > > David > > We only do the slow path things in QEMU. Maybe we could just have one lock that > we trylock and return a condition code of 2 (busy) if we fail. That seems the > most simple solution while still being architecturally correct. Something like According to the architecture, CC=busy is returned in case the access path to the CPU is busy. So this might not be optimal but should work for now. > > > diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c > index f348745..5706218 100644 > --- a/target-s390x/kvm.c > +++ b/target-s390x/kvm.c > @@ -133,6 +133,8 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { > KVM_CAP_LAST_INFO > }; > > +static QemuMutex qemu_sigp_mutex; > + > static int cap_sync_regs; > static int cap_async_pf; > static int cap_mem_op; > @@ -358,6 +360,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s) > rc = compat_disable_facilities(s, fac_mask, ARRAY_SIZE(fac_mask)); > } > > + qemu_mutex_init(&qemu_sigp_mutex); > + > return rc; > } > > @@ -1845,6 +1849,11 @@ static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1) > status_reg = &env->regs[r1]; > param = (r1 % 2) ? env->regs[r1] : env->regs[r1 + 1]; > > + if (qemu_mutex_trylock(&qemu_sigp_mutex)) { > + setcc(cpu, SIGP_CC_BUSY ); > + return 0; > + } > + > switch (order) { > case SIGP_SET_ARCH: > ret = sigp_set_architecture(cpu, param, status_reg); > @@ -1854,6 +1863,7 @@ static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1) > dst_cpu = s390_cpu_addr2state(env->regs[r3]); > ret = handle_sigp_single_dst(dst_cpu, order, param, status_reg); > } > + qemu_mutex_unlock(&qemu_sigp_mutex); > > trace_kvm_sigp_finished(order, CPU(cpu)->cpu_index, > dst_cpu ? CPU(dst_cpu)->cpu_index : -1, ret); > > > This makes SET ARCHITECTURE handling much more easier. David