From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82577C433ED for ; Wed, 5 May 2021 17:27:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50C4761157 for ; Wed, 5 May 2021 17:27:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237641AbhEER2K (ORCPT ); Wed, 5 May 2021 13:28:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:60882 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238549AbhEERGI (ORCPT ); Wed, 5 May 2021 13:06:08 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E59C461402; Wed, 5 May 2021 16:46:54 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1leKfk-00B35v-PD; Wed, 05 May 2021 17:46:52 +0100 Date: Wed, 05 May 2021 17:46:51 +0100 Message-ID: <875yzxnn5w.wl-maz@kernel.org> From: Marc Zyngier To: Zenghui Yu Cc: , , , , Will Deacon , James Morse , Julien Thierry , Suzuki K Poulose , Andrew Scull , Mark Rutland , Quentin Perret , David Brazdil Subject: Re: [PATCH v2 03/11] KVM: arm64: Make kvm_skip_instr() and co private to HYP In-Reply-To: References: <20201102164045.264512-1-maz@kernel.org> <20201102164045.264512-4-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: yuzenghui@huawei.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, kernel-team@android.com, will@kernel.org, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, ascull@google.com, mark.rutland@arm.com, qperret@google.com, dbrazdil@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Zenghui, On Wed, 05 May 2021 15:23:02 +0100, Zenghui Yu wrote: > > Hi Marc, > > On 2020/11/3 0:40, Marc Zyngier wrote: > > In an effort to remove the vcpu PC manipulations from EL1 on nVHE > > systems, move kvm_skip_instr() to be HYP-specific. EL1's intent > > to increment PC post emulation is now signalled via a flag in the > > vcpu structure. > > > > Signed-off-by: Marc Zyngier > > [...] > > > @@ -133,6 +134,8 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) > > __load_guest_stage2(vcpu->arch.hw_mmu); > > __activate_traps(vcpu); > > + __adjust_pc(vcpu); > > If the INCREMENT_PC flag was set (e.g., for WFx emulation) while we're > handling PSCI CPU_ON call targetting this VCPU, the *target_pc* (aka > entry point address, normally provided by the primary VCPU) will be > unexpectedly incremented here. That's pretty bad, I think. How can you online a CPU using PSCI if that CPU is currently spinning on a WFI? Or is that we have transitioned via userspace to perform the vcpu reset? I can imagine it happening in that case. > This was noticed with a latest guest kernel, at least with commit > dccc9da22ded ("arm64: Improve parking of stopped CPUs"), which put the > stopped VCPUs in the WFx loop. The guest kernel shouted at me that > > "CPU: CPUs started in inconsistent modes" Ah, the perks of running guests with "quiet"... Well caught. > *after* rebooting. The problem is that the secondary entry point was > corrupted by KVM as explained above. All of the secondary processors > started from set_cpu_boot_mode_flag(), with w0=0. Oh well... > > I write the below diff and guess it will help. But I have to look at all > other places where we adjust PC directly to make a right fix. Please let > me know what do you think. > > > Thanks, > Zenghui > > ---->8---- > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c > index 956cdc240148..ed647eb387c3 100644 > --- a/arch/arm64/kvm/reset.c > +++ b/arch/arm64/kvm/reset.c > @@ -265,7 +265,12 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) > if (vcpu->arch.reset_state.be) > kvm_vcpu_set_be(vcpu); > > + /* > + * Don't bother with the KVM_ARM64_INCREMENT_PC flag while > + * using this version of __adjust_pc(). > + */ > *vcpu_pc(vcpu) = target_pc; > + vcpu->arch.flags &= ~KVM_ARM64_INCREMENT_PC; I think you need to make it a lot stronger: any PC-altering flag will do the wrong thing here. I'd go and clear all the exception bits: Thanks, M. diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 956cdc240148..54913612d602 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -265,6 +265,12 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) if (vcpu->arch.reset_state.be) kvm_vcpu_set_be(vcpu); + /* + * We're reseting the CPU, make sure there is no + * pending exception or other PC-altering event. + */ + vcpu->arch.flags &= ~(KVM_ARM64_PENDING_EXCEPTION | + KVM_ARM64_EXCEPT_MASK); *vcpu_pc(vcpu) = target_pc; vcpu_set_reg(vcpu, 0, vcpu->arch.reset_state.r0); -- Without deviation from the norm, progress is not possible.