From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS
	autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 1BACAC4320A
	for <kvm@archiver.kernel.org>; Wed, 18 Aug 2021 10:38:15 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 0211F6108F
	for <kvm@archiver.kernel.org>; Wed, 18 Aug 2021 10:38:14 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S234655AbhHRKir (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 18 Aug 2021 06:38:47 -0400
Received: from mail.kernel.org ([198.145.29.99]:47916 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S234353AbhHRKij (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 18 Aug 2021 06:38:39 -0400
Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by mail.kernel.org (Postfix) with ESMTPSA id 6C8D76108E;
        Wed, 18 Aug 2021 10:38:04 +0000 (UTC)
Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org)
        by disco-boy.misterjones.org with esmtpsa  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        (Exim 4.94.2)
        (envelope-from <maz@kernel.org>)
        id 1mGIxO-005jGE-Hj; Wed, 18 Aug 2021 11:38:02 +0100
Date:   Wed, 18 Aug 2021 11:38:02 +0100
Message-ID: <87r1errqb9.wl-maz@kernel.org>
From:   Marc Zyngier <maz@kernel.org>
To:     Oliver Upton <oupton@google.com>
Cc:     kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
        Peter Shier <pshier@google.com>,
        Ricardo Koller <ricarkol@google.com>,
        Jing Zhang <jingzhangos@google.com>,
        Raghavendra Rao Anata <rananta@google.com>,
        James Morse <james.morse@arm.com>,
        Alexandru Elisei <alexandru.elisei@arm.com>,
        Suzuki K Poulose <suzuki.poulose@arm.com>
Subject: Re: [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state
In-Reply-To: <20210818085047.1005285-3-oupton@google.com>
References: <20210818085047.1005285-1-oupton@google.com>
        <20210818085047.1005285-3-oupton@google.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1
 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
X-SA-Exim-Connect-IP: 185.219.108.64
X-SA-Exim-Rcpt-To: oupton@google.com, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, pshier@google.com, ricarkol@google.com, jingzhangos@google.com, rananta@google.com, james.morse@arm.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com
X-SA-Exim-Mail-From: maz@kernel.org
X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

On Wed, 18 Aug 2021 09:50:45 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> The CPU_ON PSCI call takes a payload that KVM uses to configure a
> destination vCPU to run. This payload is non-architectural state and not
> exposed through any existing UAPI. Effectively, we have a race between
> CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't
> ran again before the VMM saves its state, the requested PC and context
> ID are lost. When restored, the target vCPU will be runnable and start
> executing at its old PC.
> 
> We can avoid this race by making sure the reset payload is serviced
> before userspace can access a vCPU's state. This is, of course, a hairy
> ugly hack. A benefit of such a hack, though, is that we've managed to
> massage the reset state into the architected state, thereby making it
> migratable without forcing userspace to play our game with a UAPI
> addition.

I don't think it is that bad. In a way, it is similar to the "resync
pending exception state" dance that we do on vcpu exit to userspace.
One thing to note is that it only works because this is done from the
vcpu thread itself.

>
> Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
> I really hate this, but my imagination is failing me on any other way to
> cure the race without cluing in userspace. Any ideas?
> 
>  arch/arm64/kvm/arm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 0de4b41c3706..6b124c29c663 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1216,6 +1216,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  		if (copy_from_user(&reg, argp, sizeof(reg)))
>  			break;
>  
> +		/*
> +		 * ugly hack. We could owe a reset due to PSCI and not yet
> +		 * serviced it. Prevent userspace from reading/writing state
> +		 * that will be clobbered by the eventual handling of the reset
> +		 * bit.

This reads a bit odd. You are taking care of two potential issues in
one go here:
- userspace writes won't be overwritten by a pending reset as they
will take place after said reset
- userspace reads will reflect the state of the freshly reset CPU
instead of some stale state

> +		 */
> +		if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
> +			kvm_reset_vcpu(vcpu);
> +
>  		if (ioctl == KVM_SET_ONE_REG)
>  			r = kvm_arm_set_reg(vcpu, &reg);
>  		else

Otherwise, well spotted.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.