From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43886C433E0 for ; Mon, 1 Feb 2021 12:32:09 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F3DC164EA0 for ; Mon, 1 Feb 2021 12:32:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F3DC164EA0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Ps9g/EN9MGE/v7L0BJwtDxgYDJoEg3nz6GHDnuG9zbU=; b=PUR/6mWs1DuEJ8kn9md3y2/9c o+Ib1mPHwqEoSvODKwdPOzvwW9niyaPH/eYdz7zz1ojYmSSXHo87blUfbpj7d4/F1uvktMrj7amWa 1MyPT716FG9uzOMOPj1qPCrmZhQUkvadw3K03UBEpqbFoE5F3D5Mf3HhpX0SgdZSlsk9bwn740Qjh j2v43iFpZ4x5m7vvDepMRs7vUa1fn4FUgGD5bDjOJc7SDkFedodi/xuFHiBNNkvsViEQUJGoiOL8z BYIC6lWkQtLL+xGh/nTJGJ8TEzYbVd2GjCYpof5roPqYC9054R97prAOoCEQy6QqHyM6XncQBds+Y S6v5ajnwA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l6YLr-0002qf-QM; Mon, 01 Feb 2021 12:30:43 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1l6YLn-0002pp-ID for linux-arm-kernel@lists.infradead.org; Mon, 01 Feb 2021 12:30:41 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7171E64E9D; Mon, 1 Feb 2021 12:30:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612182638; bh=tZVjdP57HucrQT4fRSDCq7JFKbhqfwqEQuxgfgIsUkA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Zapl5b8q6JrZNKbwwFWajehMDr2ni/AMpkRHoySqGSMJjHvrF6JFAv6cziF8S+HuH uyj+bgU3oCGLU/8Qy0QF+yNS9C1F72gsc/ncKt5Uvnd5Ap0c/K1LwvZEidlXPUvP+b P9dJh8o1AovQhx9FD1/Eb7hluCrQtYKoW8JewZmc7XU4FfGG3Phq7xdMIk+iul8TO5 PhVsw0lrZ8r/OuJzwU2SvUXIexPaZPCtgtneqN5kooEPZuS+aY/D8WwvRv7w9LDjln dCN31JYc7JzTAJFL2BsIJuSJGsHQ6DghkykMesz63GAiI/D9bBNxwFu2VAF9HHVaKI +PpZ0ubP7bKhw== From: Mark Brown To: Catalin Marinas , Will Deacon Subject: [PATCH v7 2/2] arm64/sve: Rework SVE trap access to minimise memory access Date: Mon, 1 Feb 2021 12:29:01 +0000 Message-Id: <20210201122901.11331-3-broonie@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210201122901.11331-1-broonie@kernel.org> References: <20210201122901.11331-1-broonie@kernel.org> MIME-Version: 1.0 X-Patch-Hashes: v=1; h=sha256; i=S4Ggbwf9fARNh42PexNwUuOfIVGWfR87Fm5tFDcaNTs=; m=4UeVHkJgrDTU3WZeVievByxCiskPUIzVHtpLOpm21Js=; p=yNP+hbr+vPavuc2U25MuNoQze22x4UTLTZg2muI+NPg=; g=c7459aba6450a6012aa393b2fc20fb55c4ac89d8 X-Patch-Sig: m=pgp; i=broonie@kernel.org; s=0xC3F436CA30F5D8EB; b=iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmAX9AoACgkQJNaLcl1Uh9CVKwf/V0u 2KfAwUsiiFVWRj8r2bFwiMqJHnM6siIA7tepKuS3+qVX2uJjZdDJ4blhcWoVkEEGHq14pVB6xAo7j RxwYAvnQArlYFL3Qd4XULbdBhFGK66lIHtQrc2RV/BoWEySPGLErx8tq+FxQeF5uByeB8+xUbXp+a mVVEzFuacQ5o9ARXBXdOz0UKLTp64W4yic25ntyGz61JRSEnR12vmAxhUwF3wjyJJPXGcONUHZZMk y+u6jlC8/rIfY92Yu9rf6ITZtQk5svE1cRUTRlxkKp9f3XP9T+DtJqEi54o5lRGoyBlj6UI2T1xx+ t7/k0vMw3TptJUZvPe9BSFlVn3Yx2Wg== X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210201_073039_770277_099EA032 X-CRM114-Status: GOOD ( 22.68 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julien Grall , Julien Grall , Zhang Lei , Mark Brown , Dave Martin , linux-arm-kernel@lists.infradead.org, Daniel Kiss Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When we take a SVE access trap only the subset of the SVE Z0-Z31 registers shared with the FPSIMD V0-V31 registers is valid, the rest of the bits in the SVE registers must be cleared before returning to userspace. Currently we do this by saving the current FPSIMD register state to the task struct and then using that to initalize the copy of the SVE registers in the task struct so they can be loaded from there into the registers. This requires a lot more memory access than we need. The newly added TIF_SVE_FULL_REGS can be used to reduce this overhead - instead of doing the conversion immediately we can set only TIF_SVE_EXEC and not TIF_SVE_FULL_REGS. This means that until we return to userspace we only need to store the FPSIMD registers and if (as should be the common case) the hardware still has the task state and does not need that to be reloaded from the task struct we can do the initialization of the SVE state entirely in registers. In the event that we do need to reload the registers from the task struct only the FPSIMD subset needs to be loaded from memory. If the FPSIMD state is loaded then we need to set the vector length. This is because the vector length is only set when loading from memory, the expectation is that the vector length is set when TIF_SVE_EXEC is set. We also need to rebind the task to the CPU so the newly allocated SVE state is used when the task is saved. This is based on earlier work by Julien Gral implementing a similar idea. Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 2 ++ arch/arm64/kernel/entry-fpsimd.S | 5 +++++ arch/arm64/kernel/fpsimd.c | 35 +++++++++++++++++++++----------- 3 files changed, 30 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index bec5f14b622a..e60aa4ebb351 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -74,6 +74,8 @@ extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state, unsigned long vq_minus_1); extern unsigned int sve_get_vl(void); +extern void sve_set_vq(unsigned long vq_minus_1); + struct arm64_cpu_capabilities; extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused); diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index 2ca395c25448..3ecec60d3295 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -48,6 +48,11 @@ SYM_FUNC_START(sve_get_vl) ret SYM_FUNC_END(sve_get_vl) +SYM_FUNC_START(sve_set_vq) + sve_load_vq x0, x1, x2 + ret +SYM_FUNC_END(sve_set_vq) + /* * Load SVE state from FPSIMD state. * diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 58c749ef04c4..05caf207e2ce 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -994,10 +994,10 @@ void fpsimd_release_task(struct task_struct *dead_task) /* * Trapped SVE access * - * Storage is allocated for the full SVE state, the current FPSIMD - * register contents are migrated across, and TIF_SVE_EXEC is set so that - * the SVE access trap will be disabled the next time this task - * reaches ret_to_user. + * Storage is allocated for the full SVE state so that the code + * running subsequently has somewhere to save the SVE registers to. We + * then rely on ret_to_user to actually convert the FPSIMD registers + * to SVE state by flushing as required. * * TIF_SVE_EXEC should be clear on entry: otherwise, * fpsimd_restore_current_state() would have disabled the SVE access @@ -1016,15 +1016,26 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs) get_cpu_fpsimd_context(); - fpsimd_save(); - - /* Force ret_to_user to reload the registers: */ - fpsimd_flush_task_state(current); - - fpsimd_to_sve(current); + /* + * We shouldn't trap if we can execute SVE instructions and + * there should be no SVE state if that is the case. + */ if (test_and_set_thread_flag(TIF_SVE_EXEC)) - WARN_ON(1); /* SVE access shouldn't have trapped */ - set_thread_flag(TIF_SVE_FULL_REGS); + WARN_ON(1); + if (test_and_clear_thread_flag(TIF_SVE_FULL_REGS)) + WARN_ON(1); + + /* + * When the FPSIMD state is loaded: + * - The return path (see fpsimd_restore_current_state) requires + * the vector length to be loaded beforehand. + * - We need to rebind the task to the CPU so the newly allocated + * SVE state is used when the task is saved. + */ + if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { + sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1); + fpsimd_bind_task_to_cpu(); + } put_cpu_fpsimd_context(); } -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel