From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79272C43381 for ; Thu, 4 Mar 2021 01:00:44 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0416064EEF for ; Thu, 4 Mar 2021 01:00:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0416064EEF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=tFqeOwyqRHMJYECIUI5Mp3Jie0vxEdtXigzqMpzJmXA=; b=NoPPfI1XRJ5LuH1qYNioctqQv4 iuV5GysLjXNAKbvtdw6vakmREEcGw9JdXDo14m4SFmGqgY13MEz8o/x8GGoJojMipC0qY5HsBBuPV 6ZRWcrp7tW0LSY33CnYP3TSG2+ttv7noJEvOSpGfLPjGzCwOJr3Wc7tGb8L5gLlYNv5/g9/9ebtBA hAX4ckgqXZFN37U6nVMw0SQfwNq7QLW5kA5CLX0VYpc/IDr6ECuUY8yzvuy6BtLZjrVjhQe4llc0E YwzsAltS++GEe38KjIniOxyTaV/ySywRZZTC28J0JREgCVFTDp43LPysVfb/C1Im40lAci/415syO pdOpZOwQ==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lHcHi-007FeW-FB; Thu, 04 Mar 2021 00:56:15 +0000 Received: from mail.kernel.org ([198.145.29.99]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lHXtu-006Ny5-WB for linux-arm-kernel@lists.infradead.org; Wed, 03 Mar 2021 20:15:54 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5941564EDC; Wed, 3 Mar 2021 20:15:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614802511; bh=loI16H1Qv2QfRYE+a45FtW+FvfAFS0f8h7pBXzLvwow=; h=From:To:Cc:Subject:Date:From; b=U4hNVHSg8lQ1DPW02El94PWNNfXh3TB9MUkwRKOu/ylFr6X/dXQ3BAdmsHxKtmoaP 3q49loD8XJaUIgxLZdtEoQIoj1vJzGEjKKyDcOQ5EvOo6Gac+pk38of3cSnsk1Jizl 3Q104ckGzLQ4mfuFuVZnO5sVrxPp6aVahp/pQ1TldTwEyiJd/CWYi7ZRdVphZm81rz pidpAKOJhygWth8oSD44HWtVm/2wOxMl613W87VRq9Y2vVWtKxKYaMdoqhJzIvlbD/ A3Nyk0+tlCgdT2Yz5JkJ2Kts+xzsu7pf34LYaZi3eClPJZGwY1Fvr4x1EKw2Arh+Um yb0Xk0dREDlGg== From: Mark Brown To: Catalin Marinas , Will Deacon Cc: Julien Grall , Zhang Lei , Dave Martin , Daniel Kiss , Julien Grall , linux-arm-kernel@lists.infradead.org, Mark Brown Subject: [PATCH v7 0/3] arm64/sve: Improve performance when handling SVE access traps Date: Wed, 3 Mar 2021 20:11:14 +0000 Message-Id: <20210303201117.24777-1-broonie@kernel.org> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This patch series aims to improve the performance of handling SVE access traps, earlier versions were originally written by Julien Gral but based on discussions on previous versions the patches have been substantially reworked to use a different approach. The patches are now different enough that I set myself as the author, hopefully that's OK for Julien. Per the syscall ABI, SVE registers will be unknown after a syscall. In practice, the kernel will disable SVE and the registers will be zeroed (except the first 128 bits of each vector) on the next SVE instruction. Currently we do this by saving the FPSIMD state to memory, converting to the matching SVE state and then reloading the registers on return to userspace. This requires a lot of memory accesses that we shouldn't need, improve this by reworking the SVE state tracking so we track if we should trap on executing SVE instructions separately to if we need to save the full register state. This allows us to avoid tracking the full SVE state until we need to return to userspace and to convert directly in registers in the common case where the FPSIMD state is still in registers then, reducing overhead in these cases. As with current mainline we disable SVE on every syscall. This may not be ideal for applications that mix SVE and syscall usage, strategies such as SH's fpu_counter may perform better but we need to assess the performance on a wider range of systems than are currently available before implementing anything, this rework will make that easier. It is also possible to optimize the case when the SVE vector length is 128-bit (ie the same size as the FPSIMD vectors). This could be explored in the future, it becomes a lot easier to do with this implementation. I need to confirm if this still needs an update in KVM to handle TIF_SVE_FPSIMD_REGS properly, I'll do that as part of redoing KVM testing but that'll take a little while and felt it was important to get this out for review now. v8: - Replace TIF_SVE_FULL_REGS with TIF_SVE_FPSIMD_REGS, inverting the sense of the flag. This is more in line with a convention mentioned by Dave and fixes some issues that I turned up in testing after doing some of the other updates. - Clarify that we only need to do anything with TIF_SVE_FPSIMD_REGS on entry to the kernel if TIF_SVE_EXEC is set and that the flag is always set on exit to userspace if TIF_SVE_EXEC is set. - Use a local pointer for fpsimd_state in task_fpsimd_load(). - Restructure task_fpsimd_load() for readability. - Explicitly ensure that TIF_SVE_EXEC is set in sve_set_vector_length(), fpsimd_signal_preserve_current_state(), sve_init_header_from_task(). - Drop several more hopefully redundant system_supports_sve() checks, splitting that out into a separate patch. - More use of vq_minus_1. v7: - A few minor cosmetic updates and one bugfix for fpsimd_update_current_state(). v6: - Substantially rework the patch so that TIF_SVE is now replaced by two flags TIF_SVE_EXEC and TIF_SVE_FULL_REGS. - Return to disabling SVE after every syscall as for current mainine rather than leaving it enabled unless reset via ptrace. v5: - Rebase onto v5.10-rc2. - Explicitly support the case where TIF_SVE and TIF_SVE_NEEDS_FLUSH are set simultaneously, though this is not currently expected to happen. - Extensively revised the documentation for TIF_SVE and TIF_SVE_NEEDS_FLUSH to hopefully make things more clear together with the above, I hope this addresses the comments on the prior version but it really needs fresh eyes to tell if that's actually the case. - Make comments in ptrace.c more precise. - Remove some redundant checks for system_has_sve(). v4: - Rebase onto v5.9-rc2 - Address review comments from Dave Martin, mostly documentation but also some refactorings to ensure we don't check capabilities multiple times and the addition of some WARN_ONs to make sure assumptions we are making about what TIF_ flags can be set when are true. v3: - Rebased to current kernels. - Addressed review comments from v2, mostly around tweaks in the documentation. Mark Brown (3): arm64/sve: Remove redundant system_supports_sve() tests arm64/sve: Split TIF_SVE into separate execute and register state flags arm64/sve: Rework SVE trap access to minimise memory access arch/arm64/include/asm/fpsimd.h | 2 + arch/arm64/include/asm/thread_info.h | 7 +- arch/arm64/kernel/entry-fpsimd.S | 5 + arch/arm64/kernel/fpsimd.c | 222 +++++++++++++++++++-------- arch/arm64/kernel/process.c | 7 +- arch/arm64/kernel/ptrace.c | 13 +- arch/arm64/kernel/signal.c | 18 ++- arch/arm64/kernel/syscall.c | 3 +- arch/arm64/kvm/fpsimd.c | 10 +- 9 files changed, 204 insertions(+), 83 deletions(-) base-commit: fe07bfda2fb9cdef8a4d4008a409bb02f35f1bd8 -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel