From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6698C433E0 for ; Wed, 10 Feb 2021 10:58:57 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4BB1C64E37 for ; Wed, 10 Feb 2021 10:58:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4BB1C64E37 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=CaqxT8EUTH9I7OoTSAFvSpEtBc9pSCMEFdbK5xf+K04=; b=xFX33ujMTMevUjk/N2KpGk4Uu bPjEn7uC/vDdnB8HTsjSJwwHKHvQ9LeSahDpNL7cgWmAPMVHfe92bvt/J0pvj4FxzB7Yl0U7rfmeX 2SQ17RNRjPsqZatMa39odU2nPbcPHBrTIqKGTm07aa95vRmLU/BOD0+VL+5gVA2SJWquO48cclFzD vGEiqiaIdnvoj8VI6eISOOuW9+Od/pRzuReeY39r8r/cnD3Q6nsejkj8oOwribItykTFcSLFiOWZl BJghTznwJ8b9dNorSdYPue7IwtqP6oT12+Qwy/+P3vAT1KhEMPrkFo7flxLEDRhpJrMxwo262xifH r4AnBHZDA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l9nBT-0007rC-TU; Wed, 10 Feb 2021 10:57:23 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l9nBQ-0007qf-UJ for linux-arm-kernel@lists.infradead.org; Wed, 10 Feb 2021 10:57:22 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1EB811042; Wed, 10 Feb 2021 02:57:18 -0800 (PST) Received: from arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 021CF3F73D; Wed, 10 Feb 2021 02:57:16 -0800 (PST) Date: Wed, 10 Feb 2021 10:56:55 +0000 From: Dave Martin To: Mark Brown Subject: Re: [PATCH v7 1/2] arm64/sve: Split TIF_SVE into separate execute and register state flags Message-ID: <20210210105650.GI21837@arm.com> References: <20210201122901.11331-1-broonie@kernel.org> <20210201122901.11331-2-broonie@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210201122901.11331-2-broonie@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210210_055721_312563_0E634C1B X-CRM114-Status: GOOD ( 27.27 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julien Grall , Julien Grall , Catalin Marinas , Zhang Lei , Will Deacon , linux-arm-kernel@lists.infradead.org, Daniel Kiss Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Feb 01, 2021 at 12:29:00PM +0000, Mark Brown wrote: > Currently we have a single flag TIF_SVE which says that a task is > allowed to execute SVE instructions without trapping and also that full > SVE register state is stored for the task. This results in us doing > extra work storing and restoring the full SVE register state even in > those cases where the ABI is that only the first 128 bits of the Z0-V31 > registers which are shared with the FPSIMD V0-V31 are valid. > > In order to allow us to avoid these overheads split TIF_SVE up so that > we have two separate flags, TIF_SVE_EXEC which allows execution of SVE > instructions without trapping and TIF_SVE_FULL_REGS which indicates that > the full SVE register state is stored. If both are set the behaviour is > as currently, if TIF_SVE_EXEC is set without TIF_SVE_FULL_REGS then we > save and restore only the FPSIMD registers until we return to userspace > with TIF_SVE_EXEC enabled at which point we convert the FPSIMD registers > to SVE. It is not meaningful to have TIF_SVE_FULL_REGS set without > TIF_SVE_EXEC. > > This patch is intended only to split the flags, it does not take > avantage of the ability to set the flags independently and the new > state with TIF_SVE_EXEC only should not be observed. > > This is based on earlier work by Julien Gral implementing a slightly > different approach. > > Signed-off-by: Mark Brown > --- [...] > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c [...] > @@ -279,18 +327,37 @@ static void sve_free(struct task_struct *task) > * This function should be called only when the FPSIMD/SVE state in > * thread_struct is known to be up to date, when preparing to enter > * userspace. > + * > + * When TIF_SVE_EXEC is set but TIF_SVE_FULL_REGS is not set the SVE > + * state will be restored from the FPSIMD state. > */ > static void task_fpsimd_load(void) > { > + unsigned int vl; > + > WARN_ON(!system_supports_fpsimd()); > WARN_ON(!have_cpu_fpsimd_context()); > > - if (system_supports_sve() && test_thread_flag(TIF_SVE)) > - sve_load_state(sve_pffr(¤t->thread), > - ¤t->thread.uw.fpsimd_state.fpsr, > - sve_vq_from_vl(current->thread.sve_vl) - 1); > - else > - fpsimd_load_state(¤t->thread.uw.fpsimd_state); > + if (test_thread_flag(TIF_SVE_EXEC)) { > + vl = sve_vq_from_vl(current->thread.sve_vl) - 1; One more nit: because of the confusion that can arises from "vl" being a somewhat overloaded term in the architecture, I was trying to avoid using the name "vl" for anything that isn't the vector length in bytes. Can this instead be renamed to vq_minus_1 to match the function arguments it's passed for? (You could save a couple of lines by moving the declaration here and combining it with this assignment too.) [...] Cheers ---Dave _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel