From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AC71C433E0 for ; Wed, 10 Feb 2021 07:24:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0D0EB64E53 for ; Wed, 10 Feb 2021 07:24:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232386AbhBJHX6 (ORCPT ); Wed, 10 Feb 2021 02:23:58 -0500 Received: from helcar.hmeau.com ([216.24.177.18]:50248 "EHLO fornost.hmeau.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232421AbhBJHX4 (ORCPT ); Wed, 10 Feb 2021 02:23:56 -0500 Received: from gwarestrin.arnor.me.apana.org.au ([192.168.103.7]) by fornost.hmeau.com with smtp (Exim 4.92 #5 (Debian)) id 1l9jq7-0001HR-KN; Wed, 10 Feb 2021 18:23:08 +1100 Received: by gwarestrin.arnor.me.apana.org.au (sSMTP sendmail emulation); Wed, 10 Feb 2021 18:23:07 +1100 Date: Wed, 10 Feb 2021 18:23:07 +1100 From: Herbert Xu To: Ard Biesheuvel Cc: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, will@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com, Dave Martin , Eric Biggers Subject: Re: [PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code Message-ID: <20210210072307.GA4617@gondor.apana.org.au> References: <20210203113626.220151-1-ardb@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210203113626.220151-1-ardb@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Wed, Feb 03, 2021 at 12:36:17PM +0100, Ard Biesheuvel wrote: > Given how kernel mode NEON code disables preemption (to ensure that the > FP/SIMD register state is protected without having to context switch it), > we need to take care not to let those algorithms operate on unbounded > input data, or we may end up with excessive scheduling blackouts on > CONFIG_PREEMPT kernels. > > This is currently handled by the cond_yield_neon macros, which check the > preempt count and the TIF_NEED_RESCHED flag from assembler code, and call > into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule. > This works as expected, but is a bit messy, given how much of the state > preserve/restore code in the algorithm needs to be duplicated, as well as > causing the need to manage the stack frame explicitly. All of this is better > handled by the compiler, especially now that we have enabled features such > as the shadow call stack and BTI, and are working to improve call stack > validation. > > In some cases, yielding is not necessary at all: algoritms that implement > skciphers and use the skcipher walk API will be invoked at page granularity, > which is granular enough for our purpose. > > In other cases, it is better to simply return early from the assembler > routine if a reschedule is pending, and let the C code handle with this, by > retrying the call until it completes. This removes any voluntary schedule() > calls from the call stack, making the code much easier to reason about in > the context of stack validation, rcu_tasks synchronization, etc. > > Practical note: assuming there are no objections to these changes, it may > be the most convenient to take patch #1 into the arm64 tree for v5.12, > and postpone the rest for merging via the crypto tree. (Note that this > series was created against the cryptodev tree, and so the arm64 maintainers > are also welcome to take the whole set if it applies cleanly to the arm64 > tree) > > Will: if you stick #1 on a separate branch, please base it on v5.11-rc1 > > Changes since v1: > - use sub+cbz instead of cmp+b.eq to avoid clobbering the flags in cond_yield > (patch #1) > > Cc: Dave Martin > Cc: Eric Biggers > > Ard Biesheuvel (9): > arm64: assembler: add cond_yield macro > crypto: arm64/sha1-ce - simplify NEON yield > crypto: arm64/sha2-ce - simplify NEON yield > crypto: arm64/sha3-ce - simplify NEON yield > crypto: arm64/sha512-ce - simplify NEON yield > crypto: arm64/aes-neonbs - remove NEON yield calls > crypto: arm64/aes-ce-mac - simplify NEON yield > crypto: arm64/crc-t10dif - move NEON yield to C code > arm64: assembler: remove conditional NEON yield macros > > arch/arm64/crypto/aes-glue.c | 21 +++-- > arch/arm64/crypto/aes-modes.S | 52 +++++-------- > arch/arm64/crypto/aes-neonbs-core.S | 8 +- > arch/arm64/crypto/crct10dif-ce-core.S | 43 +++-------- > arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++++-- > arch/arm64/crypto/sha1-ce-core.S | 47 ++++-------- > arch/arm64/crypto/sha1-ce-glue.c | 22 +++--- > arch/arm64/crypto/sha2-ce-core.S | 38 ++++----- > arch/arm64/crypto/sha2-ce-glue.c | 22 +++--- > arch/arm64/crypto/sha3-ce-core.S | 81 ++++++++------------ > arch/arm64/crypto/sha3-ce-glue.c | 14 ++-- > arch/arm64/crypto/sha512-ce-core.S | 29 ++----- > arch/arm64/crypto/sha512-ce-glue.c | 53 +++++++------ > arch/arm64/include/asm/assembler.h | 78 +++---------------- > 14 files changed, 209 insertions(+), 329 deletions(-) Patches 2-8 applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6969EC433DB for ; Wed, 10 Feb 2021 07:24:43 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DDEF464E2A for ; Wed, 10 Feb 2021 07:24:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DDEF464E2A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=gondor.apana.org.au Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=e5vDEVX+PATB3KIfNzBJ1zTRdXCc/0zS2Ee8Ppf++xY=; b=lk4KCc/AoocmZcEgnjm03+Mr6 f6f9UAVAzU/W5OukEJmqbzBAAYXJiFi8OO0ZcG3hZfIbExco+c8MJiFWO9qqZuNkWgRkbdEGkOY50 xDpzqQ7WdxkfXUqNr9B4BRUxVkZ4eBK3XWPifMulX5Q7UIj+YkaxfTH8hlFgmh3WMefYiZWRWuTwI bFZLIFHWv7dWQZZyTUiS2xg21cVwCS8P1Mxee3bi05q4zUh8b1COcd/TmUUvkDIMu+ni005nch9Ae IFsWQZ2PkAFPk6MYVpBJ4oH/M9nlx2RAuAJ1LjtY0P6Xk+SQ1SnCwo2+FseUMgSrnhdQRLOrfais6 kOqNHkIRw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l9jqE-0006C8-ME; Wed, 10 Feb 2021 07:23:14 +0000 Received: from helcar.hmeau.com ([216.24.177.18] helo=fornost.hmeau.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1l9jqB-0006BU-IO for linux-arm-kernel@lists.infradead.org; Wed, 10 Feb 2021 07:23:12 +0000 Received: from gwarestrin.arnor.me.apana.org.au ([192.168.103.7]) by fornost.hmeau.com with smtp (Exim 4.92 #5 (Debian)) id 1l9jq7-0001HR-KN; Wed, 10 Feb 2021 18:23:08 +1100 Received: by gwarestrin.arnor.me.apana.org.au (sSMTP sendmail emulation); Wed, 10 Feb 2021 18:23:07 +1100 Date: Wed, 10 Feb 2021 18:23:07 +1100 From: Herbert Xu To: Ard Biesheuvel Subject: Re: [PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code Message-ID: <20210210072307.GA4617@gondor.apana.org.au> References: <20210203113626.220151-1-ardb@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210203113626.220151-1-ardb@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210210_022311_636033_062881DF X-CRM114-Status: GOOD ( 24.78 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, Eric Biggers , catalin.marinas@arm.com, linux-crypto@vger.kernel.org, will@kernel.org, Dave Martin , linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Feb 03, 2021 at 12:36:17PM +0100, Ard Biesheuvel wrote: > Given how kernel mode NEON code disables preemption (to ensure that the > FP/SIMD register state is protected without having to context switch it), > we need to take care not to let those algorithms operate on unbounded > input data, or we may end up with excessive scheduling blackouts on > CONFIG_PREEMPT kernels. > > This is currently handled by the cond_yield_neon macros, which check the > preempt count and the TIF_NEED_RESCHED flag from assembler code, and call > into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule. > This works as expected, but is a bit messy, given how much of the state > preserve/restore code in the algorithm needs to be duplicated, as well as > causing the need to manage the stack frame explicitly. All of this is better > handled by the compiler, especially now that we have enabled features such > as the shadow call stack and BTI, and are working to improve call stack > validation. > > In some cases, yielding is not necessary at all: algoritms that implement > skciphers and use the skcipher walk API will be invoked at page granularity, > which is granular enough for our purpose. > > In other cases, it is better to simply return early from the assembler > routine if a reschedule is pending, and let the C code handle with this, by > retrying the call until it completes. This removes any voluntary schedule() > calls from the call stack, making the code much easier to reason about in > the context of stack validation, rcu_tasks synchronization, etc. > > Practical note: assuming there are no objections to these changes, it may > be the most convenient to take patch #1 into the arm64 tree for v5.12, > and postpone the rest for merging via the crypto tree. (Note that this > series was created against the cryptodev tree, and so the arm64 maintainers > are also welcome to take the whole set if it applies cleanly to the arm64 > tree) > > Will: if you stick #1 on a separate branch, please base it on v5.11-rc1 > > Changes since v1: > - use sub+cbz instead of cmp+b.eq to avoid clobbering the flags in cond_yield > (patch #1) > > Cc: Dave Martin > Cc: Eric Biggers > > Ard Biesheuvel (9): > arm64: assembler: add cond_yield macro > crypto: arm64/sha1-ce - simplify NEON yield > crypto: arm64/sha2-ce - simplify NEON yield > crypto: arm64/sha3-ce - simplify NEON yield > crypto: arm64/sha512-ce - simplify NEON yield > crypto: arm64/aes-neonbs - remove NEON yield calls > crypto: arm64/aes-ce-mac - simplify NEON yield > crypto: arm64/crc-t10dif - move NEON yield to C code > arm64: assembler: remove conditional NEON yield macros > > arch/arm64/crypto/aes-glue.c | 21 +++-- > arch/arm64/crypto/aes-modes.S | 52 +++++-------- > arch/arm64/crypto/aes-neonbs-core.S | 8 +- > arch/arm64/crypto/crct10dif-ce-core.S | 43 +++-------- > arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++++-- > arch/arm64/crypto/sha1-ce-core.S | 47 ++++-------- > arch/arm64/crypto/sha1-ce-glue.c | 22 +++--- > arch/arm64/crypto/sha2-ce-core.S | 38 ++++----- > arch/arm64/crypto/sha2-ce-glue.c | 22 +++--- > arch/arm64/crypto/sha3-ce-core.S | 81 ++++++++------------ > arch/arm64/crypto/sha3-ce-glue.c | 14 ++-- > arch/arm64/crypto/sha512-ce-core.S | 29 ++----- > arch/arm64/crypto/sha512-ce-glue.c | 53 +++++++------ > arch/arm64/include/asm/assembler.h | 78 +++---------------- > 14 files changed, 209 insertions(+), 329 deletions(-) Patches 2-8 applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel