From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BE82C2D0E4 for ; Tue, 17 Nov 2020 23:04:43 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5682F206A5 for ; Tue, 17 Nov 2020 23:04:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="F0YSSUhl"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="cnBXMXjF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5682F206A5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type:Cc: List-Subscribe:List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: In-Reply-To:MIME-Version:References:Message-ID:Subject:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ptCRSW8Tj1hKFsh+O+Q/ghn+NNjszPzbG6Wh0tI2m94=; b=F0YSSUhliC/BFh+lI/9F1Shvg Q81JE47Ac44N8ygJ6HbjMGIcewB8STAOA2BlG2nhb0lqo5S2N11c7WBoXm/jWrmjc10ZGU3s4mMtz K2mm2llIPsoS0QaBJEawQ/C2y0P4Z8/RvqBcQLvluSbKDOUCRyu3TRFbVZIsi54VY46m/O8UoRgqa u8tixntWfQkAo51NXphY9Xgl+aHRVAbRhCXDDLFUR4vASimQdCSB+LTpEHhymWp3l8x+Jb9Chmijg mlwP1RdLGAibNQs56mNI7DSapwOZ20Bmi7n/PrgM9HCcaeBATPFPWXlzT/eEfrDA0Xkb2UajIVrwR GRjDSPHAQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kfA14-0006cZ-84; Tue, 17 Nov 2020 23:04:02 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kfA11-0006cF-29 for linux-arm-kernel@lists.infradead.org; Tue, 17 Nov 2020 23:04:00 +0000 Received: from localhost (fw-tnat.cambridge.arm.com [217.140.96.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4FD7A206A5; Tue, 17 Nov 2020 23:03:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605654237; bh=TGeRDDAC5nLZc+/Cku6f9yXlbCmhhjNnOZEPEt58qmE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=cnBXMXjFK7RuEa+kRuzppQymhRVhQlLC2a5D1MpHd3Wl2WjXy1zvyLyVzsTXCa0X9 sPMQzlhQFqOYZYi8PDJsjUFpUQ76DTy5HovN2T4y+/kJDY2v8qR/w84hHnHgb0r1m4 0nY9aqpig47bZiFf9rs0fuMoAaECLPsn2QVLzz+s= Date: Tue, 17 Nov 2020 23:03:38 +0000 From: Mark Brown To: Dave Martin Subject: Re: [PATCH v5 1/2] arm64/sve: Don't disable SVE on syscalls return Message-ID: <20201117230338.GI5142@sirena.org.uk> References: <20201106193553.22946-1-broonie@kernel.org> <20201106193553.22946-2-broonie@kernel.org> <20201113184855.GF3212@gaia> <20201113201328.GG4828@sirena.org.uk> <20201116175938.GA6882@arm.com> <20201116194551.GG4739@sirena.org.uk> <20201117181618.GC6882@arm.com> MIME-Version: 1.0 In-Reply-To: <20201117181618.GC6882@arm.com> X-Cookie: Pause for storage relocation. User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201117_180359_557460_7196CFB3 X-CRM114-Status: GOOD ( 54.97 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julien Grall , Catalin Marinas , Zhang Lei , Julien Grall , Will Deacon , linux-arm-kernel@lists.infradead.org, Daniel Kiss Content-Type: multipart/mixed; boundary="===============3337679540871739143==" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org --===============3337679540871739143== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="GBuTPvBEOL0MYPgd" Content-Disposition: inline --GBuTPvBEOL0MYPgd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 17, 2020 at 06:16:20PM +0000, Dave Martin wrote: > I think that thinking of TIF_SVE_NEEDS_FLUSH as a distinct toggle may > not help helping here -- it feels more like an override or a special > case. Well, we need a flag somewhere - it's a question of where we put it. > I also wonder whether increasing the amount of commenting is actually a > trap here: there's a risk of piling new confusion on old. The actual > problems might be that there is too much commenting already, not too > little... It's an interesting balance. Having worked with the code it's definitely overwhelmingly more commented than the rest of the architecture, I think because it *is* a bit fiddly and has even fewer people who've looked at it in detail than the rest of it. This can make it look much more complicated than it actually appears to be but on the other hand it may be that other bits of the code should be more like the FP code rather than the other way around. > Taking a step back, this is how I think it was all supposed to work, > prior to this series. This time, I'll try to describe the flags > separately rather than enumerating all the combinations (which I think > may have been a misstep): I tend to agree that thinking about the combinations when you don't have to makes it harder - in particular with TIF_FOREIGN_FPSTATE. > At this level of abstraction, these two concepts are entirely > independent: > !SVE SVE > +---------------+---------------+ > !FFP | Track: FPSIMD | Track: SVE | > | Where: regs | Where: regs | > +---------------+---------------+ > FFP | Track: FPSIMD | Track: SVE | > | Where: memory | Where: memory | > +---------------+---------------+ Yes. > I think where we get in a tangle is that TIF_SVE conflates mechanism > (whether the kernel tracks the full SVE state) with policy (whether > userspace cares about the full SVE state). At present, we can't > describe the situation where we track SVE state but userspace doesn't > care, so we either have to do nothing at all and just leave SVE on all > the time, or we have to disable SVE at the first > opportunity. I think this is the current conflation that is causing issues but I wouldn't state it quite like that, I don't know that *care* is quite the right word for what userspace is doing here. There's two pieces, there's the SVE state in the registers and there's the ability to execute SVE instructions without trapping. When userspace executes a SVE instruction we enable execution and also start tracking the register state which we currently track in the single TIF_SVE flag. When userspace does a syscall the extra register state becomes undefined (realistically we'll probably have to continue resetting it to zero as we currently do) and we currently don't the ability to track the ability to execute the instructions separately to discarding that state so we just disable both at once. This means that the syscall overhead is amplified for any task that mixes SVE and syscalls. > So, what we need here is a way (i.e., a new state) to indicate that > userspace doesn't care about the extra SVE state. Obviously this not > orthogonal to TIF_SVE: if SVE state is not tracked, then there is > no state _to_ care about. > As I understand it, this is what the TIF_SVE_NEEDS_FLUSH flag > describes. It separates policy (userspace's policy in this case -- > i.e., whether it needs the full SVE state) from mechanism (whether the That's approximately the roles, yes. The implementation is currently different in that both SVE flags are separately wrapping in the execution permission. > kernel keeps track of the full SVE state). Perhaps we can come up with > another name, such as TIF_SVE_MAY_DISCARD. Naming is definitely a thing here - _MAY_DISCARD definitely doesn't seem right. > Whatever we call this new intermediate state, we have a something like > this. >=20 > !SVE ??? SVE > +---------------+---------------+---------------+ > | Need: FPSIMD | Need: FPSIMD | Need: SVE | > !FFP | Track: FPSIMD | Track: SVE | Track: SVE | > | Where: regs | Where: regs | Where: regs | > +---------------+---------------+---------------+ > | Need: FPSIMD | Need: FPSIMD | Need: SVE | > FFP | Track: FPSIMD | Track: SVE | Track: SVE | > | Where: memory | Where: memory | Where: memory | > +---------------+---------------+---------------+ >=20 > (FFP =3D TIF_FOREIGN_FPSTATE, SVE =3D TIF_SVE) I think I get what you mean here but I'm finding the "need" a bit confusing here and in the following discussion. I think instead of need and track it might be clearer to say valid registers and execute - it's which registers are have defined data in them and if userspace can execute SVE instructions without trapping. > However, the middle column is a bit special: we can't run in userspace > in this state, since once userspace touches the regs we no longer know > whether it cares about the data in them. Also, it's a bit pointless > representing this state in memory, since the main point of having it is > to _avoid_ dumping to memory. And we do need to do a bit of work to > sanitise the state -- unless we always do it on the syscall entry path. While we can't be in userspace in the middle column I'm not sure it's entirely true to say that we can't usefully represent things here in memory if we think about the execute part of things as well. If we schedule another task we can still track if we want to have SVE instruction execution on return to userspace even if the only data we're storing is the FPSIMD subset, avoiding a second SVE trap on the first SVE instruction after a syscall while also storing less data in memory. As noted in the cover letter for the series we're probably going to find we want to reset the execute permission intermittently on syscall workloads in order to ensure that tasks that execute SVE infrequently don't always have SVE enabled but I think we want to sort out what we're doing with the basic case first. > Effort needed > to load regs > !SVE ??? SVE > +---------------+ +---------------+ > | Need: FPSIMD | | Need: SVE | none > !FFP | Track: FPSIMD +---------------+ Track: SVE | ^ > | Where: regs | N: FPSIMD | Where: regs | : > +---------------+ T: undecided +---------------+ some > | Need: FPSIMD | W:regs+cleanup| Need: SVE | : > FFP | Track: FPSIMD +---------------+ Track: SVE | v > | Where: memory | | Where: memory | full reload > +---------------+ +---------------+ =2E.. > And finally, on sched-out (or any other fpsimd_save() event): > +---------------+ +---------------+ > | | | | > | : +---------------+ : | > | : | | : | > +--:------------+ +------------:--+ > | : <=3D=3D=3D=3D=3D=3D=3D [*] =3D=3D=3D=3D=3D=3D=3D> = : | > | V +---------------+ V | > | | | | > +---------------+ +---------------+ > ...at least in threory. I agree up to this final schedule case since here we'd either be taking a decision to force userspace to trap on the next SVE instruction or to store and reload full SVE state in order to avoid that neither of which seems ideal. With the current patch if TIF_SVE_NEEDS_FLUSH is set we only save the FPSIMD state and then restore it and switch to TIF_SVE when scheduling the task again - this is masked a bit in the patch since it does not update the existing code in fpsimd_save() as TIF_SVE is not set when setting TIF_SVE_NEEDS_FLUSH so there's no change in that path. We could potentially use this as a heuristic to decide that we want to drop SVE on a syscall that schedules rather than after some number of syscalls (which we don't currently do but was mentioned as future work in the cover letter and has precedents on other architectures) but that doesn't feel great, the syscall counter feels more robust. > While TIF_FOREIGN_FPSTATE and TIF_SVE remain orthogonal to each other, > this new state is not orthogonal to either. I don't think that's a bug, > providing that we understand its role. Since it's a 5th state, we are > going to burn a flag on it, but this doesn't mean that it needs to (or > even should) have a meaning that's fully independent of the others. > Rather it may be useful precisely because it's not independent -- it > decribes a situation which is currently an overlap between the other > states. > I'm pretty sure this is more or less what the proposed patches do, > but I'll review again with this picture in mind. It differs in the handling of scheduling while in a syscall but otherwise yes, I think that's pretty much it. With that in mind and thinking on the subthread with Catalin if we restructure the SVE flags such that we have separate SVE_EXEC and SVE_REGS flags (bad but hopefully comprehensible names short enough for drawing a table) for the execute and storage permissions we end up with: !SVE/SVE_REGS SVE_EXEC SVE_REGS+SVE_EXEC +---------------+---------------+---------------+ | Valid: FPSIMD | Valid: FPSIMD | Valid: SVE | !FFP | Trap: Yes | Trap: No | Trap: No | | Where: regs | Where: regs | Where: regs | +---------------+---------------+---------------+ | Valid: FPSIMD | Valid: FPSIMD | Valid: SVE | FFP | Trap: Yes | Trap: No | Trap: No | | Where: memory | Where: memory | Where: memory | +---------------+---------------+---------------+ (the first column being either no SVE flags set or only SVE_REGS set.) All entries should go to one of the !FFP cases. Entry from SVE traps or syscalls with SVE_EXEC set should go to SVE_EXEC, other entries with SVE_EXEC already set should go to SVE_REGS+SVE_EXEC and all other entries should go to !SVE/SVE_REGS. Currently TIF_SVE maps onto SVE_REGS+SVE_EXEC and TIF_SVE_NO_FLUSH maps onto SVE_EXEC. =20 > Note: Making things symmetric > ----------------------------- > An alternative behaviour for syscalls would be: > TIF_SVE > !TIF_SVE ,-------------------------. > ,--------------------------. >=20 > +---------------+ +---------------+ > | | | | > | +---------------+ | > | =3D=3D=3D=3D=3D=3D> <=3D=3D=3D=3D=3D=3D = | > +---------------+ +---------------+ > | | | | > | +---------------+ | > | | | | > +---------------+ +---------------+ > This shouldn't change the underlying behaviour other than way certain > if() conditions would be expressed. This model may make it clearer > that the TIF_SVE versus !TIF_SVE decision doesn't occur until moving > out of the middle state. This gives a natural place to turn SVE on > speculatively for example, when a syscall occurs with !TIF_SVE. > (Whether this is useful is we could ever make an accurate enough guess > is another matter... but this still might make the code a bit easier > to conceptualise.) I worry that this might make it harder to follow in that you'd have to understand why we're considering enabling SVE for tasks that never tried to do anything with it. --GBuTPvBEOL0MYPgd Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl+0VskACgkQJNaLcl1U h9BC2wf/QxJuLouIrqShNCaRe0LGUXp1L5bpca4/U1fMXPHvprpDALkcezEK7KGn PzqmZIHDFA4nAiAdnwNWgwObm86/rZpHhn3vwN5ftaLPwsHhHuTUFLEQyh9X9OBJ wYhuioI39eCztPXwCU08rxzo8Zm2lY7hcGUjTqsMePpxF48z8Tf6cowzb2tvQRi5 QhB6XRKpI7RaFySmbbxq0O6twLgBxJDbWmLZjgbVdHZtBHwerd1AaHCvBwu1ictc DlSzzdsAQg4exK18x8YiXIqqRFILS/VdVYXzTf81nd62ufAX6/6K8/gQ4HhHeFTF 6JkAcayJJAKG87flxdSsLneKgfzYKQ== =EPDB -----END PGP SIGNATURE----- --GBuTPvBEOL0MYPgd-- --===============3337679540871739143== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel --===============3337679540871739143==--