From: Len Brown <lenb@kernel.org> To: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Borislav Petkov <bp@suse.de>, Andy Lutomirski <luto@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@kernel.org>, X86 ML <x86@kernel.org>, "Brown, Len" <len.brown@intel.com>, Dave Hansen <dave.hansen@intel.com>, "Liu, Jing2" <jing2.liu@intel.com>, "Ravi V. Shankar" <ravi.v.shankar@intel.com>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Subject: Re: [PATCH v5 15/28] x86/arch_prctl: Create ARCH_GET_XSTATE/ARCH_PUT_XSTATE Date: Tue, 25 May 2021 20:38:16 -0400 [thread overview] Message-ID: <CAJvTdKmzN0VMyH8VU_fdzn2UZqmR=_aNrJW01a65BhyLm6YRPg@mail.gmail.com> (raw) In-Reply-To: <CAJvTdKnrFSS0fvhNz5mb9v8epEVtphUesEUV0hhNErMBK5HNHQ@mail.gmail.com> After today's discussion, I believe we are close to consensus on this plan: 1. Kernel sets XCR0.AMX=1 at boot, and leaves it set, always. 2. Kernel arms XFD for all tasks, by default. 3. New prctl() system call allows any task in a process to ask for AMX permission for all tasks in that process. Permission is granted for the lifetime of that process, and there is no interface for a process to "un-request" permission. 4. If a task touches AMX without permission, #NM/signal kills the process 5. If a task touches AMX with permission, #NM handler will transparently allocate a context switch buffer, and disarm XFD for that task. (MSR_XFD is context switched per task) 6. If the #NM handler can not allocate the 8KB buffer, the task will receive a signal at the instruction that took the #NM fault, likely resulting in program exit. 7. In addition, a 2nd system call to request that buffers be pre-allocated is available. This is a per task system call. This synchronous allocate system call will return an error code if it fails, which will also likely result in program exit. 8. NEW TODAY: Linux will exclude the AMX 8KB region from the XSTATE on the signal stack for tasks in process that do not have AMX permission. 9. For tasks in processes that have requested and received AMX permission, Linux will XSAVE/XRESTOR directly to/from the signal stack, and the stack will always include the 8KB space for AMX. (we do have a manual optimization to in place to skip writing zeros to the stack frame if AMX=INIT) 10. Linux reserves the right to plumb the new permission syscall into cgroup administrative interface in the future. Comments: Legacy software will not see signal stack growth on AMX hardware. New AMX software will see AMX state on the signal stack. If new AMX software uses an alternative signal stack, it should be built using the signal.h ABI in glibc 2.34 or later, so that it can calculate the appropriate size for the current hardware. Note that non-AMX software that is newly built will get the same answer from the ABI, which would handle the case if it does use AMX. Today it is possible for an application to calculate the uncompressed XSTATE size from XCR0 and CPUID, allocate buffers of that size, and use XSAVE and XRESTOR on those buffers in user-space. Applications can also XRESTOR from (and XSAVE back to) the signal stack, if they choose. Now, this capability can break for non-AMX programs, because their XSAVE will be 8KB larger than the buffer that they XRESTOR. Andy L questions whether such applications actually exist, and Thomas states that even if they do, that is a much smaller problem than 8KB signal stack growth would be for legacy applications. Unclear if we have consensus on the need for a synchronous allocation system call (#7 above). Observe that this system call does not improve the likelihood of failure or the timing of failure. An #NM-based allocation and be done at exactly the same spot by simply touching a TMM register. The benefit of this system call is that it returns an error code to the caller, versus the program being delivered a SIGSEGV at the offending instruction pointer. Both will likely result in the program exiting, and at the same point in execution. A future mechanism to lazy harvest not-recently-used context switchy buffers has been discussed. Eg. the kernel under low memory could re-arm XFD for all AMX tasks, and if their buffers are clean, free them. If that mechanism is implemented, and we also implement the synchronous allocation system call, that mechanism must respect the guarantee made by that system call and not harvest system-call-allocated buffers. Len Brown, Intel Open Source Technology Center
next prev parent reply other threads:[~2021-05-26 0:38 UTC|newest] Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-23 19:32 [PATCH v5 00/28] x86: Support Intel Advanced Matrix Extensions Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 01/28] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 02/28] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 03/28] x86/fpu/xstate: Modify address finders " Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 04/28] x86/fpu/xstate: Modify the context restore helper " Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 05/28] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 06/28] x86/fpu/xstate: Add new variables to indicate dynamic xstate buffer size Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 07/28] x86/fpu/xstate: Calculate and remember dynamic xstate buffer sizes Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 08/28] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 09/28] x86/fpu/xstate: Introduce helpers to manage the xstate buffer dynamically Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 10/28] x86/fpu/xstate: Define the scope of the initial xstate data Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 11/28] x86/fpu/xstate: Update the xstate save function to support dynamic states Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 12/28] x86/fpu/xstate: Update the xstate buffer address finder " Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 13/28] x86/fpu/xstate: Update the xstate context copy function " Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 14/28] x86/fpu/xstate: Prevent unauthorised use of dynamic user state Chang S. Bae 2021-06-16 16:17 ` Dave Hansen 2021-06-16 16:27 ` Dave Hansen 2021-06-16 18:12 ` Andy Lutomirski 2021-06-16 18:47 ` Bae, Chang Seok 2021-06-16 19:01 ` Dave Hansen 2021-06-16 19:23 ` Bae, Chang Seok 2021-06-16 19:28 ` Dave Hansen 2021-06-16 19:37 ` Bae, Chang Seok 2021-06-28 10:11 ` Liu, Jing2 2021-06-29 17:43 ` Bae, Chang Seok 2021-06-29 17:54 ` Dave Hansen 2021-06-29 18:35 ` Bae, Chang Seok 2021-06-29 18:50 ` Dave Hansen 2021-06-29 19:13 ` Bae, Chang Seok 2021-06-29 19:26 ` Dave Hansen 2021-05-23 19:32 ` [PATCH v5 15/28] x86/arch_prctl: Create ARCH_GET_XSTATE/ARCH_PUT_XSTATE Chang S. Bae 2021-05-24 23:10 ` Len Brown 2021-05-25 17:27 ` Borislav Petkov 2021-05-25 17:33 ` Dave Hansen 2021-05-26 0:38 ` Len Brown [this message] 2021-05-27 11:14 ` second, sync-alloc syscall Borislav Petkov 2021-05-27 13:59 ` Len Brown 2021-05-27 19:35 ` Andy Lutomirski 2021-05-25 15:46 ` [PATCH v5 15/28] x86/arch_prctl: Create ARCH_GET_XSTATE/ARCH_PUT_XSTATE Dave Hansen 2021-05-23 19:32 ` [PATCH v5 16/28] x86/fpu/xstate: Support ptracer-induced xstate buffer expansion Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 17/28] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 18/28] x86/fpu/xstate: Disable xstate support if an inconsistent state is detected Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 19/28] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 20/28] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 21/28] x86/fpu/amx: Initialize child's AMX state Chang S. Bae 2021-05-24 3:09 ` Andy Lutomirski 2021-05-24 17:37 ` Len Brown 2021-05-24 18:13 ` Andy Lutomirski 2021-05-24 18:21 ` Len Brown 2021-05-25 3:44 ` Andy Lutomirski 2021-05-23 19:32 ` [PATCH v5 22/28] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 23/28] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 24/28] x86/fpu/xstate: Use per-task xstate mask for saving xstate in signal frame Chang S. Bae 2021-05-24 3:15 ` Andy Lutomirski 2021-05-24 18:06 ` Len Brown 2021-05-25 4:47 ` Andy Lutomirski 2021-05-25 14:04 ` Len Brown 2021-05-23 19:32 ` [PATCH v5 25/28] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state Chang S. Bae 2021-05-24 3:25 ` Andy Lutomirski 2021-05-24 18:15 ` Len Brown 2021-05-24 18:29 ` Dave Hansen 2021-05-25 4:46 ` Andy Lutomirski 2021-05-23 19:32 ` [PATCH v5 26/28] selftest/x86/amx: Test case for AMX state copy optimization in signal delivery Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 27/28] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae 2021-05-23 19:32 ` [PATCH v5 28/28] x86/fpu/amx: Clear the AMX state when appropriate Chang S. Bae 2021-05-24 3:13 ` Andy Lutomirski 2021-05-24 14:10 ` Dave Hansen 2021-05-24 17:32 ` Len Brown 2021-05-24 17:39 ` Dave Hansen 2021-05-24 18:24 ` Len Brown 2021-05-27 11:56 ` Peter Zijlstra 2021-05-27 14:02 ` Len Brown 2021-05-24 14:06 ` Dave Hansen 2021-05-24 17:34 ` Len Brown 2021-05-24 21:11 ` [PATCH v5-fix " Chang S. Bae
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAJvTdKmzN0VMyH8VU_fdzn2UZqmR=_aNrJW01a65BhyLm6YRPg@mail.gmail.com' \ --to=lenb@kernel.org \ --cc=bp@suse.de \ --cc=chang.seok.bae@intel.com \ --cc=dave.hansen@intel.com \ --cc=jing2.liu@intel.com \ --cc=len.brown@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=mingo@kernel.org \ --cc=ravi.v.shankar@intel.com \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ --subject='Re: [PATCH v5 15/28] x86/arch_prctl: Create ARCH_GET_XSTATE/ARCH_PUT_XSTATE' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).