linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: "Brown, Len" <len.brown@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	"Bae, Chang Seok" <chang.seok.bae@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@suse.de>,
	X86 ML <x86@kernel.org>, "Hansen, Dave" <dave.hansen@intel.com>,
	"Liu, Jing2" <jing2.liu@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 13/22] x86/fpu/xstate: Expand dynamic user state area on first use
Date: Tue, 13 Oct 2020 18:47:13 -0700	[thread overview]
Message-ID: <89AB5807-E295-4AB2-9568-9B6306E896F8@amacapital.net> (raw)
In-Reply-To: <BYAPR11MB346371298609756EFD2DDA1EE0040@BYAPR11MB3463.namprd11.prod.outlook.com>


> On Oct 13, 2020, at 3:31 PM, Brown, Len <len.brown@intel.com> wrote:
> 
> 
>> 
>> From: Andy Lutomirski <luto@kernel.org> 
> 
>>> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
>>> @@ -518,3 +518,40 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr)
> ..
>>> +bool xfirstuse_event_handler(struct fpu *fpu)
>>> +{
>>> +       bool handled = false;
>>> +       u64 event_mask;
>>> +
>>> +       /* Check whether the first-use detection is running. */
>>> +       if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled())
>>> +               return handled;
>>> +
> 
>> MSR_IA32_XFD_ERR needs to be wired up in the exception handler, not in
>> some helper called farther down the stack
> 
> xfirstuse_event_handler() is called directly from the IDTENTRY exc_device_not_available():
> 
>>> @@ -1028,6 +1028,9 @@ DEFINE_IDTENTRY(exc_device_not_available)
>>> {
>>>    unsigned long cr0 = read_cr0();
>>> 
>>> +    if (xfirstuse_event_handler(&current->thread.fpu))
>>> +        return;
> 
> Are you suggesting we should instead open code it inside that routine?

MSR_IA32_XFD_ERR is like CR2 and DR6 — it’s functionally a part of the exception. Let’s handle it as such.  (And, a bit like DR6, it’s a bit broken.)

> 
>> But this raises an interesting point -- what happens if allocation
>> fails?  I think that, from kernel code, we simply cannot support this
>> exception mechanism.  If kernel code wants to use AMX (and that would
>> be very strange indeed), it should call x86_i_am_crazy_amx_begin() and
>> handle errors, not rely on exceptions.  From user code, I assume we
>> send a signal if allocation fails.
> 
> The XFD feature allows us to transparently expand the kernel context switch buffer
> for a user task, when that task first touches this associated hardware.
> It allows applications to operate as if the kernel had allocated the backing AMX
> context switch buffer at initialization time.  However, since we expect only
> a sub-set of tasks to actually use AMX, we instead defer allocation until
> we actually see first use for that task, rather than allocating for all tasks.

I sure hope that not all tasks use it. Context-switching it will be absurdly expensive.

> 
> While we currently don't propose AMX use inside the kernel, it is conceivable
> that could be done in the future, just like AVX is used by the RAID code;
> and it would be done the same way -- kernel_fpu_begin()/kernel_fpu_end().
> Such future Kernel AMX use would _not_ arm XFD, and would _not_ provoke this fault.
> (note that we context switch the XFD-armed state per task)

How expensive is *that*?  Can you give approximate cycle counts for saving, restoring, arming and disarming?

This reminds me of TS.  Writing TS was more expensive than saving the whole FPU state.  And, for kernel code, we can’t just “not arm” the XFD — we would have to disarm it.



> 
> vmalloc() does not fail, and does not return an error, and so there is no concept
> of returning a signal.  If we got to the point where vmalloc() sleeps, then the system
> has bigger OOM issues, and the OOM killer would be on the prowl.
> 
> If we were concerned about using vmalloc for a couple of pages in the task structure,
> Then we could implement a routine to harvest unused buffers and free them

Kind of like we vmalloc a couple pages for kernel stacks, and we carefully cache them.  And we handle failure gracefully.

  parent reply	other threads:[~2020-10-14  9:23 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 20:38 [RFC PATCH 00/22] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 01/22] x86/fpu/xstate: Modify area init helper prototypes to access all the possible areas Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 02/22] x86/fpu/xstate: Modify xstate copy " Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 03/22] x86/fpu/xstate: Modify address finder " Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 04/22] x86/fpu/xstate: Modify save and restore helper " Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 05/22] x86/fpu/xstate: Introduce a new variable for dynamic user states Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 06/22] x86/fpu/xstate: Outline dynamic xstate area size in the task context Chang S. Bae
2020-10-01 20:38 ` [RFC PATCH 07/22] x86/fpu/xstate: Introduce helpers to manage an xstate area dynamically Chang S. Bae
2020-10-01 23:41   ` Andy Lutomirski
2020-10-13 22:00     ` Brown, Len
2020-10-01 20:38 ` [RFC PATCH 08/22] x86/fpu/xstate: Define the scope of the initial xstate data Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 09/22] x86/fpu/xstate: Introduce wrapper functions for organizing xstate area access Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 10/22] x86/fpu/xstate: Update xstate save function for supporting dynamic user xstate Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 11/22] x86/fpu/xstate: Update xstate area address finder " Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 12/22] x86/fpu/xstate: Update xstate context copy function for supporting dynamic area Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 13/22] x86/fpu/xstate: Expand dynamic user state area on first use Chang S. Bae
2020-10-01 23:39   ` Andy Lutomirski
2020-10-13 22:31     ` Brown, Len
2020-10-13 22:43       ` Dave Hansen
2020-10-14  1:11         ` Andy Lutomirski
2020-10-14  6:03           ` Dave Hansen
2020-10-14 16:10             ` Andy Lutomirski
2020-10-14 16:29               ` Dave Hansen
2020-11-03 21:32                 ` Bae, Chang Seok
2020-11-03 21:41                   ` Dave Hansen
2020-11-03 21:53                     ` Bae, Chang Seok
2020-10-14 10:41         ` Peter Zijlstra
2020-10-14  1:47       ` Andy Lutomirski [this message]
2020-10-01 20:39 ` [RFC PATCH 14/22] x86/fpu/xstate: Inherit dynamic user state when used in the parent Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 15/22] x86/fpu/xstate: Support ptracer-induced xstate area expansion Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 16/22] x86/fpu/xstate: Support dynamic user state in the signal handling path Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 17/22] x86/fpu/xstate: Extend the table for mapping xstate components with features Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 18/22] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 19/22] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 20/22] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 21/22] selftest/x86/amx: Include test cases for the AMX state management Chang S. Bae
2020-10-01 20:39 ` [RFC PATCH 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support Chang S. Bae
2020-10-02  2:09   ` Randy Dunlap
2020-10-13 23:00     ` Brown, Len
2020-10-02 17:15   ` Andy Lutomirski
2020-10-02 17:26     ` Dave Hansen
2020-10-13 23:38     ` Brown, Len
2020-10-02 17:25   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=89AB5807-E295-4AB2-9568-9B6306E896F8@amacapital.net \
    --to=luto@amacapital.net \
    --cc=bp@suse.de \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).