linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Lutomirski <luto@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Borislav Petkov <bp@suse.de>
Subject: Re: [PATCH RFC 2/5] x86,fpu: delay FPU register loading until switch to userspace
Date: Sat, 01 Oct 2016 20:08:20 -0400	[thread overview]
Message-ID: <1475366900.21644.8.camel@redhat.com> (raw)
In-Reply-To: <CALCETrWnhA4Gd5AxDEJz7d-VQNDN7O8ayyna7fz3=HNtNnARxQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4087 bytes --]

On Sat, 2016-10-01 at 16:44 -0700, Andy Lutomirski wrote:
> On Sat, Oct 1, 2016 at 1:31 PM,  <riel@redhat.com> wrote:
> > 
> >  #define CREATE_TRACE_POINTS
> >  #include <trace/events/syscalls.h>
> > @@ -197,6 +198,9 @@ __visible inline void
> > prepare_exit_to_usermode(struct pt_regs *regs)
> >         if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
> >                 exit_to_usermode_loop(regs, cached_flags);
> > 
> > +       if (unlikely(test_and_clear_thread_flag(TIF_LOAD_FPU)))
> > +               switch_fpu_return();
> > +
> 
> How about:
> 
> if (unlikely(...)) {
>   exit_to_usermode_loop(regs, cached_flags);
>   cached_flags = READ_ONCE(ti->flags);
> }
> 
> if (ti->flags & _TIF_LOAD_FPU) {
>   clear_thread_flag(TIF_LOAD_FPU);
>   switch_fpu_return();
> }

It looks like test_thread_flag should be fine for avoiding
atomics, too?

  if (unlikely(test_thread_flag(TIF_LOAD_FPU))) {
      clear_thread_flag(TIF_LOAD_FPU);
      switch_fpu_return();
  }


> > +static inline void switch_fpu_finish(void)
> >  {
> > +       set_thread_flag(TIF_LOAD_FPU);
> >  }
> 
> I can imagine this causing problems with kernel code that accesses
> current's FPU state, e.g. get_xsave_field_ptr().  I wonder if it
> would
> make sense to make your changes deeper into the FPU core innards so
> that, for example, we'd have explicit functions that cause the

Whereabouts do you have in mind?

I like your general idea, but am not sure quite where
to put it.

> in-memory state for current to be up-to-date and readable, to cause
> the in-memory state to be up-to-date and writable (which is the same
> thing + TIF_FPU_LOAD + whatever other bookkeeping), and causing the
> in-CPU state to be up-to-date (possibly readable and writable).
> TIF_LOAD_FPU would trigger the latter.
> 
> I've often found it confusing that fpu__save by itself has somewhat
> ill-defined effects.

Fully agreed on both. I guess that means we want a few
different functions:

1) initialize FPU state, both in memory and in registers

2) cause FPU state in registers to be updated from memory,
   if necessary

3) cause FPU state in memory to be updated from registers,
   if necessary

The latter functions could use fpu->fpregs_active to see
whether any action is required, and be called from various
places in the code.

The signal code in particular is doing some strange things
that should probably be hidden away in FPU specific functions.

> 
> > 
> > +/*
> > + * Set up the userspace FPU context before returning to userspace.
> > + */
> > +void switch_fpu_return(void)
> > +{
> > +       struct fpu *fpu = &current->thread.fpu;
> > +       bool preload;
> > +       /*
> > +        * If the task has used the math, pre-load the FPU on xsave
> > processors
> > +        * or if the past 5 consecutive context-switches used math.
> > +        */
> > +       preload = static_cpu_has(X86_FEATURE_FPU) &&
> > +                 fpu->fpstate_active &&
> > +                 (use_eager_fpu() || fpu->counter > 5);
> > +
> > +       if (preload) {
> > +               prefetch(&fpu->state);
> > +               fpu->counter++;
> > +               __fpregs_activate(fpu);
> > +               trace_x86_fpu_regs_activated(fpu);
> > +
> > +               /* Don't change CR0.TS if we just switch! */
> > +               if (!__this_cpu_read(fpu_active)) {
> > +                       __fpregs_activate_hw();
> > +                       __this_cpu_write(fpu_active, true);
> > +               }
> 
> We should just finish getting rid of all TS uses.

Agreed. I will rebase on top of your FPU changes, that
will make things easier for everybody.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2016-10-02  0:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-01 20:31 [PATCH RFC 0/5] x86,fpu: make FPU context switching much lazier riel
2016-10-01 20:31 ` [PATCH RFC 1/5] x86,fpu: split prev/next task fpu state handling riel
2016-10-01 23:26   ` Andy Lutomirski
2016-10-02  0:02     ` Rik van Riel
2016-10-01 20:31 ` [PATCH RFC 2/5] x86,fpu: delay FPU register loading until switch to userspace riel
2016-10-01 23:44   ` Andy Lutomirski
2016-10-02  0:08     ` Rik van Riel [this message]
2016-10-03 20:54       ` Andy Lutomirski
2016-10-03 21:21         ` Rik van Riel
2016-10-03 21:36           ` Andy Lutomirski
2016-10-04  1:29             ` Rik van Riel
2016-10-04  2:09               ` Andy Lutomirski
2016-10-04  2:47                 ` Rik van Riel
2016-10-04  3:02                   ` Andy Lutomirski
2016-10-04  6:35                 ` Ingo Molnar
2016-10-04 12:48                   ` Rik van Riel
2016-10-04  2:11             ` Rik van Riel
2016-10-04  3:02               ` Andy Lutomirski
2016-10-02  0:42     ` Rik van Riel
2016-10-03 16:23       ` Dave Hansen
2016-10-01 20:31 ` [PATCH RFC 3/5] x86,fpu: add kernel fpu argument to __kernel_fpu_begin riel
2016-10-01 20:31 ` [PATCH RFC 4/5] x86,fpu: lazily skip FPU restore when still loaded riel
2016-10-03 20:04   ` Dave Hansen
2016-10-03 20:22     ` Rik van Riel
2016-10-03 20:49       ` Dave Hansen
2016-10-03 21:02         ` Rik van Riel
2016-10-01 20:31 ` [PATCH RFC 5/5] x86,fpu: kinda sorta fix up signal path riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475366900.21644.8.camel@redhat.com \
    --to=riel@redhat.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).