All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 000/208] big x86 FPU code rewrite
@ 2015-05-05 16:23 Ingo Molnar
  2015-05-05 16:23 ` [PATCH 001/208] x86/fpu: Rename unlazy_fpu() to fpu__save() Ingo Molnar
                   ` (79 more replies)
  0 siblings, 80 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

[Take #2 - sorry!]

Over the past 10 years the x86 FPU has organically grown into
somewhat of a spaghetti monster that few (if any) kernel
developers understand and which code few people enjoy to hack.

Many people suggested over the years that it needs a major cleanup,
and some time ago I went "what the heck" and started doing it step
by step to see where it leads - it cannot be that hard!

Three weeks and 200+ patches later I think I have to admit that I
seriously underestimated the magnitude of the project! ;-)

This work in progress series is large, but it I think makes the
code maintainable and hackable again. It's pretty complete, as
per the 9 high level goals laid out further below. Individual
patches are all finegrained, so should be easy to review - Boris
Petkov already reviewed most of the patches so they are not
entirely raw.

Individual patches have been tested heavily for bisectability, they
were both build and boot on a relatively wide range of x86 hardware
that I have access to. But nevertheless the changes are pretty
invasive, so I'd expect there to be test failures.

This is the only time I intend to post them to lkml in their entirety,
to not spam lkml too much.  (Future additions will be posted as delta
series.)

I'd like to ask interested people to test this tree, and to comment
on the patches. The changes can be found in the following Git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tmp.fpu

(The tree might be rebased, depending on feedback.)

Here are the main themes that motivated most of the changes:

1)

I collected all FPU code into arch/x86/kernel/fpu/*.c and split it
all up into the following, topically organized source code files:

  -rw-rw-r-- 1 mingo mingo  1423 May  5 16:36 arch/x86/kernel/fpu/bugs.c
  -rw-rw-r-- 1 mingo mingo 12206 May  5 16:36 arch/x86/kernel/fpu/core.c
  -rw-rw-r-- 1 mingo mingo  7342 May  5 16:36 arch/x86/kernel/fpu/init.c
  -rw-rw-r-- 1 mingo mingo 10909 May  5 16:36 arch/x86/kernel/fpu/measure.c
  -rw-rw-r-- 1 mingo mingo  9012 May  5 16:36 arch/x86/kernel/fpu/regset.c
  -rw-rw-r-- 1 mingo mingo 11188 May  5 16:36 arch/x86/kernel/fpu/signal.c
  -rw-rw-r-- 1 mingo mingo 10140 May  5 16:36 arch/x86/kernel/fpu/xstate.c

Similarly I've collected and split up all FPU related header files, and
organized them topically:

  -rw-rw-r-- 1 mingo mingo  1690 May  5 16:35 arch/x86/include/asm/fpu/api.h
  -rw-rw-r-- 1 mingo mingo 12937 May  5 16:36 arch/x86/include/asm/fpu/internal.h
  -rw-rw-r-- 1 mingo mingo   278 May  5 16:36 arch/x86/include/asm/fpu/measure.h
  -rw-rw-r-- 1 mingo mingo   596 May  5 16:35 arch/x86/include/asm/fpu/regset.h
  -rw-rw-r-- 1 mingo mingo  1013 May  5 16:35 arch/x86/include/asm/fpu/signal.h
  -rw-rw-r-- 1 mingo mingo  8137 May  5 16:36 arch/x86/include/asm/fpu/types.h
  -rw-rw-r-- 1 mingo mingo  5691 May  5 16:36 arch/x86/include/asm/fpu/xstate.h

<fpu/api.h> is the only 'public' API left, used in various drivers.

I decoupled drivers and non-FPU x86 code from various FPU internals.

2)

I renamed various internal data types, APIs and helpers, and organized its
support functions accordingly.

For example, all functions that deal with copying FPU registers in and
out of the FPU, are now named consistently:

      copy_fxregs_to_kernel()         # was: fpu_fxsave()
      copy_xregs_to_kernel()          # was: xsave_state()

      copy_kernel_to_fregs()          # was: frstor_checking()
      copy_kernel_to_fxregs()         # was: fxrstor_checking()
      copy_kernel_to_xregs()          # was: fpu_xrstor_checking()
      copy_kernel_to_xregs_booting()  # was: xrstor_state_booting()

      copy_fregs_to_user()            # was: fsave_user()
      copy_fxregs_to_user()           # was: fxsave_user()
      copy_xregs_to_user()            # was: xsave_user()

      copy_user_to_fregs()            # was: frstor_user()
      copy_user_to_fxregs()           # was: fxrstor_user()
      copy_user_to_xregs()            # was: xrestore_user()
      copy_user_to_fpregs_zeroing()   # was: restore_user_xstate()

'xregs'  stands for registers supported by XSAVE
'fxregs' stands for registers supported by FXSAVE
'fregs'  stands for registers supported by FSAVE
'fpregs' stands for generic FPU registers.

Similarly, the high level FPU functions got reorganized as well:

    extern void fpu__activate_curr(struct fpu *fpu);
    extern void fpu__activate_stopped(struct fpu *fpu);
    extern void fpu__save(struct fpu *fpu);
    extern void fpu__restore(struct fpu *fpu);
    extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
    extern void fpu__drop(struct fpu *fpu);
    extern int  fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
    extern void fpu__clear(struct fpu *fpu);
    extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);

Those functions that used to take a task_struct argument now take
the more limited 'struct fpu' argument, and their naming is consistent
and logical as well.

Likewise, the FP state data types are now consistently named as well:

    struct fregs_state;
    struct fxregs_state;
    struct swregs_state;
    struct xregs_state;

    union fpregs_state;

3)

Various core data types got streamlined around four byte flags in 'struct fpu':

  fpu->fpstate_active          # was: tsk->flags & PF_USED_MATH
  fpu->fpregs_active           # was: fpu->has_fpu
  fpu->last_cpu
  fpu->counter

which now fit into a single word.

4)

task->thread.fpu->state got embedded again, as task->thread.fpu.state. This
eliminated a lot of awkward late dynamic memory allocation of FPU state
and the problematic handling of failures.

Note that while the allocation is static right now, this is a WIP interim
state: we can still do dynamic allocation of FPU state, by moving the FPU
state last in task_struct and then allocating task_struct accordingly.

5)

The amazingly convoluted init dependencies got sorted out, into two
cleanly separated families of initialization functions: the
fpu__init_system_*() functions, and the fpu__init_cpu_*() functions.

This allowed the removal of various __init annotation hacks and
obscure boot time checks.

6)

Decoupled the FPU core from the save code. xsave.c and xsave.h got
shrunk quite a bit, and it now hosts only XSAVE/etc. related
functionality, not generic FPU handling functions.

7)

Added a ton of comments explaining how things works and why, hopefully
making this code accessible to everyone interested.

8)

Added FPU debugging code (CONFIG_X86_DEBUG_FPU=y) and added an FPU hw
benchmarking subsystem (CONFIG_X86_DEBUG_FPU_MEASUREMENTS=y), which
performs boot time measurements like:

  x86/fpu:##################################################################
  x86/fpu: Running FPU performance measurement suite (cache hot):
  x86/fpu: Cost of: null                                      :   108 cycles
  x86/fpu:########  CPU instructions:           ############################
  x86/fpu: Cost of: NOP                         insn          :     0 cycles
  x86/fpu: Cost of: RDTSC                       insn          :    12 cycles
  x86/fpu: Cost of: RDMSR                       insn          :   100 cycles
  x86/fpu: Cost of: WRMSR                       insn          :   396 cycles
  x86/fpu: Cost of: CLI                         insn  same-IF :     0 cycles
  x86/fpu: Cost of: CLI                         insn  flip-IF :     0 cycles
  x86/fpu: Cost of: STI                         insn  same-IF :     0 cycles
  x86/fpu: Cost of: STI                         insn  flip-IF :     0 cycles
  x86/fpu: Cost of: PUSHF                       insn          :     0 cycles
  x86/fpu: Cost of: POPF                        insn  same-IF :    20 cycles
  x86/fpu: Cost of: POPF                        insn  flip-IF :    28 cycles
  x86/fpu:########  IRQ save/restore APIs:      ############################
  x86/fpu: Cost of: local_irq_save()            fn            :    20 cycles
  x86/fpu: Cost of: local_irq_restore()         fn    same-IF :    24 cycles
  x86/fpu: Cost of: local_irq_restore()         fn    flip-IF :    28 cycles
  x86/fpu: Cost of: irq_save()+restore()        fn    same-IF :    48 cycles
  x86/fpu: Cost of: irq_save()+restore()        fn    flip-IF :    48 cycles
  x86/fpu:########  locking APIs:               ############################
  x86/fpu: Cost of: smp_mb()                    fn            :    40 cycles
  x86/fpu: Cost of: cpu_relax()                 fn            :     8 cycles
  x86/fpu: Cost of: spin_lock()+unlock()        fn            :    64 cycles
  x86/fpu: Cost of: read_lock()+unlock()        fn            :    76 cycles
  x86/fpu: Cost of: write_lock()+unlock()       fn            :    52 cycles
  x86/fpu: Cost of: rcu_read_lock()+unlock()    fn            :    16 cycles
  x86/fpu: Cost of: preempt_disable()+enable()  fn            :    20 cycles
  x86/fpu: Cost of: mutex_lock()+unlock()       fn            :    56 cycles
  x86/fpu:########  MM instructions:            ############################
  x86/fpu: Cost of: __flush_tlb()               fn            :   132 cycles
  x86/fpu: Cost of: __flush_tlb_global()        fn            :   920 cycles
  x86/fpu: Cost of: __flush_tlb_one()           fn            :   288 cycles
  x86/fpu: Cost of: __flush_tlb_range()         fn            :   412 cycles
  x86/fpu:########  FPU instructions:           ############################
  x86/fpu: Cost of: CR0                         read          :     4 cycles
  x86/fpu: Cost of: CR0                         write         :   208 cycles
  x86/fpu: Cost of: CR0::TS                     fault         :  1156 cycles
  x86/fpu: Cost of: FNINIT                      insn          :    76 cycles
  x86/fpu: Cost of: FWAIT                       insn          :     0 cycles
  x86/fpu: Cost of: FSAVE                       insn          :   168 cycles
  x86/fpu: Cost of: FRSTOR                      insn          :   160 cycles
  x86/fpu: Cost of: FXSAVE                      insn          :    84 cycles
  x86/fpu: Cost of: FXRSTOR                     insn          :    44 cycles
  x86/fpu: Cost of: FXRSTOR                     fault         :   688 cycles
  x86/fpu: Cost of: XSAVE                       insn          :   104 cycles
  x86/fpu: Cost of: XRSTOR                      insn          :    80 cycles
  x86/fpu: Cost of: XRSTOR                      fault         :   884 cycles
  x86/fpu:##################################################################

Based on such measurements we'll be able to do performance tuning,
set default policies and do optimizations in a more informed fashion,
as the speed of various x86 hardware varies a lot.

9)

Reworked many ancient inlining and uninlining decisions based on
modern principles.


Any feedback is welcome!

Thanks,

    Ingo

=====
Ingo Molnar (208):
  x86/fpu: Rename unlazy_fpu() to fpu__save()
  x86/fpu: Add comments to fpu__save() and restrict its export
  x86/fpu: Add debugging check to fpu__save()
  x86/fpu: Rename fpu_detect() to fpu__detect()
  x86/fpu: Remove stale init_fpu() prototype
  x86/fpu: Split an fpstate_alloc_init() function out of init_fpu()
  x86/fpu: Make init_fpu() static
  x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check
  x86/fpu: Optimize fpu__unlazy_stopped()
  x86/fpu: Simplify fpu__unlazy_stopped()
  x86/fpu: Remove fpu_allocated()
  x86/fpu: Move fpu_alloc() out of line
  x86/fpu: Rename fpu_alloc() to fpstate_alloc()
  x86/fpu: Rename fpu_free() to fpstate_free()
  x86/fpu: Rename fpu_finit() to fpstate_init()
  x86/fpu: Rename fpu_init() to fpu__cpu_init()
  x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size()
  x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter
  x86/fpu: Improve the comment for the fpu::counter field
  x86/fpu: Move FPU data structures to asm/fpu_types.h
  x86/fpu: Clean up asm/fpu/types.h
  x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/
  x86/fpu: Fix header file dependencies of fpu-internal.h
  x86/fpu: Split out the boot time FPU init code into fpu/init.c
  x86/fpu: Remove unnecessary includes from core.c
  x86/fpu: Move the no_387 handling and FPU detection code into init.c
  x86/fpu: Remove the free_thread_xstate() complication
  x86/fpu: Factor out fpu__flush_thread() from flush_thread()
  x86/fpu: Move math_state_restore() to fpu/core.c
  x86/fpu: Rename math_state_restore() to fpu__restore()
  x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs()
  x86/fpu: Simplify the xsave_state*() methods
  x86/fpu: Remove fpu_xsave()
  x86/fpu: Move task_xstate_cachep handling to core.c
  x86/fpu: Factor out fpu__copy()
  x86/fpu: Uninline fpstate_free() and move it next to the allocation function
  x86/fpu: Make task_xstate_cachep static
  x86/fpu: Make kernel_fpu_disable/enable() static
  x86/fpu: Add debug check to kernel_fpu_disable()
  x86/fpu: Add kernel_fpu_disabled()
  x86/fpu: Remove __save_init_fpu()
  x86/fpu: Move fpu_copy() to fpu/core.c
  x86/fpu: Add debugging check to fpu_copy()
  x86/fpu: Print out whether we are doing lazy/eager FPU context switches
  x86/fpu: Eliminate the __thread_has_fpu() wrapper
  x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter
  x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c
  x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx
  x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu()
  x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end()
  x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin()
  x86/fpu: Open code PF_USED_MATH usages
  x86/fpu: Document fpu__unlazy_stopped()
  x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active
  x86/fpu: Remove 'struct task_struct' usage from drop_fpu()
  x86/fpu: Remove task_disable_lazy_fpu_restore()
  x86/fpu: Use 'struct fpu' in fpu_lazy_restore()
  x86/fpu: Use 'struct fpu' in restore_fpu_checking()
  x86/fpu: Use 'struct fpu' in fpu_reset_state()
  x86/fpu: Use 'struct fpu' in switch_fpu_prepare()
  x86/fpu: Use 'struct fpu' in switch_fpu_finish()
  x86/fpu: Move __save_fpu() into fpu/core.c
  x86/fpu: Use 'struct fpu' in __fpu_save()
  x86/fpu: Use 'struct fpu' in fpu__save()
  x86/fpu: Use 'struct fpu' in fpu_copy()
  x86/fpu: Use 'struct fpu' in fpu__copy()
  x86/fpu: Use 'struct fpu' in fpstate_alloc_init()
  x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped()
  x86/fpu: Rename fpu__flush_thread() to fpu__clear()
  x86/fpu: Clean up fpu__clear() a bit
  x86/fpu: Rename i387.h to fpu/api.h
  x86/fpu: Move xsave.h to fpu/xsave.h
  x86/fpu: Rename fpu-internal.h to fpu/internal.h
  x86/fpu: Move MXCSR_DEFAULT to fpu/internal.h
  x86/fpu: Remove xsave_init() __init obfuscation
  x86/fpu: Remove assembly guard from asm/fpu/api.h
  x86/fpu: Improve FPU detection kernel messages
  x86/fpu: Print supported xstate features in human readable way
  x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask'
  x86/fpu: Rename 'xstate_features' to 'xfeatures_nr'
  x86/fpu: Move XCR0 manipulation to the FPU code proper
  x86/fpu: Clean up regset functions
  x86/fpu: Rename 'xsave_hdr' to 'header'
  x86/fpu: Rename xsave.header::xstate_bv to 'xfeatures'
  x86/fpu: Clean up and fix MXCSR handling
  x86/fpu: Rename regset FPU register accessors
  x86/fpu: Explain the AVX register layout in the xsave area
  x86/fpu: Improve the __sanitize_i387_state() documentation
  x86/fpu: Rename fpu->has_fpu to fpu->fpregs_active
  x86/fpu: Rename __thread_set_has_fpu() to __fpregs_activate()
  x86/fpu: Rename __thread_clear_has_fpu() to __fpregs_deactivate()
  x86/fpu: Rename __thread_fpu_begin() to fpregs_activate()
  x86/fpu: Rename __thread_fpu_end() to fpregs_deactivate()
  x86/fpu: Remove fpstate_xstate_init_size() boot quirk
  x86/fpu: Remove xsave_init() bootmem allocations
  x86/fpu: Make setup_init_fpu_buf() run-once explicitly
  x86/fpu: Remove 'init_xstate_buf' bootmem allocation
  x86/fpu: Split fpu__cpu_init() into early-boot and cpu-boot parts
  x86/fpu: Make the system/cpu init distinction clear in the xstate code as well
  x86/fpu: Move CPU capability check into fpu__init_cpu_xstate()
  x86/fpu: Move legacy check to fpu__init_system_xstate()
  x86/fpu: Propagate once per boot quirk into fpu__init_system_xstate()
  x86/fpu: Remove xsave_init()
  x86/fpu: Do fpu__init_system_xstate only from fpu__init_system()
  x86/fpu: Set up the legacy FPU init image from fpu__init_system()
  x86/fpu: Remove setup_init_fpu_buf() call from eager_fpu_init()
  x86/fpu: Move all eager-fpu setup code to eager_fpu_init()
  x86/fpu: Move eager_fpu_init() to fpu/init.c
  x86/fpu: Clean up eager_fpu_init() and rename it to fpu__ctx_switch_init()
  x86/fpu: Split fpu__ctx_switch_init() into _cpu() and _system() portions
  x86/fpu: Do CLTS fpu__init_system()
  x86/fpu: Move the fpstate_xstate_init_size() call into fpu__init_system()
  x86/fpu: Call fpu__init_cpu_ctx_switch() from fpu__init_cpu()
  x86/fpu: Do system-wide setup from fpu__detect()
  x86/fpu: Remove fpu__init_cpu_ctx_switch() call from fpu__init_system()
  x86/fpu: Simplify fpu__cpu_init()
  x86/fpu: Factor out fpu__init_cpu_generic()
  x86/fpu: Factor out fpu__init_system_generic()
  x86/fpu: Factor out fpu__init_system_early_generic()
  x86/fpu: Move !FPU check ingo fpu__init_system_early_generic()
  x86/fpu: Factor out FPU bug checks into fpu/bugs.c
  x86/fpu: Make check_fpu() init ordering independent
  x86/fpu: Move fpu__init_system_early_generic() out of fpu__detect()
  x86/fpu: Remove the extra fpu__detect() layer
  x86/fpu: Rename fpstate_xstate_init_size() to fpu__init_system_xstate_size_legacy()
  x86/fpu: Reorder init methods
  x86/fpu: Add more comments to the FPU init code
  x86/fpu: Move fpu__save() to fpu/internals.h
  x86/fpu: Uninline kernel_fpu_begin()/end()
  x86/fpu: Move various internal function prototypes to fpu/internal.h
  x86/fpu: Uninline the irq_ts_save()/restore() functions
  x86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate()
  x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization with FP exceptions
  x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)
  x86/fpu: Remove failure paths from fpstate-alloc low level functions
  x86/fpu: Remove failure return from fpstate_alloc_init()
  x86/fpu: Rename fpstate_alloc_init() to fpstate_init_curr()
  x86/fpu: Simplify fpu__unlazy_stopped() error handling
  x86/fpu, kvm: Simplify fx_init()
  x86/fpu: Simplify fpstate_init_curr() usage
  x86/fpu: Rename fpu__unlazy_stopped() to fpu__activate_stopped()
  x86/fpu: Factor out FPU hw activation/deactivation
  x86/fpu: Simplify __save_fpu()
  x86/fpu: Eliminate __save_fpu()
  x86/fpu: Simplify fpu__save()
  x86/fpu: Optimize fpu__save()
  x86/fpu: Optimize fpu_copy()
  x86/fpu: Optimize fpu_copy() some more on lazy switching systems
  x86/fpu: Rename fpu/xsave.h to fpu/xstate.h
  x86/fpu: Rename fpu/xsave.c to fpu/xstate.c
  x86/fpu: Introduce cpu_has_xfeatures(xfeatures_mask, feature_name)
  x86/fpu: Simplify print_xstate_features()
  x86/fpu: Enumerate xfeature bits
  x86/fpu: Move xfeature type enumeration to fpu/types.h
  x86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature checks
  x86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks
  x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks
  x86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks
  x86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks
  x86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks
  x86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks
  x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks
  x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks
  x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks
  x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c
  x86/fpu: Move asm/xcr.h to asm/fpu/internal.h
  x86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate()
  x86/fpu: Simplify fpstate_sanitize_xstate() calls
  x86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate()
  x86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe()
  x86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe()
  x86/fpu: Clarify ancient comments in fpu__restore()
  x86/fpu: Rename user_has_fpu() to fpregs_active()
  x86/fpu: Initialize fpregs in fpu__init_cpu_generic()
  x86/fpu: Clean up fpu__clear() state handling
  x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()
  x86/fpu: Synchronize the naming of drop_fpu() and fpu_reset_state()
  x86/fpu: Rename restore_fpu_checking() to copy_fpstate_to_fpregs()
  x86/fpu: Move all the fpu__*() high level methods closer to each other
  x86/fpu: Move fpu__clear() to 'struct fpu *' parameter passing
  x86/fpu: Rename restore_xstate_sig() to fpu__restore_sig()
  x86/fpu: Move the signal frame handling code closer to each other
  x86/fpu: Merge fpu__reset() and fpu__clear()
  x86/fpu: Move is_ia32*frame() helpers out of fpu/internal.h
  x86/fpu: Split out fpu/signal.h from fpu/internal.h for signal frame handling functions
  x86/fpu: Factor out fpu/regset.h from fpu/internal.h
  x86/fpu: Remove run-once init quirks
  x86/fpu: Factor out the exception error code handling code
  x86/fpu: Harmonize the names of the fpstate_init() helper functions
  x86/fpu: Create 'union thread_xstate' helper for fpstate_init()
  x86/fpu: Generalize 'init_xstate_ctx'
  x86/fpu: Move restore_init_xstate() out of fpu/internal.h
  x86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions
  x86/fpu: Factor out fpu/signal.c
  x86/fpu: Factor out the FPU regset code into fpu/regset.c
  x86/fpu: Harmonize FPU register state types
  x86/fpu: Change fpu->fpregs_active from 'int' to 'char', add lazy switching comments
  x86/fpu: Document the various fpregs state formats
  x86/fpu: Move debugging check from kernel_fpu_begin() to __kernel_fpu_begin()
  x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end
  x86/fpu: Clean up xstate feature reservation
  x86/fpu/xstate: Clean up setup_xstate_comp() call
  x86/fpu/init: Propagate __init annotations
  x86/fpu: Pass 'struct fpu' to fpu__restore()
  x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT
  x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code
  x86/fpu: Add FPU performance measurement subsystem
  x86/fpu: Reorganize fpu/internal.h

 Documentation/preempt-locking.txt              |   2 +-
 arch/x86/Kconfig.debug                         |  27 ++
 arch/x86/crypto/aesni-intel_glue.c             |   2 +-
 arch/x86/crypto/camellia_aesni_avx2_glue.c     |  15 +-
 arch/x86/crypto/camellia_aesni_avx_glue.c      |  15 +-
 arch/x86/crypto/cast5_avx_glue.c               |  15 +-
 arch/x86/crypto/cast6_avx_glue.c               |  15 +-
 arch/x86/crypto/crc32-pclmul_glue.c            |   2 +-
 arch/x86/crypto/crc32c-intel_glue.c            |   3 +-
 arch/x86/crypto/crct10dif-pclmul_glue.c        |   2 +-
 arch/x86/crypto/fpu.c                          |   2 +-
 arch/x86/crypto/ghash-clmulni-intel_glue.c     |   2 +-
 arch/x86/crypto/serpent_avx2_glue.c            |  15 +-
 arch/x86/crypto/serpent_avx_glue.c             |  15 +-
 arch/x86/crypto/sha-mb/sha1_mb.c               |   5 +-
 arch/x86/crypto/sha1_ssse3_glue.c              |  16 +-
 arch/x86/crypto/sha256_ssse3_glue.c            |  16 +-
 arch/x86/crypto/sha512_ssse3_glue.c            |  16 +-
 arch/x86/crypto/twofish_avx_glue.c             |  16 +-
 arch/x86/ia32/ia32_signal.c                    |  13 +-
 arch/x86/include/asm/alternative.h             |   6 +
 arch/x86/include/asm/crypto/glue_helper.h      |   2 +-
 arch/x86/include/asm/efi.h                     |   2 +-
 arch/x86/include/asm/fpu-internal.h            | 626 ---------------------------------------
 arch/x86/include/asm/fpu/api.h                 |  48 +++
 arch/x86/include/asm/fpu/internal.h            | 488 ++++++++++++++++++++++++++++++
 arch/x86/include/asm/fpu/measure.h             |  13 +
 arch/x86/include/asm/fpu/regset.h              |  21 ++
 arch/x86/include/asm/fpu/signal.h              |  33 +++
 arch/x86/include/asm/fpu/types.h               | 293 ++++++++++++++++++
 arch/x86/include/asm/{xsave.h => fpu/xstate.h} |  60 ++--
 arch/x86/include/asm/i387.h                    | 108 -------
 arch/x86/include/asm/kvm_host.h                |   2 -
 arch/x86/include/asm/mpx.h                     |   8 +-
 arch/x86/include/asm/processor.h               | 141 +--------
 arch/x86/include/asm/simd.h                    |   2 +-
 arch/x86/include/asm/stackprotector.h          |   2 +
 arch/x86/include/asm/suspend_32.h              |   2 +-
 arch/x86/include/asm/suspend_64.h              |   2 +-
 arch/x86/include/asm/user.h                    |  12 +-
 arch/x86/include/asm/xcr.h                     |  49 ---
 arch/x86/include/asm/xor.h                     |   2 +-
 arch/x86/include/asm/xor_32.h                  |   2 +-
 arch/x86/include/asm/xor_avx.h                 |   2 +-
 arch/x86/include/uapi/asm/sigcontext.h         |   8 +-
 arch/x86/kernel/Makefile                       |   2 +-
 arch/x86/kernel/alternative.c                  |   5 +
 arch/x86/kernel/cpu/bugs.c                     |  57 +---
 arch/x86/kernel/cpu/bugs_64.c                  |   2 +
 arch/x86/kernel/cpu/common.c                   |  29 +-
 arch/x86/kernel/fpu/Makefile                   |  11 +
 arch/x86/kernel/fpu/bugs.c                     |  71 +++++
 arch/x86/kernel/fpu/core.c                     | 509 +++++++++++++++++++++++++++++++
 arch/x86/kernel/fpu/init.c                     | 288 ++++++++++++++++++
 arch/x86/kernel/fpu/measure.c                  | 509 +++++++++++++++++++++++++++++++
 arch/x86/kernel/fpu/regset.c                   | 356 ++++++++++++++++++++++
 arch/x86/kernel/fpu/signal.c                   | 404 +++++++++++++++++++++++++
 arch/x86/kernel/fpu/xstate.c                   | 406 +++++++++++++++++++++++++
 arch/x86/kernel/i387.c                         | 656 ----------------------------------------
 arch/x86/kernel/process.c                      |  52 +---
 arch/x86/kernel/process_32.c                   |  15 +-
 arch/x86/kernel/process_64.c                   |  13 +-
 arch/x86/kernel/ptrace.c                       |  12 +-
 arch/x86/kernel/signal.c                       |  38 ++-
 arch/x86/kernel/smpboot.c                      |   3 +-
 arch/x86/kernel/traps.c                        | 120 ++------
 arch/x86/kernel/xsave.c                        | 724 ---------------------------------------------
 arch/x86/kvm/cpuid.c                           |   2 +-
 arch/x86/kvm/vmx.c                             |   5 +-
 arch/x86/kvm/x86.c                             |  68 ++---
 arch/x86/lguest/boot.c                         |   2 +-
 arch/x86/lib/mmx_32.c                          |   2 +-
 arch/x86/math-emu/fpu_aux.c                    |   4 +-
 arch/x86/math-emu/fpu_entry.c                  |  20 +-
 arch/x86/math-emu/fpu_system.h                 |   2 +-
 arch/x86/mm/mpx.c                              |  15 +-
 arch/x86/power/cpu.c                           |  11 +-
 arch/x86/xen/enlighten.c                       |   2 +-
 drivers/char/hw_random/via-rng.c               |   2 +-
 drivers/crypto/padlock-aes.c                   |   2 +-
 drivers/crypto/padlock-sha.c                   |   2 +-
 drivers/lguest/x86/core.c                      |  12 +-
 lib/raid6/x86.h                                |   2 +-
 83 files changed, 3742 insertions(+), 2841 deletions(-)
 delete mode 100644 arch/x86/include/asm/fpu-internal.h
 create mode 100644 arch/x86/include/asm/fpu/api.h
 create mode 100644 arch/x86/include/asm/fpu/internal.h
 create mode 100644 arch/x86/include/asm/fpu/measure.h
 create mode 100644 arch/x86/include/asm/fpu/regset.h
 create mode 100644 arch/x86/include/asm/fpu/signal.h
 create mode 100644 arch/x86/include/asm/fpu/types.h
 rename arch/x86/include/asm/{xsave.h => fpu/xstate.h} (77%)
 delete mode 100644 arch/x86/include/asm/i387.h
 delete mode 100644 arch/x86/include/asm/xcr.h
 create mode 100644 arch/x86/kernel/fpu/Makefile
 create mode 100644 arch/x86/kernel/fpu/bugs.c
 create mode 100644 arch/x86/kernel/fpu/core.c
 create mode 100644 arch/x86/kernel/fpu/init.c
 create mode 100644 arch/x86/kernel/fpu/measure.c
 create mode 100644 arch/x86/kernel/fpu/regset.c
 create mode 100644 arch/x86/kernel/fpu/signal.c
 create mode 100644 arch/x86/kernel/fpu/xstate.c
 delete mode 100644 arch/x86/kernel/i387.c
 delete mode 100644 arch/x86/kernel/xsave.c

-- 
2.1.0


^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 001/208] x86/fpu: Rename unlazy_fpu() to fpu__save()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 002/208] x86/fpu: Add comments to fpu__save() and restrict its export Ingo Molnar
                   ` (78 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This function is a misnomer on two levels:

1) it doesn't really manipulate TS on modern CPUs anymore, its
   primary purpose is to save FPU state, used:

      - when executing fork()/clone(): to copy current FPU state
        to the child's FPU state.

      - when handling math exceptions: to generate the math error
        si_code in the signal frame.

2) even on legacy CPUs it doesn't actually 'unlazy', if then
   it lazies the FPU state: as a side effect of the old FNSAVE
   instruction which clears (destroys) FPU state it's necessary
   to set CR0::TS.

So rename it to fpu__save() to better reflect its purpose.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 2 +-
 arch/x86/include/asm/i387.h         | 2 +-
 arch/x86/kernel/i387.c              | 6 +++---
 arch/x86/kernel/traps.c             | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index da5e96756570..49db0440b0ed 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -602,7 +602,7 @@ static inline void fpu_copy(struct task_struct *dst, struct task_struct *src)
 		struct fpu *dfpu = &dst->thread.fpu;
 		struct fpu *sfpu = &src->thread.fpu;
 
-		unlazy_fpu(src);
+		fpu__save(src);
 		memcpy(dfpu->state, sfpu->state, xstate_size);
 	}
 }
diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 6eb6fcb83f63..d4419da9b210 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -101,7 +101,7 @@ static inline int user_has_fpu(void)
 	return current->thread.fpu.has_fpu;
 }
 
-extern void unlazy_fpu(struct task_struct *tsk);
+extern void fpu__save(struct task_struct *tsk);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 009183276bb7..ec1a744eb853 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -117,7 +117,7 @@ void __kernel_fpu_end(void)
 }
 EXPORT_SYMBOL(__kernel_fpu_end);
 
-void unlazy_fpu(struct task_struct *tsk)
+void fpu__save(struct task_struct *tsk)
 {
 	preempt_disable();
 	if (__thread_has_fpu(tsk)) {
@@ -130,7 +130,7 @@ void unlazy_fpu(struct task_struct *tsk)
 	}
 	preempt_enable();
 }
-EXPORT_SYMBOL(unlazy_fpu);
+EXPORT_SYMBOL(fpu__save);
 
 unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
 unsigned int xstate_size;
@@ -251,7 +251,7 @@ int init_fpu(struct task_struct *tsk)
 
 	if (tsk_used_math(tsk)) {
 		if (cpu_has_fpu && tsk == current)
-			unlazy_fpu(tsk);
+			fpu__save(tsk);
 		task_disable_lazy_fpu_restore(tsk);
 		return 0;
 	}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 324ab5247687..12f29f9907cd 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -731,7 +731,7 @@ static void math_error(struct pt_regs *regs, int error_code, int trapnr)
 	/*
 	 * Save the info for the exception handler and clear the error.
 	 */
-	unlazy_fpu(task);
+	fpu__save(task);
 	task->thread.trap_nr = trapnr;
 	task->thread.error_code = error_code;
 	info.si_signo = SIGFPE;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 002/208] x86/fpu: Add comments to fpu__save() and restrict its export
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
  2015-05-05 16:23 ` [PATCH 001/208] x86/fpu: Rename unlazy_fpu() to fpu__save() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 003/208] x86/fpu: Add debugging check to fpu__save() Ingo Molnar
                   ` (77 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Add an explanation to fpu__save() and also don't export it to
random modules - we don't want them to futz around with deep kernel
internals.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/i387.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index ec1a744eb853..ac47278cde71 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -117,6 +117,9 @@ void __kernel_fpu_end(void)
 }
 EXPORT_SYMBOL(__kernel_fpu_end);
 
+/*
+ * Save the FPU state (initialize it if necessary):
+ */
 void fpu__save(struct task_struct *tsk)
 {
 	preempt_disable();
@@ -130,7 +133,7 @@ void fpu__save(struct task_struct *tsk)
 	}
 	preempt_enable();
 }
-EXPORT_SYMBOL(fpu__save);
+EXPORT_SYMBOL_GPL(fpu__save);
 
 unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
 unsigned int xstate_size;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 003/208] x86/fpu: Add debugging check to fpu__save()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
  2015-05-05 16:23 ` [PATCH 001/208] x86/fpu: Rename unlazy_fpu() to fpu__save() Ingo Molnar
  2015-05-05 16:23 ` [PATCH 002/208] x86/fpu: Add comments to fpu__save() and restrict its export Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 004/208] x86/fpu: Rename fpu_detect() to fpu__detect() Ingo Molnar
                   ` (76 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Document the function a bit more and add debugging check that we are only
running this with the current task.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/i387.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index ac47278cde71..66f1053ae2cd 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -119,9 +119,13 @@ EXPORT_SYMBOL(__kernel_fpu_end);
 
 /*
  * Save the FPU state (initialize it if necessary):
+ *
+ * This only ever gets called for the current task.
  */
 void fpu__save(struct task_struct *tsk)
 {
+	WARN_ON(tsk != current);
+
 	preempt_disable();
 	if (__thread_has_fpu(tsk)) {
 		if (use_eager_fpu()) {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 004/208] x86/fpu: Rename fpu_detect() to fpu__detect()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (2 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 003/208] x86/fpu: Add debugging check to fpu__save() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 005/208] x86/fpu: Remove stale init_fpu() prototype Ingo Molnar
                   ` (75 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Use the fpu__*() namespace to organize FPU ops better.

Also document fpu__detect() a bit.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h | 2 +-
 arch/x86/kernel/cpu/common.c     | 2 +-
 arch/x86/kernel/i387.c           | 6 +++++-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 23ba6765b718..2dc08c231a9a 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -166,7 +166,7 @@ extern const struct seq_operations cpuinfo_op;
 #define cache_line_size()	(boot_cpu_data.x86_cache_alignment)
 
 extern void cpu_detect(struct cpuinfo_x86 *c);
-extern void fpu_detect(struct cpuinfo_x86 *c);
+extern void fpu__detect(struct cpuinfo_x86 *c);
 
 extern void early_cpu_init(void);
 extern void identify_boot_cpu(void);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a62cf04dac8a..60a29d290e2a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -759,7 +759,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 	cpu_detect(c);
 	get_cpu_vendor(c);
 	get_cpu_cap(c);
-	fpu_detect(c);
+	fpu__detect(c);
 
 	if (this_cpu->c_early_init)
 		this_cpu->c_early_init(c);
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 66f1053ae2cd..29251f5668b1 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -640,7 +640,11 @@ static int __init no_387(char *s)
 
 __setup("no387", no_387);
 
-void fpu_detect(struct cpuinfo_x86 *c)
+/*
+ * Set the X86_FEATURE_FPU CPU-capability bit based on
+ * trying to execute an actual sequence of FPU instructions:
+ */
+void fpu__detect(struct cpuinfo_x86 *c)
 {
 	unsigned long cr0;
 	u16 fsw, fcw;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 005/208] x86/fpu: Remove stale init_fpu() prototype
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (3 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 004/208] x86/fpu: Rename fpu_detect() to fpu__detect() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 006/208] x86/fpu: Split an fpstate_alloc_init() function out of init_fpu() Ingo Molnar
                   ` (74 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

We are going to split init_fpu() so keep only a single prototype, in i387.h.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/xsave.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index c9a6d68b8d62..58ed0ca5a11e 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -51,7 +51,6 @@ extern struct xsave_struct *init_xstate_buf;
 
 extern void xsave_init(void);
 extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
-extern int init_fpu(struct task_struct *child);
 
 /* These macros all use (%edi)/(%rdi) as the single memory argument. */
 #define XSAVE		".byte " REX_PREFIX "0x0f,0xae,0x27"
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 006/208] x86/fpu: Split an fpstate_alloc_init() function out of init_fpu()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (4 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 005/208] x86/fpu: Remove stale init_fpu() prototype Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 007/208] x86/fpu: Make init_fpu() static Ingo Molnar
                   ` (73 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Most init_fpu() users don't want the register-saving aspect of the
function, they are calling it for 'current' and when FPU registers
are not allocated and initialized yet.

Split out a simplified API that does just that (and add debug-checks
for these conditions): fpstate_alloc_init().

Use it where appropriate.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h   |  3 +++
 arch/x86/kernel/i387.c        | 31 +++++++++++++++++++++++++++++++
 arch/x86/kernel/process.c     |  2 +-
 arch/x86/kernel/traps.c       |  2 +-
 arch/x86/kernel/xsave.c       |  2 +-
 arch/x86/kvm/x86.c            |  2 +-
 arch/x86/math-emu/fpu_entry.c |  2 +-
 7 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index d4419da9b210..1a896b4533c4 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -18,7 +18,10 @@
 struct pt_regs;
 struct user_i387_struct;
 
+extern int fpstate_alloc_init(struct task_struct *curr);
+
 extern int init_fpu(struct task_struct *child);
+
 extern void fpu_finit(struct fpu *fpu);
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
 extern void math_state_restore(void);
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 29251f5668b1..56b6e726fb60 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -247,6 +247,37 @@ void fpu_finit(struct fpu *fpu)
 EXPORT_SYMBOL_GPL(fpu_finit);
 
 /*
+ * Allocate the backing store for the current task's FPU registers
+ * and initialize the registers themselves as well.
+ *
+ * Can fail.
+ */
+int fpstate_alloc_init(struct task_struct *curr)
+{
+	int ret;
+
+	if (WARN_ON_ONCE(curr != current))
+		return -EINVAL;
+	if (WARN_ON_ONCE(curr->flags & PF_USED_MATH))
+		return -EINVAL;
+
+	/*
+	 * Memory allocation at the first usage of the FPU and other state.
+	 */
+	ret = fpu_alloc(&curr->thread.fpu);
+	if (ret)
+		return ret;
+
+	fpu_finit(&curr->thread.fpu);
+
+	/* Safe to do for the current task: */
+	curr->flags |= PF_USED_MATH;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(fpstate_alloc_init);
+
+/*
  * The _current_ task is using the FPU for the first time
  * so initialize it and set the mxcsr to its default
  * value at reset if we support XMM instructions and then
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 8213da62b1b7..e2220e38ff6d 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -158,7 +158,7 @@ void flush_thread(void)
 		free_thread_xstate(tsk);
 	} else if (!used_math()) {
 		/* kthread execs. TODO: cleanup this horror. */
-		if (WARN_ON(init_fpu(tsk)))
+		if (WARN_ON(fpstate_alloc_init(tsk)))
 			force_sig(SIGKILL, tsk);
 		user_fpu_begin();
 		restore_init_xstate();
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 12f29f9907cd..cf9c9627be19 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -846,7 +846,7 @@ void math_state_restore(void)
 		/*
 		 * does a slab alloc which can sleep
 		 */
-		if (init_fpu(tsk)) {
+		if (fpstate_alloc_init(tsk)) {
 			/*
 			 * ran out of memory!
 			 */
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 87a815b85f3e..a977cdd03825 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -349,7 +349,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 	if (!access_ok(VERIFY_READ, buf, size))
 		return -EACCES;
 
-	if (!used_math() && init_fpu(tsk))
+	if (!used_math() && fpstate_alloc_init(tsk))
 		return -1;
 
 	if (!static_cpu_has(X86_FEATURE_FPU))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c73efcd03e29..bfc396632ee8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6600,7 +6600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	int r;
 	sigset_t sigsaved;
 
-	if (!tsk_used_math(current) && init_fpu(current))
+	if (!tsk_used_math(current) && fpstate_alloc_init(current))
 		return -ENOMEM;
 
 	if (vcpu->sigset_active)
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index 9b868124128d..c9ff09a02385 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -149,7 +149,7 @@ void math_emulate(struct math_emu_info *info)
 	struct desc_struct code_descriptor;
 
 	if (!used_math()) {
-		if (init_fpu(current)) {
+		if (fpstate_alloc_init(current)) {
 			do_group_exit(SIGKILL);
 			return;
 		}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 007/208] x86/fpu: Make init_fpu() static
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (5 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 006/208] x86/fpu: Split an fpstate_alloc_init() function out of init_fpu() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 008/208] x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check Ingo Molnar
                   ` (72 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Now that the allocation users have been split off into a separate
function, init_fpu() has become local to i387.c: make it static.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h | 2 --
 arch/x86/kernel/i387.c      | 3 +--
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 1a896b4533c4..0367d17371f5 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -20,8 +20,6 @@ struct user_i387_struct;
 
 extern int fpstate_alloc_init(struct task_struct *curr);
 
-extern int init_fpu(struct task_struct *child);
-
 extern void fpu_finit(struct fpu *fpu);
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
 extern void math_state_restore(void);
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 56b6e726fb60..95079026c386 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -283,7 +283,7 @@ EXPORT_SYMBOL_GPL(fpstate_alloc_init);
  * value at reset if we support XMM instructions and then
  * remember the current task has used the FPU.
  */
-int init_fpu(struct task_struct *tsk)
+static int init_fpu(struct task_struct *tsk)
 {
 	int ret;
 
@@ -306,7 +306,6 @@ int init_fpu(struct task_struct *tsk)
 	set_stopped_child_used_math(tsk);
 	return 0;
 }
-EXPORT_SYMBOL_GPL(init_fpu);
 
 /*
  * The xstateregs_active() routine is the same as the fpregs_active() routine,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 008/208] x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (6 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 007/208] x86/fpu: Make init_fpu() static Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 009/208] x86/fpu: Optimize fpu__unlazy_stopped() Ingo Molnar
                   ` (71 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This function name is a misnomer now that we've split out all the
other users from it. Rename it accordingly: it's used to save
the FPU state of (ptrace-)stopped child tasks.

Add debugging check to double check this intended usage: that this
function is only called for non-current, stopped child tasks.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/i387.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 95079026c386..909d10d2fb70 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -283,27 +283,30 @@ EXPORT_SYMBOL_GPL(fpstate_alloc_init);
  * value at reset if we support XMM instructions and then
  * remember the current task has used the FPU.
  */
-static int init_fpu(struct task_struct *tsk)
+static int fpu__unlazy_stopped(struct task_struct *child)
 {
 	int ret;
 
-	if (tsk_used_math(tsk)) {
-		if (cpu_has_fpu && tsk == current)
-			fpu__save(tsk);
-		task_disable_lazy_fpu_restore(tsk);
+	if (WARN_ON_ONCE(child == current))
+		return -EINVAL;
+
+	if (tsk_used_math(child)) {
+		if (cpu_has_fpu && child == current)
+			fpu__save(child);
+		task_disable_lazy_fpu_restore(child);
 		return 0;
 	}
 
 	/*
 	 * Memory allocation at the first usage of the FPU and other state.
 	 */
-	ret = fpu_alloc(&tsk->thread.fpu);
+	ret = fpu_alloc(&child->thread.fpu);
 	if (ret)
 		return ret;
 
-	fpu_finit(&tsk->thread.fpu);
+	fpu_finit(&child->thread.fpu);
 
-	set_stopped_child_used_math(tsk);
+	set_stopped_child_used_math(child);
 	return 0;
 }
 
@@ -331,7 +334,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_has_fxsr)
 		return -ENODEV;
 
-	ret = init_fpu(target);
+	ret = fpu__unlazy_stopped(target);
 	if (ret)
 		return ret;
 
@@ -350,7 +353,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_has_fxsr)
 		return -ENODEV;
 
-	ret = init_fpu(target);
+	ret = fpu__unlazy_stopped(target);
 	if (ret)
 		return ret;
 
@@ -384,7 +387,7 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_has_xsave)
 		return -ENODEV;
 
-	ret = init_fpu(target);
+	ret = fpu__unlazy_stopped(target);
 	if (ret)
 		return ret;
 
@@ -414,7 +417,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_has_xsave)
 		return -ENODEV;
 
-	ret = init_fpu(target);
+	ret = fpu__unlazy_stopped(target);
 	if (ret)
 		return ret;
 
@@ -577,7 +580,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	ret = init_fpu(target);
+	ret = fpu__unlazy_stopped(target);
 	if (ret)
 		return ret;
 
@@ -608,7 +611,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	ret = init_fpu(target);
+	ret = fpu__unlazy_stopped(target);
 	if (ret)
 		return ret;
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 009/208] x86/fpu: Optimize fpu__unlazy_stopped()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (7 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 008/208] x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 010/208] x86/fpu: Simplify fpu__unlazy_stopped() Ingo Molnar
                   ` (70 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This function is only called for stopped child tasks, so the
fpu__save() branch will never get called - remove it.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/i387.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 909d10d2fb70..76006a701dbb 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -291,8 +291,6 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 		return -EINVAL;
 
 	if (tsk_used_math(child)) {
-		if (cpu_has_fpu && child == current)
-			fpu__save(child);
 		task_disable_lazy_fpu_restore(child);
 		return 0;
 	}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 010/208] x86/fpu: Simplify fpu__unlazy_stopped()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (8 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 009/208] x86/fpu: Optimize fpu__unlazy_stopped() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 011/208] x86/fpu: Remove fpu_allocated() Ingo Molnar
                   ` (69 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Open code the PF_USED_MATH logic, to make the logic more obvious.

(We'll slowly convert the other users of *_used_math() methods as well.)

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/i387.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 76006a701dbb..5e4dae70ffa5 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -290,7 +290,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 	if (WARN_ON_ONCE(child == current))
 		return -EINVAL;
 
-	if (tsk_used_math(child)) {
+	if (child->flags & PF_USED_MATH) {
 		task_disable_lazy_fpu_restore(child);
 		return 0;
 	}
@@ -304,7 +304,9 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 
 	fpu_finit(&child->thread.fpu);
 
-	set_stopped_child_used_math(child);
+	/* Safe to do for stopped child tasks: */
+	child->flags |= PF_USED_MATH;
+
 	return 0;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 011/208] x86/fpu: Remove fpu_allocated()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (9 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 010/208] x86/fpu: Simplify fpu__unlazy_stopped() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 012/208] x86/fpu: Move fpu_alloc() out of line Ingo Molnar
                   ` (68 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

It's an unnecessary obfuscation of a very simple allocation pattern.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 49db0440b0ed..21000f0f0ae1 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -569,14 +569,9 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 	}
 }
 
-static bool fpu_allocated(struct fpu *fpu)
-{
-	return fpu->state != NULL;
-}
-
 static inline int fpu_alloc(struct fpu *fpu)
 {
-	if (fpu_allocated(fpu))
+	if (fpu->state)
 		return 0;
 	fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
 	if (!fpu->state)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 012/208] x86/fpu: Move fpu_alloc() out of line
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (10 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 011/208] x86/fpu: Remove fpu_allocated() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 013/208] x86/fpu: Rename fpu_alloc() to fpstate_alloc() Ingo Molnar
                   ` (67 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This is not a small function, and it's used in several places,
one of them a popular module (KVM).

Move the function out of line. This saves a bit of text,
even with the symbol export overhead:

   text    data     bss     dec     hex filename
   12566052        1619504 1089536 15275092         e91454 vmlinux.before
   12566046        1619504 1089536 15275086         e9144e vmlinux.after

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 11 +----------
 arch/x86/kernel/i387.c              | 12 ++++++++++++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 21000f0f0ae1..bdbba1a4de69 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -569,16 +569,7 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 	}
 }
 
-static inline int fpu_alloc(struct fpu *fpu)
-{
-	if (fpu->state)
-		return 0;
-	fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
-	if (!fpu->state)
-		return -ENOMEM;
-	WARN_ON((unsigned long)fpu->state & 15);
-	return 0;
-}
+extern int fpu_alloc(struct fpu *fpu);
 
 static inline void fpu_free(struct fpu *fpu)
 {
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 5e4dae70ffa5..05fcc90087b0 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -246,6 +246,18 @@ void fpu_finit(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpu_finit);
 
+int fpu_alloc(struct fpu *fpu)
+{
+	if (fpu->state)
+		return 0;
+	fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
+	if (!fpu->state)
+		return -ENOMEM;
+	WARN_ON((unsigned long)fpu->state & 15);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(fpu_alloc);
+
 /*
  * Allocate the backing store for the current task's FPU registers
  * and initialize the registers themselves as well.
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 013/208] x86/fpu: Rename fpu_alloc() to fpstate_alloc()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (11 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 012/208] x86/fpu: Move fpu_alloc() out of line Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 014/208] x86/fpu: Rename fpu_free() to fpstate_free() Ingo Molnar
                   ` (66 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Use the fpu__*() namespace for fpstate_alloc() as well.

Also add a comment about FPU state alignment.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h |  2 +-
 arch/x86/kernel/i387.c              | 12 ++++++++----
 arch/x86/kernel/process.c           |  2 +-
 arch/x86/kvm/x86.c                  |  2 +-
 4 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index bdbba1a4de69..88f584ab3463 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -569,7 +569,7 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 	}
 }
 
-extern int fpu_alloc(struct fpu *fpu);
+extern int fpstate_alloc(struct fpu *fpu);
 
 static inline void fpu_free(struct fpu *fpu)
 {
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 05fcc90087b0..5f2feb63b72a 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -246,17 +246,21 @@ void fpu_finit(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpu_finit);
 
-int fpu_alloc(struct fpu *fpu)
+int fpstate_alloc(struct fpu *fpu)
 {
 	if (fpu->state)
 		return 0;
+
 	fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
 	if (!fpu->state)
 		return -ENOMEM;
+
+	/* The CPU requires the FPU state to be aligned to 16 byte boundaries: */
 	WARN_ON((unsigned long)fpu->state & 15);
+
 	return 0;
 }
-EXPORT_SYMBOL_GPL(fpu_alloc);
+EXPORT_SYMBOL_GPL(fpstate_alloc);
 
 /*
  * Allocate the backing store for the current task's FPU registers
@@ -276,7 +280,7 @@ int fpstate_alloc_init(struct task_struct *curr)
 	/*
 	 * Memory allocation at the first usage of the FPU and other state.
 	 */
-	ret = fpu_alloc(&curr->thread.fpu);
+	ret = fpstate_alloc(&curr->thread.fpu);
 	if (ret)
 		return ret;
 
@@ -310,7 +314,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 	/*
 	 * Memory allocation at the first usage of the FPU and other state.
 	 */
-	ret = fpu_alloc(&child->thread.fpu);
+	ret = fpstate_alloc(&child->thread.fpu);
 	if (ret)
 		return ret;
 
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index e2220e38ff6d..7b54c81403d5 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -92,7 +92,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	dst->thread.fpu.state = NULL;
 	task_disable_lazy_fpu_restore(dst);
 	if (tsk_used_math(src)) {
-		int err = fpu_alloc(&dst->thread.fpu);
+		int err = fpstate_alloc(&dst->thread.fpu);
 		if (err)
 			return err;
 		fpu_copy(dst, src);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bfc396632ee8..28fa733bb1c6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7007,7 +7007,7 @@ int fx_init(struct kvm_vcpu *vcpu)
 {
 	int err;
 
-	err = fpu_alloc(&vcpu->arch.guest_fpu);
+	err = fpstate_alloc(&vcpu->arch.guest_fpu);
 	if (err)
 		return err;
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 014/208] x86/fpu: Rename fpu_free() to fpstate_free()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (12 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 013/208] x86/fpu: Rename fpu_alloc() to fpstate_alloc() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 015/208] x86/fpu: Rename fpu_finit() to fpstate_init() Ingo Molnar
                   ` (65 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Use the fpu__*() namespace.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 2 +-
 arch/x86/kernel/process.c           | 2 +-
 arch/x86/kvm/x86.c                  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 88f584ab3463..b68e8f04c38a 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -571,7 +571,7 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 
 extern int fpstate_alloc(struct fpu *fpu);
 
-static inline void fpu_free(struct fpu *fpu)
+static inline void fpstate_free(struct fpu *fpu)
 {
 	if (fpu->state) {
 		kmem_cache_free(task_xstate_cachep, fpu->state);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 7b54c81403d5..d9a02e6392d8 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -102,7 +102,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 
 void free_thread_xstate(struct task_struct *tsk)
 {
-	fpu_free(&tsk->thread.fpu);
+	fpstate_free(&tsk->thread.fpu);
 }
 
 void arch_release_task_struct(struct task_struct *tsk)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 28fa733bb1c6..80a411c83083 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7029,7 +7029,7 @@ EXPORT_SYMBOL_GPL(fx_init);
 
 static void fx_free(struct kvm_vcpu *vcpu)
 {
-	fpu_free(&vcpu->arch.guest_fpu);
+	fpstate_free(&vcpu->arch.guest_fpu);
 }
 
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 015/208] x86/fpu: Rename fpu_finit() to fpstate_init()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (13 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 014/208] x86/fpu: Rename fpu_free() to fpstate_free() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 016/208] x86/fpu: Rename fpu_init() to fpu__cpu_init() Ingo Molnar
                   ` (64 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Make it clear that we are initializing the in-memory FPU context area,
no the FPU registers.

Also move it to the fpu__*() namespace.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h | 2 +-
 arch/x86/kernel/i387.c      | 8 ++++----
 arch/x86/kernel/xsave.c     | 2 +-
 arch/x86/kvm/x86.c          | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 0367d17371f5..6552a16e0e38 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -19,8 +19,8 @@ struct pt_regs;
 struct user_i387_struct;
 
 extern int fpstate_alloc_init(struct task_struct *curr);
+extern void fpstate_init(struct fpu *fpu);
 
-extern void fpu_finit(struct fpu *fpu);
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
 extern void math_state_restore(void);
 
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 5f2feb63b72a..e0c16e86deb0 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -225,7 +225,7 @@ void fpu_init(void)
 	eager_fpu_init();
 }
 
-void fpu_finit(struct fpu *fpu)
+void fpstate_init(struct fpu *fpu)
 {
 	if (!cpu_has_fpu) {
 		finit_soft_fpu(&fpu->state->soft);
@@ -244,7 +244,7 @@ void fpu_finit(struct fpu *fpu)
 		fp->fos = 0xffff0000u;
 	}
 }
-EXPORT_SYMBOL_GPL(fpu_finit);
+EXPORT_SYMBOL_GPL(fpstate_init);
 
 int fpstate_alloc(struct fpu *fpu)
 {
@@ -284,7 +284,7 @@ int fpstate_alloc_init(struct task_struct *curr)
 	if (ret)
 		return ret;
 
-	fpu_finit(&curr->thread.fpu);
+	fpstate_init(&curr->thread.fpu);
 
 	/* Safe to do for the current task: */
 	curr->flags |= PF_USED_MATH;
@@ -318,7 +318,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 	if (ret)
 		return ret;
 
-	fpu_finit(&child->thread.fpu);
+	fpstate_init(&child->thread.fpu);
 
 	/* Safe to do for stopped child tasks: */
 	child->flags |= PF_USED_MATH;
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index a977cdd03825..163b5cc582ef 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -395,7 +395,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 
 		if (__copy_from_user(&fpu->state->xsave, buf_fx, state_size) ||
 		    __copy_from_user(&env, buf, sizeof(env))) {
-			fpu_finit(fpu);
+			fpstate_init(fpu);
 			err = -1;
 		} else {
 			sanitize_restored_xstate(tsk, &env, xstate_bv, fx_only);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 80a411c83083..26b1f89fc608 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7011,7 +7011,7 @@ int fx_init(struct kvm_vcpu *vcpu)
 	if (err)
 		return err;
 
-	fpu_finit(&vcpu->arch.guest_fpu);
+	fpstate_init(&vcpu->arch.guest_fpu);
 	if (cpu_has_xsaves)
 		vcpu->arch.guest_fpu.state->xsave.xsave_hdr.xcomp_bv =
 			host_xcr0 | XSTATE_COMPACTION_ENABLED;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 016/208] x86/fpu: Rename fpu_init() to fpu__cpu_init()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (14 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 015/208] x86/fpu: Rename fpu_finit() to fpstate_init() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 017/208] x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size() Ingo Molnar
                   ` (63 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

fpu_init() is a bit of a misnomer in that it (falsely) creates the
impression that it's related to the (old) fpu_finit() function,
which initializes FPU ctx state.

Rename it to fpu__cpu_init() to make its boot time initialization
clear, and to move it to the fpu__*() namespace.

Also fix and extend its comment block to point out that it's
called not only on the boot CPU, but on secondary CPUs as well.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h |  2 +-
 arch/x86/kernel/cpu/common.c        |  4 ++--
 arch/x86/kernel/i387.c              | 10 ++++++----
 arch/x86/xen/enlighten.c            |  2 +-
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index b68e8f04c38a..02e0e97d8be7 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -39,7 +39,7 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
 #endif
 
 extern unsigned int mxcsr_feature_mask;
-extern void fpu_init(void);
+extern void fpu__cpu_init(void);
 extern void eager_fpu_init(void);
 
 DECLARE_PER_CPU(struct task_struct *, fpu_owner_task);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 60a29d290e2a..b8035b8dd186 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1439,7 +1439,7 @@ void cpu_init(void)
 	clear_all_debug_regs();
 	dbg_restore_debug_regs();
 
-	fpu_init();
+	fpu__cpu_init();
 
 	if (is_uv_system())
 		uv_cpu_init();
@@ -1495,7 +1495,7 @@ void cpu_init(void)
 	clear_all_debug_regs();
 	dbg_restore_debug_regs();
 
-	fpu_init();
+	fpu__cpu_init();
 }
 #endif
 
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index e0c16e86deb0..5b4672584e65 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -183,11 +183,13 @@ static void init_thread_xstate(void)
 }
 
 /*
- * Called at bootup to set up the initial FPU state that is later cloned
- * into all processes.
+ * Called on the boot CPU at bootup to set up the initial FPU state that
+ * is later cloned into all processes.
+ *
+ * Also called on secondary CPUs to set up the FPU state of their
+ * idle threads.
  */
-
-void fpu_init(void)
+void fpu__cpu_init(void)
 {
 	unsigned long cr0;
 	unsigned long cr4_mask = 0;
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 94578efd3067..64715168b2b6 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1423,7 +1423,7 @@ static void xen_pvh_set_cr_flags(int cpu)
 		return;
 	/*
 	 * For BSP, PSE PGE are set in probe_page_size_mask(), for APs
-	 * set them here. For all, OSFXSR OSXMMEXCPT are set in fpu_init.
+	 * set them here. For all, OSFXSR OSXMMEXCPT are set in fpu__cpu_init().
 	*/
 	if (cpu_has_pse)
 		cr4_set_bits_and_update_boot(X86_CR4_PSE);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 017/208] x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (15 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 016/208] x86/fpu: Rename fpu_init() to fpu__cpu_init() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 018/208] x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter Ingo Molnar
                   ` (62 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

So init_thread_xstate() is a misnomer in that it's not really related to a specific
thread - it determines, once during initial bootup, the size of the xstate context.

Also improve the comments.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/i387.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 5b4672584e65..01101553c6c1 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -158,7 +158,7 @@ static void mxcsr_feature_mask_init(void)
 	mxcsr_feature_mask &= mask;
 }
 
-static void init_thread_xstate(void)
+static void fpstate_xstate_init_size(void)
 {
 	/*
 	 * Note that xstate_size might be overwriten later during
@@ -216,11 +216,11 @@ void fpu__cpu_init(void)
 	write_cr0(cr0);
 
 	/*
-	 * init_thread_xstate is only called once to avoid overriding
-	 * xstate_size during boot time or during CPU hotplug.
+	 * fpstate_xstate_init_size() is only called once, to avoid overriding
+	 * 'xstate_size' during (secondary CPU) bootup or during CPU hotplug.
 	 */
 	if (xstate_size == 0)
-		init_thread_xstate();
+		fpstate_xstate_init_size();
 
 	mxcsr_feature_mask_init();
 	xsave_init();
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 018/208] x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (16 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 017/208] x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size() Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:23 ` [PATCH 019/208] x86/fpu: Improve the comment for the fpu::counter field Ingo Molnar
                   ` (61 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This field is kept separate from the main FPU state structure for
no good reason.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 10 +++++-----
 arch/x86/include/asm/processor.h    | 18 +++++++++---------
 arch/x86/kernel/process.c           |  2 +-
 arch/x86/kernel/traps.c             |  2 +-
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 02e0e97d8be7..f85d21b68901 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -384,7 +384,7 @@ static inline void drop_fpu(struct task_struct *tsk)
 	 * Forget coprocessor state..
 	 */
 	preempt_disable();
-	tsk->thread.fpu_counter = 0;
+	tsk->thread.fpu.counter = 0;
 
 	if (__thread_has_fpu(tsk)) {
 		/* Ignore delayed exceptions from user space */
@@ -441,7 +441,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 	 * or if the past 5 consecutive context-switches used math.
 	 */
 	fpu.preload = tsk_used_math(new) &&
-		      (use_eager_fpu() || new->thread.fpu_counter > 5);
+		      (use_eager_fpu() || new->thread.fpu.counter > 5);
 
 	if (__thread_has_fpu(old)) {
 		if (!__save_init_fpu(old))
@@ -454,16 +454,16 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 
 		/* Don't change CR0.TS if we just switch! */
 		if (fpu.preload) {
-			new->thread.fpu_counter++;
+			new->thread.fpu.counter++;
 			__thread_set_has_fpu(new);
 			prefetch(new->thread.fpu.state);
 		} else if (!use_eager_fpu())
 			stts();
 	} else {
-		old->thread.fpu_counter = 0;
+		old->thread.fpu.counter = 0;
 		task_disable_lazy_fpu_restore(old);
 		if (fpu.preload) {
-			new->thread.fpu_counter++;
+			new->thread.fpu.counter++;
 			if (fpu_lazy_restore(new, cpu))
 				fpu.preload = 0;
 			else
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2dc08c231a9a..64d6b5d97ce9 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -433,6 +433,15 @@ struct fpu {
 	unsigned int last_cpu;
 	unsigned int has_fpu;
 	union thread_xstate *state;
+	/*
+	 * This counter contains the number of consecutive context switches
+	 * that the FPU is used. If this is over a threshold, the lazy fpu
+	 * saving becomes unlazy to save the trap. This is an unsigned char
+	 * so that after 256 times the counter wraps and the behavior turns
+	 * lazy again; this to deal with bursty apps that only use FPU for
+	 * a short time
+	 */
+	unsigned char counter;
 };
 
 #ifdef CONFIG_X86_64
@@ -535,15 +544,6 @@ struct thread_struct {
 	unsigned long		iopl;
 	/* Max allowed port in the bitmap, in bytes: */
 	unsigned		io_bitmap_max;
-	/*
-	 * fpu_counter contains the number of consecutive context switches
-	 * that the FPU is used. If this is over a threshold, the lazy fpu
-	 * saving becomes unlazy to save the trap. This is an unsigned char
-	 * so that after 256 times the counter wraps and the behavior turns
-	 * lazy again; this to deal with bursty apps that only use FPU for
-	 * a short time
-	 */
-	unsigned char fpu_counter;
 };
 
 /*
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index d9a02e6392d8..999485ab6b3c 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -87,7 +87,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
 	*dst = *src;
 
-	dst->thread.fpu_counter = 0;
+	dst->thread.fpu.counter = 0;
 	dst->thread.fpu.has_fpu = 0;
 	dst->thread.fpu.state = NULL;
 	task_disable_lazy_fpu_restore(dst);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index cf9c9627be19..231aa579d9cd 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -863,7 +863,7 @@ void math_state_restore(void)
 		fpu_reset_state(tsk);
 		force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
 	} else {
-		tsk->thread.fpu_counter++;
+		tsk->thread.fpu.counter++;
 	}
 	kernel_fpu_enable();
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 019/208] x86/fpu: Improve the comment for the fpu::counter field
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (17 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 018/208] x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter Ingo Molnar
@ 2015-05-05 16:23 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 020/208] x86/fpu: Move FPU data structures to asm/fpu_types.h Ingo Molnar
                   ` (60 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This was pretty hard to read, improve it.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64d6b5d97ce9..28df85561730 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -435,11 +435,11 @@ struct fpu {
 	union thread_xstate *state;
 	/*
 	 * This counter contains the number of consecutive context switches
-	 * that the FPU is used. If this is over a threshold, the lazy fpu
-	 * saving becomes unlazy to save the trap. This is an unsigned char
-	 * so that after 256 times the counter wraps and the behavior turns
-	 * lazy again; this to deal with bursty apps that only use FPU for
-	 * a short time
+	 * during which the FPU stays used. If this is over a threshold, the
+	 * lazy fpu saving logic becomes unlazy, to save the trap overhead.
+	 * This is an unsigned char so that after 256 iterations the counter
+	 * wraps and the context switch behavior turns lazy again; this is to
+	 * deal with bursty apps that only use the FPU for a short time:
 	 */
 	unsigned char counter;
 };
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 020/208] x86/fpu: Move FPU data structures to asm/fpu_types.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (18 preceding siblings ...)
  2015-05-05 16:23 ` [PATCH 019/208] x86/fpu: Improve the comment for the fpu::counter field Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 021/208] x86/fpu: Clean up asm/fpu/types.h Ingo Molnar
                   ` (59 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move the FPU details to asm/fpu_types.h, to further factor out the
FPU code.

( As an added bonus, the 'struct orig_ist' definition now moves
  next to its other data types - the FPU definitions were
  slapped in the middle of them for some mysterious reason. )

No code changed.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/types.h | 132 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/processor.h | 132 +----------------------------------------------------------
 2 files changed, 133 insertions(+), 131 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
new file mode 100644
index 000000000000..e996023380d3
--- /dev/null
+++ b/arch/x86/include/asm/fpu/types.h
@@ -0,0 +1,132 @@
+
+#define	MXCSR_DEFAULT		0x1f80
+
+struct i387_fsave_struct {
+	u32			cwd;	/* FPU Control Word		*/
+	u32			swd;	/* FPU Status Word		*/
+	u32			twd;	/* FPU Tag Word			*/
+	u32			fip;	/* FPU IP Offset		*/
+	u32			fcs;	/* FPU IP Selector		*/
+	u32			foo;	/* FPU Operand Pointer Offset	*/
+	u32			fos;	/* FPU Operand Pointer Selector	*/
+
+	/* 8*10 bytes for each FP-reg = 80 bytes:			*/
+	u32			st_space[20];
+
+	/* Software status information [not touched by FSAVE ]:		*/
+	u32			status;
+};
+
+struct i387_fxsave_struct {
+	u16			cwd; /* Control Word			*/
+	u16			swd; /* Status Word			*/
+	u16			twd; /* Tag Word			*/
+	u16			fop; /* Last Instruction Opcode		*/
+	union {
+		struct {
+			u64	rip; /* Instruction Pointer		*/
+			u64	rdp; /* Data Pointer			*/
+		};
+		struct {
+			u32	fip; /* FPU IP Offset			*/
+			u32	fcs; /* FPU IP Selector			*/
+			u32	foo; /* FPU Operand Offset		*/
+			u32	fos; /* FPU Operand Selector		*/
+		};
+	};
+	u32			mxcsr;		/* MXCSR Register State */
+	u32			mxcsr_mask;	/* MXCSR Mask		*/
+
+	/* 8*16 bytes for each FP-reg = 128 bytes:			*/
+	u32			st_space[32];
+
+	/* 16*16 bytes for each XMM-reg = 256 bytes:			*/
+	u32			xmm_space[64];
+
+	u32			padding[12];
+
+	union {
+		u32		padding1[12];
+		u32		sw_reserved[12];
+	};
+
+} __attribute__((aligned(16)));
+
+struct i387_soft_struct {
+	u32			cwd;
+	u32			swd;
+	u32			twd;
+	u32			fip;
+	u32			fcs;
+	u32			foo;
+	u32			fos;
+	/* 8*10 bytes for each FP-reg = 80 bytes: */
+	u32			st_space[20];
+	u8			ftop;
+	u8			changed;
+	u8			lookahead;
+	u8			no_update;
+	u8			rm;
+	u8			alimit;
+	struct math_emu_info	*info;
+	u32			entry_eip;
+};
+
+struct ymmh_struct {
+	/* 16 * 16 bytes for each YMMH-reg = 256 bytes */
+	u32 ymmh_space[64];
+};
+
+/* We don't support LWP yet: */
+struct lwp_struct {
+	u8 reserved[128];
+};
+
+struct bndreg {
+	u64 lower_bound;
+	u64 upper_bound;
+} __packed;
+
+struct bndcsr {
+	u64 bndcfgu;
+	u64 bndstatus;
+} __packed;
+
+struct xsave_hdr_struct {
+	u64 xstate_bv;
+	u64 xcomp_bv;
+	u64 reserved[6];
+} __attribute__((packed));
+
+struct xsave_struct {
+	struct i387_fxsave_struct i387;
+	struct xsave_hdr_struct xsave_hdr;
+	struct ymmh_struct ymmh;
+	struct lwp_struct lwp;
+	struct bndreg bndreg[4];
+	struct bndcsr bndcsr;
+	/* new processor state extensions will go here */
+} __attribute__ ((packed, aligned (64)));
+
+union thread_xstate {
+	struct i387_fsave_struct	fsave;
+	struct i387_fxsave_struct	fxsave;
+	struct i387_soft_struct		soft;
+	struct xsave_struct		xsave;
+};
+
+struct fpu {
+	unsigned int last_cpu;
+	unsigned int has_fpu;
+	union thread_xstate *state;
+	/*
+	 * This counter contains the number of consecutive context switches
+	 * during which the FPU stays used. If this is over a threshold, the
+	 * lazy fpu saving logic becomes unlazy, to save the trap overhead.
+	 * This is an unsigned char so that after 256 iterations the counter
+	 * wraps and the context switch behavior turns lazy again; this is to
+	 * deal with bursty apps that only use the FPU for a short time:
+	 */
+	unsigned char counter;
+};
+
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 28df85561730..6b75c4b927ec 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -21,6 +21,7 @@ struct mm_struct;
 #include <asm/desc_defs.h>
 #include <asm/nops.h>
 #include <asm/special_insns.h>
+#include <asm/fpu/types.h>
 
 #include <linux/personality.h>
 #include <linux/cpumask.h>
@@ -313,137 +314,6 @@ struct orig_ist {
 	unsigned long		ist[7];
 };
 
-#define	MXCSR_DEFAULT		0x1f80
-
-struct i387_fsave_struct {
-	u32			cwd;	/* FPU Control Word		*/
-	u32			swd;	/* FPU Status Word		*/
-	u32			twd;	/* FPU Tag Word			*/
-	u32			fip;	/* FPU IP Offset		*/
-	u32			fcs;	/* FPU IP Selector		*/
-	u32			foo;	/* FPU Operand Pointer Offset	*/
-	u32			fos;	/* FPU Operand Pointer Selector	*/
-
-	/* 8*10 bytes for each FP-reg = 80 bytes:			*/
-	u32			st_space[20];
-
-	/* Software status information [not touched by FSAVE ]:		*/
-	u32			status;
-};
-
-struct i387_fxsave_struct {
-	u16			cwd; /* Control Word			*/
-	u16			swd; /* Status Word			*/
-	u16			twd; /* Tag Word			*/
-	u16			fop; /* Last Instruction Opcode		*/
-	union {
-		struct {
-			u64	rip; /* Instruction Pointer		*/
-			u64	rdp; /* Data Pointer			*/
-		};
-		struct {
-			u32	fip; /* FPU IP Offset			*/
-			u32	fcs; /* FPU IP Selector			*/
-			u32	foo; /* FPU Operand Offset		*/
-			u32	fos; /* FPU Operand Selector		*/
-		};
-	};
-	u32			mxcsr;		/* MXCSR Register State */
-	u32			mxcsr_mask;	/* MXCSR Mask		*/
-
-	/* 8*16 bytes for each FP-reg = 128 bytes:			*/
-	u32			st_space[32];
-
-	/* 16*16 bytes for each XMM-reg = 256 bytes:			*/
-	u32			xmm_space[64];
-
-	u32			padding[12];
-
-	union {
-		u32		padding1[12];
-		u32		sw_reserved[12];
-	};
-
-} __attribute__((aligned(16)));
-
-struct i387_soft_struct {
-	u32			cwd;
-	u32			swd;
-	u32			twd;
-	u32			fip;
-	u32			fcs;
-	u32			foo;
-	u32			fos;
-	/* 8*10 bytes for each FP-reg = 80 bytes: */
-	u32			st_space[20];
-	u8			ftop;
-	u8			changed;
-	u8			lookahead;
-	u8			no_update;
-	u8			rm;
-	u8			alimit;
-	struct math_emu_info	*info;
-	u32			entry_eip;
-};
-
-struct ymmh_struct {
-	/* 16 * 16 bytes for each YMMH-reg = 256 bytes */
-	u32 ymmh_space[64];
-};
-
-/* We don't support LWP yet: */
-struct lwp_struct {
-	u8 reserved[128];
-};
-
-struct bndreg {
-	u64 lower_bound;
-	u64 upper_bound;
-} __packed;
-
-struct bndcsr {
-	u64 bndcfgu;
-	u64 bndstatus;
-} __packed;
-
-struct xsave_hdr_struct {
-	u64 xstate_bv;
-	u64 xcomp_bv;
-	u64 reserved[6];
-} __attribute__((packed));
-
-struct xsave_struct {
-	struct i387_fxsave_struct i387;
-	struct xsave_hdr_struct xsave_hdr;
-	struct ymmh_struct ymmh;
-	struct lwp_struct lwp;
-	struct bndreg bndreg[4];
-	struct bndcsr bndcsr;
-	/* new processor state extensions will go here */
-} __attribute__ ((packed, aligned (64)));
-
-union thread_xstate {
-	struct i387_fsave_struct	fsave;
-	struct i387_fxsave_struct	fxsave;
-	struct i387_soft_struct		soft;
-	struct xsave_struct		xsave;
-};
-
-struct fpu {
-	unsigned int last_cpu;
-	unsigned int has_fpu;
-	union thread_xstate *state;
-	/*
-	 * This counter contains the number of consecutive context switches
-	 * during which the FPU stays used. If this is over a threshold, the
-	 * lazy fpu saving logic becomes unlazy, to save the trap overhead.
-	 * This is an unsigned char so that after 256 iterations the counter
-	 * wraps and the context switch behavior turns lazy again; this is to
-	 * deal with bursty apps that only use the FPU for a short time:
-	 */
-	unsigned char counter;
-};
-
 #ifdef CONFIG_X86_64
 DECLARE_PER_CPU(struct orig_ist, orig_ist);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 021/208] x86/fpu: Clean up asm/fpu/types.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (19 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 020/208] x86/fpu: Move FPU data structures to asm/fpu_types.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 022/208] x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/ Ingo Molnar
                   ` (58 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

 - add header guards

 - standardize vertical alignment

 - add comments about MPX

No code changed.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/types.h | 50 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index e996023380d3..efb520dcf38e 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -1,3 +1,8 @@
+/*
+ * FPU data structures:
+ */
+#ifndef _ASM_X86_FPU_H
+#define _ASM_X86_FPU_H
 
 #define	MXCSR_DEFAULT		0x1f80
 
@@ -52,6 +57,9 @@ struct i387_fxsave_struct {
 
 } __attribute__((aligned(16)));
 
+/*
+ * Software based FPU emulation state:
+ */
 struct i387_soft_struct {
 	u32			cwd;
 	u32			swd;
@@ -74,38 +82,39 @@ struct i387_soft_struct {
 
 struct ymmh_struct {
 	/* 16 * 16 bytes for each YMMH-reg = 256 bytes */
-	u32 ymmh_space[64];
+	u32				ymmh_space[64];
 };
 
 /* We don't support LWP yet: */
 struct lwp_struct {
-	u8 reserved[128];
+	u8				reserved[128];
 };
 
+/* Intel MPX support: */
 struct bndreg {
-	u64 lower_bound;
-	u64 upper_bound;
+	u64				lower_bound;
+	u64				upper_bound;
 } __packed;
 
 struct bndcsr {
-	u64 bndcfgu;
-	u64 bndstatus;
+	u64				bndcfgu;
+	u64				bndstatus;
 } __packed;
 
 struct xsave_hdr_struct {
-	u64 xstate_bv;
-	u64 xcomp_bv;
-	u64 reserved[6];
+	u64				xstate_bv;
+	u64				xcomp_bv;
+	u64				reserved[6];
 } __attribute__((packed));
 
 struct xsave_struct {
-	struct i387_fxsave_struct i387;
-	struct xsave_hdr_struct xsave_hdr;
-	struct ymmh_struct ymmh;
-	struct lwp_struct lwp;
-	struct bndreg bndreg[4];
-	struct bndcsr bndcsr;
-	/* new processor state extensions will go here */
+	struct i387_fxsave_struct	i387;
+	struct xsave_hdr_struct		xsave_hdr;
+	struct ymmh_struct		ymmh;
+	struct lwp_struct		lwp;
+	struct bndreg			bndreg[4];
+	struct bndcsr			bndcsr;
+	/* New processor state extensions will go here. */
 } __attribute__ ((packed, aligned (64)));
 
 union thread_xstate {
@@ -116,9 +125,9 @@ union thread_xstate {
 };
 
 struct fpu {
-	unsigned int last_cpu;
-	unsigned int has_fpu;
-	union thread_xstate *state;
+	unsigned int			last_cpu;
+	unsigned int			has_fpu;
+	union thread_xstate		*state;
 	/*
 	 * This counter contains the number of consecutive context switches
 	 * during which the FPU stays used. If this is over a threshold, the
@@ -127,6 +136,7 @@ struct fpu {
 	 * wraps and the context switch behavior turns lazy again; this is to
 	 * deal with bursty apps that only use the FPU for a short time:
 	 */
-	unsigned char counter;
+	unsigned char			counter;
 };
 
+#endif /* _ASM_X86_FPU_H */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 022/208] x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (20 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 021/208] x86/fpu: Clean up asm/fpu/types.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 023/208] x86/fpu: Fix header file dependencies of fpu-internal.h Ingo Molnar
                   ` (57 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Create a new subdirectory for the FPU support code in arch/x86/kernel/fpu/.

Rename 'i387.c' to 'core.c' - as this really collects the core FPU support
code, nothing i387 specific.

We'll better organize this directory in later patches.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/Makefile               | 2 +-
 arch/x86/kernel/fpu/Makefile           | 5 +++++
 arch/x86/kernel/{i387.c => fpu/core.c} | 0
 arch/x86/kernel/{ => fpu}/xsave.c      | 0
 4 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9bcd0b56ca17..febaf180621b 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -44,7 +44,7 @@ obj-y			+= pci-iommu_table.o
 obj-y			+= resource.o
 
 obj-y				+= process.o
-obj-y				+= i387.o xsave.o
+obj-y				+= fpu/
 obj-y				+= ptrace.o
 obj-$(CONFIG_X86_32)		+= tls.o
 obj-$(CONFIG_IA32_EMULATION)	+= tls.o
diff --git a/arch/x86/kernel/fpu/Makefile b/arch/x86/kernel/fpu/Makefile
new file mode 100644
index 000000000000..89fd66a4b3a1
--- /dev/null
+++ b/arch/x86/kernel/fpu/Makefile
@@ -0,0 +1,5 @@
+#
+# Build rules for the FPU support code:
+#
+
+obj-y				+= core.o xsave.o
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/fpu/core.c
similarity index 100%
rename from arch/x86/kernel/i387.c
rename to arch/x86/kernel/fpu/core.c
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/fpu/xsave.c
similarity index 100%
rename from arch/x86/kernel/xsave.c
rename to arch/x86/kernel/fpu/xsave.c
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 023/208] x86/fpu: Fix header file dependencies of fpu-internal.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (21 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 022/208] x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/ Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 024/208] x86/fpu: Split out the boot time FPU init code into fpu/init.c Ingo Molnar
                   ` (56 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Fix a minor header file dependency bug in asm/fpu-internal.h: it
relies on i387.h but does not include it. All users of fpu-internal.h
included it explicitly.

Also remove unnecessary includes, to reduce compilation time.

This also makes it easier to use it as a standalone header file
for FPU internals, such as an upcoming C module in arch/x86/kernel/fpu/.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/crc32c-intel_glue.c | 1 -
 arch/x86/crypto/sha-mb/sha1_mb.c    | 1 -
 arch/x86/ia32/ia32_signal.c         | 1 -
 arch/x86/include/asm/fpu-internal.h | 9 ++-------
 arch/x86/kernel/cpu/common.c        | 1 -
 arch/x86/kernel/process.c           | 1 -
 arch/x86/kernel/process_32.c        | 1 -
 arch/x86/kernel/process_64.c        | 1 -
 arch/x86/kernel/ptrace.c            | 1 -
 arch/x86/kernel/signal.c            | 1 -
 arch/x86/kernel/smpboot.c           | 1 -
 arch/x86/kernel/traps.c             | 1 -
 arch/x86/kvm/x86.c                  | 2 +-
 arch/x86/mm/mpx.c                   | 1 -
 14 files changed, 3 insertions(+), 20 deletions(-)

diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index 28640c3d6af7..470522cb042a 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -32,7 +32,6 @@
 
 #include <asm/cpufeature.h>
 #include <asm/cpu_device_id.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 
 #define CHKSUM_BLOCK_SIZE	1
diff --git a/arch/x86/crypto/sha-mb/sha1_mb.c b/arch/x86/crypto/sha-mb/sha1_mb.c
index e510b1c5d690..15373786494f 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb.c
+++ b/arch/x86/crypto/sha-mb/sha1_mb.c
@@ -65,7 +65,6 @@
 #include <crypto/mcryptd.h>
 #include <crypto/crypto_wq.h>
 #include <asm/byteorder.h>
-#include <asm/i387.h>
 #include <asm/xcr.h>
 #include <asm/xsave.h>
 #include <linux/hardirq.h>
diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index c81d35e6c7f1..4bafd5b05aca 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -21,7 +21,6 @@
 #include <linux/binfmts.h>
 #include <asm/ucontext.h>
 #include <asm/uaccess.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/ptrace.h>
 #include <asm/ia32_unistd.h>
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index f85d21b68901..c3b7bd12f18f 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -10,18 +10,13 @@
 #ifndef _FPU_INTERNAL_H
 #define _FPU_INTERNAL_H
 
-#include <linux/kernel_stat.h>
 #include <linux/regset.h>
 #include <linux/compat.h>
 #include <linux/slab.h>
-#include <asm/asm.h>
-#include <asm/cpufeature.h>
-#include <asm/processor.h>
-#include <asm/sigcontext.h>
+
 #include <asm/user.h>
-#include <asm/uaccess.h>
+#include <asm/i387.h>
 #include <asm/xsave.h>
-#include <asm/smap.h>
 
 #ifdef CONFIG_X86_64
 # include <asm/sigcontext32.h>
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b8035b8dd186..220ad95e0e28 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -31,7 +31,6 @@
 #include <asm/setup.h>
 #include <asm/apic.h>
 #include <asm/desc.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/mtrr.h>
 #include <linux/numa.h>
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 999485ab6b3c..5daa6547fdc7 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -25,7 +25,6 @@
 #include <asm/idle.h>
 #include <asm/uaccess.h>
 #include <asm/mwait.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/debugreg.h>
 #include <asm/nmi.h>
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 8ed2106b06da..84d647d4b14d 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -39,7 +39,6 @@
 #include <asm/pgtable.h>
 #include <asm/ldt.h>
 #include <asm/processor.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/desc.h>
 #ifdef CONFIG_MATH_EMULATION
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ddfdbf74f174..ae6efeccb46e 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -38,7 +38,6 @@
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/mmu_context.h>
 #include <asm/prctl.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index a7bc79480719..69451b8965f7 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -28,7 +28,6 @@
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/debugreg.h>
 #include <asm/ldt.h>
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 1ea14fd53933..35f867aa597e 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -26,7 +26,6 @@
 
 #include <asm/processor.h>
 #include <asm/ucontext.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/vdso.h>
 #include <asm/mce.h>
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 50e547eac8cd..60e331ceb844 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -68,7 +68,6 @@
 #include <asm/mwait.h>
 #include <asm/apic.h>
 #include <asm/io_apic.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/setup.h>
 #include <asm/uv/uv.h>
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 231aa579d9cd..465b335e7491 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -54,7 +54,6 @@
 #include <asm/ftrace.h>
 #include <asm/traps.h>
 #include <asm/desc.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
 #include <asm/mce.h>
 #include <asm/fixmap.h>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26b1f89fc608..be276e0fe0ff 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -59,7 +59,7 @@
 #include <asm/desc.h>
 #include <asm/mtrr.h>
 #include <asm/mce.h>
-#include <asm/i387.h>
+#include <linux/kernel_stat.h>
 #include <asm/fpu-internal.h> /* Ugh! */
 #include <asm/xcr.h>
 #include <asm/pvclock.h>
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index c439ec478216..37ad432e7f16 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -10,7 +10,6 @@
 #include <linux/syscalls.h>
 #include <linux/sched/sysctl.h>
 
-#include <asm/i387.h>
 #include <asm/insn.h>
 #include <asm/mman.h>
 #include <asm/mmu_context.h>
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 024/208] x86/fpu: Split out the boot time FPU init code into fpu/init.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (22 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 023/208] x86/fpu: Fix header file dependencies of fpu-internal.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 025/208] x86/fpu: Remove unnecessary includes from core.c Ingo Molnar
                   ` (55 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move boot time FPU initialization code into init.c, to better
isolate it into its own domain.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/Makefile |  2 +-
 arch/x86/kernel/fpu/core.c   | 88 ------------------------------------------------------------
 arch/x86/kernel/fpu/init.c   | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+), 89 deletions(-)

diff --git a/arch/x86/kernel/fpu/Makefile b/arch/x86/kernel/fpu/Makefile
index 89fd66a4b3a1..50464a716b87 100644
--- a/arch/x86/kernel/fpu/Makefile
+++ b/arch/x86/kernel/fpu/Makefile
@@ -2,4 +2,4 @@
 # Build rules for the FPU support code:
 #
 
-obj-y				+= core.o xsave.o
+obj-y				+= init.o core.o xsave.o
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 01101553c6c1..9866a580952f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -139,94 +139,6 @@ void fpu__save(struct task_struct *tsk)
 }
 EXPORT_SYMBOL_GPL(fpu__save);
 
-unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
-unsigned int xstate_size;
-EXPORT_SYMBOL_GPL(xstate_size);
-static struct i387_fxsave_struct fx_scratch;
-
-static void mxcsr_feature_mask_init(void)
-{
-	unsigned long mask = 0;
-
-	if (cpu_has_fxsr) {
-		memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
-		asm volatile("fxsave %0" : "+m" (fx_scratch));
-		mask = fx_scratch.mxcsr_mask;
-		if (mask == 0)
-			mask = 0x0000ffbf;
-	}
-	mxcsr_feature_mask &= mask;
-}
-
-static void fpstate_xstate_init_size(void)
-{
-	/*
-	 * Note that xstate_size might be overwriten later during
-	 * xsave_init().
-	 */
-
-	if (!cpu_has_fpu) {
-		/*
-		 * Disable xsave as we do not support it if i387
-		 * emulation is enabled.
-		 */
-		setup_clear_cpu_cap(X86_FEATURE_XSAVE);
-		setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
-		xstate_size = sizeof(struct i387_soft_struct);
-		return;
-	}
-
-	if (cpu_has_fxsr)
-		xstate_size = sizeof(struct i387_fxsave_struct);
-	else
-		xstate_size = sizeof(struct i387_fsave_struct);
-}
-
-/*
- * Called on the boot CPU at bootup to set up the initial FPU state that
- * is later cloned into all processes.
- *
- * Also called on secondary CPUs to set up the FPU state of their
- * idle threads.
- */
-void fpu__cpu_init(void)
-{
-	unsigned long cr0;
-	unsigned long cr4_mask = 0;
-
-#ifndef CONFIG_MATH_EMULATION
-	if (!cpu_has_fpu) {
-		pr_emerg("No FPU found and no math emulation present\n");
-		pr_emerg("Giving up\n");
-		for (;;)
-			asm volatile("hlt");
-	}
-#endif
-	if (cpu_has_fxsr)
-		cr4_mask |= X86_CR4_OSFXSR;
-	if (cpu_has_xmm)
-		cr4_mask |= X86_CR4_OSXMMEXCPT;
-	if (cr4_mask)
-		cr4_set_bits(cr4_mask);
-
-	cr0 = read_cr0();
-	cr0 &= ~(X86_CR0_TS|X86_CR0_EM); /* clear TS and EM */
-	if (!cpu_has_fpu)
-		cr0 |= X86_CR0_EM;
-	write_cr0(cr0);
-
-	/*
-	 * fpstate_xstate_init_size() is only called once, to avoid overriding
-	 * 'xstate_size' during (secondary CPU) bootup or during CPU hotplug.
-	 */
-	if (xstate_size == 0)
-		fpstate_xstate_init_size();
-
-	mxcsr_feature_mask_init();
-	xsave_init();
-	eager_fpu_init();
-}
-
 void fpstate_init(struct fpu *fpu)
 {
 	if (!cpu_has_fpu) {
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
new file mode 100644
index 000000000000..0a666298abbd
--- /dev/null
+++ b/arch/x86/kernel/fpu/init.c
@@ -0,0 +1,93 @@
+/*
+ * x86 FPU boot time init code
+ */
+#include <asm/fpu-internal.h>
+#include <asm/tlbflush.h>
+
+unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
+unsigned int xstate_size;
+EXPORT_SYMBOL_GPL(xstate_size);
+static struct i387_fxsave_struct fx_scratch;
+
+static void mxcsr_feature_mask_init(void)
+{
+	unsigned long mask = 0;
+
+	if (cpu_has_fxsr) {
+		memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
+		asm volatile("fxsave %0" : "+m" (fx_scratch));
+		mask = fx_scratch.mxcsr_mask;
+		if (mask == 0)
+			mask = 0x0000ffbf;
+	}
+	mxcsr_feature_mask &= mask;
+}
+
+static void fpstate_xstate_init_size(void)
+{
+	/*
+	 * Note that xstate_size might be overwriten later during
+	 * xsave_init().
+	 */
+
+	if (!cpu_has_fpu) {
+		/*
+		 * Disable xsave as we do not support it if i387
+		 * emulation is enabled.
+		 */
+		setup_clear_cpu_cap(X86_FEATURE_XSAVE);
+		setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
+		xstate_size = sizeof(struct i387_soft_struct);
+		return;
+	}
+
+	if (cpu_has_fxsr)
+		xstate_size = sizeof(struct i387_fxsave_struct);
+	else
+		xstate_size = sizeof(struct i387_fsave_struct);
+}
+
+/*
+ * Called on the boot CPU at bootup to set up the initial FPU state that
+ * is later cloned into all processes.
+ *
+ * Also called on secondary CPUs to set up the FPU state of their
+ * idle threads.
+ */
+void fpu__cpu_init(void)
+{
+	unsigned long cr0;
+	unsigned long cr4_mask = 0;
+
+#ifndef CONFIG_MATH_EMULATION
+	if (!cpu_has_fpu) {
+		pr_emerg("No FPU found and no math emulation present\n");
+		pr_emerg("Giving up\n");
+		for (;;)
+			asm volatile("hlt");
+	}
+#endif
+	if (cpu_has_fxsr)
+		cr4_mask |= X86_CR4_OSFXSR;
+	if (cpu_has_xmm)
+		cr4_mask |= X86_CR4_OSXMMEXCPT;
+	if (cr4_mask)
+		cr4_set_bits(cr4_mask);
+
+	cr0 = read_cr0();
+	cr0 &= ~(X86_CR0_TS|X86_CR0_EM); /* clear TS and EM */
+	if (!cpu_has_fpu)
+		cr0 |= X86_CR0_EM;
+	write_cr0(cr0);
+
+	/*
+	 * fpstate_xstate_init_size() is only called once, to avoid overriding
+	 * 'xstate_size' during (secondary CPU) bootup or during CPU hotplug.
+	 */
+	if (xstate_size == 0)
+		fpstate_xstate_init_size();
+
+	mxcsr_feature_mask_init();
+	xsave_init();
+	eager_fpu_init();
+}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 025/208] x86/fpu: Remove unnecessary includes from core.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (23 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 024/208] x86/fpu: Split out the boot time FPU init code into fpu/init.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 026/208] x86/fpu: Move the no_387 handling and FPU detection code into init.c Ingo Molnar
                   ` (54 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

fpu/core.c includes a lot of files for mostly historic reasons.

It only needs fpu-internal.h, which already includes all
the required headers.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 9866a580952f..b05199fa168c 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -5,20 +5,7 @@
  *  General FPU state handling cleanups
  *	Gareth Hughes <gareth@valinux.com>, May 2000
  */
-#include <linux/module.h>
-#include <linux/regset.h>
-#include <linux/sched.h>
-#include <linux/slab.h>
-
-#include <asm/sigcontext.h>
-#include <asm/processor.h>
-#include <asm/math_emu.h>
-#include <asm/tlbflush.h>
-#include <asm/uaccess.h>
-#include <asm/ptrace.h>
-#include <asm/i387.h>
 #include <asm/fpu-internal.h>
-#include <asm/user.h>
 
 static DEFINE_PER_CPU(bool, in_kernel_fpu);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 026/208] x86/fpu: Move the no_387 handling and FPU detection code into init.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (24 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 025/208] x86/fpu: Remove unnecessary includes from core.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 027/208] x86/fpu: Remove the free_thread_xstate() complication Ingo Molnar
                   ` (53 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Both no_387() and fpu__detect() run at boot time, so they belong
into init.c.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 34 ----------------------------------
 arch/x86/kernel/fpu/init.c | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index b05199fa168c..9211582f5d3f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -581,37 +581,3 @@ int dump_fpu(struct pt_regs *regs, struct user_i387_struct *fpu)
 EXPORT_SYMBOL(dump_fpu);
 
 #endif	/* CONFIG_X86_32 || CONFIG_IA32_EMULATION */
-
-static int __init no_387(char *s)
-{
-	setup_clear_cpu_cap(X86_FEATURE_FPU);
-	return 1;
-}
-
-__setup("no387", no_387);
-
-/*
- * Set the X86_FEATURE_FPU CPU-capability bit based on
- * trying to execute an actual sequence of FPU instructions:
- */
-void fpu__detect(struct cpuinfo_x86 *c)
-{
-	unsigned long cr0;
-	u16 fsw, fcw;
-
-	fsw = fcw = 0xffff;
-
-	cr0 = read_cr0();
-	cr0 &= ~(X86_CR0_TS | X86_CR0_EM);
-	write_cr0(cr0);
-
-	asm volatile("fninit ; fnstsw %0 ; fnstcw %1"
-		     : "+m" (fsw), "+m" (fcw));
-
-	if (fsw == 0 && (fcw & 0x103f) == 0x003f)
-		set_cpu_cap(c, X86_FEATURE_FPU);
-	else
-		clear_cpu_cap(c, X86_FEATURE_FPU);
-
-	/* The final cr0 value is set in fpu_init() */
-}
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 0a666298abbd..5e06aa6cc22e 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -91,3 +91,37 @@ void fpu__cpu_init(void)
 	xsave_init();
 	eager_fpu_init();
 }
+
+static int __init no_387(char *s)
+{
+	setup_clear_cpu_cap(X86_FEATURE_FPU);
+	return 1;
+}
+
+__setup("no387", no_387);
+
+/*
+ * Set the X86_FEATURE_FPU CPU-capability bit based on
+ * trying to execute an actual sequence of FPU instructions:
+ */
+void fpu__detect(struct cpuinfo_x86 *c)
+{
+	unsigned long cr0;
+	u16 fsw, fcw;
+
+	fsw = fcw = 0xffff;
+
+	cr0 = read_cr0();
+	cr0 &= ~(X86_CR0_TS | X86_CR0_EM);
+	write_cr0(cr0);
+
+	asm volatile("fninit ; fnstsw %0 ; fnstcw %1"
+		     : "+m" (fsw), "+m" (fcw));
+
+	if (fsw == 0 && (fcw & 0x103f) == 0x003f)
+		set_cpu_cap(c, X86_FEATURE_FPU);
+	else
+		clear_cpu_cap(c, X86_FEATURE_FPU);
+
+	/* The final cr0 value is set in fpu_init() */
+}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 027/208] x86/fpu: Remove the free_thread_xstate() complication
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (25 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 026/208] x86/fpu: Move the no_387 handling and FPU detection code into init.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 028/208] x86/fpu: Factor out fpu__flush_thread() from flush_thread() Ingo Molnar
                   ` (52 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Use fpstate_free() directly to manage FPU state.

Only process.c was using this method, so this is a speedup as well,
as it removes the extra function call and related clobbers.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h | 1 -
 arch/x86/kernel/process.c        | 9 ++-------
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6b75c4b927ec..fef8db024ece 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -362,7 +362,6 @@ DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
 #endif	/* X86_64 */
 
 extern unsigned int xstate_size;
-extern void free_thread_xstate(struct task_struct *);
 extern struct kmem_cache *task_xstate_cachep;
 
 struct perf_event;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 5daa6547fdc7..a9bff373f7eb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -99,14 +99,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	return 0;
 }
 
-void free_thread_xstate(struct task_struct *tsk)
-{
-	fpstate_free(&tsk->thread.fpu);
-}
-
 void arch_release_task_struct(struct task_struct *tsk)
 {
-	free_thread_xstate(tsk);
+	fpstate_free(&tsk->thread.fpu);
 }
 
 void arch_task_cache_init(void)
@@ -154,7 +149,7 @@ void flush_thread(void)
 	if (!use_eager_fpu()) {
 		/* FPU state will be reallocated lazily at the first use. */
 		drop_fpu(tsk);
-		free_thread_xstate(tsk);
+		fpstate_free(&tsk->thread.fpu);
 	} else if (!used_math()) {
 		/* kthread execs. TODO: cleanup this horror. */
 		if (WARN_ON(fpstate_alloc_init(tsk)))
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 028/208] x86/fpu: Factor out fpu__flush_thread() from flush_thread()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (26 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 027/208] x86/fpu: Remove the free_thread_xstate() complication Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 029/208] x86/fpu: Move math_state_restore() to fpu/core.c Ingo Molnar
                   ` (51 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

flush_thread() open codes a lot of FPU internals - create a separate
function for it in fpu/core.c.

Turns out that this does not hurt performance:

   text    data     bss     dec     hex filename
   11843039        1884440 1130496 14857975         e2b6f7 vmlinux.before
   11843039        1884440 1130496 14857975         e2b6f7 vmlinux.after

and since this is a slowpath clarity comes first anyway.

We can reconsider inlining decisions after the FPU code has been cleaned up.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h |  1 +
 arch/x86/kernel/fpu/core.c  | 15 +++++++++++++++
 arch/x86/kernel/process.c   | 12 +-----------
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 6552a16e0e38..d6fc84440b73 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -20,6 +20,7 @@ struct user_i387_struct;
 
 extern int fpstate_alloc_init(struct task_struct *curr);
 extern void fpstate_init(struct fpu *fpu);
+extern void fpu__flush_thread(struct task_struct *tsk);
 
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
 extern void math_state_restore(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 9211582f5d3f..d31812d973a3 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -227,6 +227,21 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 	return 0;
 }
 
+void fpu__flush_thread(struct task_struct *tsk)
+{
+	if (!use_eager_fpu()) {
+		/* FPU state will be reallocated lazily at the first use. */
+		drop_fpu(tsk);
+		fpstate_free(&tsk->thread.fpu);
+	} else if (!used_math()) {
+		/* kthread execs. TODO: cleanup this horror. */
+		if (WARN_ON(fpstate_alloc_init(tsk)))
+			force_sig(SIGKILL, tsk);
+		user_fpu_begin();
+		restore_init_xstate();
+	}
+}
+
 /*
  * The xstateregs_active() routine is the same as the fpregs_active() routine,
  * as the "regset->n" for the xstate regset will be updated based on the feature
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index a9bff373f7eb..0a4c35c3fb2f 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -146,17 +146,7 @@ void flush_thread(void)
 	flush_ptrace_hw_breakpoint(tsk);
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
-	if (!use_eager_fpu()) {
-		/* FPU state will be reallocated lazily at the first use. */
-		drop_fpu(tsk);
-		fpstate_free(&tsk->thread.fpu);
-	} else if (!used_math()) {
-		/* kthread execs. TODO: cleanup this horror. */
-		if (WARN_ON(fpstate_alloc_init(tsk)))
-			force_sig(SIGKILL, tsk);
-		user_fpu_begin();
-		restore_init_xstate();
-	}
+	fpu__flush_thread(tsk);
 }
 
 static void hard_disable_TSC(void)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 029/208] x86/fpu: Move math_state_restore() to fpu/core.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (27 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 028/208] x86/fpu: Factor out fpu__flush_thread() from flush_thread() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 030/208] x86/fpu: Rename math_state_restore() to fpu__restore() Ingo Molnar
                   ` (50 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

It's another piece of FPU internals that is better off close to
the other FPU internals.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/traps.c    | 42 ------------------------------------------
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d31812d973a3..0451e3074b55 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -227,6 +227,48 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 	return 0;
 }
 
+/*
+ * 'math_state_restore()' saves the current math information in the
+ * old math state array, and gets the new ones from the current task
+ *
+ * Careful.. There are problems with IBM-designed IRQ13 behaviour.
+ * Don't touch unless you *really* know how it works.
+ *
+ * Must be called with kernel preemption disabled (eg with local
+ * local interrupts as in the case of do_device_not_available).
+ */
+void math_state_restore(void)
+{
+	struct task_struct *tsk = current;
+
+	if (!tsk_used_math(tsk)) {
+		local_irq_enable();
+		/*
+		 * does a slab alloc which can sleep
+		 */
+		if (fpstate_alloc_init(tsk)) {
+			/*
+			 * ran out of memory!
+			 */
+			do_group_exit(SIGKILL);
+			return;
+		}
+		local_irq_disable();
+	}
+
+	/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
+	kernel_fpu_disable();
+	__thread_fpu_begin(tsk);
+	if (unlikely(restore_fpu_checking(tsk))) {
+		fpu_reset_state(tsk);
+		force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
+	} else {
+		tsk->thread.fpu.counter++;
+	}
+	kernel_fpu_enable();
+}
+EXPORT_SYMBOL_GPL(math_state_restore);
+
 void fpu__flush_thread(struct task_struct *tsk)
 {
 	if (!use_eager_fpu()) {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 465b335e7491..63c7fc3677b4 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -826,48 +826,6 @@ asmlinkage __visible void __attribute__((weak)) smp_threshold_interrupt(void)
 {
 }
 
-/*
- * 'math_state_restore()' saves the current math information in the
- * old math state array, and gets the new ones from the current task
- *
- * Careful.. There are problems with IBM-designed IRQ13 behaviour.
- * Don't touch unless you *really* know how it works.
- *
- * Must be called with kernel preemption disabled (eg with local
- * local interrupts as in the case of do_device_not_available).
- */
-void math_state_restore(void)
-{
-	struct task_struct *tsk = current;
-
-	if (!tsk_used_math(tsk)) {
-		local_irq_enable();
-		/*
-		 * does a slab alloc which can sleep
-		 */
-		if (fpstate_alloc_init(tsk)) {
-			/*
-			 * ran out of memory!
-			 */
-			do_group_exit(SIGKILL);
-			return;
-		}
-		local_irq_disable();
-	}
-
-	/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
-	kernel_fpu_disable();
-	__thread_fpu_begin(tsk);
-	if (unlikely(restore_fpu_checking(tsk))) {
-		fpu_reset_state(tsk);
-		force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
-	} else {
-		tsk->thread.fpu.counter++;
-	}
-	kernel_fpu_enable();
-}
-EXPORT_SYMBOL_GPL(math_state_restore);
-
 dotraplinkage void
 do_device_not_available(struct pt_regs *regs, long error_code)
 {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 030/208] x86/fpu: Rename math_state_restore() to fpu__restore()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (28 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 029/208] x86/fpu: Move math_state_restore() to fpu/core.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 031/208] x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs() Ingo Molnar
                   ` (49 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move to the new fpu__*() namespace.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/preempt-locking.txt | 2 +-
 arch/x86/include/asm/i387.h       | 2 +-
 arch/x86/kernel/fpu/core.c        | 6 +++---
 arch/x86/kernel/fpu/xsave.c       | 2 +-
 arch/x86/kernel/process_32.c      | 2 +-
 arch/x86/kernel/process_64.c      | 2 +-
 arch/x86/kernel/traps.c           | 2 +-
 drivers/lguest/x86/core.c         | 4 ++--
 8 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt
index 57883ca2498b..e89ce6624af2 100644
--- a/Documentation/preempt-locking.txt
+++ b/Documentation/preempt-locking.txt
@@ -48,7 +48,7 @@ preemption must be disabled around such regions.
 
 Note, some FPU functions are already explicitly preempt safe.  For example,
 kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
-However, math_state_restore must be called with preemption disabled.
+However, fpu__restore() must be called with preemption disabled.
 
 
 RULE #3: Lock acquire and release must be performed by same task
diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index d6fc84440b73..c8ee395dd6c6 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -23,7 +23,7 @@ extern void fpstate_init(struct fpu *fpu);
 extern void fpu__flush_thread(struct task_struct *tsk);
 
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
-extern void math_state_restore(void);
+extern void fpu__restore(void);
 
 extern bool irq_fpu_usable(void);
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 0451e3074b55..1896344dd8e6 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -228,7 +228,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 }
 
 /*
- * 'math_state_restore()' saves the current math information in the
+ * 'fpu__restore()' saves the current math information in the
  * old math state array, and gets the new ones from the current task
  *
  * Careful.. There are problems with IBM-designed IRQ13 behaviour.
@@ -237,7 +237,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
  * Must be called with kernel preemption disabled (eg with local
  * local interrupts as in the case of do_device_not_available).
  */
-void math_state_restore(void)
+void fpu__restore(void)
 {
 	struct task_struct *tsk = current;
 
@@ -267,7 +267,7 @@ void math_state_restore(void)
 	}
 	kernel_fpu_enable();
 }
-EXPORT_SYMBOL_GPL(math_state_restore);
+EXPORT_SYMBOL_GPL(fpu__restore);
 
 void fpu__flush_thread(struct task_struct *tsk)
 {
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 163b5cc582ef..d913d5024901 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -404,7 +404,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 		set_used_math();
 		if (use_eager_fpu()) {
 			preempt_disable();
-			math_state_restore();
+			fpu__restore();
 			preempt_enable();
 		}
 
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 84d647d4b14d..1a0edce626b2 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -295,7 +295,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * Leave lazy mode, flushing any hypercalls made here.
 	 * This must be done before restoring TLS segments so
 	 * the GDT and LDT are properly updated, and must be
-	 * done before math_state_restore, so the TS bit is up
+	 * done before fpu__restore(), so the TS bit is up
 	 * to date.
 	 */
 	arch_end_context_switch(next_p);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ae6efeccb46e..99cc4b8589ad 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -298,7 +298,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * Leave lazy mode, flushing any hypercalls made here.  This
 	 * must be done after loading TLS entries in the GDT but before
 	 * loading segments that might reference them, and and it must
-	 * be done before math_state_restore, so the TS bit is up to
+	 * be done before fpu__restore(), so the TS bit is up to
 	 * date.
 	 */
 	arch_end_context_switch(next_p);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 63c7fc3677b4..22ad90a40dbf 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -846,7 +846,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 		return;
 	}
 #endif
-	math_state_restore(); /* interrupts still off */
+	fpu__restore(); /* interrupts still off */
 #ifdef CONFIG_X86_32
 	conditional_sti(regs);
 #endif
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index 30f2aef69d78..bcb534a5512d 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -297,12 +297,12 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
 	/*
 	 * Similarly, if we took a trap because the Guest used the FPU,
 	 * we have to restore the FPU it expects to see.
-	 * math_state_restore() may sleep and we may even move off to
+	 * fpu__restore() may sleep and we may even move off to
 	 * a different CPU. So all the critical stuff should be done
 	 * before this.
 	 */
 	else if (cpu->regs->trapnum == 7 && !user_has_fpu())
-		math_state_restore();
+		fpu__restore();
 }
 
 /*H:130
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 031/208] x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (29 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 030/208] x86/fpu: Rename math_state_restore() to fpu__restore() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 032/208] x86/fpu: Simplify the xsave_state*() methods Ingo Molnar
                   ` (48 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move the boot-time FPU bug detection code to the other FPU boot time
init code in fpu/init.c.

No change in code size:

   text    data     bss     dec     hex filename
   13044568        1884440 1130496 16059504         f50c70 vmlinux.before
   13044568        1884440 1130496 16059504         f50c70 vmlinux.after

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h |  1 +
 arch/x86/kernel/cpu/bugs.c  | 53 +----------------------------------------------------
 arch/x86/kernel/fpu/init.c  | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+), 52 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index c8ee395dd6c6..89ae3e051741 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -24,6 +24,7 @@ extern void fpu__flush_thread(struct task_struct *tsk);
 
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
 extern void fpu__restore(void);
+extern void fpu__init_check_bugs(void);
 
 extern bool irq_fpu_usable(void);
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 03445346ee0a..eb8be0c5823b 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -17,52 +17,6 @@
 #include <asm/paravirt.h>
 #include <asm/alternative.h>
 
-static double __initdata x = 4195835.0;
-static double __initdata y = 3145727.0;
-
-/*
- * This used to check for exceptions..
- * However, it turns out that to support that,
- * the XMM trap handlers basically had to
- * be buggy. So let's have a correct XMM trap
- * handler, and forget about printing out
- * some status at boot.
- *
- * We should really only care about bugs here
- * anyway. Not features.
- */
-static void __init check_fpu(void)
-{
-	s32 fdiv_bug;
-
-	kernel_fpu_begin();
-
-	/*
-	 * trap_init() enabled FXSR and company _before_ testing for FP
-	 * problems here.
-	 *
-	 * Test for the divl bug: http://en.wikipedia.org/wiki/Fdiv_bug
-	 */
-	__asm__("fninit\n\t"
-		"fldl %1\n\t"
-		"fdivl %2\n\t"
-		"fmull %2\n\t"
-		"fldl %1\n\t"
-		"fsubp %%st,%%st(1)\n\t"
-		"fistpl %0\n\t"
-		"fwait\n\t"
-		"fninit"
-		: "=m" (*&fdiv_bug)
-		: "m" (*&x), "m" (*&y));
-
-	kernel_fpu_end();
-
-	if (fdiv_bug) {
-		set_cpu_bug(&boot_cpu_data, X86_BUG_FDIV);
-		pr_warn("Hmm, FPU with FDIV bug\n");
-	}
-}
-
 void __init check_bugs(void)
 {
 	identify_boot_cpu();
@@ -85,10 +39,5 @@ void __init check_bugs(void)
 		'0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
 	alternative_instructions();
 
-	/*
-	 * kernel_fpu_begin/end() in check_fpu() relies on the patched
-	 * alternative instructions.
-	 */
-	if (cpu_has_fpu)
-		check_fpu();
+	fpu__init_check_bugs();
 }
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 5e06aa6cc22e..4eabb426e910 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -4,6 +4,69 @@
 #include <asm/fpu-internal.h>
 #include <asm/tlbflush.h>
 
+/*
+ * Boot time CPU/FPU FDIV bug detection code:
+ */
+
+static double __initdata x = 4195835.0;
+static double __initdata y = 3145727.0;
+
+/*
+ * This used to check for exceptions..
+ * However, it turns out that to support that,
+ * the XMM trap handlers basically had to
+ * be buggy. So let's have a correct XMM trap
+ * handler, and forget about printing out
+ * some status at boot.
+ *
+ * We should really only care about bugs here
+ * anyway. Not features.
+ */
+static void __init check_fpu(void)
+{
+	s32 fdiv_bug;
+
+	kernel_fpu_begin();
+
+	/*
+	 * trap_init() enabled FXSR and company _before_ testing for FP
+	 * problems here.
+	 *
+	 * Test for the divl bug: http://en.wikipedia.org/wiki/Fdiv_bug
+	 */
+	__asm__("fninit\n\t"
+		"fldl %1\n\t"
+		"fdivl %2\n\t"
+		"fmull %2\n\t"
+		"fldl %1\n\t"
+		"fsubp %%st,%%st(1)\n\t"
+		"fistpl %0\n\t"
+		"fwait\n\t"
+		"fninit"
+		: "=m" (*&fdiv_bug)
+		: "m" (*&x), "m" (*&y));
+
+	kernel_fpu_end();
+
+	if (fdiv_bug) {
+		set_cpu_bug(&boot_cpu_data, X86_BUG_FDIV);
+		pr_warn("Hmm, FPU with FDIV bug\n");
+	}
+}
+
+void fpu__init_check_bugs(void)
+{
+	/*
+	 * kernel_fpu_begin/end() in check_fpu() relies on the patched
+	 * alternative instructions.
+	 */
+	if (cpu_has_fpu)
+		check_fpu();
+}
+
+/*
+ * Boot time FPU feature detection code:
+ */
 unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
 unsigned int xstate_size;
 EXPORT_SYMBOL_GPL(xstate_size);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 032/208] x86/fpu: Simplify the xsave_state*() methods
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (30 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 031/208] x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 033/208] x86/fpu: Remove fpu_xsave() Ingo Molnar
                   ` (47 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

These functions (xsave_state() and xsave_state_booting()) have a 'mask'
argument that is always -1.

Propagate this into the functions instead and eliminate the extra argument.

Does not change the generated code, because these were inlined functions.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 4 ++--
 arch/x86/include/asm/xsave.h        | 8 +++++---
 arch/x86/kernel/fpu/xsave.c         | 3 ++-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index c3b7bd12f18f..b1f5dd63cfeb 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -527,9 +527,9 @@ static inline void __save_fpu(struct task_struct *tsk)
 {
 	if (use_xsave()) {
 		if (unlikely(system_state == SYSTEM_BOOTING))
-			xsave_state_booting(&tsk->thread.fpu.state->xsave, -1);
+			xsave_state_booting(&tsk->thread.fpu.state->xsave);
 		else
-			xsave_state(&tsk->thread.fpu.state->xsave, -1);
+			xsave_state(&tsk->thread.fpu.state->xsave);
 	} else
 		fpu_fxsave(&tsk->thread.fpu);
 }
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 58ed0ca5a11e..61c951ce77fe 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -70,8 +70,9 @@ extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
  * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
-static inline int xsave_state_booting(struct xsave_struct *fx, u64 mask)
+static inline int xsave_state_booting(struct xsave_struct *fx)
 {
+	u64 mask = -1;
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err = 0;
@@ -123,8 +124,9 @@ static inline int xrstor_state_booting(struct xsave_struct *fx, u64 mask)
 /*
  * Save processor xstate to xsave area.
  */
-static inline int xsave_state(struct xsave_struct *fx, u64 mask)
+static inline int xsave_state(struct xsave_struct *fx)
 {
+	u64 mask = -1;
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err = 0;
@@ -189,7 +191,7 @@ static inline int xrstor_state(struct xsave_struct *fx, u64 mask)
  */
 static inline void fpu_xsave(struct fpu *fpu)
 {
-	xsave_state(&fpu->state->xsave, -1);
+	xsave_state(&fpu->state->xsave);
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index d913d5024901..a52205b87acb 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -558,11 +558,12 @@ static void __init setup_init_fpu_buf(void)
 	 * Init all the features state with header_bv being 0x0
 	 */
 	xrstor_state_booting(init_xstate_buf, -1);
+
 	/*
 	 * Dump the init state again. This is to identify the init state
 	 * of any feature which is not represented by all zero's.
 	 */
-	xsave_state_booting(init_xstate_buf, -1);
+	xsave_state_booting(init_xstate_buf);
 }
 
 static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 033/208] x86/fpu: Remove fpu_xsave()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (31 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 032/208] x86/fpu: Simplify the xsave_state*() methods Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 034/208] x86/fpu: Move task_xstate_cachep handling to core.c Ingo Molnar
                   ` (46 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

It's a pointless wrapper now - use xsave_state().

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 2 +-
 arch/x86/include/asm/xsave.h        | 8 --------
 arch/x86/mm/mpx.c                   | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index b1f5dd63cfeb..95e04cb1ed2f 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -265,7 +265,7 @@ static inline void fpu_fxsave(struct fpu *fpu)
 static inline int fpu_save_init(struct fpu *fpu)
 {
 	if (use_xsave()) {
-		fpu_xsave(fpu);
+		xsave_state(&fpu->state->xsave);
 
 		/*
 		 * xsave header may indicate the init state of the FP.
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 61c951ce77fe..7c90ea93c54e 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -187,14 +187,6 @@ static inline int xrstor_state(struct xsave_struct *fx, u64 mask)
 }
 
 /*
- * Save xstate context for old process during context switch.
- */
-static inline void fpu_xsave(struct fpu *fpu)
-{
-	xsave_state(&fpu->state->xsave);
-}
-
-/*
  * Restore xstate context for new process during context switch.
  */
 static inline int fpu_xrstor_checking(struct xsave_struct *fx)
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 37ad432e7f16..412b5f81e547 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -389,7 +389,7 @@ int mpx_enable_management(struct task_struct *tsk)
 	 * directory into XSAVE/XRSTOR Save Area and enable MPX through
 	 * XRSTOR instruction.
 	 *
-	 * fpu_xsave() is expected to be very expensive. Storing the bounds
+	 * xsave_state() is expected to be very expensive. Storing the bounds
 	 * directory here means that we do not have to do xsave in the unmap
 	 * path; we can just use mm->bd_addr instead.
 	 */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 034/208] x86/fpu: Move task_xstate_cachep handling to core.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (32 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 033/208] x86/fpu: Remove fpu_xsave() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 035/208] x86/fpu: Factor out fpu__copy() Ingo Molnar
                   ` (45 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This code was historically in process.c, now we have FPU core internals in
fpu/core.c instead - move it there.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h |  2 ++
 arch/x86/kernel/fpu/core.c          | 15 +++++++++++++++
 arch/x86/kernel/process.c           |  9 +--------
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 95e04cb1ed2f..f41170c6d376 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -564,6 +564,8 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 	}
 }
 
+extern void fpstate_cache_init(void);
+
 extern int fpstate_alloc(struct fpu *fpu);
 
 static inline void fpstate_free(struct fpu *fpu)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1896344dd8e6..9dc4bb3f6f5a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -147,6 +147,21 @@ void fpstate_init(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
+/*
+ * FPU state allocation:
+ */
+struct kmem_cache *task_xstate_cachep;
+EXPORT_SYMBOL_GPL(task_xstate_cachep);
+
+void fpstate_cache_init(void)
+{
+	task_xstate_cachep =
+		kmem_cache_create("task_xstate", xstate_size,
+				  __alignof__(union thread_xstate),
+				  SLAB_PANIC | SLAB_NOTRACK, NULL);
+	setup_xstate_comp();
+}
+
 int fpstate_alloc(struct fpu *fpu)
 {
 	if (fpu->state)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0a4c35c3fb2f..fda613eeeebd 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -75,9 +75,6 @@ void idle_notifier_unregister(struct notifier_block *n)
 EXPORT_SYMBOL_GPL(idle_notifier_unregister);
 #endif
 
-struct kmem_cache *task_xstate_cachep;
-EXPORT_SYMBOL_GPL(task_xstate_cachep);
-
 /*
  * this gets called so that we can store lazy state into memory and copy the
  * current task into the new thread.
@@ -106,11 +103,7 @@ void arch_release_task_struct(struct task_struct *tsk)
 
 void arch_task_cache_init(void)
 {
-        task_xstate_cachep =
-        	kmem_cache_create("task_xstate", xstate_size,
-				  __alignof__(union thread_xstate),
-				  SLAB_PANIC | SLAB_NOTRACK, NULL);
-	setup_xstate_comp();
+	fpstate_cache_init();
 }
 
 /*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 035/208] x86/fpu: Factor out fpu__copy()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (33 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 034/208] x86/fpu: Move task_xstate_cachep handling to core.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 036/208] x86/fpu: Uninline fpstate_free() and move it next to the allocation function Ingo Molnar
                   ` (44 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Introduce fpu__copy() and use it in arch_dup_task_struct(),
thus moving another chunk of FPU logic to fpu/core.c.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h |  1 +
 arch/x86/kernel/fpu/core.c          | 18 ++++++++++++++++++
 arch/x86/kernel/process.c           | 12 +-----------
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index f41170c6d376..6c1ceb7c3f9a 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -567,6 +567,7 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 extern void fpstate_cache_init(void);
 
 extern int fpstate_alloc(struct fpu *fpu);
+extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
 
 static inline void fpstate_free(struct fpu *fpu)
 {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 9dc4bb3f6f5a..e02c42965f53 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -178,6 +178,24 @@ int fpstate_alloc(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpstate_alloc);
 
+int fpu__copy(struct task_struct *dst, struct task_struct *src)
+{
+	dst->thread.fpu.counter = 0;
+	dst->thread.fpu.has_fpu = 0;
+	dst->thread.fpu.state = NULL;
+
+	task_disable_lazy_fpu_restore(dst);
+
+	if (tsk_used_math(src)) {
+		int err = fpstate_alloc(&dst->thread.fpu);
+
+		if (err)
+			return err;
+		fpu_copy(dst, src);
+	}
+	return 0;
+}
+
 /*
  * Allocate the backing store for the current task's FPU registers
  * and initialize the registers themselves as well.
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index fda613eeeebd..1b4ea12e412d 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -83,17 +83,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
 	*dst = *src;
 
-	dst->thread.fpu.counter = 0;
-	dst->thread.fpu.has_fpu = 0;
-	dst->thread.fpu.state = NULL;
-	task_disable_lazy_fpu_restore(dst);
-	if (tsk_used_math(src)) {
-		int err = fpstate_alloc(&dst->thread.fpu);
-		if (err)
-			return err;
-		fpu_copy(dst, src);
-	}
-	return 0;
+	return fpu__copy(dst, src);
 }
 
 void arch_release_task_struct(struct task_struct *tsk)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 036/208] x86/fpu: Uninline fpstate_free() and move it next to the allocation function
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (34 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 035/208] x86/fpu: Factor out fpu__copy() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 037/208] x86/fpu: Make task_xstate_cachep static Ingo Molnar
                   ` (43 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 9 +--------
 arch/x86/kernel/fpu/core.c          | 9 +++++++++
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 6c1ceb7c3f9a..16a1c66cf4ee 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -567,16 +567,9 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
 extern void fpstate_cache_init(void);
 
 extern int fpstate_alloc(struct fpu *fpu);
+extern void fpstate_free(struct fpu *fpu);
 extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
 
-static inline void fpstate_free(struct fpu *fpu)
-{
-	if (fpu->state) {
-		kmem_cache_free(task_xstate_cachep, fpu->state);
-		fpu->state = NULL;
-	}
-}
-
 static inline void fpu_copy(struct task_struct *dst, struct task_struct *src)
 {
 	if (use_eager_fpu()) {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index e02c42965f53..4209105d2854 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -178,6 +178,15 @@ int fpstate_alloc(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpstate_alloc);
 
+void fpstate_free(struct fpu *fpu)
+{
+	if (fpu->state) {
+		kmem_cache_free(task_xstate_cachep, fpu->state);
+		fpu->state = NULL;
+	}
+}
+EXPORT_SYMBOL_GPL(fpstate_free);
+
 int fpu__copy(struct task_struct *dst, struct task_struct *src)
 {
 	dst->thread.fpu.counter = 0;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 037/208] x86/fpu: Make task_xstate_cachep static
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (35 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 036/208] x86/fpu: Uninline fpstate_free() and move it next to the allocation function Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 038/208] x86/fpu: Make kernel_fpu_disable/enable() static Ingo Molnar
                   ` (42 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

It's now local to fpu/core.c, make it static.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h | 1 -
 arch/x86/kernel/fpu/core.c       | 3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index fef8db024ece..d50cc7f61559 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -362,7 +362,6 @@ DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
 #endif	/* X86_64 */
 
 extern unsigned int xstate_size;
-extern struct kmem_cache *task_xstate_cachep;
 
 struct perf_event;
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 4209105d2854..1896e96f1082 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -150,8 +150,7 @@ EXPORT_SYMBOL_GPL(fpstate_init);
 /*
  * FPU state allocation:
  */
-struct kmem_cache *task_xstate_cachep;
-EXPORT_SYMBOL_GPL(task_xstate_cachep);
+static struct kmem_cache *task_xstate_cachep;
 
 void fpstate_cache_init(void)
 {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 038/208] x86/fpu: Make kernel_fpu_disable/enable() static
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (36 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 037/208] x86/fpu: Make task_xstate_cachep static Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 039/208] x86/fpu: Add debug check to kernel_fpu_disable() Ingo Molnar
                   ` (41 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This allows the compiler to inline them and to eliminate them:

   arch/x86/kernel/fpu/core.o:

   text    data     bss     dec     hex filename
   6741       4       8    6753    1a61 core.o.before
   6716       4       8    6728    1a48 core.o.after

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h | 4 ----
 arch/x86/kernel/fpu/core.c  | 8 ++++----
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 89ae3e051741..e69989f95da5 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -54,10 +54,6 @@ static inline void kernel_fpu_end(void)
 	preempt_enable();
 }
 
-/* Must be called with preempt disabled */
-extern void kernel_fpu_disable(void);
-extern void kernel_fpu_enable(void);
-
 /*
  * Some instructions like VIA's padlock instructions generate a spurious
  * DNA fault but don't modify SSE registers. And these instructions
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1896e96f1082..587e4ab46f59 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -9,13 +9,13 @@
 
 static DEFINE_PER_CPU(bool, in_kernel_fpu);
 
-void kernel_fpu_disable(void)
+static void kernel_fpu_disable(void)
 {
 	WARN_ON(this_cpu_read(in_kernel_fpu));
 	this_cpu_write(in_kernel_fpu, true);
 }
 
-void kernel_fpu_enable(void)
+static void kernel_fpu_enable(void)
 {
 	this_cpu_write(in_kernel_fpu, false);
 }
@@ -32,7 +32,7 @@ void kernel_fpu_enable(void)
  * Except for the eagerfpu case when we return true; in the likely case
  * the thread has FPU but we are not going to set/clear TS.
  */
-static inline bool interrupted_kernel_fpu_idle(void)
+static bool interrupted_kernel_fpu_idle(void)
 {
 	if (this_cpu_read(in_kernel_fpu))
 		return false;
@@ -52,7 +52,7 @@ static inline bool interrupted_kernel_fpu_idle(void)
  * in an interrupt context from user mode - we'll just
  * save the FPU state as required.
  */
-static inline bool interrupted_user_mode(void)
+static bool interrupted_user_mode(void)
 {
 	struct pt_regs *regs = get_irq_regs();
 	return regs && user_mode(regs);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 039/208] x86/fpu: Add debug check to kernel_fpu_disable()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (37 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 038/208] x86/fpu: Make kernel_fpu_disable/enable() static Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 040/208] x86/fpu: Add kernel_fpu_disabled() Ingo Molnar
                   ` (40 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

We are not supposed to call kernel_fpu_disable() if we have not
previously enabled it.

Also use kernel_fpu_disable()/enable() in the __kernel_fpu_begin/end()
primitives, instead of writing to in_kernel_fpu directly,
so that we get the debugging checks.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 587e4ab46f59..bcf705751d02 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -17,6 +17,7 @@ static void kernel_fpu_disable(void)
 
 static void kernel_fpu_enable(void)
 {
+	WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu));
 	this_cpu_write(in_kernel_fpu, false);
 }
 
@@ -77,7 +78,7 @@ void __kernel_fpu_begin(void)
 {
 	struct task_struct *me = current;
 
-	this_cpu_write(in_kernel_fpu, true);
+	kernel_fpu_disable();
 
 	if (__thread_has_fpu(me)) {
 		__save_init_fpu(me);
@@ -100,7 +101,7 @@ void __kernel_fpu_end(void)
 		stts();
 	}
 
-	this_cpu_write(in_kernel_fpu, false);
+	kernel_fpu_enable();
 }
 EXPORT_SYMBOL(__kernel_fpu_end);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 040/208] x86/fpu: Add kernel_fpu_disabled()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (38 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 039/208] x86/fpu: Add debug check to kernel_fpu_disable() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 041/208] x86/fpu: Remove __save_init_fpu() Ingo Molnar
                   ` (39 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Instead of open-coded in_kernel_fpu access, Use kernel_fpu_disabled() in
interrupted_kernel_fpu_idle(), matching the other kernel_fpu_*() methods.

Also add some documentation for in_kernel_fpu.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index bcf705751d02..87f10b49da47 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -7,6 +7,17 @@
  */
 #include <asm/fpu-internal.h>
 
+/*
+ * Track whether the kernel is using the FPU state
+ * currently.
+ *
+ * This flag is used:
+ *
+ *   - by IRQ context code to potentially use the FPU
+ *     if it's unused.
+ *
+ *   - to debug kernel_fpu_begin()/end() correctness
+ */
 static DEFINE_PER_CPU(bool, in_kernel_fpu);
 
 static void kernel_fpu_disable(void)
@@ -21,6 +32,11 @@ static void kernel_fpu_enable(void)
 	this_cpu_write(in_kernel_fpu, false);
 }
 
+static bool kernel_fpu_disabled(void)
+{
+	return this_cpu_read(in_kernel_fpu);
+}
+
 /*
  * Were we in an interrupt that interrupted kernel mode?
  *
@@ -35,7 +51,7 @@ static void kernel_fpu_enable(void)
  */
 static bool interrupted_kernel_fpu_idle(void)
 {
-	if (this_cpu_read(in_kernel_fpu))
+	if (kernel_fpu_disabled())
 		return false;
 
 	if (use_eager_fpu())
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 041/208] x86/fpu: Remove __save_init_fpu()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (39 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 040/208] x86/fpu: Add kernel_fpu_disabled() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 042/208] x86/fpu: Move fpu_copy() to fpu/core.c Ingo Molnar
                   ` (38 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

__save_init_fpu() is just a trivial wrapper around fpu_save_init().

Remove the extra layer of obfuscation.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 7 +------
 arch/x86/kernel/fpu/core.c          | 4 ++--
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 16a1c66cf4ee..1e2b6c67b1f1 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -295,11 +295,6 @@ static inline int fpu_save_init(struct fpu *fpu)
 	return 1;
 }
 
-static inline int __save_init_fpu(struct task_struct *tsk)
-{
-	return fpu_save_init(&tsk->thread.fpu);
-}
-
 static inline int fpu_restore_checking(struct fpu *fpu)
 {
 	if (use_xsave())
@@ -439,7 +434,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 		      (use_eager_fpu() || new->thread.fpu.counter > 5);
 
 	if (__thread_has_fpu(old)) {
-		if (!__save_init_fpu(old))
+		if (!fpu_save_init(&old->thread.fpu))
 			task_disable_lazy_fpu_restore(old);
 		else
 			old->thread.fpu.last_cpu = cpu;
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 87f10b49da47..c99c79af48d2 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -97,7 +97,7 @@ void __kernel_fpu_begin(void)
 	kernel_fpu_disable();
 
 	if (__thread_has_fpu(me)) {
-		__save_init_fpu(me);
+		fpu_save_init(&me->thread.fpu);
 	} else {
 		this_cpu_write(fpu_owner_task, NULL);
 		if (!use_eager_fpu())
@@ -135,7 +135,7 @@ void fpu__save(struct task_struct *tsk)
 		if (use_eager_fpu()) {
 			__save_fpu(tsk);
 		} else {
-			__save_init_fpu(tsk);
+			fpu_save_init(&tsk->thread.fpu);
 			__thread_fpu_end(tsk);
 		}
 	}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 042/208] x86/fpu: Move fpu_copy() to fpu/core.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (40 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 041/208] x86/fpu: Remove __save_init_fpu() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 043/208] x86/fpu: Add debugging check to fpu_copy() Ingo Molnar
                   ` (37 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move fpu_copy() where its only user is.

Beyond readability this also speeds up compilation, as fpu-internal.h
is included in over a dozen .c files.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 14 --------------
 arch/x86/kernel/fpu/core.c          | 14 ++++++++++++++
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 1e2b6c67b1f1..e180fb96dd0d 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -565,20 +565,6 @@ extern int fpstate_alloc(struct fpu *fpu);
 extern void fpstate_free(struct fpu *fpu);
 extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
 
-static inline void fpu_copy(struct task_struct *dst, struct task_struct *src)
-{
-	if (use_eager_fpu()) {
-		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
-		__save_fpu(dst);
-	} else {
-		struct fpu *dfpu = &dst->thread.fpu;
-		struct fpu *sfpu = &src->thread.fpu;
-
-		fpu__save(src);
-		memcpy(dfpu->state, sfpu->state, xstate_size);
-	}
-}
-
 static inline unsigned long
 alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx,
 		unsigned long *size)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index c99c79af48d2..3f3c79c75e37 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -203,6 +203,20 @@ void fpstate_free(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpstate_free);
 
+static void fpu_copy(struct task_struct *dst, struct task_struct *src)
+{
+	if (use_eager_fpu()) {
+		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
+		__save_fpu(dst);
+	} else {
+		struct fpu *dfpu = &dst->thread.fpu;
+		struct fpu *sfpu = &src->thread.fpu;
+
+		fpu__save(src);
+		memcpy(dfpu->state, sfpu->state, xstate_size);
+	}
+}
+
 int fpu__copy(struct task_struct *dst, struct task_struct *src)
 {
 	dst->thread.fpu.counter = 0;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 043/208] x86/fpu: Add debugging check to fpu_copy()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (41 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 042/208] x86/fpu: Move fpu_copy() to fpu/core.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 044/208] x86/fpu: Print out whether we are doing lazy/eager FPU context switches Ingo Molnar
                   ` (36 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Also add a bit of documentation.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 3f3c79c75e37..7711539bcda5 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -203,8 +203,18 @@ void fpstate_free(struct fpu *fpu)
 }
 EXPORT_SYMBOL_GPL(fpstate_free);
 
+/*
+ * Copy the current task's FPU state to a new task's FPU context.
+ *
+ * In the 'eager' case we just save to the destination context.
+ *
+ * In the 'lazy' case we save to the source context, mark the FPU lazy
+ * via stts() and copy the source context into the destination context.
+ */
 static void fpu_copy(struct task_struct *dst, struct task_struct *src)
 {
+	WARN_ON(src != current);
+
 	if (use_eager_fpu()) {
 		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
 		__save_fpu(dst);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 044/208] x86/fpu: Print out whether we are doing lazy/eager FPU context switches
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (42 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 043/208] x86/fpu: Add debugging check to fpu_copy() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 045/208] x86/fpu: Eliminate the __thread_has_fpu() wrapper Ingo Molnar
                   ` (35 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Ever since the kernel started defaulting to eager FPU switches on modern Intel
CPUs it's not been obvious whether a given system is using the lazy or the eager
FPU context switching logic.

So generate a boot message about which mode the FPU code is in:

  x86/fpu: Using 'lazy' FPU context switches.

or:

  x86/fpu: Using 'eager' FPU context switches.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/xsave.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index a52205b87acb..61696c5005eb 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -691,6 +691,8 @@ void __init_refok eager_fpu_init(void)
 	if (eagerfpu == ENABLE)
 		setup_force_cpu_cap(X86_FEATURE_EAGER_FPU);
 
+	printk_once(KERN_INFO "x86/fpu: Using '%s' FPU context switches.\n", eagerfpu == ENABLE ? "eager" : "lazy");
+
 	if (!cpu_has_eager_fpu) {
 		stts();
 		return;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 045/208] x86/fpu: Eliminate the __thread_has_fpu() wrapper
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (43 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 044/208] x86/fpu: Print out whether we are doing lazy/eager FPU context switches Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 046/208] x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter Ingo Molnar
                   ` (34 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Start migrating FPU methods towards using 'struct fpu *fpu'
directly. __thread_has_fpu() is just a trivial wrapper around
fpu->has_fpu, eliminate it.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 16 ++++------------
 arch/x86/kernel/fpu/core.c          | 17 ++++++++++-------
 2 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index e180fb96dd0d..c005d1fc1247 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -323,16 +323,6 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
 	return fpu_restore_checking(&tsk->thread.fpu);
 }
 
-/*
- * Software FPU state helpers. Careful: these need to
- * be preemption protection *and* they need to be
- * properly paired with the CR0.TS changes!
- */
-static inline int __thread_has_fpu(struct task_struct *tsk)
-{
-	return tsk->thread.fpu.has_fpu;
-}
-
 /* Must be paired with an 'stts' after! */
 static inline void __thread_clear_has_fpu(struct task_struct *tsk)
 {
@@ -370,13 +360,14 @@ static inline void __thread_fpu_begin(struct task_struct *tsk)
 
 static inline void drop_fpu(struct task_struct *tsk)
 {
+	struct fpu *fpu = &tsk->thread.fpu;
 	/*
 	 * Forget coprocessor state..
 	 */
 	preempt_disable();
 	tsk->thread.fpu.counter = 0;
 
-	if (__thread_has_fpu(tsk)) {
+	if (fpu->has_fpu) {
 		/* Ignore delayed exceptions from user space */
 		asm volatile("1: fwait\n"
 			     "2:\n"
@@ -424,6 +415,7 @@ typedef struct { int preload; } fpu_switch_t;
 
 static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct task_struct *new, int cpu)
 {
+	struct fpu *old_fpu = &old->thread.fpu;
 	fpu_switch_t fpu;
 
 	/*
@@ -433,7 +425,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 	fpu.preload = tsk_used_math(new) &&
 		      (use_eager_fpu() || new->thread.fpu.counter > 5);
 
-	if (__thread_has_fpu(old)) {
+	if (old_fpu->has_fpu) {
 		if (!fpu_save_init(&old->thread.fpu))
 			task_disable_lazy_fpu_restore(old);
 		else
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7711539bcda5..c0633ca8fece 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -57,8 +57,7 @@ static bool interrupted_kernel_fpu_idle(void)
 	if (use_eager_fpu())
 		return true;
 
-	return !__thread_has_fpu(current) &&
-		(read_cr0() & X86_CR0_TS);
+	return !current->thread.fpu.has_fpu && (read_cr0() & X86_CR0_TS);
 }
 
 /*
@@ -93,11 +92,12 @@ EXPORT_SYMBOL(irq_fpu_usable);
 void __kernel_fpu_begin(void)
 {
 	struct task_struct *me = current;
+	struct fpu *fpu = &me->thread.fpu;
 
 	kernel_fpu_disable();
 
-	if (__thread_has_fpu(me)) {
-		fpu_save_init(&me->thread.fpu);
+	if (fpu->has_fpu) {
+		fpu_save_init(fpu);
 	} else {
 		this_cpu_write(fpu_owner_task, NULL);
 		if (!use_eager_fpu())
@@ -109,8 +109,9 @@ EXPORT_SYMBOL(__kernel_fpu_begin);
 void __kernel_fpu_end(void)
 {
 	struct task_struct *me = current;
+	struct fpu *fpu = &me->thread.fpu;
 
-	if (__thread_has_fpu(me)) {
+	if (fpu->has_fpu) {
 		if (WARN_ON(restore_fpu_checking(me)))
 			fpu_reset_state(me);
 	} else if (!use_eager_fpu()) {
@@ -128,14 +129,16 @@ EXPORT_SYMBOL(__kernel_fpu_end);
  */
 void fpu__save(struct task_struct *tsk)
 {
+	struct fpu *fpu = &tsk->thread.fpu;
+
 	WARN_ON(tsk != current);
 
 	preempt_disable();
-	if (__thread_has_fpu(tsk)) {
+	if (fpu->has_fpu) {
 		if (use_eager_fpu()) {
 			__save_fpu(tsk);
 		} else {
-			fpu_save_init(&tsk->thread.fpu);
+			fpu_save_init(fpu);
 			__thread_fpu_end(tsk);
 		}
 	}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 046/208] x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (44 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 045/208] x86/fpu: Eliminate the __thread_has_fpu() wrapper Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 047/208] x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c Ingo Molnar
                   ` (33 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

We do this to make the code more readable, and also to be able to eliminate
task_struct usage from most of the FPU code.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index c005d1fc1247..94c068b6238e 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -324,9 +324,9 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
 }
 
 /* Must be paired with an 'stts' after! */
-static inline void __thread_clear_has_fpu(struct task_struct *tsk)
+static inline void __thread_clear_has_fpu(struct fpu *fpu)
 {
-	tsk->thread.fpu.has_fpu = 0;
+	fpu->has_fpu = 0;
 	this_cpu_write(fpu_owner_task, NULL);
 }
 
@@ -346,7 +346,7 @@ static inline void __thread_set_has_fpu(struct task_struct *tsk)
  */
 static inline void __thread_fpu_end(struct task_struct *tsk)
 {
-	__thread_clear_has_fpu(tsk);
+	__thread_clear_has_fpu(&tsk->thread.fpu);
 	if (!use_eager_fpu())
 		stts();
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 047/208] x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (45 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 046/208] x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 048/208] x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx Ingo Molnar
                   ` (32 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move it closer to other per-cpu FPU data structures.

This also unifies the 32-bit and 64-bit code.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/common.c | 3 ---
 arch/x86/kernel/fpu/core.c   | 5 +++++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 220ad95e0e28..88bb7a75f5c6 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1182,8 +1182,6 @@ DEFINE_PER_CPU(unsigned int, irq_count) __visible = -1;
 DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
 EXPORT_PER_CPU_SYMBOL(__preempt_count);
 
-DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
-
 /*
  * Special IST stacks which the CPU switches to when it calls
  * an IST-marked descriptor entry. Up to 7 stacks (hardware
@@ -1274,7 +1272,6 @@ DEFINE_PER_CPU(struct task_struct *, current_task) = &init_task;
 EXPORT_PER_CPU_SYMBOL(current_task);
 DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
 EXPORT_PER_CPU_SYMBOL(__preempt_count);
-DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
 
 /*
  * On x86_32, vm86 modifies tss.sp0, so sp0 isn't a reliable way to find
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index c0633ca8fece..7942105a2d20 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -20,6 +20,11 @@
  */
 static DEFINE_PER_CPU(bool, in_kernel_fpu);
 
+/*
+ * Track which task is using the FPU on the CPU:
+ */
+DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
+
 static void kernel_fpu_disable(void)
 {
 	WARN_ON(this_cpu_read(in_kernel_fpu));
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 048/208] x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (46 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 047/208] x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 049/208] x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu() Ingo Molnar
                   ` (31 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Track the FPU owner context instead of the owner task: this change,
together with other changes, will allow in subsequent patches the
elimination of 'struct task_struct' usage in various FPU code:
we'll be able to use 'struct fpu' only.

There's no change in code size:

      text           data     bss      dec            hex filename
  13066467        2545248 1626112 17237827        1070743 vmlinux.before
  13066467        2545248 1626112 17237827        1070743 vmlinux.after

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 14 +++++++-------
 arch/x86/kernel/fpu/core.c          |  9 ++++-----
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 94c068b6238e..d0fe7bbb51d1 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -37,7 +37,7 @@ extern unsigned int mxcsr_feature_mask;
 extern void fpu__cpu_init(void);
 extern void eager_fpu_init(void);
 
-DECLARE_PER_CPU(struct task_struct *, fpu_owner_task);
+DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
 
 extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
 			      struct task_struct *tsk);
@@ -63,7 +63,7 @@ static inline void finit_soft_fpu(struct i387_soft_struct *soft) {}
 #endif
 
 /*
- * Must be run with preemption disabled: this clears the fpu_owner_task,
+ * Must be run with preemption disabled: this clears the fpu_fpregs_owner_ctx,
  * on this CPU.
  *
  * This will disable any lazy FPU state restore of the current FPU state,
@@ -71,7 +71,7 @@ static inline void finit_soft_fpu(struct i387_soft_struct *soft) {}
  */
 static inline void __cpu_disable_lazy_restore(unsigned int cpu)
 {
-	per_cpu(fpu_owner_task, cpu) = NULL;
+	per_cpu(fpu_fpregs_owner_ctx, cpu) = NULL;
 }
 
 /*
@@ -86,7 +86,7 @@ static inline void task_disable_lazy_fpu_restore(struct task_struct *tsk)
 
 static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu)
 {
-	return new == this_cpu_read_stable(fpu_owner_task) &&
+	return &new->thread.fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) &&
 		cpu == new->thread.fpu.last_cpu;
 }
 
@@ -327,14 +327,14 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
 static inline void __thread_clear_has_fpu(struct fpu *fpu)
 {
 	fpu->has_fpu = 0;
-	this_cpu_write(fpu_owner_task, NULL);
+	this_cpu_write(fpu_fpregs_owner_ctx, NULL);
 }
 
 /* Must be paired with a 'clts' before! */
 static inline void __thread_set_has_fpu(struct task_struct *tsk)
 {
 	tsk->thread.fpu.has_fpu = 1;
-	this_cpu_write(fpu_owner_task, tsk);
+	this_cpu_write(fpu_fpregs_owner_ctx, &tsk->thread.fpu);
 }
 
 /*
@@ -431,7 +431,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 		else
 			old->thread.fpu.last_cpu = cpu;
 
-		/* But leave fpu_owner_task! */
+		/* But leave fpu_fpregs_owner_ctx! */
 		old->thread.fpu.has_fpu = 0;
 
 		/* Don't change CR0.TS if we just switch! */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7942105a2d20..172315e29c25 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -21,9 +21,9 @@
 static DEFINE_PER_CPU(bool, in_kernel_fpu);
 
 /*
- * Track which task is using the FPU on the CPU:
+ * Track which context is using the FPU on the CPU:
  */
-DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
+DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
 
 static void kernel_fpu_disable(void)
 {
@@ -96,15 +96,14 @@ EXPORT_SYMBOL(irq_fpu_usable);
 
 void __kernel_fpu_begin(void)
 {
-	struct task_struct *me = current;
-	struct fpu *fpu = &me->thread.fpu;
+	struct fpu *fpu = &current->thread.fpu;
 
 	kernel_fpu_disable();
 
 	if (fpu->has_fpu) {
 		fpu_save_init(fpu);
 	} else {
-		this_cpu_write(fpu_owner_task, NULL);
+		this_cpu_write(fpu_fpregs_owner_ctx, NULL);
 		if (!use_eager_fpu())
 			clts();
 	}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 049/208] x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (47 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 048/208] x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 050/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end() Ingo Molnar
                   ` (30 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index d0fe7bbb51d1..cf0d4124fb3d 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -331,10 +331,10 @@ static inline void __thread_clear_has_fpu(struct fpu *fpu)
 }
 
 /* Must be paired with a 'clts' before! */
-static inline void __thread_set_has_fpu(struct task_struct *tsk)
+static inline void __thread_set_has_fpu(struct fpu *fpu)
 {
-	tsk->thread.fpu.has_fpu = 1;
-	this_cpu_write(fpu_fpregs_owner_ctx, &tsk->thread.fpu);
+	fpu->has_fpu = 1;
+	this_cpu_write(fpu_fpregs_owner_ctx, fpu);
 }
 
 /*
@@ -355,7 +355,7 @@ static inline void __thread_fpu_begin(struct task_struct *tsk)
 {
 	if (!use_eager_fpu())
 		clts();
-	__thread_set_has_fpu(tsk);
+	__thread_set_has_fpu(&tsk->thread.fpu);
 }
 
 static inline void drop_fpu(struct task_struct *tsk)
@@ -416,6 +416,7 @@ typedef struct { int preload; } fpu_switch_t;
 static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct task_struct *new, int cpu)
 {
 	struct fpu *old_fpu = &old->thread.fpu;
+	struct fpu *new_fpu = &new->thread.fpu;
 	fpu_switch_t fpu;
 
 	/*
@@ -437,7 +438,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 		/* Don't change CR0.TS if we just switch! */
 		if (fpu.preload) {
 			new->thread.fpu.counter++;
-			__thread_set_has_fpu(new);
+			__thread_set_has_fpu(new_fpu);
 			prefetch(new->thread.fpu.state);
 		} else if (!use_eager_fpu())
 			stts();
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 050/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (48 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 049/208] x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 051/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin() Ingo Molnar
                   ` (29 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 6 +++---
 arch/x86/kernel/fpu/core.c          | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index cf0d4124fb3d..b1803a656651 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -344,9 +344,9 @@ static inline void __thread_set_has_fpu(struct fpu *fpu)
  * These generally need preemption protection to work,
  * do try to avoid using these on their own.
  */
-static inline void __thread_fpu_end(struct task_struct *tsk)
+static inline void __thread_fpu_end(struct fpu *fpu)
 {
-	__thread_clear_has_fpu(&tsk->thread.fpu);
+	__thread_clear_has_fpu(fpu);
 	if (!use_eager_fpu())
 		stts();
 }
@@ -372,7 +372,7 @@ static inline void drop_fpu(struct task_struct *tsk)
 		asm volatile("1: fwait\n"
 			     "2:\n"
 			     _ASM_EXTABLE(1b, 2b));
-		__thread_fpu_end(tsk);
+		__thread_fpu_end(fpu);
 	}
 
 	clear_stopped_child_used_math(tsk);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 172315e29c25..96d175101cd8 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -143,7 +143,7 @@ void fpu__save(struct task_struct *tsk)
 			__save_fpu(tsk);
 		} else {
 			fpu_save_init(fpu);
-			__thread_fpu_end(tsk);
+			__thread_fpu_end(fpu);
 		}
 	}
 	preempt_enable();
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 051/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (49 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 050/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 052/208] x86/fpu: Open code PF_USED_MATH usages Ingo Molnar
                   ` (28 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 10 ++++++----
 arch/x86/kernel/fpu/core.c          |  3 ++-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index b1803a656651..44516ad6c890 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -351,11 +351,11 @@ static inline void __thread_fpu_end(struct fpu *fpu)
 		stts();
 }
 
-static inline void __thread_fpu_begin(struct task_struct *tsk)
+static inline void __thread_fpu_begin(struct fpu *fpu)
 {
 	if (!use_eager_fpu())
 		clts();
-	__thread_set_has_fpu(&tsk->thread.fpu);
+	__thread_set_has_fpu(fpu);
 }
 
 static inline void drop_fpu(struct task_struct *tsk)
@@ -451,7 +451,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 				fpu.preload = 0;
 			else
 				prefetch(new->thread.fpu.state);
-			__thread_fpu_begin(new);
+			__thread_fpu_begin(new_fpu);
 		}
 	}
 	return fpu;
@@ -505,9 +505,11 @@ static inline int restore_xstate_sig(void __user *buf, int ia32_frame)
  */
 static inline void user_fpu_begin(void)
 {
+	struct fpu *fpu = &current->thread.fpu;
+
 	preempt_disable();
 	if (!user_has_fpu())
-		__thread_fpu_begin(current);
+		__thread_fpu_begin(fpu);
 	preempt_enable();
 }
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 96d175101cd8..275bdb768895 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -329,6 +329,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 void fpu__restore(void)
 {
 	struct task_struct *tsk = current;
+	struct fpu *fpu = &tsk->thread.fpu;
 
 	if (!tsk_used_math(tsk)) {
 		local_irq_enable();
@@ -347,7 +348,7 @@ void fpu__restore(void)
 
 	/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
 	kernel_fpu_disable();
-	__thread_fpu_begin(tsk);
+	__thread_fpu_begin(fpu);
 	if (unlikely(restore_fpu_checking(tsk))) {
 		fpu_reset_state(tsk);
 		force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 052/208] x86/fpu: Open code PF_USED_MATH usages
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (50 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 051/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 053/208] x86/fpu: Document fpu__unlazy_stopped() Ingo Molnar
                   ` (27 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

PF_USED_MATH is used directly, but also in a handful of helper inlines.

To ease the elimination of PF_USED_MATH, convert all inline helpers
to open-coded PF_USED_MATH usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/ia32/ia32_signal.c         |  2 +-
 arch/x86/include/asm/fpu-internal.h |  5 +++--
 arch/x86/kernel/fpu/core.c          | 14 ++++++++------
 arch/x86/kernel/fpu/xsave.c         | 10 +++++-----
 arch/x86/kernel/signal.c            |  6 +++---
 arch/x86/kvm/x86.c                  |  2 +-
 arch/x86/math-emu/fpu_entry.c       |  2 +-
 7 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 4bafd5b05aca..bffb2c49ceb6 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -321,7 +321,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 		 ksig->ka.sa.sa_restorer)
 		sp = (unsigned long) ksig->ka.sa.sa_restorer;
 
-	if (used_math()) {
+	if (current->flags & PF_USED_MATH) {
 		unsigned long fx_aligned, math_size;
 
 		sp = alloc_mathframe(sp, 1, &fx_aligned, &math_size);
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 44516ad6c890..2cac49e3b4bd 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -375,7 +375,8 @@ static inline void drop_fpu(struct task_struct *tsk)
 		__thread_fpu_end(fpu);
 	}
 
-	clear_stopped_child_used_math(tsk);
+	tsk->flags &= ~PF_USED_MATH;
+
 	preempt_enable();
 }
 
@@ -423,7 +424,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 	 * If the task has used the math, pre-load the FPU on xsave processors
 	 * or if the past 5 consecutive context-switches used math.
 	 */
-	fpu.preload = tsk_used_math(new) &&
+	fpu.preload = (new->flags & PF_USED_MATH) &&
 		      (use_eager_fpu() || new->thread.fpu.counter > 5);
 
 	if (old_fpu->has_fpu) {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 275bdb768895..b8e3dbbcdc16 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -242,7 +242,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 
 	task_disable_lazy_fpu_restore(dst);
 
-	if (tsk_used_math(src)) {
+	if (src->flags & PF_USED_MATH) {
 		int err = fpstate_alloc(&dst->thread.fpu);
 
 		if (err)
@@ -331,7 +331,7 @@ void fpu__restore(void)
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 
-	if (!tsk_used_math(tsk)) {
+	if (!(tsk->flags & PF_USED_MATH)) {
 		local_irq_enable();
 		/*
 		 * does a slab alloc which can sleep
@@ -361,11 +361,13 @@ EXPORT_SYMBOL_GPL(fpu__restore);
 
 void fpu__flush_thread(struct task_struct *tsk)
 {
+	WARN_ON(tsk != current);
+
 	if (!use_eager_fpu()) {
 		/* FPU state will be reallocated lazily at the first use. */
 		drop_fpu(tsk);
 		fpstate_free(&tsk->thread.fpu);
-	} else if (!used_math()) {
+	} else if (!(tsk->flags & PF_USED_MATH)) {
 		/* kthread execs. TODO: cleanup this horror. */
 		if (WARN_ON(fpstate_alloc_init(tsk)))
 			force_sig(SIGKILL, tsk);
@@ -381,12 +383,12 @@ void fpu__flush_thread(struct task_struct *tsk)
  */
 int fpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	return tsk_used_math(target) ? regset->n : 0;
+	return (target->flags & PF_USED_MATH) ? regset->n : 0;
 }
 
 int xfpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	return (cpu_has_fxsr && tsk_used_math(target)) ? regset->n : 0;
+	return (cpu_has_fxsr && (target->flags & PF_USED_MATH)) ? regset->n : 0;
 }
 
 int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
@@ -717,7 +719,7 @@ int dump_fpu(struct pt_regs *regs, struct user_i387_struct *fpu)
 	struct task_struct *tsk = current;
 	int fpvalid;
 
-	fpvalid = !!used_math();
+	fpvalid = !!(tsk->flags & PF_USED_MATH);
 	if (fpvalid)
 		fpvalid = !fpregs_get(tsk, NULL,
 				      0, sizeof(struct user_i387_ia32_struct),
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 61696c5005eb..8cd127049c9b 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -349,7 +349,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 	if (!access_ok(VERIFY_READ, buf, size))
 		return -EACCES;
 
-	if (!used_math() && fpstate_alloc_init(tsk))
+	if (!(tsk->flags & PF_USED_MATH) && fpstate_alloc_init(tsk))
 		return -1;
 
 	if (!static_cpu_has(X86_FEATURE_FPU))
@@ -384,12 +384,12 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 		int err = 0;
 
 		/*
-		 * Drop the current fpu which clears used_math(). This ensures
+		 * Drop the current fpu which clears PF_USED_MATH. This ensures
 		 * that any context-switch during the copy of the new state,
 		 * avoids the intermediate state from getting restored/saved.
 		 * Thus avoiding the new restored state from getting corrupted.
 		 * We will be ready to restore/save the state only after
-		 * set_used_math() is again set.
+		 * PF_USED_MATH is again set.
 		 */
 		drop_fpu(tsk);
 
@@ -401,7 +401,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 			sanitize_restored_xstate(tsk, &env, xstate_bv, fx_only);
 		}
 
-		set_used_math();
+		tsk->flags |= PF_USED_MATH;
 		if (use_eager_fpu()) {
 			preempt_disable();
 			fpu__restore();
@@ -685,7 +685,7 @@ void xsave_init(void)
  */
 void __init_refok eager_fpu_init(void)
 {
-	WARN_ON(used_math());
+	WARN_ON(current->flags & PF_USED_MATH);
 	current_thread_info()->status = 0;
 
 	if (eagerfpu == ENABLE)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 35f867aa597e..8e2529ebb8c6 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -217,7 +217,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		}
 	}
 
-	if (used_math()) {
+	if (current->flags & PF_USED_MATH) {
 		sp = alloc_mathframe(sp, config_enabled(CONFIG_X86_32),
 				     &buf_fx, &math_size);
 		*fpstate = (void __user *)sp;
@@ -233,7 +233,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		return (void __user *)-1L;
 
 	/* save i387 and extended state */
-	if (used_math() &&
+	if ((current->flags & PF_USED_MATH) &&
 	    save_xstate_sig(*fpstate, (void __user *)buf_fx, math_size) < 0)
 		return (void __user *)-1L;
 
@@ -664,7 +664,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 		/*
 		 * Ensure the signal handler starts with the new fpu state.
 		 */
-		if (used_math())
+		if (current->flags & PF_USED_MATH)
 			fpu_reset_state(current);
 	}
 	signal_setup_done(failed, ksig, stepping);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index be276e0fe0ff..0635a1fd43ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6600,7 +6600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	int r;
 	sigset_t sigsaved;
 
-	if (!tsk_used_math(current) && fpstate_alloc_init(current))
+	if (!(current->flags & PF_USED_MATH) && fpstate_alloc_init(current))
 		return -ENOMEM;
 
 	if (vcpu->sigset_active)
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index c9ff09a02385..bf628804d67c 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -148,7 +148,7 @@ void math_emulate(struct math_emu_info *info)
 	unsigned long code_limit = 0;	/* Initialized to stop compiler warnings */
 	struct desc_struct code_descriptor;
 
-	if (!used_math()) {
+	if (!(current->flags & PF_USED_MATH)) {
 		if (fpstate_alloc_init(current)) {
 			do_group_exit(SIGKILL);
 			return;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 053/208] x86/fpu: Document fpu__unlazy_stopped()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (51 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 052/208] x86/fpu: Open code PF_USED_MATH usages Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active Ingo Molnar
                   ` (26 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Explain its usage and also document a TODO item.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index b8e3dbbcdc16..0235df54cd48 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -284,10 +284,27 @@ int fpstate_alloc_init(struct task_struct *curr)
 EXPORT_SYMBOL_GPL(fpstate_alloc_init);
 
 /*
- * The _current_ task is using the FPU for the first time
- * so initialize it and set the mxcsr to its default
- * value at reset if we support XMM instructions and then
- * remember the current task has used the FPU.
+ * This function is called before we modify a stopped child's
+ * FPU state context.
+ *
+ * If the child has not used the FPU before then initialize its
+ * FPU context.
+ *
+ * If the child has used the FPU before then unlazy it.
+ *
+ * [ After this function call, after the context is modified and
+ *   the child task is woken up, the child task will restore
+ *   the modified FPU state from the modified context. If we
+ *   didn't clear its lazy status here then the lazy in-registers
+ *   state pending on its former CPU could be restored, losing
+ *   the modifications. ]
+ *
+ * This function is also called before we read a stopped child's
+ * FPU state - to make sure it's modified.
+ *
+ * TODO: A future optimization would be to skip the unlazying in
+ *       the read-only case, it's not strictly necessary for
+ *       read-only access to the context.
  */
 static int fpu__unlazy_stopped(struct task_struct *child)
 {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (52 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 053/208] x86/fpu: Document fpu__unlazy_stopped() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-06  0:51   ` Andy Lutomirski
  2015-05-05 16:24 ` [PATCH 055/208] x86/fpu: Remove 'struct task_struct' usage from drop_fpu() Ingo Molnar
                   ` (25 subsequent siblings)
  79 siblings, 1 reply; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Introduce a simple fpu->fpstate_active flag in the fpu context data structure
and use that instead of PF_USED_MATH in task->flags.

Testing for this flag byte should be slightly more efficient than
testing a bit in a bitmask, but the main advantage is that most
FPU functions can now be performed on a 'struct fpu' alone, they
don't need access to 'struct task_struct' anymore.

There's a slight linecount increase, mostly due to the 'fpu' local
variables and due to extra comments. The local variables will go away
once we move most of the FPU methods to pure 'struct fpu' parameters.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/ia32/ia32_signal.c         |  3 ++-
 arch/x86/include/asm/fpu-internal.h |  4 ++--
 arch/x86/include/asm/fpu/types.h    |  6 ++++++
 arch/x86/include/asm/processor.h    |  6 ++++--
 arch/x86/kernel/fpu/core.c          | 38 +++++++++++++++++++++++++-------------
 arch/x86/kernel/fpu/xsave.c         | 11 ++++++-----
 arch/x86/kernel/signal.c            |  8 +++++---
 arch/x86/kvm/x86.c                  |  3 ++-
 arch/x86/math-emu/fpu_entry.c       |  3 ++-
 9 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index bffb2c49ceb6..e1ec6f90d09e 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -307,6 +307,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 				 size_t frame_size,
 				 void __user **fpstate)
 {
+	struct fpu *fpu = &current->thread.fpu;
 	unsigned long sp;
 
 	/* Default to using normal stack */
@@ -321,7 +322,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 		 ksig->ka.sa.sa_restorer)
 		sp = (unsigned long) ksig->ka.sa.sa_restorer;
 
-	if (current->flags & PF_USED_MATH) {
+	if (fpu->fpstate_active) {
 		unsigned long fx_aligned, math_size;
 
 		sp = alloc_mathframe(sp, 1, &fx_aligned, &math_size);
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 2cac49e3b4bd..9311126571ab 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -375,7 +375,7 @@ static inline void drop_fpu(struct task_struct *tsk)
 		__thread_fpu_end(fpu);
 	}
 
-	tsk->flags &= ~PF_USED_MATH;
+	fpu->fpstate_active = 0;
 
 	preempt_enable();
 }
@@ -424,7 +424,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 	 * If the task has used the math, pre-load the FPU on xsave processors
 	 * or if the past 5 consecutive context-switches used math.
 	 */
-	fpu.preload = (new->flags & PF_USED_MATH) &&
+	fpu.preload = new_fpu->fpstate_active &&
 		      (use_eager_fpu() || new->thread.fpu.counter > 5);
 
 	if (old_fpu->has_fpu) {
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index efb520dcf38e..f6317d9aa808 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -137,6 +137,12 @@ struct fpu {
 	 * deal with bursty apps that only use the FPU for a short time:
 	 */
 	unsigned char			counter;
+	/*
+	 * This flag indicates whether this context is fpstate_active: if the task is
+	 * not running then we can restore from this context, if the task
+	 * is running then we should save into this context.
+	 */
+	unsigned char			fpstate_active;
 };
 
 #endif /* _ASM_X86_FPU_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index d50cc7f61559..0f4add462697 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -385,6 +385,10 @@ struct thread_struct {
 	unsigned long		fs;
 #endif
 	unsigned long		gs;
+
+	/* Floating point and extended processor state */
+	struct fpu		fpu;
+
 	/* Save middle states of ptrace breakpoints */
 	struct perf_event	*ptrace_bps[HBP_NUM];
 	/* Debug status used for traps, single steps, etc... */
@@ -395,8 +399,6 @@ struct thread_struct {
 	unsigned long		cr2;
 	unsigned long		trap_nr;
 	unsigned long		error_code;
-	/* floating point and extended processor state */
-	struct fpu		fpu;
 #ifdef CONFIG_X86_32
 	/* Virtual 86 mode info */
 	struct vm86_struct __user *vm86_info;
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 0235df54cd48..cc15461b9bee 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -236,14 +236,17 @@ static void fpu_copy(struct task_struct *dst, struct task_struct *src)
 
 int fpu__copy(struct task_struct *dst, struct task_struct *src)
 {
+	struct fpu *dst_fpu = &dst->thread.fpu;
+	struct fpu *src_fpu = &src->thread.fpu;
+
 	dst->thread.fpu.counter = 0;
 	dst->thread.fpu.has_fpu = 0;
 	dst->thread.fpu.state = NULL;
 
 	task_disable_lazy_fpu_restore(dst);
 
-	if (src->flags & PF_USED_MATH) {
-		int err = fpstate_alloc(&dst->thread.fpu);
+	if (src_fpu->fpstate_active) {
+		int err = fpstate_alloc(dst_fpu);
 
 		if (err)
 			return err;
@@ -260,11 +263,12 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
  */
 int fpstate_alloc_init(struct task_struct *curr)
 {
+	struct fpu *fpu = &curr->thread.fpu;
 	int ret;
 
 	if (WARN_ON_ONCE(curr != current))
 		return -EINVAL;
-	if (WARN_ON_ONCE(curr->flags & PF_USED_MATH))
+	if (WARN_ON_ONCE(fpu->fpstate_active))
 		return -EINVAL;
 
 	/*
@@ -277,7 +281,7 @@ int fpstate_alloc_init(struct task_struct *curr)
 	fpstate_init(&curr->thread.fpu);
 
 	/* Safe to do for the current task: */
-	curr->flags |= PF_USED_MATH;
+	fpu->fpstate_active = 1;
 
 	return 0;
 }
@@ -308,12 +312,13 @@ EXPORT_SYMBOL_GPL(fpstate_alloc_init);
  */
 static int fpu__unlazy_stopped(struct task_struct *child)
 {
+	struct fpu *child_fpu = &child->thread.fpu;
 	int ret;
 
 	if (WARN_ON_ONCE(child == current))
 		return -EINVAL;
 
-	if (child->flags & PF_USED_MATH) {
+	if (child_fpu->fpstate_active) {
 		task_disable_lazy_fpu_restore(child);
 		return 0;
 	}
@@ -328,7 +333,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 	fpstate_init(&child->thread.fpu);
 
 	/* Safe to do for stopped child tasks: */
-	child->flags |= PF_USED_MATH;
+	child_fpu->fpstate_active = 1;
 
 	return 0;
 }
@@ -348,7 +353,7 @@ void fpu__restore(void)
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 
-	if (!(tsk->flags & PF_USED_MATH)) {
+	if (!fpu->fpstate_active) {
 		local_irq_enable();
 		/*
 		 * does a slab alloc which can sleep
@@ -378,13 +383,15 @@ EXPORT_SYMBOL_GPL(fpu__restore);
 
 void fpu__flush_thread(struct task_struct *tsk)
 {
+	struct fpu *fpu = &tsk->thread.fpu;
+
 	WARN_ON(tsk != current);
 
 	if (!use_eager_fpu()) {
 		/* FPU state will be reallocated lazily at the first use. */
 		drop_fpu(tsk);
 		fpstate_free(&tsk->thread.fpu);
-	} else if (!(tsk->flags & PF_USED_MATH)) {
+	} else if (!fpu->fpstate_active) {
 		/* kthread execs. TODO: cleanup this horror. */
 		if (WARN_ON(fpstate_alloc_init(tsk)))
 			force_sig(SIGKILL, tsk);
@@ -400,12 +407,16 @@ void fpu__flush_thread(struct task_struct *tsk)
  */
 int fpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	return (target->flags & PF_USED_MATH) ? regset->n : 0;
+	struct fpu *target_fpu = &target->thread.fpu;
+
+	return target_fpu->fpstate_active ? regset->n : 0;
 }
 
 int xfpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	return (cpu_has_fxsr && (target->flags & PF_USED_MATH)) ? regset->n : 0;
+	struct fpu *target_fpu = &target->thread.fpu;
+
+	return (cpu_has_fxsr && target_fpu->fpstate_active) ? regset->n : 0;
 }
 
 int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
@@ -731,16 +742,17 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
  * struct user_i387_struct) but is in fact only used for 32-bit
  * dumps, so on 64-bit it is really struct user_i387_ia32_struct.
  */
-int dump_fpu(struct pt_regs *regs, struct user_i387_struct *fpu)
+int dump_fpu(struct pt_regs *regs, struct user_i387_struct *ufpu)
 {
 	struct task_struct *tsk = current;
+	struct fpu *fpu = &tsk->thread.fpu;
 	int fpvalid;
 
-	fpvalid = !!(tsk->flags & PF_USED_MATH);
+	fpvalid = fpu->fpstate_active;
 	if (fpvalid)
 		fpvalid = !fpregs_get(tsk, NULL,
 				      0, sizeof(struct user_i387_ia32_struct),
-				      fpu, NULL);
+				      ufpu, NULL);
 
 	return fpvalid;
 }
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 8cd127049c9b..dc346e19c0df 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -334,6 +334,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 {
 	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
+	struct fpu *fpu = &tsk->thread.fpu;
 	int state_size = xstate_size;
 	u64 xstate_bv = 0;
 	int fx_only = 0;
@@ -349,7 +350,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 	if (!access_ok(VERIFY_READ, buf, size))
 		return -EACCES;
 
-	if (!(tsk->flags & PF_USED_MATH) && fpstate_alloc_init(tsk))
+	if (!fpu->fpstate_active && fpstate_alloc_init(tsk))
 		return -1;
 
 	if (!static_cpu_has(X86_FEATURE_FPU))
@@ -384,12 +385,12 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 		int err = 0;
 
 		/*
-		 * Drop the current fpu which clears PF_USED_MATH. This ensures
+		 * Drop the current fpu which clears fpu->fpstate_active. This ensures
 		 * that any context-switch during the copy of the new state,
 		 * avoids the intermediate state from getting restored/saved.
 		 * Thus avoiding the new restored state from getting corrupted.
 		 * We will be ready to restore/save the state only after
-		 * PF_USED_MATH is again set.
+		 * fpu->fpstate_active is again set.
 		 */
 		drop_fpu(tsk);
 
@@ -401,7 +402,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 			sanitize_restored_xstate(tsk, &env, xstate_bv, fx_only);
 		}
 
-		tsk->flags |= PF_USED_MATH;
+		fpu->fpstate_active = 1;
 		if (use_eager_fpu()) {
 			preempt_disable();
 			fpu__restore();
@@ -685,7 +686,7 @@ void xsave_init(void)
  */
 void __init_refok eager_fpu_init(void)
 {
-	WARN_ON(current->flags & PF_USED_MATH);
+	WARN_ON(current->thread.fpu.fpstate_active);
 	current_thread_info()->status = 0;
 
 	if (eagerfpu == ENABLE)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 8e2529ebb8c6..20a9d355af59 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -198,6 +198,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
 	int onsigstack = on_sig_stack(sp);
+	struct fpu *fpu = &current->thread.fpu;
 
 	/* redzone */
 	if (config_enabled(CONFIG_X86_64))
@@ -217,7 +218,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		}
 	}
 
-	if (current->flags & PF_USED_MATH) {
+	if (fpu->fpstate_active) {
 		sp = alloc_mathframe(sp, config_enabled(CONFIG_X86_32),
 				     &buf_fx, &math_size);
 		*fpstate = (void __user *)sp;
@@ -233,7 +234,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		return (void __user *)-1L;
 
 	/* save i387 and extended state */
-	if ((current->flags & PF_USED_MATH) &&
+	if (fpu->fpstate_active &&
 	    save_xstate_sig(*fpstate, (void __user *)buf_fx, math_size) < 0)
 		return (void __user *)-1L;
 
@@ -616,6 +617,7 @@ static void
 handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 {
 	bool stepping, failed;
+	struct fpu *fpu = &current->thread.fpu;
 
 	/* Are we from a system call? */
 	if (syscall_get_nr(current, regs) >= 0) {
@@ -664,7 +666,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 		/*
 		 * Ensure the signal handler starts with the new fpu state.
 		 */
-		if (current->flags & PF_USED_MATH)
+		if (fpu->fpstate_active)
 			fpu_reset_state(current);
 	}
 	signal_setup_done(failed, ksig, stepping);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0635a1fd43ba..bab8afb61dc1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6597,10 +6597,11 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
 
 int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
+	struct fpu *fpu = &current->thread.fpu;
 	int r;
 	sigset_t sigsaved;
 
-	if (!(current->flags & PF_USED_MATH) && fpstate_alloc_init(current))
+	if (!fpu->fpstate_active && fpstate_alloc_init(current))
 		return -ENOMEM;
 
 	if (vcpu->sigset_active)
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index bf628804d67c..f1aac55d6a67 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -147,8 +147,9 @@ void math_emulate(struct math_emu_info *info)
 	unsigned long code_base = 0;
 	unsigned long code_limit = 0;	/* Initialized to stop compiler warnings */
 	struct desc_struct code_descriptor;
+	struct fpu *fpu = &current->thread.fpu;
 
-	if (!(current->flags & PF_USED_MATH)) {
+	if (!fpu->fpstate_active) {
 		if (fpstate_alloc_init(current)) {
 			do_group_exit(SIGKILL);
 			return;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 055/208] x86/fpu: Remove 'struct task_struct' usage from drop_fpu()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (53 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 056/208] x86/fpu: Remove task_disable_lazy_fpu_restore() Ingo Molnar
                   ` (24 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 9 +++++----
 arch/x86/kernel/fpu/core.c          | 2 +-
 arch/x86/kernel/fpu/xsave.c         | 2 +-
 arch/x86/kernel/process.c           | 3 ++-
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 9311126571ab..e8f7134f0ffb 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -358,14 +358,13 @@ static inline void __thread_fpu_begin(struct fpu *fpu)
 	__thread_set_has_fpu(fpu);
 }
 
-static inline void drop_fpu(struct task_struct *tsk)
+static inline void drop_fpu(struct fpu *fpu)
 {
-	struct fpu *fpu = &tsk->thread.fpu;
 	/*
 	 * Forget coprocessor state..
 	 */
 	preempt_disable();
-	tsk->thread.fpu.counter = 0;
+	fpu->counter = 0;
 
 	if (fpu->has_fpu) {
 		/* Ignore delayed exceptions from user space */
@@ -394,8 +393,10 @@ static inline void restore_init_xstate(void)
  */
 static inline void fpu_reset_state(struct task_struct *tsk)
 {
+	struct fpu *fpu = &tsk->thread.fpu;
+
 	if (!use_eager_fpu())
-		drop_fpu(tsk);
+		drop_fpu(fpu);
 	else
 		restore_init_xstate();
 }
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index cc15461b9bee..39b78f1cc93b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -389,7 +389,7 @@ void fpu__flush_thread(struct task_struct *tsk)
 
 	if (!use_eager_fpu()) {
 		/* FPU state will be reallocated lazily at the first use. */
-		drop_fpu(tsk);
+		drop_fpu(fpu);
 		fpstate_free(&tsk->thread.fpu);
 	} else if (!fpu->fpstate_active) {
 		/* kthread execs. TODO: cleanup this horror. */
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index dc346e19c0df..049dc619481d 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -392,7 +392,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 		 * We will be ready to restore/save the state only after
 		 * fpu->fpstate_active is again set.
 		 */
-		drop_fpu(tsk);
+		drop_fpu(fpu);
 
 		if (__copy_from_user(&fpu->state->xsave, buf_fx, state_size) ||
 		    __copy_from_user(&env, buf, sizeof(env))) {
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 1b4ea12e412d..40bc28624628 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -104,6 +104,7 @@ void exit_thread(void)
 	struct task_struct *me = current;
 	struct thread_struct *t = &me->thread;
 	unsigned long *bp = t->io_bitmap_ptr;
+	struct fpu *fpu = &t->fpu;
 
 	if (bp) {
 		struct tss_struct *tss = &per_cpu(cpu_tss, get_cpu());
@@ -119,7 +120,7 @@ void exit_thread(void)
 		kfree(bp);
 	}
 
-	drop_fpu(me);
+	drop_fpu(fpu);
 }
 
 void flush_thread(void)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 056/208] x86/fpu: Remove task_disable_lazy_fpu_restore()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (54 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 055/208] x86/fpu: Remove 'struct task_struct' usage from drop_fpu() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 057/208] x86/fpu: Use 'struct fpu' in fpu_lazy_restore() Ingo Molnar
                   ` (23 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Replace task_disable_lazy_fpu_restore() with easier to read
open-coded uses: we already update the fpu->last_cpu field
explicitly in other cases.

(This also removes yet another task_struct using FPU method.)

Better explain the fpu::last_cpu field in the structure definition.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 14 ++------------
 arch/x86/include/asm/fpu/types.h    | 11 +++++++++++
 arch/x86/kernel/fpu/core.c          |  5 ++---
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index e8f7134f0ffb..76a1f3529881 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -74,16 +74,6 @@ static inline void __cpu_disable_lazy_restore(unsigned int cpu)
 	per_cpu(fpu_fpregs_owner_ctx, cpu) = NULL;
 }
 
-/*
- * Used to indicate that the FPU state in memory is newer than the FPU
- * state in registers, and the FPU state should be reloaded next time the
- * task is run. Only safe on the current task, or non-running tasks.
- */
-static inline void task_disable_lazy_fpu_restore(struct task_struct *tsk)
-{
-	tsk->thread.fpu.last_cpu = ~0;
-}
-
 static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu)
 {
 	return &new->thread.fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) &&
@@ -430,7 +420,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 
 	if (old_fpu->has_fpu) {
 		if (!fpu_save_init(&old->thread.fpu))
-			task_disable_lazy_fpu_restore(old);
+			old->thread.fpu.last_cpu = -1;
 		else
 			old->thread.fpu.last_cpu = cpu;
 
@@ -446,7 +436,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 			stts();
 	} else {
 		old->thread.fpu.counter = 0;
-		task_disable_lazy_fpu_restore(old);
+		old->thread.fpu.last_cpu = -1;
 		if (fpu.preload) {
 			new->thread.fpu.counter++;
 			if (fpu_lazy_restore(new, cpu))
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f6317d9aa808..cad1c37d9ea2 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -125,7 +125,18 @@ union thread_xstate {
 };
 
 struct fpu {
+	/*
+	 * Records the last CPU on which this context was loaded into
+	 * FPU registers. (In the lazy-switching case we might be
+	 * able to reuse FPU registers across multiple context switches
+	 * this way, if no intermediate task used the FPU.)
+	 *
+	 * A value of -1 is used to indicate that the FPU state in context
+	 * memory is newer than the FPU state in registers, and that the
+	 * FPU state should be reloaded next time the task is run.
+	 */
 	unsigned int			last_cpu;
+
 	unsigned int			has_fpu;
 	union thread_xstate		*state;
 	/*
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 39b78f1cc93b..59378e36b2ce 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -242,8 +242,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 	dst->thread.fpu.counter = 0;
 	dst->thread.fpu.has_fpu = 0;
 	dst->thread.fpu.state = NULL;
-
-	task_disable_lazy_fpu_restore(dst);
+	dst->thread.fpu.last_cpu = -1;
 
 	if (src_fpu->fpstate_active) {
 		int err = fpstate_alloc(dst_fpu);
@@ -319,7 +318,7 @@ static int fpu__unlazy_stopped(struct task_struct *child)
 		return -EINVAL;
 
 	if (child_fpu->fpstate_active) {
-		task_disable_lazy_fpu_restore(child);
+		child->thread.fpu.last_cpu = -1;
 		return 0;
 	}
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 057/208] x86/fpu: Use 'struct fpu' in fpu_lazy_restore()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (55 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 056/208] x86/fpu: Remove task_disable_lazy_fpu_restore() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 058/208] x86/fpu: Use 'struct fpu' in restore_fpu_checking() Ingo Molnar
                   ` (22 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Also rename it to fpu_want_lazy_restore(), to better indicate that
this function just tests whether we can do a lazy restore. (The old
name suggested that it was doing the lazy restore, which is not
the case.)

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 76a1f3529881..3f6d36c6ffce 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -74,10 +74,9 @@ static inline void __cpu_disable_lazy_restore(unsigned int cpu)
 	per_cpu(fpu_fpregs_owner_ctx, cpu) = NULL;
 }
 
-static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu)
+static inline int fpu_want_lazy_restore(struct fpu *fpu, unsigned int cpu)
 {
-	return &new->thread.fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) &&
-		cpu == new->thread.fpu.last_cpu;
+	return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
 }
 
 static inline int is_ia32_compat_frame(void)
@@ -439,7 +438,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 		old->thread.fpu.last_cpu = -1;
 		if (fpu.preload) {
 			new->thread.fpu.counter++;
-			if (fpu_lazy_restore(new, cpu))
+			if (fpu_want_lazy_restore(new_fpu, cpu))
 				fpu.preload = 0;
 			else
 				prefetch(new->thread.fpu.state);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 058/208] x86/fpu: Use 'struct fpu' in restore_fpu_checking()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (56 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 057/208] x86/fpu: Use 'struct fpu' in fpu_lazy_restore() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 059/208] x86/fpu: Use 'struct fpu' in fpu_reset_state() Ingo Molnar
                   ` (21 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 10 ++++++----
 arch/x86/kernel/fpu/core.c          |  4 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 3f6d36c6ffce..2d7934e4e394 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -294,7 +294,7 @@ static inline int fpu_restore_checking(struct fpu *fpu)
 		return frstor_checking(&fpu->state->fsave);
 }
 
-static inline int restore_fpu_checking(struct task_struct *tsk)
+static inline int restore_fpu_checking(struct fpu *fpu)
 {
 	/*
 	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
@@ -306,10 +306,10 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
 			"fnclex\n\t"
 			"emms\n\t"
 			"fildl %P[addr]"	/* set F?P to defined value */
-			: : [addr] "m" (tsk->thread.fpu.has_fpu));
+			: : [addr] "m" (fpu->has_fpu));
 	}
 
-	return fpu_restore_checking(&tsk->thread.fpu);
+	return fpu_restore_checking(fpu);
 }
 
 /* Must be paired with an 'stts' after! */
@@ -456,8 +456,10 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
  */
 static inline void switch_fpu_finish(struct task_struct *new, fpu_switch_t fpu)
 {
+	struct fpu *new_fpu = &new->thread.fpu;
+
 	if (fpu.preload) {
-		if (unlikely(restore_fpu_checking(new)))
+		if (unlikely(restore_fpu_checking(new_fpu)))
 			fpu_reset_state(new);
 	}
 }
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 59378e36b2ce..79e92cf02e31 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -116,7 +116,7 @@ void __kernel_fpu_end(void)
 	struct fpu *fpu = &me->thread.fpu;
 
 	if (fpu->has_fpu) {
-		if (WARN_ON(restore_fpu_checking(me)))
+		if (WARN_ON(restore_fpu_checking(fpu)))
 			fpu_reset_state(me);
 	} else if (!use_eager_fpu()) {
 		stts();
@@ -370,7 +370,7 @@ void fpu__restore(void)
 	/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
 	kernel_fpu_disable();
 	__thread_fpu_begin(fpu);
-	if (unlikely(restore_fpu_checking(tsk))) {
+	if (unlikely(restore_fpu_checking(fpu))) {
 		fpu_reset_state(tsk);
 		force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
 	} else {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 059/208] x86/fpu: Use 'struct fpu' in fpu_reset_state()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (57 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 058/208] x86/fpu: Use 'struct fpu' in restore_fpu_checking() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 060/208] x86/fpu: Use 'struct fpu' in switch_fpu_prepare() Ingo Molnar
                   ` (20 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 6 ++----
 arch/x86/kernel/fpu/core.c          | 7 +++----
 arch/x86/kernel/fpu/xsave.c         | 4 ++--
 arch/x86/kernel/signal.c            | 2 +-
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 2d7934e4e394..579f7d0a399d 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -380,10 +380,8 @@ static inline void restore_init_xstate(void)
  * Reset the FPU state in the eager case and drop it in the lazy case (later use
  * will reinit it).
  */
-static inline void fpu_reset_state(struct task_struct *tsk)
+static inline void fpu_reset_state(struct fpu *fpu)
 {
-	struct fpu *fpu = &tsk->thread.fpu;
-
 	if (!use_eager_fpu())
 		drop_fpu(fpu);
 	else
@@ -460,7 +458,7 @@ static inline void switch_fpu_finish(struct task_struct *new, fpu_switch_t fpu)
 
 	if (fpu.preload) {
 		if (unlikely(restore_fpu_checking(new_fpu)))
-			fpu_reset_state(new);
+			fpu_reset_state(new_fpu);
 	}
 }
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 79e92cf02e31..2da02fcc0e35 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -112,12 +112,11 @@ EXPORT_SYMBOL(__kernel_fpu_begin);
 
 void __kernel_fpu_end(void)
 {
-	struct task_struct *me = current;
-	struct fpu *fpu = &me->thread.fpu;
+	struct fpu *fpu = &current->thread.fpu;
 
 	if (fpu->has_fpu) {
 		if (WARN_ON(restore_fpu_checking(fpu)))
-			fpu_reset_state(me);
+			fpu_reset_state(fpu);
 	} else if (!use_eager_fpu()) {
 		stts();
 	}
@@ -371,7 +370,7 @@ void fpu__restore(void)
 	kernel_fpu_disable();
 	__thread_fpu_begin(fpu);
 	if (unlikely(restore_fpu_checking(fpu))) {
-		fpu_reset_state(tsk);
+		fpu_reset_state(fpu);
 		force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
 	} else {
 		tsk->thread.fpu.counter++;
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 049dc619481d..3953cbf8d7e7 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -343,7 +343,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 			 config_enabled(CONFIG_IA32_EMULATION));
 
 	if (!buf) {
-		fpu_reset_state(tsk);
+		fpu_reset_state(fpu);
 		return 0;
 	}
 
@@ -417,7 +417,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 		 */
 		user_fpu_begin();
 		if (restore_user_xstate(buf_fx, xstate_bv, fx_only)) {
-			fpu_reset_state(tsk);
+			fpu_reset_state(fpu);
 			return -1;
 		}
 	}
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 20a9d355af59..bcb853e44d30 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -667,7 +667,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 		 * Ensure the signal handler starts with the new fpu state.
 		 */
 		if (fpu->fpstate_active)
-			fpu_reset_state(current);
+			fpu_reset_state(fpu);
 	}
 	signal_setup_done(failed, ksig, stepping);
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 060/208] x86/fpu: Use 'struct fpu' in switch_fpu_prepare()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (58 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 059/208] x86/fpu: Use 'struct fpu' in fpu_reset_state() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 061/208] x86/fpu: Use 'struct fpu' in switch_fpu_finish() Ingo Molnar
                   ` (19 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 27 +++++++++++++--------------
 arch/x86/kernel/process_32.c        |  2 +-
 arch/x86/kernel/process_64.c        |  2 +-
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 579f7d0a399d..60d2c6f376f3 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -402,10 +402,9 @@ static inline void fpu_reset_state(struct fpu *fpu)
  */
 typedef struct { int preload; } fpu_switch_t;
 
-static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct task_struct *new, int cpu)
+static inline fpu_switch_t
+switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
 {
-	struct fpu *old_fpu = &old->thread.fpu;
-	struct fpu *new_fpu = &new->thread.fpu;
 	fpu_switch_t fpu;
 
 	/*
@@ -413,33 +412,33 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
 	 * or if the past 5 consecutive context-switches used math.
 	 */
 	fpu.preload = new_fpu->fpstate_active &&
-		      (use_eager_fpu() || new->thread.fpu.counter > 5);
+		      (use_eager_fpu() || new_fpu->counter > 5);
 
 	if (old_fpu->has_fpu) {
-		if (!fpu_save_init(&old->thread.fpu))
-			old->thread.fpu.last_cpu = -1;
+		if (!fpu_save_init(old_fpu))
+			old_fpu->last_cpu = -1;
 		else
-			old->thread.fpu.last_cpu = cpu;
+			old_fpu->last_cpu = cpu;
 
 		/* But leave fpu_fpregs_owner_ctx! */
-		old->thread.fpu.has_fpu = 0;
+		old_fpu->has_fpu = 0;
 
 		/* Don't change CR0.TS if we just switch! */
 		if (fpu.preload) {
-			new->thread.fpu.counter++;
+			new_fpu->counter++;
 			__thread_set_has_fpu(new_fpu);
-			prefetch(new->thread.fpu.state);
+			prefetch(new_fpu->state);
 		} else if (!use_eager_fpu())
 			stts();
 	} else {
-		old->thread.fpu.counter = 0;
-		old->thread.fpu.last_cpu = -1;
+		old_fpu->counter = 0;
+		old_fpu->last_cpu = -1;
 		if (fpu.preload) {
-			new->thread.fpu.counter++;
+			new_fpu->counter++;
 			if (fpu_want_lazy_restore(new_fpu, cpu))
 				fpu.preload = 0;
 			else
-				prefetch(new->thread.fpu.state);
+				prefetch(new_fpu->state);
 			__thread_fpu_begin(new_fpu);
 		}
 	}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 1a0edce626b2..5b0ed71dde60 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -248,7 +248,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
-	fpu = switch_fpu_prepare(prev_p, next_p, cpu);
+	fpu = switch_fpu_prepare(&prev_p->thread.fpu, &next_p->thread.fpu, cpu);
 
 	/*
 	 * Save away %gs. No need to save %fs, as it was saved on the
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 99cc4b8589ad..fefe65efd9d6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -278,7 +278,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	unsigned fsindex, gsindex;
 	fpu_switch_t fpu;
 
-	fpu = switch_fpu_prepare(prev_p, next_p, cpu);
+	fpu = switch_fpu_prepare(&prev_p->thread.fpu, &next_p->thread.fpu, cpu);
 
 	/* We must save %fs and %gs before load_TLS() because
 	 * %fs and %gs may be cleared by load_TLS().
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 061/208] x86/fpu: Use 'struct fpu' in switch_fpu_finish()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (59 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 060/208] x86/fpu: Use 'struct fpu' in switch_fpu_prepare() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 062/208] x86/fpu: Move __save_fpu() into fpu/core.c Ingo Molnar
                   ` (18 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h |  6 ++----
 arch/x86/kernel/process_32.c        | 10 ++++++----
 arch/x86/kernel/process_64.c        |  8 +++++---
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 60d2c6f376f3..5fd9b3f9be0f 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -451,11 +451,9 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
  * state - all we need to do is to conditionally restore the register
  * state itself.
  */
-static inline void switch_fpu_finish(struct task_struct *new, fpu_switch_t fpu)
+static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switch)
 {
-	struct fpu *new_fpu = &new->thread.fpu;
-
-	if (fpu.preload) {
+	if (fpu_switch.preload) {
 		if (unlikely(restore_fpu_checking(new_fpu)))
 			fpu_reset_state(new_fpu);
 	}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 5b0ed71dde60..7adc314b5075 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -241,14 +241,16 @@ __visible __notrace_funcgraph struct task_struct *
 __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 {
 	struct thread_struct *prev = &prev_p->thread,
-				 *next = &next_p->thread;
+			     *next = &next_p->thread;
+	struct fpu *prev_fpu = &prev->fpu;
+	struct fpu *next_fpu = &next->fpu;
 	int cpu = smp_processor_id();
 	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
-	fpu_switch_t fpu;
+	fpu_switch_t fpu_switch;
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
-	fpu = switch_fpu_prepare(&prev_p->thread.fpu, &next_p->thread.fpu, cpu);
+	fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu);
 
 	/*
 	 * Save away %gs. No need to save %fs, as it was saved on the
@@ -318,7 +320,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	if (prev->gs | next->gs)
 		lazy_load_gs(next->gs);
 
-	switch_fpu_finish(next_p, fpu);
+	switch_fpu_finish(next_fpu, fpu_switch);
 
 	this_cpu_write(current_task, next_p);
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index fefe65efd9d6..4504569c6c4e 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -273,12 +273,14 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 {
 	struct thread_struct *prev = &prev_p->thread;
 	struct thread_struct *next = &next_p->thread;
+	struct fpu *prev_fpu = &prev->fpu;
+	struct fpu *next_fpu = &next->fpu;
 	int cpu = smp_processor_id();
 	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
 	unsigned fsindex, gsindex;
-	fpu_switch_t fpu;
+	fpu_switch_t fpu_switch;
 
-	fpu = switch_fpu_prepare(&prev_p->thread.fpu, &next_p->thread.fpu, cpu);
+	fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu);
 
 	/* We must save %fs and %gs before load_TLS() because
 	 * %fs and %gs may be cleared by load_TLS().
@@ -390,7 +392,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 		wrmsrl(MSR_KERNEL_GS_BASE, next->gs);
 	prev->gsindex = gsindex;
 
-	switch_fpu_finish(next_p, fpu);
+	switch_fpu_finish(next_fpu, fpu_switch);
 
 	/*
 	 * Switch the PDA and FPU contexts.
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 062/208] x86/fpu: Move __save_fpu() into fpu/core.c
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (60 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 061/208] x86/fpu: Use 'struct fpu' in switch_fpu_finish() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 063/208] x86/fpu: Use 'struct fpu' in __fpu_save() Ingo Molnar
                   ` (17 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This helper function is only used in fpu/core.c, move it there.

This slightly speeds up compilation.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h | 11 -----------
 arch/x86/kernel/fpu/core.c          | 12 ++++++++++++
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 5fd9b3f9be0f..6b84399c8839 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -501,17 +501,6 @@ static inline void user_fpu_begin(void)
 	preempt_enable();
 }
 
-static inline void __save_fpu(struct task_struct *tsk)
-{
-	if (use_xsave()) {
-		if (unlikely(system_state == SYSTEM_BOOTING))
-			xsave_state_booting(&tsk->thread.fpu.state->xsave);
-		else
-			xsave_state(&tsk->thread.fpu.state->xsave);
-	} else
-		fpu_fxsave(&tsk->thread.fpu);
-}
-
 /*
  * i387 state interaction
  */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2da02fcc0e35..93bf90d48ded 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -125,6 +125,18 @@ void __kernel_fpu_end(void)
 }
 EXPORT_SYMBOL(__kernel_fpu_end);
 
+static void __save_fpu(struct task_struct *tsk)
+{
+	if (use_xsave()) {
+		if (unlikely(system_state == SYSTEM_BOOTING))
+			xsave_state_booting(&tsk->thread.fpu.state->xsave);
+		else
+			xsave_state(&tsk->thread.fpu.state->xsave);
+	} else {
+		fpu_fxsave(&tsk->thread.fpu);
+	}
+}
+
 /*
  * Save the FPU state (initialize it if necessary):
  *
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 063/208] x86/fpu: Use 'struct fpu' in __fpu_save()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (61 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 062/208] x86/fpu: Move __save_fpu() into fpu/core.c Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 064/208] x86/fpu: Use 'struct fpu' in fpu__save() Ingo Molnar
                   ` (16 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 93bf90d48ded..1ed2fc695e54 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -125,15 +125,15 @@ void __kernel_fpu_end(void)
 }
 EXPORT_SYMBOL(__kernel_fpu_end);
 
-static void __save_fpu(struct task_struct *tsk)
+static void __save_fpu(struct fpu *fpu)
 {
 	if (use_xsave()) {
 		if (unlikely(system_state == SYSTEM_BOOTING))
-			xsave_state_booting(&tsk->thread.fpu.state->xsave);
+			xsave_state_booting(&fpu->state->xsave);
 		else
-			xsave_state(&tsk->thread.fpu.state->xsave);
+			xsave_state(&fpu->state->xsave);
 	} else {
-		fpu_fxsave(&tsk->thread.fpu);
+		fpu_fxsave(fpu);
 	}
 }
 
@@ -151,7 +151,7 @@ void fpu__save(struct task_struct *tsk)
 	preempt_disable();
 	if (fpu->has_fpu) {
 		if (use_eager_fpu()) {
-			__save_fpu(tsk);
+			__save_fpu(fpu);
 		} else {
 			fpu_save_init(fpu);
 			__thread_fpu_end(fpu);
@@ -231,17 +231,17 @@ EXPORT_SYMBOL_GPL(fpstate_free);
  */
 static void fpu_copy(struct task_struct *dst, struct task_struct *src)
 {
+	struct fpu *dst_fpu = &dst->thread.fpu;
+	struct fpu *src_fpu = &src->thread.fpu;
+
 	WARN_ON(src != current);
 
 	if (use_eager_fpu()) {
 		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
-		__save_fpu(dst);
+		__save_fpu(dst_fpu);
 	} else {
-		struct fpu *dfpu = &dst->thread.fpu;
-		struct fpu *sfpu = &src->thread.fpu;
-
 		fpu__save(src);
-		memcpy(dfpu->state, sfpu->state, xstate_size);
+		memcpy(dst_fpu->state, src_fpu->state, xstate_size);
 	}
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 064/208] x86/fpu: Use 'struct fpu' in fpu__save()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (62 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 063/208] x86/fpu: Use 'struct fpu' in __fpu_save() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 065/208] x86/fpu: Use 'struct fpu' in fpu_copy() Ingo Molnar
                   ` (15 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h | 2 +-
 arch/x86/kernel/fpu/core.c  | 8 +++-----
 arch/x86/kernel/traps.c     | 2 +-
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index e69989f95da5..e3b42c5379bc 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -100,7 +100,7 @@ static inline int user_has_fpu(void)
 	return current->thread.fpu.has_fpu;
 }
 
-extern void fpu__save(struct task_struct *tsk);
+extern void fpu__save(struct fpu *fpu);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1ed2fc695e54..92cee0c18dc6 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -142,11 +142,9 @@ static void __save_fpu(struct fpu *fpu)
  *
  * This only ever gets called for the current task.
  */
-void fpu__save(struct task_struct *tsk)
+void fpu__save(struct fpu *fpu)
 {
-	struct fpu *fpu = &tsk->thread.fpu;
-
-	WARN_ON(tsk != current);
+	WARN_ON(fpu != &current->thread.fpu);
 
 	preempt_disable();
 	if (fpu->has_fpu) {
@@ -240,7 +238,7 @@ static void fpu_copy(struct task_struct *dst, struct task_struct *src)
 		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
 		__save_fpu(dst_fpu);
 	} else {
-		fpu__save(src);
+		fpu__save(src_fpu);
 		memcpy(dst_fpu->state, src_fpu->state, xstate_size);
 	}
 }
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 22ad90a40dbf..8abcd6a6f3dc 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -730,7 +730,7 @@ static void math_error(struct pt_regs *regs, int error_code, int trapnr)
 	/*
 	 * Save the info for the exception handler and clear the error.
 	 */
-	fpu__save(task);
+	fpu__save(&task->thread.fpu);
 	task->thread.trap_nr = trapnr;
 	task->thread.error_code = error_code;
 	info.si_signo = SIGFPE;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 065/208] x86/fpu: Use 'struct fpu' in fpu_copy()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (63 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 064/208] x86/fpu: Use 'struct fpu' in fpu__save() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 066/208] x86/fpu: Use 'struct fpu' in fpu__copy() Ingo Molnar
                   ` (14 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 92cee0c18dc6..cfc2af98bcde 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -227,15 +227,12 @@ EXPORT_SYMBOL_GPL(fpstate_free);
  * In the 'lazy' case we save to the source context, mark the FPU lazy
  * via stts() and copy the source context into the destination context.
  */
-static void fpu_copy(struct task_struct *dst, struct task_struct *src)
+static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
-	struct fpu *dst_fpu = &dst->thread.fpu;
-	struct fpu *src_fpu = &src->thread.fpu;
-
-	WARN_ON(src != current);
+	WARN_ON(src_fpu != &current->thread.fpu);
 
 	if (use_eager_fpu()) {
-		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
+		memset(&dst_fpu->state->xsave, 0, xstate_size);
 		__save_fpu(dst_fpu);
 	} else {
 		fpu__save(src_fpu);
@@ -258,7 +255,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 
 		if (err)
 			return err;
-		fpu_copy(dst, src);
+		fpu_copy(dst_fpu, src_fpu);
 	}
 	return 0;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 066/208] x86/fpu: Use 'struct fpu' in fpu__copy()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (64 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 065/208] x86/fpu: Use 'struct fpu' in fpu_copy() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 067/208] x86/fpu: Use 'struct fpu' in fpstate_alloc_init() Ingo Molnar
                   ` (13 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu-internal.h |  2 +-
 arch/x86/kernel/fpu/core.c          | 13 +++++--------
 arch/x86/kernel/process.c           |  2 +-
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 6b84399c8839..21ad68179454 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -535,7 +535,7 @@ extern void fpstate_cache_init(void);
 
 extern int fpstate_alloc(struct fpu *fpu);
 extern void fpstate_free(struct fpu *fpu);
-extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
+extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
 
 static inline unsigned long
 alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx,
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index cfc2af98bcde..04a8322df8b5 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -240,15 +240,12 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	}
 }
 
-int fpu__copy(struct task_struct *dst, struct task_struct *src)
+int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
-	struct fpu *dst_fpu = &dst->thread.fpu;
-	struct fpu *src_fpu = &src->thread.fpu;
-
-	dst->thread.fpu.counter = 0;
-	dst->thread.fpu.has_fpu = 0;
-	dst->thread.fpu.state = NULL;
-	dst->thread.fpu.last_cpu = -1;
+	dst_fpu->counter = 0;
+	dst_fpu->has_fpu = 0;
+	dst_fpu->state = NULL;
+	dst_fpu->last_cpu = -1;
 
 	if (src_fpu->fpstate_active) {
 		int err = fpstate_alloc(dst_fpu);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 40bc28624628..f2cd1df00b40 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -83,7 +83,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
 	*dst = *src;
 
-	return fpu__copy(dst, src);
+	return fpu__copy(&dst->thread.fpu, &src->thread.fpu);
 }
 
 void arch_release_task_struct(struct task_struct *tsk)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 067/208] x86/fpu: Use 'struct fpu' in fpstate_alloc_init()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (65 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 066/208] x86/fpu: Use 'struct fpu' in fpu__copy() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 068/208] x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped() Ingo Molnar
                   ` (12 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h   |  2 +-
 arch/x86/kernel/fpu/core.c    | 13 ++++++-------
 arch/x86/kernel/fpu/xsave.c   |  2 +-
 arch/x86/kvm/x86.c            |  2 +-
 arch/x86/math-emu/fpu_entry.c |  2 +-
 5 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index e3b42c5379bc..38376cdf297c 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -18,7 +18,7 @@
 struct pt_regs;
 struct user_i387_struct;
 
-extern int fpstate_alloc_init(struct task_struct *curr);
+extern int fpstate_alloc_init(struct fpu *fpu);
 extern void fpstate_init(struct fpu *fpu);
 extern void fpu__flush_thread(struct task_struct *tsk);
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 04a8322df8b5..76a6b1faa91f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -263,12 +263,11 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
  *
  * Can fail.
  */
-int fpstate_alloc_init(struct task_struct *curr)
+int fpstate_alloc_init(struct fpu *fpu)
 {
-	struct fpu *fpu = &curr->thread.fpu;
 	int ret;
 
-	if (WARN_ON_ONCE(curr != current))
+	if (WARN_ON_ONCE(fpu != &current->thread.fpu))
 		return -EINVAL;
 	if (WARN_ON_ONCE(fpu->fpstate_active))
 		return -EINVAL;
@@ -276,11 +275,11 @@ int fpstate_alloc_init(struct task_struct *curr)
 	/*
 	 * Memory allocation at the first usage of the FPU and other state.
 	 */
-	ret = fpstate_alloc(&curr->thread.fpu);
+	ret = fpstate_alloc(fpu);
 	if (ret)
 		return ret;
 
-	fpstate_init(&curr->thread.fpu);
+	fpstate_init(fpu);
 
 	/* Safe to do for the current task: */
 	fpu->fpstate_active = 1;
@@ -360,7 +359,7 @@ void fpu__restore(void)
 		/*
 		 * does a slab alloc which can sleep
 		 */
-		if (fpstate_alloc_init(tsk)) {
+		if (fpstate_alloc_init(fpu)) {
 			/*
 			 * ran out of memory!
 			 */
@@ -395,7 +394,7 @@ void fpu__flush_thread(struct task_struct *tsk)
 		fpstate_free(&tsk->thread.fpu);
 	} else if (!fpu->fpstate_active) {
 		/* kthread execs. TODO: cleanup this horror. */
-		if (WARN_ON(fpstate_alloc_init(tsk)))
+		if (WARN_ON(fpstate_alloc_init(fpu)))
 			force_sig(SIGKILL, tsk);
 		user_fpu_begin();
 		restore_init_xstate();
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 3953cbf8d7e7..80b0c8fa50c5 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -350,7 +350,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 	if (!access_ok(VERIFY_READ, buf, size))
 		return -EACCES;
 
-	if (!fpu->fpstate_active && fpstate_alloc_init(tsk))
+	if (!fpu->fpstate_active && fpstate_alloc_init(fpu))
 		return -1;
 
 	if (!static_cpu_has(X86_FEATURE_FPU))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bab8afb61dc1..479d4ce25081 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6601,7 +6601,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	int r;
 	sigset_t sigsaved;
 
-	if (!fpu->fpstate_active && fpstate_alloc_init(current))
+	if (!fpu->fpstate_active && fpstate_alloc_init(fpu))
 		return -ENOMEM;
 
 	if (vcpu->sigset_active)
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index f1aac55d6a67..e394bcb4275d 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -150,7 +150,7 @@ void math_emulate(struct math_emu_info *info)
 	struct fpu *fpu = &current->thread.fpu;
 
 	if (!fpu->fpstate_active) {
-		if (fpstate_alloc_init(current)) {
+		if (fpstate_alloc_init(fpu)) {
 			do_group_exit(SIGKILL);
 			return;
 		}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 068/208] x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (66 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 067/208] x86/fpu: Use 'struct fpu' in fpstate_alloc_init() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 069/208] x86/fpu: Rename fpu__flush_thread() to fpu__clear() Ingo Molnar
                   ` (11 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Migrate this function to pure 'struct fpu' usage.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 76a6b1faa91f..7045eff05292 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -311,27 +311,26 @@ EXPORT_SYMBOL_GPL(fpstate_alloc_init);
  *       the read-only case, it's not strictly necessary for
  *       read-only access to the context.
  */
-static int fpu__unlazy_stopped(struct task_struct *child)
+static int fpu__unlazy_stopped(struct fpu *child_fpu)
 {
-	struct fpu *child_fpu = &child->thread.fpu;
 	int ret;
 
-	if (WARN_ON_ONCE(child == current))
+	if (WARN_ON_ONCE(child_fpu == &current->thread.fpu))
 		return -EINVAL;
 
 	if (child_fpu->fpstate_active) {
-		child->thread.fpu.last_cpu = -1;
+		child_fpu->last_cpu = -1;
 		return 0;
 	}
 
 	/*
 	 * Memory allocation at the first usage of the FPU and other state.
 	 */
-	ret = fpstate_alloc(&child->thread.fpu);
+	ret = fpstate_alloc(child_fpu);
 	if (ret)
 		return ret;
 
-	fpstate_init(&child->thread.fpu);
+	fpstate_init(child_fpu);
 
 	/* Safe to do for stopped child tasks: */
 	child_fpu->fpstate_active = 1;
@@ -424,12 +423,13 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 		unsigned int pos, unsigned int count,
 		void *kbuf, void __user *ubuf)
 {
+	struct fpu *fpu = &target->thread.fpu;
 	int ret;
 
 	if (!cpu_has_fxsr)
 		return -ENODEV;
 
-	ret = fpu__unlazy_stopped(target);
+	ret = fpu__unlazy_stopped(fpu);
 	if (ret)
 		return ret;
 
@@ -443,12 +443,13 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 		unsigned int pos, unsigned int count,
 		const void *kbuf, const void __user *ubuf)
 {
+	struct fpu *fpu = &target->thread.fpu;
 	int ret;
 
 	if (!cpu_has_fxsr)
 		return -ENODEV;
 
-	ret = fpu__unlazy_stopped(target);
+	ret = fpu__unlazy_stopped(fpu);
 	if (ret)
 		return ret;
 
@@ -476,13 +477,14 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 		unsigned int pos, unsigned int count,
 		void *kbuf, void __user *ubuf)
 {
+	struct fpu *fpu = &target->thread.fpu;
 	struct xsave_struct *xsave;
 	int ret;
 
 	if (!cpu_has_xsave)
 		return -ENODEV;
 
-	ret = fpu__unlazy_stopped(target);
+	ret = fpu__unlazy_stopped(fpu);
 	if (ret)
 		return ret;
 
@@ -506,13 +508,14 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 		  unsigned int pos, unsigned int count,
 		  const void *kbuf, const void __user *ubuf)
 {
+	struct fpu *fpu = &target->thread.fpu;
 	struct xsave_struct *xsave;
 	int ret;
 
 	if (!cpu_has_xsave)
 		return -ENODEV;
 
-	ret = fpu__unlazy_stopped(target);
+	ret = fpu__unlazy_stopped(fpu);
 	if (ret)
 		return ret;
 
@@ -672,10 +675,11 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
 	       unsigned int pos, unsigned int count,
 	       void *kbuf, void __user *ubuf)
 {
+	struct fpu *fpu = &target->thread.fpu;
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	ret = fpu__unlazy_stopped(target);
+	ret = fpu__unlazy_stopped(fpu);
 	if (ret)
 		return ret;
 
@@ -703,10 +707,11 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 	       unsigned int pos, unsigned int count,
 	       const void *kbuf, const void __user *ubuf)
 {
+	struct fpu *fpu = &target->thread.fpu;
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	ret = fpu__unlazy_stopped(target);
+	ret = fpu__unlazy_stopped(fpu);
 	if (ret)
 		return ret;
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 069/208] x86/fpu: Rename fpu__flush_thread() to fpu__clear()
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (67 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 068/208] x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 070/208] x86/fpu: Clean up fpu__clear() a bit Ingo Molnar
                   ` (10 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

The primary purpose of this function is to clear the current task's
FPU before an exec(), to not leak information from the previous task,
and to allow the new task to start with freshly initialized FPU
registers.

Rename the function to reflect this primary purpose.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/i387.h | 2 +-
 arch/x86/kernel/fpu/core.c  | 4 ++--
 arch/x86/kernel/process.c   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 38376cdf297c..b8f7d76ac066 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -20,7 +20,7 @@ struct user_i387_struct;
 
 extern int fpstate_alloc_init(struct fpu *fpu);
 extern void fpstate_init(struct fpu *fpu);
-extern void fpu__flush_thread(struct task_struct *tsk);
+extern void fpu__clear(struct task_struct *tsk);
 
 extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
 extern void fpu__restore(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7045eff05292..e24f477f9113 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -381,11 +381,11 @@ void fpu__restore(void)
 }
 EXPORT_SYMBOL_GPL(fpu__restore);
 
-void fpu__flush_thread(struct task_struct *tsk)
+void fpu__clear(struct task_struct *tsk)
 {
 	struct fpu *fpu = &tsk->thread.fpu;
 
-	WARN_ON(tsk != current);
+	WARN_ON_ONCE(tsk != current); /* Almost certainly an anomaly */
 
 	if (!use_eager_fpu()) {
 		/* FPU state will be reallocated lazily at the first use. */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index f2cd1df00b40..04ac5901dbee 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -130,7 +130,7 @@ void flush_thread(void)
 	flush_ptrace_hw_breakpoint(tsk);
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
-	fpu__flush_thread(tsk);
+	fpu__clear(tsk);
 }
 
 static void hard_disable_TSC(void)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 070/208] x86/fpu: Clean up fpu__clear() a bit
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (68 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 069/208] x86/fpu: Rename fpu__flush_thread() to fpu__clear() Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 071/208] x86/fpu: Rename i387.h to fpu/api.h Ingo Molnar
                   ` (9 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index e24f477f9113..7d69d784d064 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -390,13 +390,15 @@ void fpu__clear(struct task_struct *tsk)
 	if (!use_eager_fpu()) {
 		/* FPU state will be reallocated lazily at the first use. */
 		drop_fpu(fpu);
-		fpstate_free(&tsk->thread.fpu);
-	} else if (!fpu->fpstate_active) {
-		/* kthread execs. TODO: cleanup this horror. */
-		if (WARN_ON(fpstate_alloc_init(fpu)))
-			force_sig(SIGKILL, tsk);
-		user_fpu_begin();
-		restore_init_xstate();
+		fpstate_free(fpu);
+	} else {
+		 if (!fpu->fpstate_active) {
+			/* kthread execs. TODO: cleanup this horror. */
+			if (WARN_ON(fpstate_alloc_init(fpu)))
+				force_sig(SIGKILL, tsk);
+			user_fpu_begin();
+			restore_init_xstate();
+		}
 	}
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 071/208] x86/fpu: Rename i387.h to fpu/api.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (69 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 070/208] x86/fpu: Clean up fpu__clear() a bit Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 072/208] x86/fpu: Move xsave.h to fpu/xsave.h Ingo Molnar
                   ` (8 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

We already have fpu/types.h, move i387.h to fpu/api.h.

The file name has become a misnomer anyway: it offers generic FPU APIs,
but is not limited to i387 functionality.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/aesni-intel_glue.c         | 2 +-
 arch/x86/crypto/crc32-pclmul_glue.c        | 2 +-
 arch/x86/crypto/crct10dif-pclmul_glue.c    | 2 +-
 arch/x86/crypto/fpu.c                      | 2 +-
 arch/x86/crypto/ghash-clmulni-intel_glue.c | 2 +-
 arch/x86/crypto/sha1_ssse3_glue.c          | 2 +-
 arch/x86/crypto/sha256_ssse3_glue.c        | 2 +-
 arch/x86/crypto/sha512_ssse3_glue.c        | 2 +-
 arch/x86/crypto/twofish_avx_glue.c         | 2 +-
 arch/x86/include/asm/crypto/glue_helper.h  | 2 +-
 arch/x86/include/asm/efi.h                 | 2 +-
 arch/x86/include/asm/fpu-internal.h        | 2 +-
 arch/x86/include/asm/{i387.h => fpu/api.h} | 6 +++---
 arch/x86/include/asm/simd.h                | 2 +-
 arch/x86/include/asm/suspend_32.h          | 2 +-
 arch/x86/include/asm/suspend_64.h          | 2 +-
 arch/x86/include/asm/xor.h                 | 2 +-
 arch/x86/include/asm/xor_32.h              | 2 +-
 arch/x86/include/asm/xor_avx.h             | 2 +-
 arch/x86/kernel/cpu/bugs.c                 | 2 +-
 arch/x86/kernel/fpu/xsave.c                | 2 +-
 arch/x86/kvm/vmx.c                         | 2 +-
 arch/x86/lguest/boot.c                     | 2 +-
 arch/x86/lib/mmx_32.c                      | 2 +-
 arch/x86/math-emu/fpu_entry.c              | 2 +-
 drivers/char/hw_random/via-rng.c           | 2 +-
 drivers/crypto/padlock-aes.c               | 2 +-
 drivers/crypto/padlock-sha.c               | 2 +-
 drivers/lguest/x86/core.c                  | 2 +-
 lib/raid6/x86.h                            | 2 +-
 30 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 112cefacf2af..b419f43ce0c5 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -32,7 +32,7 @@
 #include <crypto/lrw.h>
 #include <crypto/xts.h>
 #include <asm/cpu_device_id.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/crypto/aes.h>
 #include <crypto/ablk_helper.h>
 #include <crypto/scatterwalk.h>
diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c
index 1937fc1d8763..07d2c6c86a54 100644
--- a/arch/x86/crypto/crc32-pclmul_glue.c
+++ b/arch/x86/crypto/crc32-pclmul_glue.c
@@ -35,7 +35,7 @@
 
 #include <asm/cpufeature.h>
 #include <asm/cpu_device_id.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 #define CHKSUM_BLOCK_SIZE	1
 #define CHKSUM_DIGEST_SIZE	4
diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c
index b6c67bf30fdf..a3fcfc97a311 100644
--- a/arch/x86/crypto/crct10dif-pclmul_glue.c
+++ b/arch/x86/crypto/crct10dif-pclmul_glue.c
@@ -29,7 +29,7 @@
 #include <linux/init.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/cpufeature.h>
 #include <asm/cpu_device_id.h>
 
diff --git a/arch/x86/crypto/fpu.c b/arch/x86/crypto/fpu.c
index f368ba261739..5a2f30f9f52d 100644
--- a/arch/x86/crypto/fpu.c
+++ b/arch/x86/crypto/fpu.c
@@ -18,7 +18,7 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/crypto.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 struct crypto_fpu_ctx {
 	struct crypto_blkcipher *child;
diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c
index 2079baf06bdd..64d7cf1b50e1 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_glue.c
+++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c
@@ -19,7 +19,7 @@
 #include <crypto/cryptd.h>
 #include <crypto/gf128mul.h>
 #include <crypto/internal/hash.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/cpu_device_id.h>
 
 #define GHASH_BLOCK_SIZE	16
diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index 33d1b9dc14cc..cb3bf19dca5a 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -29,7 +29,7 @@
 #include <linux/types.h>
 #include <crypto/sha.h>
 #include <crypto/sha1_base.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/xcr.h>
 #include <asm/xsave.h>
 
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
index ccc338881ee8..9eaf7abaf4dc 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -37,7 +37,7 @@
 #include <linux/types.h>
 #include <crypto/sha.h>
 #include <crypto/sha256_base.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/xcr.h>
 #include <asm/xsave.h>
 #include <linux/string.h>
diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c
index d9fa4c1e063f..e0d6a67f567d 100644
--- a/arch/x86/crypto/sha512_ssse3_glue.c
+++ b/arch/x86/crypto/sha512_ssse3_glue.c
@@ -35,7 +35,7 @@
 #include <linux/types.h>
 #include <crypto/sha.h>
 #include <crypto/sha512_base.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/xcr.h>
 #include <asm/xsave.h>
 
diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c
index b5e2d5651851..1a66e6110f4b 100644
--- a/arch/x86/crypto/twofish_avx_glue.c
+++ b/arch/x86/crypto/twofish_avx_glue.c
@@ -36,7 +36,7 @@
 #include <crypto/ctr.h>
 #include <crypto/lrw.h>
 #include <crypto/xts.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/xcr.h>
 #include <asm/xsave.h>
 #include <asm/crypto/twofish.h>
diff --git a/arch/x86/include/asm/crypto/glue_helper.h b/arch/x86/include/asm/crypto/glue_helper.h
index 1eef55596e82..03bb1065c335 100644
--- a/arch/x86/include/asm/crypto/glue_helper.h
+++ b/arch/x86/include/asm/crypto/glue_helper.h
@@ -7,7 +7,7 @@
 
 #include <linux/kernel.h>
 #include <linux/crypto.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <crypto/b128ops.h>
 
 typedef void (*common_glue_func_t)(void *ctx, u8 *dst, const u8 *src);
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 3738b138b843..155162ea0e00 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -1,7 +1,7 @@
 #ifndef _ASM_X86_EFI_H
 #define _ASM_X86_EFI_H
 
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/pgtable.h>
 
 /*
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 21ad68179454..d68b349b4247 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -15,7 +15,7 @@
 #include <linux/slab.h>
 
 #include <asm/user.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/xsave.h>
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/fpu/api.h
similarity index 96%
rename from arch/x86/include/asm/i387.h
rename to arch/x86/include/asm/fpu/api.h
index b8f7d76ac066..9d3a6f3cfc1b 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -7,8 +7,8 @@
  * x86-64 work by Andi Kleen 2002
  */
 
-#ifndef _ASM_X86_I387_H
-#define _ASM_X86_I387_H
+#ifndef _ASM_X86_FPU_API_H
+#define _ASM_X86_FPU_API_H
 
 #ifndef __ASSEMBLY__
 
@@ -104,4 +104,4 @@ extern void fpu__save(struct fpu *fpu);
 
 #endif /* __ASSEMBLY__ */
 
-#endif /* _ASM_X86_I387_H */
+#endif /* _ASM_X86_FPU_API_H */
diff --git a/arch/x86/include/asm/simd.h b/arch/x86/include/asm/simd.h
index ee80b92f0096..6c8a7ed13365 100644
--- a/arch/x86/include/asm/simd.h
+++ b/arch/x86/include/asm/simd.h
@@ -1,5 +1,5 @@
 
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 /*
  * may_use_simd - whether it is allowable at this time to issue SIMD
diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h
index 552d6c90a6d4..d1793f06854d 100644
--- a/arch/x86/include/asm/suspend_32.h
+++ b/arch/x86/include/asm/suspend_32.h
@@ -7,7 +7,7 @@
 #define _ASM_X86_SUSPEND_32_H
 
 #include <asm/desc.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 /* image of the saved processor state */
 struct saved_context {
diff --git a/arch/x86/include/asm/suspend_64.h b/arch/x86/include/asm/suspend_64.h
index bc6232834bab..7ebf0ebe4e68 100644
--- a/arch/x86/include/asm/suspend_64.h
+++ b/arch/x86/include/asm/suspend_64.h
@@ -7,7 +7,7 @@
 #define _ASM_X86_SUSPEND_64_H
 
 #include <asm/desc.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 /*
  * Image of the saved processor state, used by the low level ACPI suspend to
diff --git a/arch/x86/include/asm/xor.h b/arch/x86/include/asm/xor.h
index d8829751b3f8..1f5c5161ead6 100644
--- a/arch/x86/include/asm/xor.h
+++ b/arch/x86/include/asm/xor.h
@@ -36,7 +36,7 @@
  * no advantages to be gotten from x86-64 here anyways.
  */
 
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 #ifdef CONFIG_X86_32
 /* reduce register pressure */
diff --git a/arch/x86/include/asm/xor_32.h b/arch/x86/include/asm/xor_32.h
index ce05722e3c68..5a08bc8bff33 100644
--- a/arch/x86/include/asm/xor_32.h
+++ b/arch/x86/include/asm/xor_32.h
@@ -26,7 +26,7 @@
 #define XO3(x, y)	"       pxor   8*("#x")(%4), %%mm"#y"   ;\n"
 #define XO4(x, y)	"       pxor   8*("#x")(%5), %%mm"#y"   ;\n"
 
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 static void
 xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
diff --git a/arch/x86/include/asm/xor_avx.h b/arch/x86/include/asm/xor_avx.h
index 492b29802f57..7c0a517ec751 100644
--- a/arch/x86/include/asm/xor_avx.h
+++ b/arch/x86/include/asm/xor_avx.h
@@ -18,7 +18,7 @@
 #ifdef CONFIG_AS_AVX
 
 #include <linux/compiler.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 #define BLOCK4(i) \
 		BLOCK(32 * i, 0) \
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index eb8be0c5823b..29dd74318ec6 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -12,7 +12,7 @@
 #include <asm/bugs.h>
 #include <asm/processor.h>
 #include <asm/processor-flags.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/msr.h>
 #include <asm/paravirt.h>
 #include <asm/alternative.h>
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 80b0c8fa50c5..8aa3b864a2e0 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -9,7 +9,7 @@
 #include <linux/bootmem.h>
 #include <linux/compat.h>
 #include <linux/cpu.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/fpu-internal.h>
 #include <asm/sigframe.h>
 #include <asm/tlbflush.h>
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f7b61687bd79..5cb738a18ca3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -40,7 +40,7 @@
 #include <asm/vmx.h>
 #include <asm/virtext.h>
 #include <asm/mce.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/xcr.h>
 #include <asm/perf_event.h>
 #include <asm/debugreg.h>
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index 8f9a133cc099..27f8eea0d6eb 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -70,7 +70,7 @@
 #include <asm/e820.h>
 #include <asm/mce.h>
 #include <asm/io.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/stackprotector.h>
 #include <asm/reboot.h>		/* for struct machine_ops */
 #include <asm/kvm_para.h>
diff --git a/arch/x86/lib/mmx_32.c b/arch/x86/lib/mmx_32.c
index c9f2d9ba8dd8..e5e3ed8dc079 100644
--- a/arch/x86/lib/mmx_32.c
+++ b/arch/x86/lib/mmx_32.c
@@ -22,7 +22,7 @@
 #include <linux/sched.h>
 #include <linux/types.h>
 
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/asm.h>
 
 void *_mmx_memcpy(void *to, const void *from, size_t len)
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index e394bcb4275d..3bb4c6a24ea5 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -31,7 +31,7 @@
 #include <asm/traps.h>
 #include <asm/desc.h>
 #include <asm/user.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 #include "fpu_system.h"
 #include "fpu_emu.h"
diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index a3bebef255ad..0c98a9d51a24 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -33,7 +33,7 @@
 #include <asm/io.h>
 #include <asm/msr.h>
 #include <asm/cpufeature.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 
 
diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index c178ed8c3908..da2d6777bd09 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -22,7 +22,7 @@
 #include <asm/cpu_device_id.h>
 #include <asm/byteorder.h>
 #include <asm/processor.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 /*
  * Number of data blocks actually fetched for each xcrypt insn.
diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index 95f7d27ce491..4e154c9b9206 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -23,7 +23,7 @@
 #include <linux/kernel.h>
 #include <linux/scatterlist.h>
 #include <asm/cpu_device_id.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 struct padlock_sha_desc {
 	struct shash_desc fallback;
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index bcb534a5512d..fce5989e66d9 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -46,7 +46,7 @@
 #include <asm/setup.h>
 #include <asm/lguest.h>
 #include <asm/uaccess.h>
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 #include <asm/tlbflush.h>
 #include "../lg.h"
 
diff --git a/lib/raid6/x86.h b/lib/raid6/x86.h
index b7595484a815..8fe9d9662abb 100644
--- a/lib/raid6/x86.h
+++ b/lib/raid6/x86.h
@@ -23,7 +23,7 @@
 
 #ifdef __KERNEL__ /* Real code */
 
-#include <asm/i387.h>
+#include <asm/fpu/api.h>
 
 #else /* Dummy code for user space testing */
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 072/208] x86/fpu: Move xsave.h to fpu/xsave.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (70 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 071/208] x86/fpu: Rename i387.h to fpu/api.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 073/208] x86/fpu: Rename fpu-internal.h to fpu/internal.h Ingo Molnar
                   ` (7 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Move the xsave.h header file to the FPU directory as well.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/camellia_aesni_avx2_glue.c | 2 +-
 arch/x86/crypto/camellia_aesni_avx_glue.c  | 2 +-
 arch/x86/crypto/cast5_avx_glue.c           | 2 +-
 arch/x86/crypto/cast6_avx_glue.c           | 2 +-
 arch/x86/crypto/serpent_avx2_glue.c        | 2 +-
 arch/x86/crypto/serpent_avx_glue.c         | 2 +-
 arch/x86/crypto/sha-mb/sha1_mb.c           | 2 +-
 arch/x86/crypto/sha1_ssse3_glue.c          | 2 +-
 arch/x86/crypto/sha256_ssse3_glue.c        | 2 +-
 arch/x86/crypto/sha512_ssse3_glue.c        | 2 +-
 arch/x86/crypto/twofish_avx_glue.c         | 2 +-
 arch/x86/include/asm/fpu-internal.h        | 2 +-
 arch/x86/include/asm/{ => fpu}/xsave.h     | 0
 arch/x86/kvm/cpuid.c                       | 2 +-
 14 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c
index baf0ac21ace5..004acd7bb4e0 100644
--- a/arch/x86/crypto/camellia_aesni_avx2_glue.c
+++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c
@@ -20,7 +20,7 @@
 #include <crypto/lrw.h>
 #include <crypto/xts.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/camellia.h>
 #include <asm/crypto/glue_helper.h>
 
diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c
index 78818a1e73e3..2f7ead8caf53 100644
--- a/arch/x86/crypto/camellia_aesni_avx_glue.c
+++ b/arch/x86/crypto/camellia_aesni_avx_glue.c
@@ -20,7 +20,7 @@
 #include <crypto/lrw.h>
 #include <crypto/xts.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/camellia.h>
 #include <asm/crypto/glue_helper.h>
 
diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c
index 236c80974457..2c3360be6fc8 100644
--- a/arch/x86/crypto/cast5_avx_glue.c
+++ b/arch/x86/crypto/cast5_avx_glue.c
@@ -32,7 +32,7 @@
 #include <crypto/cryptd.h>
 #include <crypto/ctr.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/glue_helper.h>
 
 #define CAST5_PARALLEL_BLOCKS 16
diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c
index f448810ca4ac..a2ec18a56e4f 100644
--- a/arch/x86/crypto/cast6_avx_glue.c
+++ b/arch/x86/crypto/cast6_avx_glue.c
@@ -37,7 +37,7 @@
 #include <crypto/lrw.h>
 #include <crypto/xts.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/glue_helper.h>
 
 #define CAST6_PARALLEL_BLOCKS 8
diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c
index 2f63dc89e7a9..206ec57725a3 100644
--- a/arch/x86/crypto/serpent_avx2_glue.c
+++ b/arch/x86/crypto/serpent_avx2_glue.c
@@ -21,7 +21,7 @@
 #include <crypto/xts.h>
 #include <crypto/serpent.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/serpent-avx.h>
 #include <asm/crypto/glue_helper.h>
 
diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c
index c8d478af8456..4feb68c9a41f 100644
--- a/arch/x86/crypto/serpent_avx_glue.c
+++ b/arch/x86/crypto/serpent_avx_glue.c
@@ -37,7 +37,7 @@
 #include <crypto/lrw.h>
 #include <crypto/xts.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/serpent-avx.h>
 #include <asm/crypto/glue_helper.h>
 
diff --git a/arch/x86/crypto/sha-mb/sha1_mb.c b/arch/x86/crypto/sha-mb/sha1_mb.c
index 15373786494f..02b64bbc1d48 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb.c
+++ b/arch/x86/crypto/sha-mb/sha1_mb.c
@@ -66,7 +66,7 @@
 #include <crypto/crypto_wq.h>
 #include <asm/byteorder.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <linux/hardirq.h>
 #include <asm/fpu-internal.h>
 #include "sha_mb_ctx.h"
diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index cb3bf19dca5a..71ab2b35d5e0 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -31,7 +31,7 @@
 #include <crypto/sha1_base.h>
 #include <asm/fpu/api.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 
 
 asmlinkage void sha1_transform_ssse3(u32 *digest, const char *data,
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
index 9eaf7abaf4dc..dcbd8ea6eaaf 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -39,7 +39,7 @@
 #include <crypto/sha256_base.h>
 #include <asm/fpu/api.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <linux/string.h>
 
 asmlinkage void sha256_transform_ssse3(u32 *digest, const char *data,
diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c
index e0d6a67f567d..e8836e0c1098 100644
--- a/arch/x86/crypto/sha512_ssse3_glue.c
+++ b/arch/x86/crypto/sha512_ssse3_glue.c
@@ -37,7 +37,7 @@
 #include <crypto/sha512_base.h>
 #include <asm/fpu/api.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 
 #include <linux/string.h>
 
diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c
index 1a66e6110f4b..3b6c8ba64f81 100644
--- a/arch/x86/crypto/twofish_avx_glue.c
+++ b/arch/x86/crypto/twofish_avx_glue.c
@@ -38,7 +38,7 @@
 #include <crypto/xts.h>
 #include <asm/fpu/api.h>
 #include <asm/xcr.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include <asm/crypto/twofish.h>
 #include <asm/crypto/glue_helper.h>
 #include <crypto/scatterwalk.h>
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index d68b349b4247..20690a14c73a 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -16,7 +16,7 @@
 
 #include <asm/user.h>
 #include <asm/fpu/api.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 
 #ifdef CONFIG_X86_64
 # include <asm/sigcontext32.h>
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/fpu/xsave.h
similarity index 100%
rename from arch/x86/include/asm/xsave.h
rename to arch/x86/include/asm/fpu/xsave.h
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 59b69f6a2844..0ce4c4f87332 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -17,7 +17,7 @@
 #include <linux/vmalloc.h>
 #include <linux/uaccess.h>
 #include <asm/user.h>
-#include <asm/xsave.h>
+#include <asm/fpu/xsave.h>
 #include "cpuid.h"
 #include "lapic.h"
 #include "mmu.h"
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 073/208] x86/fpu: Rename fpu-internal.h to fpu/internal.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (71 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 072/208] x86/fpu: Move xsave.h to fpu/xsave.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 074/208] x86/fpu: Move MXCSR_DEFAULT " Ingo Molnar
                   ` (6 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

This unifies all the FPU related header files under a unified, hiearchical
naming scheme:

 - asm/fpu/types.h:      FPU related data types, needed for 'struct task_struct',
                         widely included in almost all kernel code, and hence kept
                         as small as possible.

 - asm/fpu/api.h:        FPU related 'public' methods exported to other subsystems.

 - asm/fpu/internal.h:   FPU subsystem internal methods

 - asm/fpu/xsave.h:      XSAVE support internal methods

(Also standardize the header guard in asm/fpu/internal.h.)

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/crypto/crc32c-intel_glue.c                     | 2 +-
 arch/x86/crypto/sha-mb/sha1_mb.c                        | 2 +-
 arch/x86/ia32/ia32_signal.c                             | 2 +-
 arch/x86/include/asm/{fpu-internal.h => fpu/internal.h} | 6 +++---
 arch/x86/kernel/cpu/common.c                            | 2 +-
 arch/x86/kernel/fpu/core.c                              | 2 +-
 arch/x86/kernel/fpu/init.c                              | 2 +-
 arch/x86/kernel/fpu/xsave.c                             | 2 +-
 arch/x86/kernel/process.c                               | 2 +-
 arch/x86/kernel/process_32.c                            | 2 +-
 arch/x86/kernel/process_64.c                            | 2 +-
 arch/x86/kernel/ptrace.c                                | 2 +-
 arch/x86/kernel/signal.c                                | 2 +-
 arch/x86/kernel/smpboot.c                               | 2 +-
 arch/x86/kernel/traps.c                                 | 2 +-
 arch/x86/kvm/x86.c                                      | 2 +-
 arch/x86/mm/mpx.c                                       | 2 +-
 arch/x86/power/cpu.c                                    | 2 +-
 18 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index 470522cb042a..81a595d75cf5 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -32,7 +32,7 @@
 
 #include <asm/cpufeature.h>
 #include <asm/cpu_device_id.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 
 #define CHKSUM_BLOCK_SIZE	1
 #define CHKSUM_DIGEST_SIZE	4
diff --git a/arch/x86/crypto/sha-mb/sha1_mb.c b/arch/x86/crypto/sha-mb/sha1_mb.c
index 02b64bbc1d48..03ffaf8c2244 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb.c
+++ b/arch/x86/crypto/sha-mb/sha1_mb.c
@@ -68,7 +68,7 @@
 #include <asm/xcr.h>
 #include <asm/fpu/xsave.h>
 #include <linux/hardirq.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include "sha_mb_ctx.h"
 
 #define FLUSH_INTERVAL 1000 /* in usec */
diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index e1ec6f90d09e..d6d8f4ca5136 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -21,7 +21,7 @@
 #include <linux/binfmts.h>
 #include <asm/ucontext.h>
 #include <asm/uaccess.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/ptrace.h>
 #include <asm/ia32_unistd.h>
 #include <asm/user32.h>
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu/internal.h
similarity index 99%
rename from arch/x86/include/asm/fpu-internal.h
rename to arch/x86/include/asm/fpu/internal.h
index 20690a14c73a..386a8837c358 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -7,8 +7,8 @@
  * x86-64 work by Andi Kleen 2002
  */
 
-#ifndef _FPU_INTERNAL_H
-#define _FPU_INTERNAL_H
+#ifndef _ASM_X86_FPU_INTERNAL_H
+#define _ASM_X86_FPU_INTERNAL_H
 
 #include <linux/regset.h>
 #include <linux/compat.h>
@@ -553,4 +553,4 @@ alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx,
 	return sp;
 }
 
-#endif
+#endif /* _ASM_X86_FPU_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 88bb7a75f5c6..8f6a4ea39657 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -31,7 +31,7 @@
 #include <asm/setup.h>
 #include <asm/apic.h>
 #include <asm/desc.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/mtrr.h>
 #include <linux/numa.h>
 #include <asm/asm.h>
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7d69d784d064..3094b37b101e 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -5,7 +5,7 @@
  *  General FPU state handling cleanups
  *	Gareth Hughes <gareth@valinux.com>, May 2000
  */
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 
 /*
  * Track whether the kernel is using the FPU state
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 4eabb426e910..33df056b1624 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -1,7 +1,7 @@
 /*
  * x86 FPU boot time init code
  */
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/tlbflush.h>
 
 /*
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 8aa3b864a2e0..4ff726e4e29b 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -10,7 +10,7 @@
 #include <linux/compat.h>
 #include <linux/cpu.h>
 #include <asm/fpu/api.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/sigframe.h>
 #include <asm/tlbflush.h>
 #include <asm/xcr.h>
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 04ac5901dbee..2bd188501ac9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -25,7 +25,7 @@
 #include <asm/idle.h>
 #include <asm/uaccess.h>
 #include <asm/mwait.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/debugreg.h>
 #include <asm/nmi.h>
 #include <asm/tlbflush.h>
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 7adc314b5075..deff651835b4 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -39,7 +39,7 @@
 #include <asm/pgtable.h>
 #include <asm/ldt.h>
 #include <asm/processor.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/desc.h>
 #ifdef CONFIG_MATH_EMULATION
 #include <asm/math_emu.h>
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4504569c6c4e..c50e013b57d2 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -38,7 +38,7 @@
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/mmu_context.h>
 #include <asm/prctl.h>
 #include <asm/desc.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 69451b8965f7..c14a00f54b61 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -28,7 +28,7 @@
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/debugreg.h>
 #include <asm/ldt.h>
 #include <asm/desc.h>
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index bcb853e44d30..c67f96c87938 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -26,7 +26,7 @@
 
 #include <asm/processor.h>
 #include <asm/ucontext.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/vdso.h>
 #include <asm/mce.h>
 #include <asm/sighandling.h>
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 60e331ceb844..29f105f0d9fb 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -68,7 +68,7 @@
 #include <asm/mwait.h>
 #include <asm/apic.h>
 #include <asm/io_apic.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/setup.h>
 #include <asm/uv/uv.h>
 #include <linux/mc146818rtc.h>
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 8abcd6a6f3dc..a65586edbb57 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -54,7 +54,7 @@
 #include <asm/ftrace.h>
 #include <asm/traps.h>
 #include <asm/desc.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 #include <asm/mce.h>
 #include <asm/fixmap.h>
 #include <asm/mach_traps.h>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 479d4ce25081..91d7f3b1e50c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -60,7 +60,7 @@
 #include <asm/mtrr.h>
 #include <asm/mce.h>
 #include <linux/kernel_stat.h>
-#include <asm/fpu-internal.h> /* Ugh! */
+#include <asm/fpu/internal.h> /* Ugh! */
 #include <asm/xcr.h>
 #include <asm/pvclock.h>
 #include <asm/div64.h>
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 412b5f81e547..5563be313fd6 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -15,7 +15,7 @@
 #include <asm/mmu_context.h>
 #include <asm/mpx.h>
 #include <asm/processor.h>
-#include <asm/fpu-internal.h>
+#include <asm/fpu/internal.h>
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
 {
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 757678fb26e1..edaf934c749e 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -21,7 +21,7 @@
 #include <asm/xcr.h>
 #include <asm/suspend.h>
 #include <asm/debugreg.h>
-#include <asm/fpu-internal.h> /* pcntxt_mask */
+#include <asm/fpu/internal.h> /* pcntxt_mask */
 #include <asm/cpu.h>
 
 #ifdef CONFIG_X86_32
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 074/208] x86/fpu: Move MXCSR_DEFAULT to fpu/internal.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (72 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 073/208] x86/fpu: Rename fpu-internal.h to fpu/internal.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 075/208] x86/fpu: Remove xsave_init() __init obfuscation Ingo Molnar
                   ` (5 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

fpu/types.h gets included everywhere, move the MXCSR_DEFAULT to
fpu/internal.h, the place where it's used.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 2 ++
 arch/x86/include/asm/fpu/types.h    | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 386a8837c358..0e9a7a37801a 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -33,6 +33,8 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
 # define ia32_setup_rt_frame	__setup_rt_frame
 #endif
 
+#define	MXCSR_DEFAULT		0x1f80
+
 extern unsigned int mxcsr_feature_mask;
 extern void fpu__cpu_init(void);
 extern void eager_fpu_init(void);
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index cad1c37d9ea2..917d2e56426a 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -4,8 +4,6 @@
 #ifndef _ASM_X86_FPU_H
 #define _ASM_X86_FPU_H
 
-#define	MXCSR_DEFAULT		0x1f80
-
 struct i387_fsave_struct {
 	u32			cwd;	/* FPU Control Word		*/
 	u32			swd;	/* FPU Status Word		*/
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 075/208] x86/fpu: Remove xsave_init() __init obfuscation
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (73 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 074/208] x86/fpu: Move MXCSR_DEFAULT " Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 076/208] x86/fpu: Remove assembly guard from asm/fpu/api.h Ingo Molnar
                   ` (4 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

So this code surprised me - and being surprised when reading FPU code
does not help maintainability of an already overly complex subsystem.

Remove the obfuscation and just don't use __init annotation for now.
Anyone who wants to free these ~600 bytes of xstate_enable_boot_cpu()
should implement it cleanly.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/xsave.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 4ff726e4e29b..6c1cbb2487fe 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -606,8 +606,11 @@ static void __init init_xstate_size(void)
 
 /*
  * Enable and initialize the xsave feature.
+ *
+ * ( Not marked __init because of false positive section warnings
+ *   generated by xsave_init(). )
  */
-static void __init xstate_enable_boot_cpu(void)
+static void /* __init */ xstate_enable_boot_cpu(void)
 {
 	unsigned int eax, ebx, ecx, edx;
 
@@ -663,21 +666,20 @@ static void __init xstate_enable_boot_cpu(void)
 /*
  * For the very first instance, this calls xstate_enable_boot_cpu();
  * for all subsequent instances, this calls xstate_enable().
- *
- * This is somewhat obfuscated due to the lack of powerful enough
- * overrides for the section checks.
  */
 void xsave_init(void)
 {
-	static __refdata void (*next_func)(void) = xstate_enable_boot_cpu;
-	void (*this_func)(void);
+	static char on_boot_cpu = 1;
 
 	if (!cpu_has_xsave)
 		return;
 
-	this_func = next_func;
-	next_func = xstate_enable;
-	this_func();
+	if (on_boot_cpu) {
+		on_boot_cpu = 0;
+		xstate_enable_boot_cpu();
+	} else {
+		xstate_enable();
+	}
 }
 
 /*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 076/208] x86/fpu: Remove assembly guard from asm/fpu/api.h
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (74 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 075/208] x86/fpu: Remove xsave_init() __init obfuscation Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 077/208] x86/fpu: Improve FPU detection kernel messages Ingo Molnar
                   ` (3 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

asm/fpu/api.h does not contain any defines useful to assembly code,
and no assembly code includes asm/fpu/api.h. Remove the historic
 #ifndef __ASSEMBLY__ leftover guard.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/api.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index 9d3a6f3cfc1b..f1eddcccba16 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -10,8 +10,6 @@
 #ifndef _ASM_X86_FPU_API_H
 #define _ASM_X86_FPU_API_H
 
-#ifndef __ASSEMBLY__
-
 #include <linux/sched.h>
 #include <linux/hardirq.h>
 
@@ -102,6 +100,4 @@ static inline int user_has_fpu(void)
 
 extern void fpu__save(struct fpu *fpu);
 
-#endif /* __ASSEMBLY__ */
-
 #endif /* _ASM_X86_FPU_API_H */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 077/208] x86/fpu: Improve FPU detection kernel messages
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (75 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 076/208] x86/fpu: Remove assembly guard from asm/fpu/api.h Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 078/208] x86/fpu: Print supported xstate features in human readable way Ingo Molnar
                   ` (2 subsequent siblings)
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Standardize the various boot time messages printed during FPU detection:

 - Use a common 'x86/fpu: ' prefix for consistency and to make it easy
   to grep boot logs for FPU related messages

 - Correct speling errors

 - Add printout for the legacy FPU case as well

 - Clarify messages

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/xsave.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index 6c1cbb2487fe..bfe92f73bf86 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -3,9 +3,6 @@
  *
  * Author: Suresh Siddha <suresh.b.siddha@intel.com>
  */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
 #include <linux/bootmem.h>
 #include <linux/compat.h>
 #include <linux/cpu.h>
@@ -615,7 +612,7 @@ static void /* __init */ xstate_enable_boot_cpu(void)
 	unsigned int eax, ebx, ecx, edx;
 
 	if (boot_cpu_data.cpuid_level < XSTATE_CPUID) {
-		WARN(1, KERN_ERR "XSTATE_CPUID missing\n");
+		WARN(1, "x86/fpu: XSTATE_CPUID missing!\n");
 		return;
 	}
 
@@ -623,8 +620,7 @@ static void /* __init */ xstate_enable_boot_cpu(void)
 	pcntxt_mask = eax + ((u64)edx << 32);
 
 	if ((pcntxt_mask & XSTATE_FPSSE) != XSTATE_FPSSE) {
-		pr_err("FP/SSE not shown under xsave features 0x%llx\n",
-		       pcntxt_mask);
+		pr_err("x86/fpu: FP/SSE not present amongst the CPU's xstate features: 0x%llx.\n", pcntxt_mask);
 		BUG();
 	}
 
@@ -650,17 +646,18 @@ static void /* __init */ xstate_enable_boot_cpu(void)
 
 	if (pcntxt_mask & XSTATE_EAGER) {
 		if (eagerfpu == DISABLE) {
-			pr_err("eagerfpu not present, disabling some xstate features: 0x%llx\n",
-					pcntxt_mask & XSTATE_EAGER);
+			pr_err("x86/fpu: eagerfpu switching disabled, disabling the following xstate features: 0x%llx.\n",
+			       pcntxt_mask & XSTATE_EAGER);
 			pcntxt_mask &= ~XSTATE_EAGER;
 		} else {
 			eagerfpu = ENABLE;
 		}
 	}
 
-	pr_info("enabled xstate_bv 0x%llx, cntxt size 0x%x using %s\n",
-		pcntxt_mask, xstate_size,
-		cpu_has_xsaves ? "compacted form" : "standard form");
+	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is 0x%x bytes, using '%s' format.\n",
+		pcntxt_mask,
+		xstate_size,
+		cpu_has_xsaves ? "compacted" : "standard");
 }
 
 /*
@@ -671,8 +668,13 @@ void xsave_init(void)
 {
 	static char on_boot_cpu = 1;
 
-	if (!cpu_has_xsave)
+	if (!cpu_has_xsave) {
+		if (on_boot_cpu) {
+			on_boot_cpu = 0;
+			pr_info("x86/fpu: Legacy x87 FPU detected.\n");
+		}
 		return;
+	}
 
 	if (on_boot_cpu) {
 		on_boot_cpu = 0;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 078/208] x86/fpu: Print supported xstate features in human readable way
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (76 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 077/208] x86/fpu: Improve FPU detection kernel messages Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 16:24 ` [PATCH 079/208] x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask' Ingo Molnar
  2015-05-05 17:14 ` [PATCH 000/208] big x86 FPU code rewrite Linus Torvalds
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

Inform the user/admin about which xstate features the kernel supports.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/xsave.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index bfe92f73bf86..f39882b7281b 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -482,6 +482,30 @@ static void __init setup_xstate_features(void)
 	} while (1);
 }
 
+static void print_xstate_feature(u64 xstate_mask, const char *desc)
+{
+	if (pcntxt_mask & xstate_mask) {
+		int xstate_feature = fls64(xstate_mask)-1;
+
+		pr_info("x86/fpu: Supporting XSAVE feature %2d: '%s'\n", xstate_feature, desc);
+	}
+}
+
+/*
+ * Print out all the supported xstate features:
+ */
+static void print_xstate_features(void)
+{
+	print_xstate_feature(XSTATE_FP,		"x87 floating point registers");
+	print_xstate_feature(XSTATE_SSE,	"SSE registers");
+	print_xstate_feature(XSTATE_YMM,	"AVX registers");
+	print_xstate_feature(XSTATE_BNDREGS,	"MPX bounds registers");
+	print_xstate_feature(XSTATE_BNDCSR,	"MPX CSR");
+	print_xstate_feature(XSTATE_OPMASK,	"AVX-512 opmask");
+	print_xstate_feature(XSTATE_ZMM_Hi256,	"AVX-512 Hi256");
+	print_xstate_feature(XSTATE_Hi16_ZMM,	"AVX-512 ZMM_Hi256");
+}
+
 /*
  * This function sets up offsets and sizes of all extended states in
  * xsave area. This supports both standard format and compacted format
@@ -545,6 +569,7 @@ static void __init setup_init_fpu_buf(void)
 		return;
 
 	setup_xstate_features();
+	print_xstate_features();
 
 	if (cpu_has_xsaves) {
 		init_xstate_buf->xsave_hdr.xcomp_bv =
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH 079/208] x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask'
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (77 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 078/208] x86/fpu: Print supported xstate features in human readable way Ingo Molnar
@ 2015-05-05 16:24 ` Ingo Molnar
  2015-05-05 17:14 ` [PATCH 000/208] big x86 FPU code rewrite Linus Torvalds
  79 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 16:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Dave Hansen, Fenghua Yu,
	H. Peter Anvin, Linus Torvalds, Oleg Nesterov, Thomas Gleixner

So the 'pcntxt_mask' is a misnomer, it's essentially meaningless to anyone
who doesn't know what it does exactly.

Name it more descriptively as 'xfeatures_mask'.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/xsave.h |  2 +-
 arch/x86/kernel/fpu/core.c       |  2 +-
 arch/x86/kernel/fpu/xsave.c      | 58 +++++++++++++++++++++++++++++-----------------------------
 arch/x86/power/cpu.c             |  4 ++--
 4 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xsave.h b/arch/x86/include/asm/fpu/xsave.h
index 7c90ea93c54e..400d5b2e42eb 100644
--- a/arch/x86/include/asm/fpu/xsave.h
+++ b/arch/x86/include/asm/fpu/xsave.h
@@ -45,7 +45,7 @@
 #endif
 
 extern unsigned int xstate_size;
-extern u64 pcntxt_mask;
+extern u64 xfeatures_mask;
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern struct xsave_struct *init_xstate_buf;
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 3094b37b101e..7b98da7e1b55 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -528,7 +528,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	 * mxcsr reserved bits must be masked to zero for security reasons.
 	 */
 	xsave->i387.mxcsr &= mxcsr_feature_mask;
-	xsave->xsave_hdr.xstate_bv &= pcntxt_mask;
+	xsave->xsave_hdr.xstate_bv &= xfeatures_mask;
 	/*
 	 * These bits must be zero.
 	 */
diff --git a/arch/x86/kernel/fpu/xsave.c b/arch/x86/kernel/fpu/xsave.c
index f39882b7281b..c0e95538d689 100644
--- a/arch/x86/kernel/fpu/xsave.c
+++ b/arch/x86/kernel/fpu/xsave.c
@@ -13,9 +13,9 @@
 #include <asm/xcr.h>
 
 /*
- * Supported feature mask by the CPU and the kernel.
+ * Mask of xstate features supported by the CPU and the kernel:
  */
-u64 pcntxt_mask;
+u64 xfeatures_mask;
 
 /*
  * Represents init state for the supported extended state.
@@ -24,7 +24,7 @@ struct xsave_struct *init_xstate_buf;
 
 static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
 static unsigned int *xstate_offsets, *xstate_sizes;
-static unsigned int xstate_comp_offsets[sizeof(pcntxt_mask)*8];
+static unsigned int xstate_comp_offsets[sizeof(xfeatures_mask)*8];
 static unsigned int xstate_features;
 
 /*
@@ -52,7 +52,7 @@ void __sanitize_i387_state(struct task_struct *tsk)
 	 * None of the feature bits are in init state. So nothing else
 	 * to do for us, as the memory layout is up to date.
 	 */
-	if ((xstate_bv & pcntxt_mask) == pcntxt_mask)
+	if ((xstate_bv & xfeatures_mask) == xfeatures_mask)
 		return;
 
 	/*
@@ -74,7 +74,7 @@ void __sanitize_i387_state(struct task_struct *tsk)
 	if (!(xstate_bv & XSTATE_SSE))
 		memset(&fx->xmm_space[0], 0, 256);
 
-	xstate_bv = (pcntxt_mask & ~xstate_bv) >> 2;
+	xstate_bv = (xfeatures_mask & ~xstate_bv) >> 2;
 
 	/*
 	 * Update all the other memory layouts for which the corresponding
@@ -291,7 +291,7 @@ sanitize_restored_xstate(struct task_struct *tsk,
 		if (fx_only)
 			xsave_hdr->xstate_bv = XSTATE_FPSSE;
 		else
-			xsave_hdr->xstate_bv &= (pcntxt_mask & xstate_bv);
+			xsave_hdr->xstate_bv &= (xfeatures_mask & xstate_bv);
 	}
 
 	if (use_fxsr()) {
@@ -312,11 +312,11 @@ static inline int restore_user_xstate(void __user *buf, u64 xbv, int fx_only)
 {
 	if (use_xsave()) {
 		if ((unsigned long)buf % 64 || fx_only) {
-			u64 init_bv = pcntxt_mask & ~XSTATE_FPSSE;
+			u64 init_bv = xfeatures_mask & ~XSTATE_FPSSE;
 			xrstor_state(init_xstate_buf, init_bv);
 			return fxrstor_user(buf);
 		} else {
-			u64 init_bv = pcntxt_mask & ~xbv;
+			u64 init_bv = xfeatures_mask & ~xbv;
 			if (unlikely(init_bv))
 				xrstor_state(init_xstate_buf, init_bv);
 			return xrestore_user(buf, xbv);
@@ -439,7 +439,7 @@ static void prepare_fx_sw_frame(void)
 
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
-	fx_sw_reserved.xstate_bv = pcntxt_mask;
+	fx_sw_reserved.xstate_bv = xfeatures_mask;
 	fx_sw_reserved.xstate_size = xstate_size;
 
 	if (config_enabled(CONFIG_IA32_EMULATION)) {
@@ -454,7 +454,7 @@ static void prepare_fx_sw_frame(void)
 static inline void xstate_enable(void)
 {
 	cr4_set_bits(X86_CR4_OSXSAVE);
-	xsetbv(XCR_XFEATURE_ENABLED_MASK, pcntxt_mask);
+	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask);
 }
 
 /*
@@ -465,7 +465,7 @@ static void __init setup_xstate_features(void)
 {
 	int eax, ebx, ecx, edx, leaf = 0x2;
 
-	xstate_features = fls64(pcntxt_mask);
+	xstate_features = fls64(xfeatures_mask);
 	xstate_offsets = alloc_bootmem(xstate_features * sizeof(int));
 	xstate_sizes = alloc_bootmem(xstate_features * sizeof(int));
 
@@ -484,7 +484,7 @@ static void __init setup_xstate_features(void)
 
 static void print_xstate_feature(u64 xstate_mask, const char *desc)
 {
-	if (pcntxt_mask & xstate_mask) {
+	if (xfeatures_mask & xstate_mask) {
 		int xstate_feature = fls64(xstate_mask)-1;
 
 		pr_info("x86/fpu: Supporting XSAVE feature %2d: '%s'\n", xstate_feature, desc);
@@ -516,7 +516,7 @@ static void print_xstate_features(void)
  */
 void setup_xstate_comp(void)
 {
-	unsigned int xstate_comp_sizes[sizeof(pcntxt_mask)*8];
+	unsigned int xstate_comp_sizes[sizeof(xfeatures_mask)*8];
 	int i;
 
 	/*
@@ -529,7 +529,7 @@ void setup_xstate_comp(void)
 
 	if (!cpu_has_xsaves) {
 		for (i = 2; i < xstate_features; i++) {
-			if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
+			if (test_bit(i, (unsigned long *)&xfeatures_mask)) {
 				xstate_comp_offsets[i] = xstate_offsets[i];
 				xstate_comp_sizes[i] = xstate_sizes[i];
 			}
@@ -540,7 +540,7 @@ void setup_xstate_comp(void)
 	xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE;
 
 	for (i = 2; i < xstate_features; i++) {
-		if (test_bit(i, (unsigned long *)&pcntxt_mask))
+		if (test_bit(i, (unsigned long *)&xfeatures_mask))
 			xstate_comp_sizes[i] = xstate_sizes[i];
 		else
 			xstate_comp_sizes[i] = 0;
@@ -573,8 +573,8 @@ static void __init setup_init_fpu_buf(void)
 
 	if (cpu_has_xsaves) {
 		init_xstate_buf->xsave_hdr.xcomp_bv =
-						(u64)1 << 63 | pcntxt_mask;
-		init_xstate_buf->xsave_hdr.xstate_bv = pcntxt_mask;
+						(u64)1 << 63 | xfeatures_mask;
+		init_xstate_buf->xsave_hdr.xstate_bv = xfeatures_mask;
 	}
 
 	/*
@@ -604,7 +604,7 @@ __setup("eagerfpu=", eager_fpu_setup);
 
 
 /*
- * Calculate total size of enabled xstates in XCR0/pcntxt_mask.
+ * Calculate total size of enabled xstates in XCR0/xfeatures_mask.
  */
 static void __init init_xstate_size(void)
 {
@@ -619,7 +619,7 @@ static void __init init_xstate_size(void)
 
 	xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
 	for (i = 2; i < 64; i++) {
-		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
+		if (test_bit(i, (unsigned long *)&xfeatures_mask)) {
 			cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
 			xstate_size += eax;
 		}
@@ -642,17 +642,17 @@ static void /* __init */ xstate_enable_boot_cpu(void)
 	}
 
 	cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
-	pcntxt_mask = eax + ((u64)edx << 32);
+	xfeatures_mask = eax + ((u64)edx << 32);
 
-	if ((pcntxt_mask & XSTATE_FPSSE) != XSTATE_FPSSE) {
-		pr_err("x86/fpu: FP/SSE not present amongst the CPU's xstate features: 0x%llx.\n", pcntxt_mask);
+	if ((xfeatures_mask & XSTATE_FPSSE) != XSTATE_FPSSE) {
+		pr_err("x86/fpu: FP/SSE not present amongst the CPU's xstate features: 0x%llx.\n", xfeatures_mask);
 		BUG();
 	}
 
 	/*
 	 * Support only the state known to OS.
 	 */
-	pcntxt_mask = pcntxt_mask & XCNTXT_MASK;
+	xfeatures_mask = xfeatures_mask & XCNTXT_MASK;
 
 	xstate_enable();
 
@@ -661,7 +661,7 @@ static void /* __init */ xstate_enable_boot_cpu(void)
 	 */
 	init_xstate_size();
 
-	update_regset_xstate_info(xstate_size, pcntxt_mask);
+	update_regset_xstate_info(xstate_size, xfeatures_mask);
 	prepare_fx_sw_frame();
 	setup_init_fpu_buf();
 
@@ -669,18 +669,18 @@ static void /* __init */ xstate_enable_boot_cpu(void)
 	if (cpu_has_xsaveopt && eagerfpu != DISABLE)
 		eagerfpu = ENABLE;
 
-	if (pcntxt_mask & XSTATE_EAGER) {
+	if (xfeatures_mask & XSTATE_EAGER) {
 		if (eagerfpu == DISABLE) {
 			pr_err("x86/fpu: eagerfpu switching disabled, disabling the following xstate features: 0x%llx.\n",
-			       pcntxt_mask & XSTATE_EAGER);
-			pcntxt_mask &= ~XSTATE_EAGER;
+			       xfeatures_mask & XSTATE_EAGER);
+			xfeatures_mask &= ~XSTATE_EAGER;
 		} else {
 			eagerfpu = ENABLE;
 		}
 	}
 
 	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is 0x%x bytes, using '%s' format.\n",
-		pcntxt_mask,
+		xfeatures_mask,
 		xstate_size,
 		cpu_has_xsaves ? "compacted" : "standard");
 }
@@ -749,7 +749,7 @@ void __init_refok eager_fpu_init(void)
 void *get_xsave_addr(struct xsave_struct *xsave, int xstate)
 {
 	int feature = fls64(xstate) - 1;
-	if (!test_bit(feature, (unsigned long *)&pcntxt_mask))
+	if (!test_bit(feature, (unsigned long *)&xfeatures_mask))
 		return NULL;
 
 	return (void *)xsave + xstate_comp_offsets[feature];
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index edaf934c749e..62054acbd0d8 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -21,7 +21,7 @@
 #include <asm/xcr.h>
 #include <asm/suspend.h>
 #include <asm/debugreg.h>
-#include <asm/fpu/internal.h> /* pcntxt_mask */
+#include <asm/fpu/internal.h> /* xfeatures_mask */
 #include <asm/cpu.h>
 
 #ifdef CONFIG_X86_32
@@ -225,7 +225,7 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
 	 * restore XCR0 for xsave capable cpu's.
 	 */
 	if (cpu_has_xsave)
-		xsetbv(XCR_XFEATURE_ENABLED_MASK, pcntxt_mask);
+		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask);
 
 	fix_processor_context();
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH 000/208] big x86 FPU code rewrite
  2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
                   ` (78 preceding siblings ...)
  2015-05-05 16:24 ` [PATCH 079/208] x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask' Ingo Molnar
@ 2015-05-05 17:14 ` Linus Torvalds
  2015-05-05 17:50   ` Ingo Molnar
  79 siblings, 1 reply; 85+ messages in thread
From: Linus Torvalds @ 2015-05-05 17:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linux Kernel Mailing List, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H. Peter Anvin, Oleg Nesterov,
	Thomas Gleixner

On Tue, May 5, 2015 at 9:23 AM, Ingo Molnar <mingo@kernel.org> wrote:
>  83 files changed, 3742 insertions(+), 2841 deletions(-)

How much of this is just the added instrumentation? Because that's
almost a thousand new lines, which makes me unhappy. The *last* thing
we want is to make this thing bigger. I'm not convinced it's worth it
adding some performance debug code that doesn't really add any new
information, and could be done outside the kernel as just an
independent module instead.

                       Linus

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 000/208] big x86 FPU code rewrite
  2015-05-05 17:14 ` [PATCH 000/208] big x86 FPU code rewrite Linus Torvalds
@ 2015-05-05 17:50   ` Ingo Molnar
  2015-07-17 23:52     ` Andy Lutomirski
  0 siblings, 1 reply; 85+ messages in thread
From: Ingo Molnar @ 2015-05-05 17:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H. Peter Anvin, Oleg Nesterov,
	Thomas Gleixner


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, May 5, 2015 at 9:23 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >  83 files changed, 3742 insertions(+), 2841 deletions(-)
> 
> How much of this is just the added instrumentation? [...]

Half of it is that, plus a lot of comments.

> [...] Because that's almost a thousand new lines, which makes me 
> unhappy. The *last* thing we want is to make this thing bigger. 
> [...]

So Boris suggested that I should move fpu/measure.c out of the FPU 
code anyway, which is fair enough, as it measures a lot of other low 
level details as well. Consider it done.

With that taken out, the diffstat comes down to:

   81 files changed, 3409 insertions(+), 3055 deletions(-)

That's mostly 400 new lines of comments all around the FPU code, plus 
a bit of extra headers due to the split-up modules (50-100 lines 
maybe).

> [...] I'm not convinced it's worth it adding some performance debug 
> code that doesn't really add any new information, and could be done 
> outside the kernel as just an independent module instead.

Code size difference (with debugging off) on an x86-64 defconfig-ish 
kernel:

        text      data    bss     filename

    15030376   2574976   1634304 vmlinux.before
    15023690   2578648   1634304 vmlinux.after

The runtime size of the kernel got smaller by 7K.

Considering that arch/x86/kernel/fpu/built-in.o is only 13K that's 
quite significant.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active
  2015-05-05 16:24 ` [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active Ingo Molnar
@ 2015-05-06  0:51   ` Andy Lutomirski
  2015-05-06  3:24     ` Ingo Molnar
  0 siblings, 1 reply; 85+ messages in thread
From: Andy Lutomirski @ 2015-05-06  0:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Borislav Petkov, Fenghua Yu, Thomas Gleixner, Dave Hansen,
	Linus Torvalds, Oleg Nesterov, H. Peter Anvin, linux-kernel

On May 5, 2015 9:59 PM, "Ingo Molnar" <mingo@kernel.org> wrote:
>
> Introduce a simple fpu->fpstate_active flag in the fpu context data structure
> and use that instead of PF_USED_MATH in task->flags.
>
> Testing for this flag byte should be slightly more efficient than
> testing a bit in a bitmask, but the main advantage is that most
> FPU functions can now be performed on a 'struct fpu' alone, they
> don't need access to 'struct task_struct' anymore.
>
> There's a slight linecount increase, mostly due to the 'fpu' local
> variables and due to extra comments. The local variables will go away
> once we move most of the FPU methods to pure 'struct fpu' parameters.
>
> Reviewed-by: Borislav Petkov <bp@alien8.de>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/ia32/ia32_signal.c         |  3 ++-
>  arch/x86/include/asm/fpu-internal.h |  4 ++--
>  arch/x86/include/asm/fpu/types.h    |  6 ++++++
>  arch/x86/include/asm/processor.h    |  6 ++++--
>  arch/x86/kernel/fpu/core.c          | 38 +++++++++++++++++++++++++-------------
>  arch/x86/kernel/fpu/xsave.c         | 11 ++++++-----
>  arch/x86/kernel/signal.c            |  8 +++++---
>  arch/x86/kvm/x86.c                  |  3 ++-
>  arch/x86/math-emu/fpu_entry.c       |  3 ++-
>  9 files changed, 54 insertions(+), 28 deletions(-)
>
> diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
> index bffb2c49ceb6..e1ec6f90d09e 100644
> --- a/arch/x86/ia32/ia32_signal.c
> +++ b/arch/x86/ia32/ia32_signal.c
> @@ -307,6 +307,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
>                                  size_t frame_size,
>                                  void __user **fpstate)
>  {
> +       struct fpu *fpu = &current->thread.fpu;
>         unsigned long sp;
>
>         /* Default to using normal stack */
> @@ -321,7 +322,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
>                  ksig->ka.sa.sa_restorer)
>                 sp = (unsigned long) ksig->ka.sa.sa_restorer;
>
> -       if (current->flags & PF_USED_MATH) {
> +       if (fpu->fpstate_active) {
>                 unsigned long fx_aligned, math_size;
>
>                 sp = alloc_mathframe(sp, 1, &fx_aligned, &math_size);
> diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
> index 2cac49e3b4bd..9311126571ab 100644
> --- a/arch/x86/include/asm/fpu-internal.h
> +++ b/arch/x86/include/asm/fpu-internal.h
> @@ -375,7 +375,7 @@ static inline void drop_fpu(struct task_struct *tsk)
>                 __thread_fpu_end(fpu);
>         }
>
> -       tsk->flags &= ~PF_USED_MATH;
> +       fpu->fpstate_active = 0;
>
>         preempt_enable();
>  }
> @@ -424,7 +424,7 @@ static inline fpu_switch_t switch_fpu_prepare(struct task_struct *old, struct ta
>          * If the task has used the math, pre-load the FPU on xsave processors
>          * or if the past 5 consecutive context-switches used math.
>          */
> -       fpu.preload = (new->flags & PF_USED_MATH) &&
> +       fpu.preload = new_fpu->fpstate_active &&
>                       (use_eager_fpu() || new->thread.fpu.counter > 5);
>
>         if (old_fpu->has_fpu) {
> diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> index efb520dcf38e..f6317d9aa808 100644
> --- a/arch/x86/include/asm/fpu/types.h
> +++ b/arch/x86/include/asm/fpu/types.h
> @@ -137,6 +137,12 @@ struct fpu {
>          * deal with bursty apps that only use the FPU for a short time:
>          */
>         unsigned char                   counter;
> +       /*
> +        * This flag indicates whether this context is fpstate_active: if the task is
> +        * not running then we can restore from this context, if the task
> +        * is running then we should save into this context.
> +        */
> +       unsigned char                   fpstate_active;

I don't understand.  What does it mean if !fpstate_active?

--Andy

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active
  2015-05-06  0:51   ` Andy Lutomirski
@ 2015-05-06  3:24     ` Ingo Molnar
  0 siblings, 0 replies; 85+ messages in thread
From: Ingo Molnar @ 2015-05-06  3:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Fenghua Yu, Thomas Gleixner, Dave Hansen,
	Linus Torvalds, Oleg Nesterov, H. Peter Anvin, linux-kernel


* Andy Lutomirski <luto@amacapital.net> wrote:

> > diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> > index efb520dcf38e..f6317d9aa808 100644
> > --- a/arch/x86/include/asm/fpu/types.h
> > +++ b/arch/x86/include/asm/fpu/types.h
> > @@ -137,6 +137,12 @@ struct fpu {
> >          * deal with bursty apps that only use the FPU for a short time:
> >          */
> >         unsigned char                   counter;
> > +       /*
> > +        * This flag indicates whether this context is fpstate_active: if the task is
> > +        * not running then we can restore from this context, if the task
> > +        * is running then we should save into this context.
> > +        */
> > +       unsigned char                   fpstate_active;
> 
> I don't understand.  What does it mean if !fpstate_active?

Yeah, so this was just the 'simple' migration patch from PF_USED_FPU 
to ->fpstate_active.

At the end of the series those fields get more love and a more 
detailed explanation:

        /*
         * @fpstate_active:
         *
         * This flag indicates whether this context is active: if the task
         * is not running then we can restore from this context, if the task
         * is running then we should save into this context.
         */
        unsigned char                   fpstate_active;

and its interaction with fpregs_active is explained as well:

        /*
         * @fpregs_active:
         *
         * This flag determines whether a given context is actively
         * loaded into the FPU's registers and that those registers
         * represent the task's current FPU state.
         *
         * Note the interaction with fpstate_active:
         *
         *   # task does not use the FPU:
         *   fpstate_active == 0
         *
         *   # task uses the FPU and regs are active:
         *   fpstate_active == 1 && fpregs_active == 1
         *
         *   # the regs are inactive but still match fpstate:
         *   fpstate_active == 1 && fpregs_active == 0 && fpregs_owner == fpu
         *
         * The third state is what we use for the lazy restore optimization
         * on lazy-switching CPUs.
         */
        unsigned char                   fpregs_active;

Basically the 'fpstate' is the in-memory FPU state, and if it's 
active, it means it can be copied to (saved to) and copied from 
(restored from). Whether this fpstate is the currently representative 
FPU state depends on the other state flag(s), as described.

Maybe I should have broken out the fourth state as well:

         *   # the fpstate holds all of a task's FPU state:
         *   fpstate_active == 1 && fpregs_active == 0 && fpregs_owner != fpu

?

active/inactive was one idiom that I felt worked pretty well - but I 
considered others as well:

  - dirty/clean (didn't work so well and too MM-ish)
  - valid/invalid (likewise)
  - used/unused (yuck)

Note:

  There's a fifth valid state as well, but I did not want
  to complicate the description even more: kernel_user_begin()/end()
  users create this state with its own private in_kernel_fpu flag, in 
  that they use FPU registers but don't touch these (user-)flags. 
  kernel_user_begin()/end() is atomic and (beyond zapping pending lazy 
  restore state in fpu_fpregs_owner_ctx) it restores the FPU to the 
  previous state so it's pretty orthogonal as far as the other states 
  are concerned.

Note2:

  I also considered renaming kernel_fpu_begin()/end() to the new 
  nomenclature, but it has a good name and I did not want too much
  churn with a well-established API, which also mirrors
  user_fpu_begin() conceptually. I also couldn't find a better name:
  maybe fpu__kernel_save()/restore(), but that felt a bit strained.

Does this make things clearer? I can work on it some more if I got it 
wrong or if the text is confusing somewhere, this is crucial IMHO.

Instead of binary states we could also unify them into a single state 
variable - didn't find any really convincing naming concept for that 
though, mostly because I think those states are fundamentally 
separate, just interrelated.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 000/208] big x86 FPU code rewrite
  2015-05-05 17:50   ` Ingo Molnar
@ 2015-07-17 23:52     ` Andy Lutomirski
  0 siblings, 0 replies; 85+ messages in thread
From: Andy Lutomirski @ 2015-07-17 23:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Linux Kernel Mailing List, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H. Peter Anvin, Oleg Nesterov,
	Thomas Gleixner

On Tue, May 5, 2015 at 10:50 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> On Tue, May 5, 2015 at 9:23 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> >  83 files changed, 3742 insertions(+), 2841 deletions(-)
>>
>> How much of this is just the added instrumentation? [...]
>
> Half of it is that, plus a lot of comments.
>
>> [...] Because that's almost a thousand new lines, which makes me
>> unhappy. The *last* thing we want is to make this thing bigger.
>> [...]
>
> So Boris suggested that I should move fpu/measure.c out of the FPU
> code anyway, which is fair enough, as it measures a lot of other low
> level details as well. Consider it done.

Where did the measurement code go?  Regardless of where it lives, I liked it.

--Andy

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2015-07-17 23:52 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-05 16:23 [PATCH 000/208] big x86 FPU code rewrite Ingo Molnar
2015-05-05 16:23 ` [PATCH 001/208] x86/fpu: Rename unlazy_fpu() to fpu__save() Ingo Molnar
2015-05-05 16:23 ` [PATCH 002/208] x86/fpu: Add comments to fpu__save() and restrict its export Ingo Molnar
2015-05-05 16:23 ` [PATCH 003/208] x86/fpu: Add debugging check to fpu__save() Ingo Molnar
2015-05-05 16:23 ` [PATCH 004/208] x86/fpu: Rename fpu_detect() to fpu__detect() Ingo Molnar
2015-05-05 16:23 ` [PATCH 005/208] x86/fpu: Remove stale init_fpu() prototype Ingo Molnar
2015-05-05 16:23 ` [PATCH 006/208] x86/fpu: Split an fpstate_alloc_init() function out of init_fpu() Ingo Molnar
2015-05-05 16:23 ` [PATCH 007/208] x86/fpu: Make init_fpu() static Ingo Molnar
2015-05-05 16:23 ` [PATCH 008/208] x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check Ingo Molnar
2015-05-05 16:23 ` [PATCH 009/208] x86/fpu: Optimize fpu__unlazy_stopped() Ingo Molnar
2015-05-05 16:23 ` [PATCH 010/208] x86/fpu: Simplify fpu__unlazy_stopped() Ingo Molnar
2015-05-05 16:23 ` [PATCH 011/208] x86/fpu: Remove fpu_allocated() Ingo Molnar
2015-05-05 16:23 ` [PATCH 012/208] x86/fpu: Move fpu_alloc() out of line Ingo Molnar
2015-05-05 16:23 ` [PATCH 013/208] x86/fpu: Rename fpu_alloc() to fpstate_alloc() Ingo Molnar
2015-05-05 16:23 ` [PATCH 014/208] x86/fpu: Rename fpu_free() to fpstate_free() Ingo Molnar
2015-05-05 16:23 ` [PATCH 015/208] x86/fpu: Rename fpu_finit() to fpstate_init() Ingo Molnar
2015-05-05 16:23 ` [PATCH 016/208] x86/fpu: Rename fpu_init() to fpu__cpu_init() Ingo Molnar
2015-05-05 16:23 ` [PATCH 017/208] x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size() Ingo Molnar
2015-05-05 16:23 ` [PATCH 018/208] x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter Ingo Molnar
2015-05-05 16:23 ` [PATCH 019/208] x86/fpu: Improve the comment for the fpu::counter field Ingo Molnar
2015-05-05 16:24 ` [PATCH 020/208] x86/fpu: Move FPU data structures to asm/fpu_types.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 021/208] x86/fpu: Clean up asm/fpu/types.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 022/208] x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/ Ingo Molnar
2015-05-05 16:24 ` [PATCH 023/208] x86/fpu: Fix header file dependencies of fpu-internal.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 024/208] x86/fpu: Split out the boot time FPU init code into fpu/init.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 025/208] x86/fpu: Remove unnecessary includes from core.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 026/208] x86/fpu: Move the no_387 handling and FPU detection code into init.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 027/208] x86/fpu: Remove the free_thread_xstate() complication Ingo Molnar
2015-05-05 16:24 ` [PATCH 028/208] x86/fpu: Factor out fpu__flush_thread() from flush_thread() Ingo Molnar
2015-05-05 16:24 ` [PATCH 029/208] x86/fpu: Move math_state_restore() to fpu/core.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 030/208] x86/fpu: Rename math_state_restore() to fpu__restore() Ingo Molnar
2015-05-05 16:24 ` [PATCH 031/208] x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs() Ingo Molnar
2015-05-05 16:24 ` [PATCH 032/208] x86/fpu: Simplify the xsave_state*() methods Ingo Molnar
2015-05-05 16:24 ` [PATCH 033/208] x86/fpu: Remove fpu_xsave() Ingo Molnar
2015-05-05 16:24 ` [PATCH 034/208] x86/fpu: Move task_xstate_cachep handling to core.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 035/208] x86/fpu: Factor out fpu__copy() Ingo Molnar
2015-05-05 16:24 ` [PATCH 036/208] x86/fpu: Uninline fpstate_free() and move it next to the allocation function Ingo Molnar
2015-05-05 16:24 ` [PATCH 037/208] x86/fpu: Make task_xstate_cachep static Ingo Molnar
2015-05-05 16:24 ` [PATCH 038/208] x86/fpu: Make kernel_fpu_disable/enable() static Ingo Molnar
2015-05-05 16:24 ` [PATCH 039/208] x86/fpu: Add debug check to kernel_fpu_disable() Ingo Molnar
2015-05-05 16:24 ` [PATCH 040/208] x86/fpu: Add kernel_fpu_disabled() Ingo Molnar
2015-05-05 16:24 ` [PATCH 041/208] x86/fpu: Remove __save_init_fpu() Ingo Molnar
2015-05-05 16:24 ` [PATCH 042/208] x86/fpu: Move fpu_copy() to fpu/core.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 043/208] x86/fpu: Add debugging check to fpu_copy() Ingo Molnar
2015-05-05 16:24 ` [PATCH 044/208] x86/fpu: Print out whether we are doing lazy/eager FPU context switches Ingo Molnar
2015-05-05 16:24 ` [PATCH 045/208] x86/fpu: Eliminate the __thread_has_fpu() wrapper Ingo Molnar
2015-05-05 16:24 ` [PATCH 046/208] x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter Ingo Molnar
2015-05-05 16:24 ` [PATCH 047/208] x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 048/208] x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx Ingo Molnar
2015-05-05 16:24 ` [PATCH 049/208] x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu() Ingo Molnar
2015-05-05 16:24 ` [PATCH 050/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end() Ingo Molnar
2015-05-05 16:24 ` [PATCH 051/208] x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin() Ingo Molnar
2015-05-05 16:24 ` [PATCH 052/208] x86/fpu: Open code PF_USED_MATH usages Ingo Molnar
2015-05-05 16:24 ` [PATCH 053/208] x86/fpu: Document fpu__unlazy_stopped() Ingo Molnar
2015-05-05 16:24 ` [PATCH 054/208] x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active Ingo Molnar
2015-05-06  0:51   ` Andy Lutomirski
2015-05-06  3:24     ` Ingo Molnar
2015-05-05 16:24 ` [PATCH 055/208] x86/fpu: Remove 'struct task_struct' usage from drop_fpu() Ingo Molnar
2015-05-05 16:24 ` [PATCH 056/208] x86/fpu: Remove task_disable_lazy_fpu_restore() Ingo Molnar
2015-05-05 16:24 ` [PATCH 057/208] x86/fpu: Use 'struct fpu' in fpu_lazy_restore() Ingo Molnar
2015-05-05 16:24 ` [PATCH 058/208] x86/fpu: Use 'struct fpu' in restore_fpu_checking() Ingo Molnar
2015-05-05 16:24 ` [PATCH 059/208] x86/fpu: Use 'struct fpu' in fpu_reset_state() Ingo Molnar
2015-05-05 16:24 ` [PATCH 060/208] x86/fpu: Use 'struct fpu' in switch_fpu_prepare() Ingo Molnar
2015-05-05 16:24 ` [PATCH 061/208] x86/fpu: Use 'struct fpu' in switch_fpu_finish() Ingo Molnar
2015-05-05 16:24 ` [PATCH 062/208] x86/fpu: Move __save_fpu() into fpu/core.c Ingo Molnar
2015-05-05 16:24 ` [PATCH 063/208] x86/fpu: Use 'struct fpu' in __fpu_save() Ingo Molnar
2015-05-05 16:24 ` [PATCH 064/208] x86/fpu: Use 'struct fpu' in fpu__save() Ingo Molnar
2015-05-05 16:24 ` [PATCH 065/208] x86/fpu: Use 'struct fpu' in fpu_copy() Ingo Molnar
2015-05-05 16:24 ` [PATCH 066/208] x86/fpu: Use 'struct fpu' in fpu__copy() Ingo Molnar
2015-05-05 16:24 ` [PATCH 067/208] x86/fpu: Use 'struct fpu' in fpstate_alloc_init() Ingo Molnar
2015-05-05 16:24 ` [PATCH 068/208] x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped() Ingo Molnar
2015-05-05 16:24 ` [PATCH 069/208] x86/fpu: Rename fpu__flush_thread() to fpu__clear() Ingo Molnar
2015-05-05 16:24 ` [PATCH 070/208] x86/fpu: Clean up fpu__clear() a bit Ingo Molnar
2015-05-05 16:24 ` [PATCH 071/208] x86/fpu: Rename i387.h to fpu/api.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 072/208] x86/fpu: Move xsave.h to fpu/xsave.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 073/208] x86/fpu: Rename fpu-internal.h to fpu/internal.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 074/208] x86/fpu: Move MXCSR_DEFAULT " Ingo Molnar
2015-05-05 16:24 ` [PATCH 075/208] x86/fpu: Remove xsave_init() __init obfuscation Ingo Molnar
2015-05-05 16:24 ` [PATCH 076/208] x86/fpu: Remove assembly guard from asm/fpu/api.h Ingo Molnar
2015-05-05 16:24 ` [PATCH 077/208] x86/fpu: Improve FPU detection kernel messages Ingo Molnar
2015-05-05 16:24 ` [PATCH 078/208] x86/fpu: Print supported xstate features in human readable way Ingo Molnar
2015-05-05 16:24 ` [PATCH 079/208] x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask' Ingo Molnar
2015-05-05 17:14 ` [PATCH 000/208] big x86 FPU code rewrite Linus Torvalds
2015-05-05 17:50   ` Ingo Molnar
2015-07-17 23:52     ` Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.