All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] More i387 state save/restore work
@ 2012-02-19 22:23 Linus Torvalds
  2012-02-19 22:26 ` [PATCH 1/2] i387: use 'restore_fpu_checking()' directly in task switching code Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Linus Torvalds @ 2012-02-19 22:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin
  Cc: x86, Linux Kernel Mailing List


Ok, this is a series of two patches that continue my i387 state 
save/restore series, but aren't necessarily worth it for Linux-3.3.

That said, the first one is a bug-fix - but it's an old bug, and I'm not 
sure it can actually be triggered. The failure path for the FP state 
preload is bogus - and always was. But I'm not sure it really *can* fail.

The first one has another small bugfix in it too, and I think that one may 
be new to the rewritten FP state preloading - it doesn't update the 
fpu_counter, so once it starts preloading, it never stops.

I wrote a silly FPU task switch testing program, which basically starts 
two processes pinned to the same CPU, and then uses sched_yield() in both 
to switch back-and-forth between them. *One* of the processes uses the FPU 
between every yield, the other does not. It runs for two seconds, and 
counts how many loops it gets through.

With that test, I get:

 - Plain 3.3-rc4:

   [torvalds@i5 ~]$ uname -r
   3.3.0-rc4
   [torvalds@i5 ~]$ ./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;
   2216090 loops in 2 seconds
   2216922 loops in 2 seconds
   2217148 loops in 2 seconds
   2232191 loops in 2 seconds
   2186203 loops in 2 seconds
   2231614 loops in 2 seconds

 - With the first patch that fixes the FPU preloading to eventually stop:

   [torvalds@i5 ~]$ uname -r
   3.3.0-rc4-00001-g704ed737bd3c
   [torvalds@i5 ~]$ ./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;
   2306667 loops in 2 seconds
   2295760 loops in 2 seconds
   2295494 loops in 2 seconds
   2296282 loops in 2 seconds
   2282229 loops in 2 seconds
   2301842 loops in 2 seconds

 - With the second patch that does the lazy preloading

   [torvalds@i5 ~]$ uname -r
   3.3.0-rc4-00002-g022899d937f9
   [torvalds@i5 ~]$ ./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;
   2466973 loops in 2 seconds
   2456168 loops in 2 seconds
   2449863 loops in 2 seconds
   2461588 loops in 2 seconds
   2478256 loops in 2 seconds
   2476844 loops in 2 seconds

so these things do make some difference. But it is also interesting to see 
from profiles just how expensive setting CR0.TS is (the write to CR0 is 
very expensive indeed), so even when you avoid the FP state restore 
lazily, just setting TS in between task switches is still a big cost of 
FPU save/restore.

Linus Torvalds (2):
  i387: use 'restore_fpu_checking()' directly in task switching code
  i387: support lazy restore of FPU state

 arch/x86/include/asm/i387.h      |   48 +++++++++++++++++++++++++++----------
 arch/x86/include/asm/processor.h |    3 +-
 arch/x86/kernel/cpu/common.c     |    2 +
 arch/x86/kernel/process_32.c     |    2 +-
 arch/x86/kernel/process_64.c     |    2 +-
 arch/x86/kernel/traps.c          |   40 ++++++-------------------------
 6 files changed, 49 insertions(+), 48 deletions(-)

Comments? I feel confident enough about these that I thin kthey might even 
work in 3.3, especially the first one. But I want people to look at 
them.

                     Linus

-- 
1.7.9.188.g12766.dirty


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2012-03-01 11:30 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-19 22:23 [PATCH 0/2] More i387 state save/restore work Linus Torvalds
2012-02-19 22:26 ` [PATCH 1/2] i387: use 'restore_fpu_checking()' directly in task switching code Linus Torvalds
2012-02-19 22:37   ` [PATCH 2/2] i387: support lazy restore of FPU state Linus Torvalds
2012-02-19 22:44     ` H. Peter Anvin
2012-02-19 23:18       ` H. Peter Anvin
2012-02-19 23:56       ` Linus Torvalds
2012-02-20  7:51     ` Ingo Molnar
2012-02-20  0:53 ` [PATCH 0/2] More i387 state save/restore work Michael Neuling
2012-02-20  1:03   ` Linus Torvalds
2012-02-20  1:06     ` Linus Torvalds
2012-02-20  1:11       ` Linus Torvalds
2012-03-01 11:30         ` Benjamin Herrenschmidt
2012-02-20  2:09     ` Indan Zupancic
2012-02-20 19:46 ` [PATCH v2 0/3] " Linus Torvalds
2012-02-20 19:47   ` [PATCH v2 1/3] i387: fix up some fpu_counter confusion Linus Torvalds
2012-02-20 19:48     ` [PATCH v2 2/3] i387: use 'restore_fpu_checking()' directly in task switching code Linus Torvalds
2012-02-20 19:48       ` [PATCH v2 3/3] i387: support lazy restore of FPU state Linus Torvalds
2012-02-21  1:50         ` Josh Boyer
2012-02-21  2:10           ` Linus Torvalds
2012-02-21  2:14             ` H. Peter Anvin
2012-02-21  5:27               ` Linus Torvalds
2012-02-21  5:35                 ` H. Peter Anvin
2012-02-21 14:19                 ` Josh Boyer
2012-02-21 17:59                 ` H. Peter Anvin
2012-02-21 18:06                   ` Ingo Molnar
2012-02-21 18:26                   ` Linus Torvalds
2012-02-21 21:14                     ` H. Peter Anvin
2012-02-21 21:39                       ` [PATCH 0/2] i387: FP state interface cleanups Linus Torvalds
2012-02-21 21:40                         ` [PATCH 1/2] i387: uninline the generic FP helpers that we expose to kernel modules Linus Torvalds
2012-02-21 21:41                           ` [PATCH 2/2] i387: split up <asm/i387.h> into exported and internal interfaces Linus Torvalds
2012-02-21 23:50                             ` [tip:x86/fpu] i387: Split " tip-bot for Linus Torvalds
2012-02-28 11:21                             ` [PATCH 2/2] i387: split " Avi Kivity
2012-02-28 11:21                               ` Avi Kivity
2012-02-28 16:05                               ` Linus Torvalds
2012-02-28 17:21                                 ` Avi Kivity
2012-02-28 17:21                                   ` Avi Kivity
2012-02-28 17:37                                   ` Linus Torvalds
2012-02-28 18:08                                     ` Linus Torvalds
2012-02-28 18:29                                       ` Avi Kivity
2012-02-28 18:29                                         ` Avi Kivity
2012-02-28 18:09                                     ` Avi Kivity
2012-02-28 18:09                                       ` Avi Kivity
2012-02-28 18:34                                       ` Linus Torvalds
2012-02-28 19:06                                         ` Avi Kivity
2012-02-28 19:06                                           ` Avi Kivity
2012-02-28 19:26                                           ` Linus Torvalds
2012-02-28 19:45                                             ` Avi Kivity
2012-02-28 19:45                                               ` Avi Kivity
2012-02-21 23:49                           ` [tip:x86/fpu] i387: Uninline the generic FP helpers that we expose to kernel modules tip-bot for Linus Torvalds
2012-02-21  2:18             ` [PATCH v2 3/3] i387: support lazy restore of FPU state Linus Torvalds
2012-02-21  2:32               ` H. Peter Anvin
2012-02-21  2:11           ` H. Peter Anvin
2012-02-21 21:54         ` Suresh Siddha
2012-02-21 21:57           ` Linus Torvalds
2012-02-21 22:19             ` Suresh Siddha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.