linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Neuling <mikey@neuling.org>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: [RFC/PATCH 0/7] powerpc: Implement lazy save of FP, VMX and VSX state in SMP
Date: Tue, 07 Dec 2010 10:40:43 +1100	[thread overview]
Message-ID: <20101206234043.083045003@neuling.org> (raw)

This implements lazy save of FP, VMX and VSX state on SMP 64bit and 32
bit powerpc.

Currently we only do lazy save in UP, but this patch set extends this to
SMP.  We always do lazy restore.

For VMX, on a context switch we do the following:
 - if we are switching to a CPU that currently holds the new processes
   state, just turn on VMX in the MSR (this is the lazy/quick case)
 - if the new processes state is in the thread_struct, turn VMX off.
 - if the new processes state is in someone else's CPU, IPI that CPU to
   giveup it's state and turn VMX off in the MSR (slow IPI case).
We always start the new process at this point, irrespective of if we
have the state or not in the thread struct or current CPU.  

So in the slow case, we attempt to avoid the IPI latency by starting
the process immediately and only waiting for the state to be flushed
when the process actually needs VMX.  ie. when we take the VMX
unavailable exception after the context switch.

FP is implemented in a similar way.  VSX reuses the FP and VMX code as
it doesn't have any additional state over what FP and VMX used.

I've been benchmarking with Anton Blanchard's context_switch.c benchmark
found here: 
  http://ozlabs.org/~anton/junkcode/context_switch.c 
Using this benchmark as is gives no degradation in performance with these
patches applied.  

Inserting a simple FP instruction into one of the threads (gives the
nice save/restore lazy case), I get about a 4% improvement in context
switching rates with my patches applied.  I get similar results VMX.
With a simple VSX instruction (VSX state is 64x128bit registers) in 1
thread I get an 8% bump in performance with these patches.

With FP/VMX/VSX instructions in both threads, I get no degradation in
performance.

Running lmbench doesn't have any degradation in performance.

Most of my benchmarking and testing has been done on 64 bit systems.
I've tested 32 bit FP but I've not tested 32 bit VMX at all.

There is probably some optimisations to my asm code that can also be
made.  I've been concentrating on correctness, as opposed to speed
with the asm code, since if you get a lazy context switch, you skip
all the asm now anyway.

Whole series is bisectable to compile with various 64/32bit SMP/UP
FPU/VMX/VSX config options on and off.

I really hate the include file changes in this series.  Getting the
call_single_data in the powerpc threads_struct was a PITA :-)

Mikey

Signed-off-by: Michael Neuling <mikey@neuling.org>

             reply	other threads:[~2010-12-06 23:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-06 23:40 Michael Neuling [this message]
2010-12-06 23:40 ` [RFC/PATCH 1/7] Add csd_locked function Michael Neuling
2010-12-06 23:40 ` [RFC/PATCH 2/7] Rearrange include files to make struct call_single_data usable in more places Michael Neuling
2010-12-06 23:40 ` [RFC/PATCH 3/7] powerpc: Reorganise powerpc include files to make call_single_data Michael Neuling
2010-12-06 23:40 ` [RFC/PATCH 4/7] powerpc: Change fast_exception_return to restore r0, r7. r8, and CTR Michael Neuling
2010-12-06 23:40 ` [RFC/PATCH 5/7] powerpc: Enable lazy save VMX registers for SMP Michael Neuling
2010-12-06 23:40 ` [RFC/PATCH 6/7] powerpc: Enable lazy save FP " Michael Neuling
2010-12-06 23:40 ` [RFC/PATCH 7/7] powerpc: Enable lazy save VSX " Michael Neuling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101206234043.083045003@neuling.org \
    --to=mikey@neuling.org \
    --cc=benh@kernel.crashing.org \
    --cc=galak@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).