4x4 single-precision matrix product with SSE

* 4x4 single-precision matrix product with SSE
@ 2011-03-11 22:49 Nicolas Bock
  2011-03-12  8:32 ` Frederic Marmond
       [not found] ` <AANLkTimCWmanFU19admtg5q18HvCOxrdjm+9XWFT-0Zm@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Nicolas Bock @ 2011-03-11 22:49 UTC (permalink / raw)
  To: linux-assembly

[-- Attachment #1: Type: text/plain, Size: 469 bytes --]

Hello list,

I am writing an assembly function that multiplies 2 4x4 single precision
matrices. I wrote 2 versions, one using SSE the other using SSE4.1. What
surprised me is that the SSE4.1 version fails to beat the SSE version,
it is in fact slightly slower.

Is this the right place to ask for help? If anyone is interested I can
post some code which would maybe clarify the situation a bit.

If this is not the right place, please ignore me...

nick

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread