All of lore.kernel.org
 help / color / mirror / Atom feed
* [MODERATED] Some microperf tests
@ 2019-02-23 18:26 Andrew Cooper
  2019-02-23 19:30 ` [MODERATED] " Linus Torvalds
  2019-03-07 14:26 ` [MODERATED] Updated " Andrew Cooper
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Cooper @ 2019-02-23 18:26 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

Hello,

So I've finally got my Coffee Lake system and alpha microcode working.

All numbers are the deltas between two RDTSCP instructions, with the
single instruction under test and just enough compiler-inserted mov's to
preserve the output of the first RDTSCP for later calculations.

(Insert some disclaimer about these not being statistically rigorous,
but they do at least give a rough ballpark.)

Pre microcode:
* VERW of NUL   => 65-69 cycles
* VERW of %ds   => 33-37 cycles
* MSR_FLUSH_CMD => 925-980 cycles

Post microcode:
* VERW of NUL   => 512-520 cycles
* VERW of %ds   => 520-540 cycles
* MSR_FLUSH_CMD => 1300-1500 cycles


So, MSR_FLUSH_CMD has got longer, but not by as much as VERW got longer
by.  Pre microcode, the "use %ds" advice is clearly a win, but post
microcode, it appears to be fractionally worse.

I've raise the selector question with Intel - its possible it is a side
effect of this piece of alpha ucode being an early prototype, or that
this particular system is different to most older parts.

Thanks,

~Andrew


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-07 15:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-23 18:26 [MODERATED] Some microperf tests Andrew Cooper
2019-02-23 19:30 ` [MODERATED] " Linus Torvalds
2019-02-23 20:42   ` Andrew Cooper
2019-02-24 14:23   ` Andi Kleen
2019-03-07 14:26 ` [MODERATED] Updated " Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.