All of lore.kernel.org
 help / color / mirror / Atom feed
* memcpy_to/fromio() is badly optimised on x86
@ 2017-11-22 10:43 David Laight
  0 siblings, 0 replies; only message in thread
From: David Laight @ 2017-11-22 10:43 UTC (permalink / raw)
  To: 'linux-kernel@vger.kernel.org'

I believe that it is valid to use memcpy_to/fromio() to copy
data to/from memory BARs on PCIe cards.

However on x86 they are both aliases for memcpy().

The x86 kernel has several implementations of memcpy().
The 'best' one for the current cpu is selected during boot.

For more recent Intel cpus (probably Haswell and later) the
selected implementation is just 'rep movsb' relying on
the hardware to do all its 'clever' optimisations.

These optimisations are only done for cached addresses,
for uncached ones (and definitely for PCIe ones) single
byte copies are used.
(Verified on 4.13 with a PCIe monitor (of sorts).)

With the typical large read latency of PCIe this makes
memcpy_fromio() particularly painful.

memcpy_to/fromio() should be using 'rep movsd' for
the bulk of the copy.

	David

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-11-22 10:42 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22 10:43 memcpy_to/fromio() is badly optimised on x86 David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.