[Qemu-devel] [v3 0/3] add avx2 instruction optimization

* [Qemu-devel] [v3 0/3] add avx2 instruction optimization
@ 2015-12-08 12:08 Liang Li
  2015-12-08 12:08 ` [Qemu-devel] [v3 1/3] cutils: " Liang Li
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Liang Li @ 2015-12-08 12:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Liang Li, quintela, mst, dgilbert, stefanha, amit.shah, pbonzini, rth

buffer_find_nonzero_offset() is a hot function during live migration.
Now it use SSE2 intructions for optimization. For platform supports
AVX2 instructions, use the AVX2 instructions for optimization can help
to improve the performance about 30% comparing to SSE2.
Zero page check can be faster with this optimization, the test result
shows that for an 8GB RAM idle guest, this patch can help to shorten
the total live migration time about 6%.

This patch use the ifunc mechanism to select the proper function when
running, for platform supports AVX2, excute the AVX2 instructions,
else, excute the original code.

With this patch, the QEMU binary can run on both platforms support AVX2
or not.

Compiler which desn't support the AVX2 or ifunc attribute can build the
source code successfully.

v2 -> v3 changes:
  * Detect the ifunc attribute support (Paolo's suggestion) 
  * Use the ifunc attribute instead of the inline asm (Richard's suggestion)
  * Change the configure (Juan's suggestion)

Liang Li (3):
  cutils: add avx2 instruction optimization
  configure: detect ifunc attribute
  configure: add options to config avx2

 configure               | 50 +++++++++++++++++++++++++++++++++++++
 include/qemu-common.h   | 13 +++++-----
 util/Makefile.objs      |  2 ++
 util/buffer-zero-avx2.c | 54 ++++++++++++++++++++++++++++++++++++++++
 util/cutils.c           | 65 +++++++++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 175 insertions(+), 9 deletions(-)
 create mode 100644 util/buffer-zero-avx2.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread