From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33800) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a6KqE-0005j4-98 for qemu-devel@nongnu.org; Tue, 08 Dec 2015 11:10:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a6KqA-0000sB-Fg for qemu-devel@nongnu.org; Tue, 08 Dec 2015 11:10:14 -0500 Received: from mail-qg0-x22f.google.com ([2607:f8b0:400d:c04::22f]:36375) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a6KqA-0000s5-Ay for qemu-devel@nongnu.org; Tue, 08 Dec 2015 11:10:10 -0500 Received: by qgcc31 with SMTP id c31so22774938qgc.3 for ; Tue, 08 Dec 2015 08:10:09 -0800 (PST) Sender: Richard Henderson References: <1449576535-3369-1-git-send-email-liang.z.li@intel.com> <1449576535-3369-2-git-send-email-liang.z.li@intel.com> From: Richard Henderson Message-ID: <566700CA.1080408@twiddle.net> Date: Tue, 8 Dec 2015 08:09:46 -0800 MIME-Version: 1.0 In-Reply-To: <1449576535-3369-2-git-send-email-liang.z.li@intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [v3 1/3] cutils: add avx2 instruction optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liang Li , qemu-devel@nongnu.org Cc: quintela@redhat.com, mst@redhat.com, dgilbert@redhat.com, stefanha@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com On 12/08/2015 04:08 AM, Liang Li wrote: > +++ b/util/buffer-zero-avx2.c > @@ -0,0 +1,54 @@ > +#include "qemu-common.h" > + > +#if defined CONFIG_IFUNC && defined CONFIG_AVX2 > +#include > +#define AVX2_VECTYPE __m256i > +#define AVX2_SPLAT(p) _mm256_set1_epi8(*(p)) > +#define AVX2_ALL_EQ(v1, v2) \ > + (_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == 0xFFFFFFFF) > +#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2)) > + > +inline bool > +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len) > +{ > + return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > + * sizeof(AVX2_VECTYPE)) == 0 > + && ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0); > +} I'm not keen on adding a new file for this. You ought to be able to use __attribute__((target("-mavx2"))) on any compiler that supports the command-line option. Which means you can do this all in one file with static functions. Nor am I keen on marking a function inline when we know it must be out-of-line because of the ifunc usage. r~