All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Robert Hoo <robert.hu@linux.intel.com>
Subject: [PULL 07/15] util/bufferiszero: improve avx2 accelerator
Date: Thu,  2 Apr 2020 15:06:32 -0400	[thread overview]
Message-ID: <20200402190640.1693-8-pbonzini@redhat.com> (raw)
In-Reply-To: <20200402190640.1693-1-pbonzini@redhat.com>

From: Robert Hoo <robert.hu@linux.intel.com>

By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a
branch.

The authorship of this patch actually belongs to Richard Henderson
<richard.henderson@linaro.org>, I just fixed a boundary case on his
original patch.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1585119021-46593-2-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/bufferiszero.c | 26 +++++++++-----------------
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index b8012532e4..695bb4ce28 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -158,27 +158,19 @@ buffer_zero_avx2(const void *buf, size_t len)
     __m256i *p = (__m256i *)(((uintptr_t)buf + 5 * 32) & -32);
     __m256i *e = (__m256i *)(((uintptr_t)buf + len) & -32);
 
-    if (likely(p <= e)) {
-        /* Loop over 32-byte aligned blocks of 128.  */
-        do {
-            __builtin_prefetch(p);
-            if (unlikely(!_mm256_testz_si256(t, t))) {
-                return false;
-            }
-            t = p[-4] | p[-3] | p[-2] | p[-1];
-            p += 4;
-        } while (p <= e);
-    } else {
-        t |= _mm256_loadu_si256(buf + 32);
-        if (len <= 128) {
-            goto last2;
+    /* Loop over 32-byte aligned blocks of 128.  */
+    while (p <= e) {
+        __builtin_prefetch(p);
+        if (unlikely(!_mm256_testz_si256(t, t))) {
+            return false;
         }
-    }
+        t = p[-4] | p[-3] | p[-2] | p[-1];
+        p += 4;
+    } ;
 
     /* Finish the last block of 128 unaligned.  */
     t |= _mm256_loadu_si256(buf + len - 4 * 32);
     t |= _mm256_loadu_si256(buf + len - 3 * 32);
- last2:
     t |= _mm256_loadu_si256(buf + len - 2 * 32);
     t |= _mm256_loadu_si256(buf + len - 1 * 32);
 
@@ -263,7 +255,7 @@ static void init_accel(unsigned cache)
     }
     if (cache & CACHE_AVX2) {
         fn = buffer_zero_avx2;
-        length_to_accel = 64;
+        length_to_accel = 128;
     }
 #endif
 #ifdef CONFIG_AVX512F_OPT
-- 
2.18.2




  parent reply	other threads:[~2020-04-02 19:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-02 19:06 [PULL 00/15] Misc patches for 5.0-rc2 Paolo Bonzini
2020-04-02 19:06 ` [PULL 01/15] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset Paolo Bonzini
2020-04-02 19:06 ` [PULL 02/15] hw/isa/superio: Correct the license text Paolo Bonzini
2020-04-02 19:06 ` [PULL 03/15] virtio-iommu: depend on PCI Paolo Bonzini
2020-04-02 19:06 ` [PULL 04/15] softmmu: fix crash with invalid -M memory-backend= Paolo Bonzini
2020-04-02 19:06 ` [PULL 05/15] MAINTAINERS: Add an entry for the HVF accelerator Paolo Bonzini
2020-04-02 19:06 ` [PULL 06/15] util/bufferiszero: assign length_to_accel value for each accelerator case Paolo Bonzini
2020-04-02 19:06 ` Paolo Bonzini [this message]
2020-04-02 19:06 ` [PULL 08/15] vl: fix broken IPA range for ARM -M virt with KVM enabled Paolo Bonzini
2020-04-02 19:06 ` [PULL 09/15] i386: hvf: Reset IRQ inhibition after moving RIP Paolo Bonzini
2020-04-02 19:06 ` [PULL 10/15] serial: Fix double migration data Paolo Bonzini
2020-04-02 19:06 ` [PULL 11/15] target/i386: do not set unsupported VMX secondary execution controls Paolo Bonzini
2020-04-02 19:06 ` [PULL 12/15] migration: fix cleanup_bh leak on resume Paolo Bonzini
2020-04-02 19:06 ` [PULL 13/15] qmp: fix leak on callbacks that return both value and error Paolo Bonzini
2020-04-02 19:06 ` [PULL 14/15] object-add: don't create return value if failed Paolo Bonzini
2020-04-02 19:06 ` [PULL 15/15] xen: fixup RAM memory region initialization Paolo Bonzini
2020-04-02 20:16 ` [PULL 00/15] Misc patches for 5.0-rc2 no-reply
2020-04-02 20:17 ` no-reply
2020-04-02 20:17 ` no-reply
2020-04-03  9:07 ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200402190640.1693-8-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=robert.hu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.