All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Victor Kaplansky <victork@redhat.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"Li, Liang Z" <liang.z.li@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"amit.shah@redhat.com" <amit.shah@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization
Date: Thu, 7 Apr 2016 15:54:05 +0300	[thread overview]
Message-ID: <20160407154040-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <20160407110951.GB2240@work-vm>

On Thu, Apr 07, 2016 at 12:09:52PM +0100, Dr. David Alan Gilbert wrote:
> * Eric Blake (eblake@redhat.com) wrote:
> > On 11/12/2015 12:56 PM, Dr. David Alan Gilbert wrote:
> > 
> > >> One thing I still can't understand, why the unit test in host environment shows
> > >> 'memcmp()' have better performance?
> > 
> > Have you tried running under a profiler, to see if there are hotspots or
> > at least get an idea of where the time is being spent?
> > 
> > > 
> > > Are you aware of any program other than QEMU that also wants to do something
> > > similar?  Finding whether a block of memory is zero, sounds like something
> > > that would be useful in lots of places, I just can't think which ones.
> > 
> > At least dd, cp, and probably several other utilities.  It would be nice
> > to post an RFE to glibc to see if they can come up with a dedicated
> > interface that is faster than memcmp(), although that still only helps
> > us when targetting a system new enough to have that interface.
> 
> I've just posted that RFE:
> https://sourceware.org/bugzilla/show_bug.cgi?id=19920
> 
> Dave

Have you guys seen the discussion in
http://rusty.ozlabs.org/?p=560#respond

In particular it claims this is close to optimal:


char check_zero(char *p, int len)
{
    char res = 0;
    int i;

    for (i = 0; i < len; i++) {
        res = res | p[i];
    }

    return res;
}


If you compile this function with --tree-vectorize and --unroll-loops.

Now, this version always scans all of the buffer, so
it will be slower when buffer is *not* all-zeroes.

Which might indicate that you need to know what your
workload is to implement compare to zero efficiently,
and if that is the case, it's not clear this is appropriate for libc.


> > -- 
> > Eric Blake   eblake redhat com    +1-919-301-3266
> > Libvirt virtualization library http://libvirt.org
> > 
> 
> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2016-04-07 12:54 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-10  2:51 [Qemu-devel] [v2 0/2] add avx2 instruction optimization Liang Li
2015-11-10  2:51 ` [Qemu-devel] [v2 1/2] cutils: " Liang Li
2015-11-12 10:08   ` Paolo Bonzini
2015-11-12 10:12     ` Li, Liang Z
2015-11-12 11:30     ` Juan Quintela
2015-11-13  2:49     ` Li, Liang Z
2015-11-13  9:30       ` Paolo Bonzini
2015-11-12 14:43   ` Richard Henderson
2015-11-10  2:51 ` [Qemu-devel] [v2 2/2] configure: add options to config avx2 Liang Li
2015-11-10  3:43 ` [Qemu-devel] [v2 0/2] add avx2 instruction optimization Eric Blake
2015-11-10  5:48   ` Li, Liang Z
2015-11-10  9:13     ` Juan Quintela
2015-11-10  9:26       ` Li, Liang Z
2015-11-10  9:35         ` Paolo Bonzini
2015-11-10  9:41           ` Li, Liang Z
2015-11-10  9:50             ` Paolo Bonzini
2015-11-10  9:56               ` Li, Liang Z
2015-11-10 10:00                 ` Paolo Bonzini
2015-11-10 10:04                   ` Li, Liang Z
2015-11-12  2:49           ` Li, Liang Z
2015-11-12  8:43             ` Paolo Bonzini
2015-11-12  8:53               ` Li, Liang Z
2015-11-12  9:04                 ` Paolo Bonzini
2015-11-12  9:40                   ` Li, Liang Z
2015-11-12  9:45                     ` Paolo Bonzini
2015-11-12  9:53                       ` Li, Liang Z
2015-11-12 11:34                         ` Juan Quintela
2015-11-12 11:42                           ` Li, Liang Z
2015-11-12 19:56                             ` Dr. David Alan Gilbert
2015-11-12 20:20                               ` Eric Blake
2016-04-07 11:09                                 ` Dr. David Alan Gilbert
2016-04-07 12:54                                   ` Michael S. Tsirkin [this message]
2016-04-07 13:42                                     ` Dr. David Alan Gilbert
2016-04-07 13:54                                     ` Paolo Bonzini
2015-11-10  9:30       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160407154040-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=liang.z.li@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=victork@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.