All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Peter Lieven <pl@kamp.de>
Cc: Orit Wasserman <owasserm@redhat.com>,
	quintela@redhat.com, qemu-devel@nongnu.org,
	Stefan Hajnoczi <stefanha@gmail.com>
Subject: Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations
Date: Mon, 25 Mar 2013 09:02:02 -0400 (EDT)	[thread overview]
Message-ID: <972929461.13095041.1364216522903.JavaMail.root@redhat.com> (raw)
In-Reply-To: <CC5AC377-A8B1-4FD2-B249-8A27476D1ED6@kamp.de>

> Maybe I should have explained the output more detailed. The percentages
> are added. 35.8% in the second last column means that
> 35.8% have a return value that is less than TARGET_PAGE_SIZE.
> This was meant to illustrate at how many 64-bit chunks you have
> to look to grab a certain percentage of non-zero pages.

Ok, I wrongly understood that many pages had 4088 zero bytes but
the last 8 were not zero.  Now it's clearer, and more logical too. :)

> Looking e.g. at the third value it means that looking at the first
> three 64-bit chunks it will catch 34.0% of all pages.
> It turns out that the non-zeroness of a page can be detected looking
> at the first 256 or so bits and only a low
> percentage turns out to be non-zero at a later position. So after
> having checked the first chunks one by one
> there is no big penalty looking at the remaining chunks with the
> vectorized loop.

I think it makes most sense to unroll the first four non-vectorized
iterations, i.e. not use SSE and use three or four ifs.  Either:

   if (foo[0]) return 0;
   if (foo[1]) return 8;
   if (foo[2]) return 16;
   if (foo[3]) return 24;

or

   if (foo[0]) return 0;
   if (foo[1] | foo[2] | foo[3]) return 8;

and then proceed on the remaining 4096-4*sizeof(long) bytes with
the vectorized loop.  foo+4 is aligned for SIMD operations on both
32- and 64-bit machines, which makes this a nice choice.

Paolo

> Here is the distribution of return values for the Windows XP example:
> 
> 25.62% 0.49% 7.86% 0.12% 0.15% 0.05% 0.05% 0.04% 0.05% 0.02% 0.03%
> 0.02% 0.03% 0.02% 0.02% 0.01% 0.03% 0.02% 0.01% 0.02% 0.02% 0.01%
> 0.02% 0.01% 0.01% 0.01% 0.01% 0.01% 0.02% 0.00% 0.01% 0.02% 0.03%
> 0.01% 0.01% 0.01% 0.01% 0.01% 0.01% 0.00% 0.00% 0.01% 0.07% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.01% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.01% 0.02% 0.00% 0.00% 0.00%
> 0.00% 0.01% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.02% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.03% 0.00% 0.00% 0.00%
> 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.02% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.01%
> 0.00% 0.00% 0.00% 0.02% 0.00% 0.00% 0.02% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.01% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01%
> 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
> 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 64.23%
> 
> The last value is the percentage of return value of TARGET_PAGE_SIZE
> meaning the page is all zero.
> 
> Peter
> 
> 

  reply	other threads:[~2013-03-25 13:02 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-22 12:46 [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations Peter Lieven
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 1/9] move vector definitions to qemu-common.h Peter Lieven
2013-03-25  8:35   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer Peter Lieven
2013-03-22 19:37   ` Eric Blake
2013-03-22 20:03     ` Peter Lieven
2013-03-22 20:22       ` [Qemu-devel] indentation hints [was: [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer] Eric Blake
2013-03-23 11:18         ` Peter Maydell
2013-03-25  8:53   ` [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer Orit Wasserman
2013-03-25  8:56     ` Peter Lieven
2013-03-25  9:26       ` Orit Wasserman
2013-03-25  9:42         ` Paolo Bonzini
2013-03-25 10:03           ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 3/9] buffer_is_zero: use vector optimizations if possible Peter Lieven
2013-03-25  8:53   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize find_next_bit() Peter Lieven
2013-03-25  9:04   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 5/9] migration: search for zero instead of dup pages Peter Lieven
2013-03-22 19:49   ` Eric Blake
2013-03-22 20:02     ` Peter Lieven
2013-03-25  9:30   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 6/9] migration: add an indicator for bulk state of ram migration Peter Lieven
2013-03-25  9:32   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage Peter Lieven
2013-03-22 20:13   ` Eric Blake
2013-03-25  9:44   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 8/9] migration: do not search dirty " Peter Lieven
2013-03-25 10:05   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 9/9] migration: use XBZRLE only after " Peter Lieven
2013-03-25 10:16   ` Orit Wasserman
2013-03-22 17:25 ` [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations Paolo Bonzini
2013-03-22 19:20   ` Peter Lieven
2013-03-22 21:24     ` Paolo Bonzini
2013-03-23  7:34       ` Peter Lieven
2013-03-25 10:17       ` Peter Lieven
2013-03-25 10:53         ` Paolo Bonzini
2013-03-25 11:26           ` Peter Lieven
2013-03-25 13:02             ` Paolo Bonzini [this message]
2013-03-25 13:23               ` Peter Lieven
2013-03-25 13:32                 ` Peter Lieven
2013-03-25 14:34                   ` Paolo Bonzini
2013-03-25 21:37                     ` Peter Lieven
2013-03-26  8:14                     ` Peter Lieven
2013-03-26  9:20                       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=972929461.13095041.1364216522903.JavaMail.root@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=owasserm@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.