All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
	Orit Wasserman <owasserm@redhat.com>,
	qemu-devel@nongnu.org, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations
Date: Mon, 25 Mar 2013 22:37:44 +0100	[thread overview]
Message-ID: <185F4D8B-1C5D-4008-AAD8-E063D6D2A7DD@kamp.de> (raw)
In-Reply-To: <51506068.5080103@redhat.com>


Am 25.03.2013 um 15:34 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Il 25/03/2013 14:32, Peter Lieven ha scritto:
>> 
>> Am 25.03.2013 um 14:23 schrieb Peter Lieven <pl@kamp.de>:
>> 
>>> 
>>> Am 25.03.2013 um 14:02 schrieb Paolo Bonzini <pbonzini@redhat.com>:
>>> 
>>>>> Maybe I should have explained the output more detailed. The percentages
>>>>> are added. 35.8% in the second last column means that
>>>>> 35.8% have a return value that is less than TARGET_PAGE_SIZE.
>>>>> This was meant to illustrate at how many 64-bit chunks you have
>>>>> to look to grab a certain percentage of non-zero pages.
>>>> 
>>>> Ok, I wrongly understood that many pages had 4088 zero bytes but
>>>> the last 8 were not zero.  Now it's clearer, and more logical too. :)
>>>> 
>>>>> Looking e.g. at the third value it means that looking at the first
>>>>> three 64-bit chunks it will catch 34.0% of all pages.
>>>>> It turns out that the non-zeroness of a page can be detected looking
>>>>> at the first 256 or so bits and only a low
>>>>> percentage turns out to be non-zero at a later position. So after
>>>>> having checked the first chunks one by one
>>>>> there is no big penalty looking at the remaining chunks with the
>>>>> vectorized loop.
>>>> 
>>>> I think it makes most sense to unroll the first four non-vectorized
>>>> iterations, i.e. not use SSE and use three or four ifs.  Either:
>>>> 
>>>> if (foo[0]) return 0;
>>>> if (foo[1]) return 8;
>>>> if (foo[2]) return 16;
>>>> if (foo[3]) return 24;
>>>> 
>>>> or
>>>> 
>>>> if (foo[0]) return 0;
>>>> if (foo[1] | foo[2] | foo[3]) return 8;
>>>> 
>>>> and then proceed on the remaining 4096-4*sizeof(long) bytes with
>>>> the vectorized loop.  foo+4 is aligned for SIMD operations on both
>>>> 32- and 64-bit machines, which makes this a nice choice.
>>> 
>>> i can't start at foo+4 since the remaining X-4*sizeof(long) bytes
>>> are not dividable by 8*sizeof(VECTYPE).
> 
> 
> Hmm, right.  What about just processing the first few longs twice, i.e.
> the above followed by "for (i = 0; i < len / sizeof(sizeof(VECTYPE); i
> += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR)"?

i will profile it tomorrow.

what is bad about processing the first 8 vectors like described below?

>>  for (i = 0; i < BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR; i++) {
>>        if (!ALL_EQ(p[i], zero)) {
>>            return i * sizeof(VECTYPE);
>>        }
>>    }


this way it would not be necessary to process them twice.

Peter

  reply	other threads:[~2013-03-25 21:36 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-22 12:46 [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations Peter Lieven
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 1/9] move vector definitions to qemu-common.h Peter Lieven
2013-03-25  8:35   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer Peter Lieven
2013-03-22 19:37   ` Eric Blake
2013-03-22 20:03     ` Peter Lieven
2013-03-22 20:22       ` [Qemu-devel] indentation hints [was: [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer] Eric Blake
2013-03-23 11:18         ` Peter Maydell
2013-03-25  8:53   ` [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer Orit Wasserman
2013-03-25  8:56     ` Peter Lieven
2013-03-25  9:26       ` Orit Wasserman
2013-03-25  9:42         ` Paolo Bonzini
2013-03-25 10:03           ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 3/9] buffer_is_zero: use vector optimizations if possible Peter Lieven
2013-03-25  8:53   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize find_next_bit() Peter Lieven
2013-03-25  9:04   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 5/9] migration: search for zero instead of dup pages Peter Lieven
2013-03-22 19:49   ` Eric Blake
2013-03-22 20:02     ` Peter Lieven
2013-03-25  9:30   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 6/9] migration: add an indicator for bulk state of ram migration Peter Lieven
2013-03-25  9:32   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage Peter Lieven
2013-03-22 20:13   ` Eric Blake
2013-03-25  9:44   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 8/9] migration: do not search dirty " Peter Lieven
2013-03-25 10:05   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 9/9] migration: use XBZRLE only after " Peter Lieven
2013-03-25 10:16   ` Orit Wasserman
2013-03-22 17:25 ` [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations Paolo Bonzini
2013-03-22 19:20   ` Peter Lieven
2013-03-22 21:24     ` Paolo Bonzini
2013-03-23  7:34       ` Peter Lieven
2013-03-25 10:17       ` Peter Lieven
2013-03-25 10:53         ` Paolo Bonzini
2013-03-25 11:26           ` Peter Lieven
2013-03-25 13:02             ` Paolo Bonzini
2013-03-25 13:23               ` Peter Lieven
2013-03-25 13:32                 ` Peter Lieven
2013-03-25 14:34                   ` Paolo Bonzini
2013-03-25 21:37                     ` Peter Lieven [this message]
2013-03-26  8:14                     ` Peter Lieven
2013-03-26  9:20                       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=185F4D8B-1C5D-4008-AAD8-E063D6D2A7DD@kamp.de \
    --to=pl@kamp.de \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.