All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Orit Wasserman <owasserm@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	qemu-devel@nongnu.org, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations
Date: Fri, 22 Mar 2013 20:20:29 +0100	[thread overview]
Message-ID: <514CAEFD.7090904@kamp.de> (raw)
In-Reply-To: <514C9413.8090902@redhat.com>

Am 22.03.2013 18:25, schrieb Paolo Bonzini:
> Il 22/03/2013 13:46, Peter Lieven ha scritto:
>> this is v4 of my patch series with various optimizations in
>> zero buffer checking and migration tweaks.
>>
>> thanks especially to Eric Blake for reviewing.
>>
>> v4:
>> - do not inline buffer_find_nonzero_offset()
>> - inline can_usebuffer_find_nonzero_offset() correctly
>> - readd asserts in buffer_find_nonzero_offset() as profiling
>>   shows they do not hurt.
>> - change last occurences of scalar 8 by 
>>   BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>> - avoid deferencing p already in patch 5 where we
>>   know that the page (p) is zero
>> - explicitly set bytes_sent = 0 if we skip a zero page.
>>   bytes_sent was 0 before, but it was not obvious.
>> - add accounting information for skipped zero pages
>> - fix errors reported by checkpatch.pl
>>
>> v3:
>> - remove asserts, inline functions and add a check
>>   function if buffer_find_nonzero_offset() can be used.
>> - use above check function in buffer_is_zero() and
>>   find_next_bit().
>> - use buffer_is_nonzero_offset() directly to find
>>   zero pages. we know that all requirements are met
>>   for memory pages.
>> - fix C89 violation in buffer_is_zero().
>> - avoid derefencing p in ram_save_block() if we already
>>   know the page is zero.
>> - fix initialization of last_offset in reset_ram_globals().
>> - avoid skipping pages with offset == 0 in bulk stage in
>>   migration_bitmap_find_and_reset_dirty().
>> - compared to v1 check for zero pages also after bulk
>>   ram migration as there are guests (e.g. Windows) which
>>   zero out large amount of memory while running.
>>
>> v2:
>> - fix description, add trivial zero check and add asserts 
>>   to buffer_find_nonzero_offset.
>> - add a constant for the unroll factor of buffer_find_nonzero_offset
>> - replace is_dup_page() by buffer_is_zero()
>> - added test results to xbzrle patch
>> - optimize descriptions
>>
>> Peter Lieven (9):
>>   move vector definitions to qemu-common.h
>>   cutils: add a function to find non-zero content in a buffer
>>   buffer_is_zero: use vector optimizations if possible
>>   bitops: use vector algorithm to optimize find_next_bit()
>>   migration: search for zero instead of dup pages
>>   migration: add an indicator for bulk state of ram migration
>>   migration: do not sent zero pages in bulk stage
>>   migration: do not search dirty pages in bulk stage
>>   migration: use XBZRLE only after bulk stage
>>
>>  arch_init.c                   |   74 +++++++++++++++++++----------------------
>>  hmp.c                         |    2 ++
>>  include/migration/migration.h |    2 ++
>>  include/qemu-common.h         |   37 +++++++++++++++++++++
>>  migration.c                   |    3 +-
>>  qapi-schema.json              |    6 ++--
>>  qmp-commands.hx               |    3 +-
>>  util/bitops.c                 |   24 +++++++++++--
>>  util/cutils.c                 |   50 ++++++++++++++++++++++++++++
>>  9 files changed, 155 insertions(+), 46 deletions(-)
>>
> I think patch 4 is a bit overengineered.  I would prefer the simple
> patch you had using three/four non-vectorized accesses.  The setup cost
> of the vectorized buffer_is_zero is quite high, and 64 bits are just
> 256k RAM; if the host doesn't touch 256k RAM, it will incur the overhead.
I think you are right. I was a little to eager to utilize buffer_find_nonzero_offset()
as much as possible. The performance gain by unrolling was impressive enough.
The gain by the vector functions is not that big that it would justify a possible
slow down by the high setup costs. My testings revealed that in most cases buffer_find_nonzero_offset()
returns 0 or a big offset. All the 0 return values would have increased setup costs with
the vectorized version of patch 4.

>
> I would prefer some more benchmarking for patch 5, but it looks ok.
What would you like to see? Statistics how many pages of a real system
are not zero, but zero in the first sizeof(long) bytes?

>
> The rest are fine, thanks!
Thank you for reviewing. If we are done with this patches I will continue with
the block migration optimizations next week.

Peter

  reply	other threads:[~2013-03-22 19:20 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-22 12:46 [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations Peter Lieven
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 1/9] move vector definitions to qemu-common.h Peter Lieven
2013-03-25  8:35   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer Peter Lieven
2013-03-22 19:37   ` Eric Blake
2013-03-22 20:03     ` Peter Lieven
2013-03-22 20:22       ` [Qemu-devel] indentation hints [was: [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer] Eric Blake
2013-03-23 11:18         ` Peter Maydell
2013-03-25  8:53   ` [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer Orit Wasserman
2013-03-25  8:56     ` Peter Lieven
2013-03-25  9:26       ` Orit Wasserman
2013-03-25  9:42         ` Paolo Bonzini
2013-03-25 10:03           ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 3/9] buffer_is_zero: use vector optimizations if possible Peter Lieven
2013-03-25  8:53   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize find_next_bit() Peter Lieven
2013-03-25  9:04   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 5/9] migration: search for zero instead of dup pages Peter Lieven
2013-03-22 19:49   ` Eric Blake
2013-03-22 20:02     ` Peter Lieven
2013-03-25  9:30   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 6/9] migration: add an indicator for bulk state of ram migration Peter Lieven
2013-03-25  9:32   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage Peter Lieven
2013-03-22 20:13   ` Eric Blake
2013-03-25  9:44   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 8/9] migration: do not search dirty " Peter Lieven
2013-03-25 10:05   ` Orit Wasserman
2013-03-22 12:46 ` [Qemu-devel] [PATCHv4 9/9] migration: use XBZRLE only after " Peter Lieven
2013-03-25 10:16   ` Orit Wasserman
2013-03-22 17:25 ` [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations Paolo Bonzini
2013-03-22 19:20   ` Peter Lieven [this message]
2013-03-22 21:24     ` Paolo Bonzini
2013-03-23  7:34       ` Peter Lieven
2013-03-25 10:17       ` Peter Lieven
2013-03-25 10:53         ` Paolo Bonzini
2013-03-25 11:26           ` Peter Lieven
2013-03-25 13:02             ` Paolo Bonzini
2013-03-25 13:23               ` Peter Lieven
2013-03-25 13:32                 ` Peter Lieven
2013-03-25 14:34                   ` Paolo Bonzini
2013-03-25 21:37                     ` Peter Lieven
2013-03-26  8:14                     ` Peter Lieven
2013-03-26  9:20                       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=514CAEFD.7090904@kamp.de \
    --to=pl@kamp.de \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.