All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
To: Peter Lieven <pl@kamp.de>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] broken incoming migration
Date: Sat, 08 Jun 2013 16:27:54 +0800	[thread overview]
Message-ID: <51B2EB0A.7000704@linux.vnet.ibm.com> (raw)
In-Reply-To: <51AE035A.5070301@kamp.de>

 > On 04.06.2013 16:40, Paolo Bonzini wrote:
>> Il 04/06/2013 16:38, Peter Lieven ha scritto:
>>> On 04.06.2013 16:14, Paolo Bonzini wrote:
>>>> Il 04/06/2013 15:52, Peter Lieven ha scritto:
>>>>> On 30.05.2013 16:41, Paolo Bonzini wrote:
>>>>>> Il 30/05/2013 16:38, Peter Lieven ha scritto:
>>>>>>>>> You could also scan the page for nonzero values before writing it.
>>>>>>> i had this in mind, but then choosed the other approach.... turned
>>>>>>> out to be a bad idea.
>>>>>>>
>>>>>>> alexey: i will prepare a patch later today, could you then please
>>>>>>> verify it fixes your problem.
>>>>>>>
>>>>>>> paolo: would we still need the madvise or is it enough to not write
>>>>>>> the zeroes?
>>>>>> It should be enough to not write them.
>>>>> Problem: checking the pages for zero allocates them. even at the
>>>>> source.
>>>> It doesn't look like.  I tried this program and top doesn't show an
>>>> increasing amount of reserved memory:
>>>>
>>>> #include <stdio.h>
>>>> #include <stdlib.h>
>>>> int main()
>>>> {
>>>>       char *x = malloc(500 << 20);
>>>>       int i, j;
>>>>       for (i = 0; i < 500; i += 10) {
>>>>           for (j = 0; j < 10 << 20; j += 4096) {
>>>>                *(volatile char*) (x + (i << 20) + j);
>>>>           }
>>>>           getchar();
>>>>       }
>>>> }
>>> strange. we are talking about RSS size, right?
>> None of the three top values change, and only VIRT is >500 MB.
>>
>>> is the malloc above using mmapped memory?
>> Yes.
>>
>>> which kernel version do you use?
>> 3.9.
>>
>>> what avoids allocating the memory for me is the following (with
>>> whatever side effects it has ;-))
>> This would also fail to migrate any page that is swapped out, breaking
>> overcommit in a more subtle way. :)
>>
>> Paolo
> the following does also not allocate memory, but qemu does...
>
Hi, Peter
   As the patch writes

"not sending zero pages breaks migration if a page is zero
at the source but not at the destination."

   I don't understand why it would be trouble, shouldn't all page
not received in dest be treated as zero pages?

Also, you mean following code is from qemu and it does not allocate
memory with you gcc right? Maybe it is related to KVM, how about
turn off KVM and retry following code in qemu?

> #include <stdio.h>
> #include <stdlib.h>
> #include <assert.h>
> #include <unistd.h>
> #include <sys/resource.h>
> #include <inttypes.h>
> #include <string.h>
> #include <sys/mman.h>
> #include <errno.h>
>
> #if defined __SSE2__
> #include <emmintrin.h>
> #define VECTYPE        __m128i
> #define SPLAT(p)       _mm_set1_epi8(*(p))
> #define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) ==
> 0xFFFF)
> #else
> #define VECTYPE        unsigned long
> #define SPLAT(p)       (*(p) * (~0UL / 255))
> #define ALL_EQ(v1, v2) ((v1) == (v2))
> #endif
>
> #define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
>
> /* Round number down to multiple */
> #define QEMU_ALIGN_DOWN(n, m) ((n) / (m) * (m))
>
> /* Round number up to multiple */
> #define QEMU_ALIGN_UP(n, m) QEMU_ALIGN_DOWN((n) + (m) - 1, (m))
>
> #define QEMU_VMALLOC_ALIGN (256 * 4096)
>
> /* alloc shared memory pages */
> void *qemu_anon_ram_alloc(size_t size)
> {
>      size_t align = QEMU_VMALLOC_ALIGN;
>      size_t total = size + align - getpagesize();
>      void *ptr = mmap(0, total, PROT_READ | PROT_WRITE,
>                       MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>      size_t offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
>
>      if (ptr == MAP_FAILED) {
>          fprintf(stderr, "Failed to allocate %zu B: %s\n",
>                  size, strerror(errno));
>          abort();
>      }
>
>      ptr += offset;
>      total -= offset;
>
>      if (offset > 0) {
>          munmap(ptr - offset, offset);
>      }
>      if (total > size) {
>          munmap(ptr + size, total - size);
>      }
>
>      return ptr;
> }
>
> static inline int
> can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
> {
>      return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>                     * sizeof(VECTYPE)) == 0
>              && ((uintptr_t) buf) % sizeof(VECTYPE) == 0);
> }
>
> size_t buffer_find_nonzero_offset(const void *buf, size_t len)
> {
>      const VECTYPE *p = buf;
>      const VECTYPE zero = (VECTYPE){0};
>      size_t i;
>
>      if (!len) {
>          return 0;
>      }
>
>      assert(can_use_buffer_find_nonzero_offset(buf, len));
>
>      for (i = 0; i < BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR; i++) {
>          if (!ALL_EQ(p[i], zero)) {
>              return i * sizeof(VECTYPE);
>          }
>      }
>
>      for (i = BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR;
>           i < len / sizeof(VECTYPE);
>           i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
>          VECTYPE tmp0 = p[i + 0] | p[i + 1];
>          VECTYPE tmp1 = p[i + 2] | p[i + 3];
>          VECTYPE tmp2 = p[i + 4] | p[i + 5];
>          VECTYPE tmp3 = p[i + 6] | p[i + 7];
>          VECTYPE tmp01 = tmp0 | tmp1;
>          VECTYPE tmp23 = tmp2 | tmp3;
>          if (!ALL_EQ(tmp01 | tmp23, zero)) {
>              break;
>          }
>      }
>
>      return i * sizeof(VECTYPE);
> }
>
> int main()
> {
>       //char *x = malloc(1024 << 20);
>       char *x = qemu_anon_ram_alloc(1024 << 20);
>
>       int i, j;
>       int ret = 0;
>       struct rusage rusage;
>       for (i = 0; i < 500; i ++) {
>           for (j = 0; j < 10 << 20; j += 4096) {
>                ret += buffer_find_nonzero_offset((char*) (x + (i << 20)
> + j), 4096);
>           }
>           getrusage( RUSAGE_SELF, &rusage );
>           printf("read offset: %d kB, RSS size: %ld kB", ((i+1) << 10),
> rusage.ru_maxrss);
>           getchar();
>       }
>       printf("%d zero pages\n", ret);
> }
>


-- 
Best Regards

Wenchao Xia

  reply	other threads:[~2013-06-08  8:42 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30  7:44 [Qemu-devel] broken incoming migration Alexey Kardashevskiy
2013-05-30  7:49 ` Alexey Kardashevskiy
2013-05-30  7:49 ` Paolo Bonzini
2013-05-30  8:18   ` Alexey Kardashevskiy
2013-05-30  9:08     ` Peter Lieven
2013-05-30  9:31       ` Alexey Kardashevskiy
2013-05-30 13:00       ` Paolo Bonzini
2013-05-30 13:38         ` Alexey Kardashevskiy
2013-05-30 14:08           ` Paolo Bonzini
2013-05-30 14:38         ` Peter Lieven
2013-05-30 14:41           ` Paolo Bonzini
2013-06-04 13:52             ` Peter Lieven
2013-06-04 14:14               ` Paolo Bonzini
2013-06-04 14:38                 ` Peter Lieven
2013-06-04 14:40                   ` Paolo Bonzini
2013-06-04 14:48                     ` Peter Lieven
2013-06-04 15:17                       ` Paolo Bonzini
2013-06-04 19:15                         ` Peter Lieven
2013-06-05  3:37                           ` Alexey Kardashevskiy
2013-06-05  6:09                             ` Peter Lieven
2013-06-09  4:12                               ` liu ping fan
2013-06-09  7:22                                 ` Peter Lieven
2013-06-04 15:10                     ` Peter Lieven
2013-06-08  8:27                       ` Wenchao Xia [this message]
2013-06-08  8:30                         ` Alexey Kardashevskiy
2013-06-09  2:16                           ` Wenchao Xia
2013-06-09  2:34                             ` Alexey Kardashevskiy
2013-06-09  2:52                               ` [Qemu-devel] [Qemu-ppc] " Benjamin Herrenschmidt
2013-06-09  3:01                                 ` Alexey Kardashevskiy
2013-06-09  3:01                               ` [Qemu-devel] " Wenchao Xia
2013-06-09  3:09                                 ` Alexey Kardashevskiy
2013-06-09  3:31                                   ` Wenchao Xia
2013-06-09  7:27                                   ` Peter Lieven
2013-06-10  6:39                                     ` Alexey Kardashevskiy
2013-06-10  6:50                                       ` Peter Lieven
2013-06-10  6:55                                         ` Alexey Kardashevskiy
2013-06-10  8:44                                           ` Peter Lieven
2013-06-10  9:10                                             ` Alexey Kardashevskiy
2013-06-10  9:33                                               ` [Qemu-devel] [Qemu-ppc] " Benjamin Herrenschmidt
2013-06-10  9:42                                                 ` Peter Lieven
2013-06-09  2:53                             ` Benjamin Herrenschmidt
2013-06-12 14:00                               ` Paolo Bonzini
2013-06-12 14:11                                 ` Benjamin Herrenschmidt
2013-06-12 20:10                                   ` Paolo Bonzini
2013-06-13  2:41                                     ` Wenchao Xia
2013-06-03 10:04           ` [Qemu-devel] " Alexey Kardashevskiy
2013-06-04 10:56             ` Peter Lieven
2013-06-08  8:24         ` Wenchao Xia
2013-05-30 10:18 ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B2EB0A.7000704@linux.vnet.ibm.com \
    --to=xiawenc@linux.vnet.ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=david@gibson.dropbear.id.au \
    --cc=pbonzini@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.