All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
@ 2016-09-25  8:22 Chunguang Li
  2016-09-26 11:23 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 21+ messages in thread
From: Chunguang Li @ 2016-09-25  8:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, pbonzini, stefanha

Hi all!
I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.

Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.


Am I right about this consideration? If I am right, is there some advice to improve this?


Thanks,
Chunguang Li

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-25  8:22 [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent Chunguang Li
@ 2016-09-26 11:23 ` Dr. David Alan Gilbert
  2016-09-26 14:55   ` Chunguang Li
  0 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-09-26 11:23 UTC (permalink / raw)
  To: Chunguang Li; +Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela

* Chunguang Li (lichunguang@hust.edu.cn) wrote:
> Hi all!
> I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> 
> Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> 
> 
> Am I right about this consideration? If I am right, is there some advice to improve this?

I think you're right that this can happen; to clarify I think the
case you're talking about is:

  Iteration 1
    sync bitmap
    start sending pages
    page 'n' is modified - but hasn't been sent yet
    page 'n' gets sent
  Iteration 2
    sync bitmap
       'page n is shown as modified'
    send page 'n' again

So you're right that is wasteful; I guess it's more wasteful
on big VMs with slow networks where the length of each iteration
is large.

Fixing it is not easy, because you have to be really careful
never to miss a page modification, even if the page is sent
about the same time it's dirtied.

One way would be to sync the dirty log from the kernel
in smaller chunks.

Dave


> 
> Thanks,
> Chunguang Li
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-26 11:23 ` Dr. David Alan Gilbert
@ 2016-09-26 14:55   ` Chunguang Li
  2016-09-26 18:52     ` Dr. David Alan Gilbert
  2016-09-30  5:46     ` Amit Shah
  0 siblings, 2 replies; 21+ messages in thread
From: Chunguang Li @ 2016-09-26 14:55 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela




> -----原始邮件-----
> 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 发送时间: 2016年9月26日 星期一
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > Hi all!
> > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > 
> > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > 
> > 
> > Am I right about this consideration? If I am right, is there some advice to improve this?
> 
> I think you're right that this can happen; to clarify I think the
> case you're talking about is:
> 
>   Iteration 1
>     sync bitmap
>     start sending pages
>     page 'n' is modified - but hasn't been sent yet
>     page 'n' gets sent
>   Iteration 2
>     sync bitmap
>        'page n is shown as modified'
>     send page 'n' again
>

Yes,this is right the case I am talking about.
 
> So you're right that is wasteful; I guess it's more wasteful
> on big VMs with slow networks where the length of each iteration
> is large.

I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.

Thanks,

Chunguang

> 
> Fixing it is not easy, because you have to be really careful
> never to miss a page modification, even if the page is sent
> about the same time it's dirtied.
> 
> One way would be to sync the dirty log from the kernel
> in smaller chunks.
> 
> Dave
> 
> 
> > 
> > Thanks,
> > Chunguang Li
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-26 14:55   ` Chunguang Li
@ 2016-09-26 18:52     ` Dr. David Alan Gilbert
  2016-09-27 12:28       ` Chunguang Li
  2016-09-30  5:46     ` Amit Shah
  1 sibling, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-09-26 18:52 UTC (permalink / raw)
  To: Chunguang Li; +Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela

* Chunguang Li (lichunguang@hust.edu.cn) wrote:
> 
> 
> 
> > -----原始邮件-----
> > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 发送时间: 2016年9月26日 星期一
> > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > 
> > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > Hi all!
> > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > 
> > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > 
> > > 
> > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > 
> > I think you're right that this can happen; to clarify I think the
> > case you're talking about is:
> > 
> >   Iteration 1
> >     sync bitmap
> >     start sending pages
> >     page 'n' is modified - but hasn't been sent yet
> >     page 'n' gets sent
> >   Iteration 2
> >     sync bitmap
> >        'page n is shown as modified'
> >     send page 'n' again
> >
> 
> Yes,this is right the case I am talking about.
>  
> > So you're right that is wasteful; I guess it's more wasteful
> > on big VMs with slow networks where the length of each iteration
> > is large.
> 
> I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.

Yes, it's probably pretty bad; and we really need to do something like
split the sync into smaller chunks; there are other suggestions
for how to improve it (e.g. there's the page-modification-logging
changes).

However, I don't think you usually get really random writes, if you
do precopy rarely converges at all, because even without your
observation it changes lots and lots of pages.

Dave

> Thanks,
> 
> Chunguang
> 
> > 
> > Fixing it is not easy, because you have to be really careful
> > never to miss a page modification, even if the page is sent
> > about the same time it's dirtied.
> > 
> > One way would be to sync the dirty log from the kernel
> > in smaller chunks.
> > 
> > Dave
> > 
> > 
> > > 
> > > Thanks,
> > > Chunguang Li
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-26 18:52     ` Dr. David Alan Gilbert
@ 2016-09-27 12:28       ` Chunguang Li
  0 siblings, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-09-27 12:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela

> -----原始邮件-----
> 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 发送时间: 2016年9月27日 星期二
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> Yes, it's probably pretty bad; and we really need to do something like
> split the sync into smaller chunks; there are other suggestions
> for how to improve it (e.g. there's the page-modification-logging
> changes).
> 
> However, I don't think you usually get really random writes, if you
> do precopy rarely converges at all, because even without your
> observation it changes lots and lots of pages.
> 
> Dave
> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

I have read a little about the page-modification-logging. I think 
it is only a more efficient way for dirty logging with better performance, 
compared with write protection, but will not solve the problem we are
talking about.

The only idea to handle this, which I have come up with so far, 
is to split the sync into smaller chunks that you have mentioned.
Maybe I can start from this idea to try to fix it.

If you come up with some other idea or suggestion, please let me know.
Thank you~

Chunguang

--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-26 14:55   ` Chunguang Li
  2016-09-26 18:52     ` Dr. David Alan Gilbert
@ 2016-09-30  5:46     ` Amit Shah
  2016-09-30  8:18       ` Chunguang Li
  2016-10-08  7:55       ` Chunguang Li
  1 sibling, 2 replies; 21+ messages in thread
From: Amit Shah @ 2016-09-30  5:46 UTC (permalink / raw)
  To: Chunguang Li
  Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela

On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> 
> 
> 
> > -----原始邮件-----
> > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 发送时间: 2016年9月26日 星期一
> > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > 
> > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > Hi all!
> > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > 
> > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > 
> > > 
> > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > 
> > I think you're right that this can happen; to clarify I think the
> > case you're talking about is:
> > 
> >   Iteration 1
> >     sync bitmap
> >     start sending pages
> >     page 'n' is modified - but hasn't been sent yet
> >     page 'n' gets sent
> >   Iteration 2
> >     sync bitmap
> >        'page n is shown as modified'
> >     send page 'n' again
> >
> 
> Yes,this is right the case I am talking about.
>  
> > So you're right that is wasteful; I guess it's more wasteful
> > on big VMs with slow networks where the length of each iteration
> > is large.
> 
> I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.

It makes sense, can you get some perf numbers to show what kinds of
workloads get impacted the most?  That would also help us to figure
out what kinds of speed improvements we can expect.


		Amit

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-30  5:46     ` Amit Shah
@ 2016-09-30  8:18       ` Chunguang Li
  2016-10-08  7:55       ` Chunguang Li
  1 sibling, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-09-30  8:18 UTC (permalink / raw)
  To: Amit Shah
  Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela




> -----原始邮件-----
> 发件人: "Amit Shah" <amit.shah@redhat.com>
> 发送时间: 2016年9月30日 星期五
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > 
> > 
> > 
> > > -----原始邮件-----
> > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 发送时间: 2016年9月26日 星期一
> > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > 
> > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > Hi all!
> > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > 
> > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > 
> > > > 
> > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > 
> > > I think you're right that this can happen; to clarify I think the
> > > case you're talking about is:
> > > 
> > >   Iteration 1
> > >     sync bitmap
> > >     start sending pages
> > >     page 'n' is modified - but hasn't been sent yet
> > >     page 'n' gets sent
> > >   Iteration 2
> > >     sync bitmap
> > >        'page n is shown as modified'
> > >     send page 'n' again
> > >
> > 
> > Yes,this is right the case I am talking about.
> >  
> > > So you're right that is wasteful; I guess it's more wasteful
> > > on big VMs with slow networks where the length of each iteration
> > > is large.
> > 
> > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> 
> It makes sense, can you get some perf numbers to show what kinds of
> workloads get impacted the most?  That would also help us to figure
> out what kinds of speed improvements we can expect.
> 
> 
> 		Amit

Yes, I can pick up some workloads to get some perf numbers. 
However, I don't know how to get the quantity of non-dirty pages we are
resending in each iteration. Instead, I can  get the numbers below:

1. The time consuming of each iteration;
2. The quantity of pages transferred during each iteration;
3. The quantity of dirty pages (including not-really-dirty pages) produced 
   during each iteration.

With these numbers, we can only estimate the quantity of not-really-dirty
pages to some extent. How do you think of this test plan? Any suggestions?

Chunguang

--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-09-30  5:46     ` Amit Shah
  2016-09-30  8:18       ` Chunguang Li
@ 2016-10-08  7:55       ` Chunguang Li
  2016-10-14 11:15         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 21+ messages in thread
From: Chunguang Li @ 2016-10-08  7:55 UTC (permalink / raw)
  To: Amit Shah
  Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela




> -----原始邮件-----
> 发件人: "Amit Shah" <amit.shah@redhat.com>
> 发送时间: 2016年9月30日 星期五
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > 
> > 
> > 
> > > -----原始邮件-----
> > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 发送时间: 2016年9月26日 星期一
> > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > 
> > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > Hi all!
> > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > 
> > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > 
> > > > 
> > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > 
> > > I think you're right that this can happen; to clarify I think the
> > > case you're talking about is:
> > > 
> > >   Iteration 1
> > >     sync bitmap
> > >     start sending pages
> > >     page 'n' is modified - but hasn't been sent yet
> > >     page 'n' gets sent
> > >   Iteration 2
> > >     sync bitmap
> > >        'page n is shown as modified'
> > >     send page 'n' again
> > >
> > 
> > Yes,this is right the case I am talking about.
> >  
> > > So you're right that is wasteful; I guess it's more wasteful
> > > on big VMs with slow networks where the length of each iteration
> > > is large.
> > 
> > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> 
> It makes sense, can you get some perf numbers to show what kinds of
> workloads get impacted the most?  That would also help us to figure
> out what kinds of speed improvements we can expect.
> 
> 
> 		Amit

I have picked up 6 workloads and got the following statistics numbers 
of every iteration (except the last stop-copy one) during precopy.
These numbers are obtained with the basic precopy migration, without 
the capabilities like xbzrle or compression, etc. The network for the 
migration is exclusive, with a separate network for the workloads. 
They are both gigabit ethernet. I use qemu-2.5.1.

Three (booting, idle, web server) of them converged to the stop-copy phase, 
with the given bandwidth and default downtime (300ms), while the other
three (kernel compilation, zeusmp, memcached) did not.

One page is "not-really-dirty", if it is written first and is sent later
(and not written again after that) during one iteration. I guess this 
would not happen so often during the other iterations as during the 1st 
iteration. Because all the pages of the VM are sent to the dest node during 
the 1st iteration, while during the others, only part of the pages are sent. 
So I think the "not-really-dirty" pages should be produced mainly during 
the 1st iteration , and maybe very little during the other iterations.

If we could avoid resending the "not-really-dirty" pages, intuitively, I
think the time spent on Iteration 2 would be halved. This is a chain reaction,
because the dirty pages produced during Iteration 2 is halved, which incurs
that the time spent on Iteration 3 is halved, then Iteration 4, 5...

So I think "booting" and  "kernel compilation" should benefit a lot from this
improvement. The reason of "kernel compilation" would benefit is that some 
iterations take around 600ms, and if they are halved into 300ms, then the precopy
may have the chance to step into stop and copy phase.

On the other hand, "idle" and "web server" would not benefit a lot, because
most of the time are spent on the 1st iteration and little on the others.

As to the "zeusmp" and "memcached", although the time spent on the other iterations
but the 1st one may be halved, they still could not converge to stop and copy 
with the 300ms downtime.

--------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------

1. booting : begin to migrate when the VM is booting

Iteration   1, duration:   6997 ms , transferred pages:   266450 (n:    57269, d:   209181 ) , new dirty pages:    56414 , remaining dirty pages:    56414
Iteration   2, duration:   6497 ms , transferred pages:    54008 (n:    52701, d:     1307 ) , new dirty pages:    48053 , remaining dirty pages:    50459
Iteration   3, duration:   5800 ms , transferred pages:    48232 (n:    47444, d:      788 ) , new dirty pages:     9129 , remaining dirty pages:    11356
Iteration   4, duration:   1100 ms , transferred pages:     9091 (n:     8998, d:       93 ) , new dirty pages:      165 , remaining dirty pages:     2430
Iteration   5, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2430
(note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
       "n" means "normal pages" and "d" means "duplicate (zero) pages".)

2. idle

Iteration   1, duration:  14496 ms , transferred pages:   266450 (n:   118980, d:   147470 ) , new dirty pages:    17398 , remaining dirty pages:    17398
Iteration   2, duration:   1896 ms , transferred pages:    14953 (n:    14854, d:       99 ) , new dirty pages:     1849 , remaining dirty pages:     4294
Iteration   3, duration:    300 ms , transferred pages:     2454 (n:     2454, d:        0 ) , new dirty pages:        9 , remaining dirty pages:     1849
Iteration   4, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     1849

3. kernel compilation (can not converge)

Iteration   1, duration:  20700 ms , transferred pages:   266450 (n:   169778, d:    96672 ) , new dirty pages:    40067 , remaining dirty pages:    40067
Iteration   2, duration:   4696 ms , transferred pages:    38401 (n:    37787, d:      614 ) , new dirty pages:     8852 , remaining dirty pages:    10518
Iteration   3, duration:   1000 ms , transferred pages:     8642 (n:     8180, d:      462 ) , new dirty pages:     6331 , remaining dirty pages:     8207
Iteration   4, duration:    700 ms , transferred pages:     6110 (n:     5726, d:      384 ) , new dirty pages:     5242 , remaining dirty pages:     7339
Iteration   5, duration:    600 ms , transferred pages:     5007 (n:     4908, d:       99 ) , new dirty pages:     4868 , remaining dirty pages:     7200
Iteration   6, duration:    600 ms , transferred pages:     5226 (n:     4908, d:      318 ) , new dirty pages:     6142 , remaining dirty pages:     8116
Iteration   7, duration:    700 ms , transferred pages:     5985 (n:     5726, d:      259 ) , new dirty pages:     5902 , remaining dirty pages:     8033
Iteration   8, duration:    701 ms , transferred pages:     5893 (n:     5726, d:      167 ) , new dirty pages:     7502 , remaining dirty pages:     9642
Iteration   9, duration:    900 ms , transferred pages:     7623 (n:     7362, d:      261 ) , new dirty pages:     6408 , remaining dirty pages:     8427
Iteration  10, duration:    700 ms , transferred pages:     6008 (n:     5726, d:      282 ) , new dirty pages:     8312 , remaining dirty pages:    10731
Iteration  11, duration:   1000 ms , transferred pages:     8353 (n:     8180, d:      173 ) , new dirty pages:     6874 , remaining dirty pages:     9252
Iteration  12, duration:    899 ms , transferred pages:     7477 (n:     7362, d:      115 ) , new dirty pages:     5573 , remaining dirty pages:     7348
Iteration  13, duration:    601 ms , transferred pages:     5099 (n:     4908, d:      191 ) , new dirty pages:     7671 , remaining dirty pages:     9920
Iteration  14, duration:    900 ms , transferred pages:     7586 (n:     7362, d:      224 ) , new dirty pages:     7359 , remaining dirty pages:     9693
Iteration  15, duration:    900 ms , transferred pages:     7682 (n:     7362, d:      320 ) , new dirty pages:     7371 , remaining dirty pages:     9382

4. cpu2006.zeusmp (can not converge)

Iteration   1, duration:  21603 ms , transferred pages:   266450 (n:   176660, d:    89790 ) , new dirty pages:   145625 , remaining dirty pages:   145625
Iteration   2, duration:   8696 ms , transferred pages:   144389 (n:    70862, d:    73527 ) , new dirty pages:   125124 , remaining dirty pages:   126360
Iteration   3, duration:   6301 ms , transferred pages:   124057 (n:    51379, d:    72678 ) , new dirty pages:   122528 , remaining dirty pages:   124831
Iteration   4, duration:   6400 ms , transferred pages:   124330 (n:    52196, d:    72134 ) , new dirty pages:   124267 , remaining dirty pages:   124768
Iteration   5, duration:   6703 ms , transferred pages:   124034 (n:    54656, d:    69378 ) , new dirty pages:   124151 , remaining dirty pages:   124885
Iteration   6, duration:   6703 ms , transferred pages:   124357 (n:    54658, d:    69699 ) , new dirty pages:   124106 , remaining dirty pages:   124634
Iteration   7, duration:   6602 ms , transferred pages:   124568 (n:    53838, d:    70730 ) , new dirty pages:   133828 , remaining dirty pages:   133894
Iteration   8, duration:   7600 ms , transferred pages:   133030 (n:    62021, d:    71009 ) , new dirty pages:   126612 , remaining dirty pages:   127476
Iteration   9, duration:   7299 ms , transferred pages:   126511 (n:    59569, d:    66942 ) , new dirty pages:   122727 , remaining dirty pages:   123692
Iteration  10, duration:   6609 ms , transferred pages:   123692 (n:    54539, d:    69153 ) , new dirty pages:   122727 , remaining dirty pages:   122727
Iteration  11, duration:   6995 ms , transferred pages:   120347 (n:    56423, d:    63924 ) , new dirty pages:   121430 , remaining dirty pages:   123810
Iteration  12, duration:   6703 ms , transferred pages:   123040 (n:    54657, d:    68383 ) , new dirty pages:   122043 , remaining dirty pages:   122813
Iteration  13, duration:   7006 ms , transferred pages:   122353 (n:    57121, d:    65232 ) , new dirty pages:   133869 , remaining dirty pages:   134329
Iteration  14, duration:   8209 ms , transferred pages:   132325 (n:    66932, d:    65393 ) , new dirty pages:   126914 , remaining dirty pages:   128918
Iteration  15, duration:   7802 ms , transferred pages:   126931 (n:    63671, d:    63260 ) , new dirty pages:   122351 , remaining dirty pages:   124338

5. web server : An apache web server. The client is configured with 50 concurrent connections.

Iteration   1, duration:  30697 ms , transferred pages:   266450 (n:   251215, d:    15235 ) , new dirty pages:    30628 , remaining dirty pages:    30628
Iteration   2, duration:   3496 ms , transferred pages:    28859 (n:    28513, d:      346 ) , new dirty pages:     5805 , remaining dirty pages:     7574
Iteration   3, duration:    701 ms , transferred pages:     5746 (n:     5726, d:       20 ) , new dirty pages:     3433 , remaining dirty pages:     5261
Iteration   4, duration:    400 ms , transferred pages:     3281 (n:     3272, d:        9 ) , new dirty pages:     1539 , remaining dirty pages:     3519
Iteration   5, duration:    199 ms , transferred pages:     1653 (n:     1636, d:       17 ) , new dirty pages:      301 , remaining dirty pages:     2167
Iteration   6, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2167

--------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------

6. memcached : 4 GB cache, memaslap: all write, concurrency = 5  (can not converge)

Iteration   1, duration:  42486 ms , transferred pages:  1568087 (n:  1216079, d:   352008 ) , new dirty pages:   571940 , remaining dirty pages:   581023
Iteration   2, duration:  19774 ms , transferred pages:   571700 (n:   567416, d:     4284 ) , new dirty pages:   331690 , remaining dirty pages:   341013
Iteration   3, duration:  11589 ms , transferred pages:   332187 (n:   332095, d:       92 ) , new dirty pages:   222725 , remaining dirty pages:   231551
Iteration   4, duration:   7790 ms , transferred pages:   223571 (n:   223499, d:       72 ) , new dirty pages:   157658 , remaining dirty pages:   165638
Iteration   5, duration:   5518 ms , transferred pages:   158056 (n:   157998, d:       58 ) , new dirty pages:   128130 , remaining dirty pages:   135712
Iteration   6, duration:   4442 ms , transferred pages:   127764 (n:   127701, d:       63 ) , new dirty pages:   104839 , remaining dirty pages:   112787
Iteration   7, duration:   3649 ms , transferred pages:   104581 (n:   104523, d:       58 ) , new dirty pages:   100736 , remaining dirty pages:   108942
Iteration   8, duration:   3532 ms , transferred pages:   101379 (n:   101315, d:       64 ) , new dirty pages:    87869 , remaining dirty pages:    95432
Iteration   9, duration:   3030 ms , transferred pages:    86841 (n:    86786, d:       55 ) , new dirty pages:    77505 , remaining dirty pages:    86096
Iteration  10, duration:   2709 ms , transferred pages:    77875 (n:    77814, d:       61 ) , new dirty pages:    77197 , remaining dirty pages:    85418
Iteration  11, duration:   2696 ms , transferred pages:    77107 (n:    77044, d:       63 ) , new dirty pages:    65010 , remaining dirty pages:    73321
Iteration  12, duration:   2308 ms , transferred pages:    66540 (n:    66484, d:       56 ) , new dirty pages:    64388 , remaining dirty pages:    71169
Iteration  13, duration:   2198 ms , transferred pages:    62953 (n:    62897, d:       56 ) , new dirty pages:    62773 , remaining dirty pages:    70989
Iteration  14, duration:   2214 ms , transferred pages:    63466 (n:    63411, d:       55 ) , new dirty pages:    67538 , remaining dirty pages:    75061
Iteration  15, duration:   2329 ms , transferred pages:    66924 (n:    66875, d:       49 ) , new dirty pages:    63580 , remaining dirty pages:    71717
Iteration  16, duration:   2252 ms , transferred pages:    64554 (n:    64539, d:       15 ) , new dirty pages:    63094 , remaining dirty pages:    70257
Iteration  17, duration:   2188 ms , transferred pages:    62697 (n:    62641, d:       56 ) , new dirty pages:    63016 , remaining dirty pages:    70576
Iteration  18, duration:   2171 ms , transferred pages:    62377 (n:    62322, d:       55 ) , new dirty pages:    56764 , remaining dirty pages:    64963
Iteration  19, duration:   2003 ms , transferred pages:    57382 (n:    57324, d:       58 ) , new dirty pages:    65307 , remaining dirty pages:    72888
Iteration  20, duration:   2240 ms , transferred pages:    64426 (n:    64364, d:       62 ) , new dirty pages:    61585 , remaining dirty pages:    70047


--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-10-08  7:55       ` Chunguang Li
@ 2016-10-14 11:15         ` Dr. David Alan Gilbert
  2016-11-03  8:25           ` Chunguang Li
  0 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-14 11:15 UTC (permalink / raw)
  To: Chunguang Li; +Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela

* Chunguang Li (lichunguang@hust.edu.cn) wrote:
> 
> 
> 
> > -----原始邮件-----
> > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > 发送时间: 2016年9月30日 星期五
> > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > 
> > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > 
> > > 
> > > 
> > > > -----原始邮件-----
> > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > 发送时间: 2016年9月26日 星期一
> > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > 
> > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > Hi all!
> > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > 
> > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > 
> > > > > 
> > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > 
> > > > I think you're right that this can happen; to clarify I think the
> > > > case you're talking about is:
> > > > 
> > > >   Iteration 1
> > > >     sync bitmap
> > > >     start sending pages
> > > >     page 'n' is modified - but hasn't been sent yet
> > > >     page 'n' gets sent
> > > >   Iteration 2
> > > >     sync bitmap
> > > >        'page n is shown as modified'
> > > >     send page 'n' again
> > > >
> > > 
> > > Yes,this is right the case I am talking about.
> > >  
> > > > So you're right that is wasteful; I guess it's more wasteful
> > > > on big VMs with slow networks where the length of each iteration
> > > > is large.
> > > 
> > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > 
> > It makes sense, can you get some perf numbers to show what kinds of
> > workloads get impacted the most?  That would also help us to figure
> > out what kinds of speed improvements we can expect.
> > 
> > 
> > 		Amit
> 
> I have picked up 6 workloads and got the following statistics numbers 
> of every iteration (except the last stop-copy one) during precopy.
> These numbers are obtained with the basic precopy migration, without 
> the capabilities like xbzrle or compression, etc. The network for the 
> migration is exclusive, with a separate network for the workloads. 
> They are both gigabit ethernet. I use qemu-2.5.1.
> 
> Three (booting, idle, web server) of them converged to the stop-copy phase, 
> with the given bandwidth and default downtime (300ms), while the other
> three (kernel compilation, zeusmp, memcached) did not.
> 
> One page is "not-really-dirty", if it is written first and is sent later
> (and not written again after that) during one iteration. I guess this 
> would not happen so often during the other iterations as during the 1st 
> iteration. Because all the pages of the VM are sent to the dest node during 
> the 1st iteration, while during the others, only part of the pages are sent. 
> So I think the "not-really-dirty" pages should be produced mainly during 
> the 1st iteration , and maybe very little during the other iterations.
> 
> If we could avoid resending the "not-really-dirty" pages, intuitively, I
> think the time spent on Iteration 2 would be halved. This is a chain reaction,
> because the dirty pages produced during Iteration 2 is halved, which incurs
> that the time spent on Iteration 3 is halved, then Iteration 4, 5...

Yes; these numbers don't show how many of them are false dirty though.

One problem is thinking about pages that have been redirtied, if the page is dirtied
after the sync but before the network write then it's the false-dirty that
you're describing.

However, if the page is being written a few times, and so it would have been written
after the network write then it isn't a false-dirty. 

You might be able to figure that out with some kernel tracing of when the dirtying
happens, but it might be easier to write the fix!

Dave

> So I think "booting" and  "kernel compilation" should benefit a lot from this
> improvement. The reason of "kernel compilation" would benefit is that some 
> iterations take around 600ms, and if they are halved into 300ms, then the precopy
> may have the chance to step into stop and copy phase.
> 
> On the other hand, "idle" and "web server" would not benefit a lot, because
> most of the time are spent on the 1st iteration and little on the others.
> 
> As to the "zeusmp" and "memcached", although the time spent on the other iterations
> but the 1st one may be halved, they still could not converge to stop and copy 
> with the 300ms downtime.
> 
> --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> 
> 1. booting : begin to migrate when the VM is booting
> 
> Iteration   1, duration:   6997 ms , transferred pages:   266450 (n:    57269, d:   209181 ) , new dirty pages:    56414 , remaining dirty pages:    56414
> Iteration   2, duration:   6497 ms , transferred pages:    54008 (n:    52701, d:     1307 ) , new dirty pages:    48053 , remaining dirty pages:    50459
> Iteration   3, duration:   5800 ms , transferred pages:    48232 (n:    47444, d:      788 ) , new dirty pages:     9129 , remaining dirty pages:    11356
> Iteration   4, duration:   1100 ms , transferred pages:     9091 (n:     8998, d:       93 ) , new dirty pages:      165 , remaining dirty pages:     2430
> Iteration   5, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2430
> (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
>        "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> 
> 2. idle
> 
> Iteration   1, duration:  14496 ms , transferred pages:   266450 (n:   118980, d:   147470 ) , new dirty pages:    17398 , remaining dirty pages:    17398
> Iteration   2, duration:   1896 ms , transferred pages:    14953 (n:    14854, d:       99 ) , new dirty pages:     1849 , remaining dirty pages:     4294
> Iteration   3, duration:    300 ms , transferred pages:     2454 (n:     2454, d:        0 ) , new dirty pages:        9 , remaining dirty pages:     1849
> Iteration   4, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     1849
> 
> 3. kernel compilation (can not converge)
> 
> Iteration   1, duration:  20700 ms , transferred pages:   266450 (n:   169778, d:    96672 ) , new dirty pages:    40067 , remaining dirty pages:    40067
> Iteration   2, duration:   4696 ms , transferred pages:    38401 (n:    37787, d:      614 ) , new dirty pages:     8852 , remaining dirty pages:    10518
> Iteration   3, duration:   1000 ms , transferred pages:     8642 (n:     8180, d:      462 ) , new dirty pages:     6331 , remaining dirty pages:     8207
> Iteration   4, duration:    700 ms , transferred pages:     6110 (n:     5726, d:      384 ) , new dirty pages:     5242 , remaining dirty pages:     7339
> Iteration   5, duration:    600 ms , transferred pages:     5007 (n:     4908, d:       99 ) , new dirty pages:     4868 , remaining dirty pages:     7200
> Iteration   6, duration:    600 ms , transferred pages:     5226 (n:     4908, d:      318 ) , new dirty pages:     6142 , remaining dirty pages:     8116
> Iteration   7, duration:    700 ms , transferred pages:     5985 (n:     5726, d:      259 ) , new dirty pages:     5902 , remaining dirty pages:     8033
> Iteration   8, duration:    701 ms , transferred pages:     5893 (n:     5726, d:      167 ) , new dirty pages:     7502 , remaining dirty pages:     9642
> Iteration   9, duration:    900 ms , transferred pages:     7623 (n:     7362, d:      261 ) , new dirty pages:     6408 , remaining dirty pages:     8427
> Iteration  10, duration:    700 ms , transferred pages:     6008 (n:     5726, d:      282 ) , new dirty pages:     8312 , remaining dirty pages:    10731
> Iteration  11, duration:   1000 ms , transferred pages:     8353 (n:     8180, d:      173 ) , new dirty pages:     6874 , remaining dirty pages:     9252
> Iteration  12, duration:    899 ms , transferred pages:     7477 (n:     7362, d:      115 ) , new dirty pages:     5573 , remaining dirty pages:     7348
> Iteration  13, duration:    601 ms , transferred pages:     5099 (n:     4908, d:      191 ) , new dirty pages:     7671 , remaining dirty pages:     9920
> Iteration  14, duration:    900 ms , transferred pages:     7586 (n:     7362, d:      224 ) , new dirty pages:     7359 , remaining dirty pages:     9693
> Iteration  15, duration:    900 ms , transferred pages:     7682 (n:     7362, d:      320 ) , new dirty pages:     7371 , remaining dirty pages:     9382
> 
> 4. cpu2006.zeusmp (can not converge)
> 
> Iteration   1, duration:  21603 ms , transferred pages:   266450 (n:   176660, d:    89790 ) , new dirty pages:   145625 , remaining dirty pages:   145625
> Iteration   2, duration:   8696 ms , transferred pages:   144389 (n:    70862, d:    73527 ) , new dirty pages:   125124 , remaining dirty pages:   126360
> Iteration   3, duration:   6301 ms , transferred pages:   124057 (n:    51379, d:    72678 ) , new dirty pages:   122528 , remaining dirty pages:   124831
> Iteration   4, duration:   6400 ms , transferred pages:   124330 (n:    52196, d:    72134 ) , new dirty pages:   124267 , remaining dirty pages:   124768
> Iteration   5, duration:   6703 ms , transferred pages:   124034 (n:    54656, d:    69378 ) , new dirty pages:   124151 , remaining dirty pages:   124885
> Iteration   6, duration:   6703 ms , transferred pages:   124357 (n:    54658, d:    69699 ) , new dirty pages:   124106 , remaining dirty pages:   124634
> Iteration   7, duration:   6602 ms , transferred pages:   124568 (n:    53838, d:    70730 ) , new dirty pages:   133828 , remaining dirty pages:   133894
> Iteration   8, duration:   7600 ms , transferred pages:   133030 (n:    62021, d:    71009 ) , new dirty pages:   126612 , remaining dirty pages:   127476
> Iteration   9, duration:   7299 ms , transferred pages:   126511 (n:    59569, d:    66942 ) , new dirty pages:   122727 , remaining dirty pages:   123692
> Iteration  10, duration:   6609 ms , transferred pages:   123692 (n:    54539, d:    69153 ) , new dirty pages:   122727 , remaining dirty pages:   122727
> Iteration  11, duration:   6995 ms , transferred pages:   120347 (n:    56423, d:    63924 ) , new dirty pages:   121430 , remaining dirty pages:   123810
> Iteration  12, duration:   6703 ms , transferred pages:   123040 (n:    54657, d:    68383 ) , new dirty pages:   122043 , remaining dirty pages:   122813
> Iteration  13, duration:   7006 ms , transferred pages:   122353 (n:    57121, d:    65232 ) , new dirty pages:   133869 , remaining dirty pages:   134329
> Iteration  14, duration:   8209 ms , transferred pages:   132325 (n:    66932, d:    65393 ) , new dirty pages:   126914 , remaining dirty pages:   128918
> Iteration  15, duration:   7802 ms , transferred pages:   126931 (n:    63671, d:    63260 ) , new dirty pages:   122351 , remaining dirty pages:   124338
> 
> 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> 
> Iteration   1, duration:  30697 ms , transferred pages:   266450 (n:   251215, d:    15235 ) , new dirty pages:    30628 , remaining dirty pages:    30628
> Iteration   2, duration:   3496 ms , transferred pages:    28859 (n:    28513, d:      346 ) , new dirty pages:     5805 , remaining dirty pages:     7574
> Iteration   3, duration:    701 ms , transferred pages:     5746 (n:     5726, d:       20 ) , new dirty pages:     3433 , remaining dirty pages:     5261
> Iteration   4, duration:    400 ms , transferred pages:     3281 (n:     3272, d:        9 ) , new dirty pages:     1539 , remaining dirty pages:     3519
> Iteration   5, duration:    199 ms , transferred pages:     1653 (n:     1636, d:       17 ) , new dirty pages:      301 , remaining dirty pages:     2167
> Iteration   6, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2167
> 
> --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> 
> 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5  (can not converge)
> 
> Iteration   1, duration:  42486 ms , transferred pages:  1568087 (n:  1216079, d:   352008 ) , new dirty pages:   571940 , remaining dirty pages:   581023
> Iteration   2, duration:  19774 ms , transferred pages:   571700 (n:   567416, d:     4284 ) , new dirty pages:   331690 , remaining dirty pages:   341013
> Iteration   3, duration:  11589 ms , transferred pages:   332187 (n:   332095, d:       92 ) , new dirty pages:   222725 , remaining dirty pages:   231551
> Iteration   4, duration:   7790 ms , transferred pages:   223571 (n:   223499, d:       72 ) , new dirty pages:   157658 , remaining dirty pages:   165638
> Iteration   5, duration:   5518 ms , transferred pages:   158056 (n:   157998, d:       58 ) , new dirty pages:   128130 , remaining dirty pages:   135712
> Iteration   6, duration:   4442 ms , transferred pages:   127764 (n:   127701, d:       63 ) , new dirty pages:   104839 , remaining dirty pages:   112787
> Iteration   7, duration:   3649 ms , transferred pages:   104581 (n:   104523, d:       58 ) , new dirty pages:   100736 , remaining dirty pages:   108942
> Iteration   8, duration:   3532 ms , transferred pages:   101379 (n:   101315, d:       64 ) , new dirty pages:    87869 , remaining dirty pages:    95432
> Iteration   9, duration:   3030 ms , transferred pages:    86841 (n:    86786, d:       55 ) , new dirty pages:    77505 , remaining dirty pages:    86096
> Iteration  10, duration:   2709 ms , transferred pages:    77875 (n:    77814, d:       61 ) , new dirty pages:    77197 , remaining dirty pages:    85418
> Iteration  11, duration:   2696 ms , transferred pages:    77107 (n:    77044, d:       63 ) , new dirty pages:    65010 , remaining dirty pages:    73321
> Iteration  12, duration:   2308 ms , transferred pages:    66540 (n:    66484, d:       56 ) , new dirty pages:    64388 , remaining dirty pages:    71169
> Iteration  13, duration:   2198 ms , transferred pages:    62953 (n:    62897, d:       56 ) , new dirty pages:    62773 , remaining dirty pages:    70989
> Iteration  14, duration:   2214 ms , transferred pages:    63466 (n:    63411, d:       55 ) , new dirty pages:    67538 , remaining dirty pages:    75061
> Iteration  15, duration:   2329 ms , transferred pages:    66924 (n:    66875, d:       49 ) , new dirty pages:    63580 , remaining dirty pages:    71717
> Iteration  16, duration:   2252 ms , transferred pages:    64554 (n:    64539, d:       15 ) , new dirty pages:    63094 , remaining dirty pages:    70257
> Iteration  17, duration:   2188 ms , transferred pages:    62697 (n:    62641, d:       56 ) , new dirty pages:    63016 , remaining dirty pages:    70576
> Iteration  18, duration:   2171 ms , transferred pages:    62377 (n:    62322, d:       55 ) , new dirty pages:    56764 , remaining dirty pages:    64963
> Iteration  19, duration:   2003 ms , transferred pages:    57382 (n:    57324, d:       58 ) , new dirty pages:    65307 , remaining dirty pages:    72888
> Iteration  20, duration:   2240 ms , transferred pages:    64426 (n:    64364, d:       62 ) , new dirty pages:    61585 , remaining dirty pages:    70047
> 
> 
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-10-14 11:15         ` Dr. David Alan Gilbert
@ 2016-11-03  8:25           ` Chunguang Li
  2016-11-03  9:59             ` Li, Liang Z
                               ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-03  8:25 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela




> -----Original Messages-----
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Sent Time: Friday, October 14, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > 
> > 
> > 
> > > -----原始邮件-----
> > > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > > 发送时间: 2016年9月30日 星期五
> > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > 
> > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > > 
> > > > 
> > > > 
> > > > > -----原始邮件-----
> > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > 发送时间: 2016年9月26日 星期一
> > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > > 
> > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > > Hi all!
> > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > > 
> > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > > 
> > > > > > 
> > > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > > 
> > > > > I think you're right that this can happen; to clarify I think the
> > > > > case you're talking about is:
> > > > > 
> > > > >   Iteration 1
> > > > >     sync bitmap
> > > > >     start sending pages
> > > > >     page 'n' is modified - but hasn't been sent yet
> > > > >     page 'n' gets sent
> > > > >   Iteration 2
> > > > >     sync bitmap
> > > > >        'page n is shown as modified'
> > > > >     send page 'n' again
> > > > >
> > > > 
> > > > Yes,this is right the case I am talking about.
> > > >  
> > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > on big VMs with slow networks where the length of each iteration
> > > > > is large.
> > > > 
> > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > > 
> > > It makes sense, can you get some perf numbers to show what kinds of
> > > workloads get impacted the most?  That would also help us to figure
> > > out what kinds of speed improvements we can expect.
> > > 
> > > 
> > > 		Amit
> > 
> > I have picked up 6 workloads and got the following statistics numbers 
> > of every iteration (except the last stop-copy one) during precopy.
> > These numbers are obtained with the basic precopy migration, without 
> > the capabilities like xbzrle or compression, etc. The network for the 
> > migration is exclusive, with a separate network for the workloads. 
> > They are both gigabit ethernet. I use qemu-2.5.1.
> > 
> > Three (booting, idle, web server) of them converged to the stop-copy phase, 
> > with the given bandwidth and default downtime (300ms), while the other
> > three (kernel compilation, zeusmp, memcached) did not.
> > 
> > One page is "not-really-dirty", if it is written first and is sent later
> > (and not written again after that) during one iteration. I guess this 
> > would not happen so often during the other iterations as during the 1st 
> > iteration. Because all the pages of the VM are sent to the dest node during 
> > the 1st iteration, while during the others, only part of the pages are sent. 
> > So I think the "not-really-dirty" pages should be produced mainly during 
> > the 1st iteration , and maybe very little during the other iterations.
> > 
> > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > think the time spent on Iteration 2 would be halved. This is a chain reaction,
> > because the dirty pages produced during Iteration 2 is halved, which incurs
> > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> 
> Yes; these numbers don't show how many of them are false dirty though.
> 
> One problem is thinking about pages that have been redirtied, if the page is dirtied
> after the sync but before the network write then it's the false-dirty that
> you're describing.
> 
> However, if the page is being written a few times, and so it would have been written
> after the network write then it isn't a false-dirty. 
> 
> You might be able to figure that out with some kernel tracing of when the dirtying
> happens, but it might be easier to write the fix!
> 
> Dave

Hi, I have made some new progress now.

To tell how many false dirty pages there are exactly in each iteration, I malloc a
buffer in memory as big as the size of the whole VM memory. When a page is 
transferred to the dest node, it is copied to the buffer; During the next iteration, 
if one page is transferred, it is compared to the old one in the buffer, and the
old one will be replaced for next comparison if it is really dirty. Thus, we are now 
able to get the exact number of false dirty pages.

This time, I use 15 workloads to get the statistic number. They are:

  1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific 
     computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick
     up these 11 benchmarks because compared to others, they have bigger memory
     occupation and higher memory dirty rate. Thus most of them could not converge
     to stop-and-copy using the default migration speed (32MB/s).
  2. kernel compilation
  3. idle VM 
  4. Apache web server which serves static content

  (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the 
   migration speed is the default 32MB/s)

  5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache.
     After filling up the 4GB cache, a client writes the cache at a constant speed
     during migration. This time, migration speed has no limit, and is up to the 
     capability of 1Gbps Ethernet.

Summarize the results first: (and you can read the precise number below) 

  1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations) 
     of false dirty pages out of all the dirty pages since iteration 2 (and the big 
     proportion lasts during the following iterations). They are cpu2006.zeusmp, 
     cpu2006.bzip2, cpu2006.mcf, and memcached.
  2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even
     though the proportion of false dirty pages is big since iteration 2, the space to
     optimize is small.
  3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not 
     in the other iterations.
  4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false 
     dirty pages since iteration 2. So the spaces to optimize for them are small.

Now I want to talk a little more about the reasons why false dirty pages are produced. 
The first reason is what we have discussed before---the mechanism to track the dirty 
pages.
And then I come up with another reason. Here is the situation: a write operation to one 
memory page happens, but it doesn't change any content of the page. So it's "write but 
not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments
to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006
benchmark suit. According to his results, general workloads has a little proportion (<10%) 
of "write but not dirty" out of all the write operations, while few workloads have higher
proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would 
happen, it just happened.

So these two reasons contribute to the false dirty pages. To optimize, I compute and store
the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its
SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a
false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new
hash replaces the old one for next comparison. 
The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1
hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which
is relatively small. 
As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems. 
They have proven that the probability of hash collision is far smaller than disk hardware fault,
so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the 
same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery. 

Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have 
big proportions of false dirty pages, the improvement is remarkable. Without optimization, 
they either can not converge to stop-and-copy, or take a very long time to complete. With the 
SHA1 hash method, all of them now complete in a relatively short time. 
For the reason I have talked above, the other workloads don't get notable improvements from the 
optimization. So below, I only show the exact number after optimization for the 4 workloads with
remarkable improvements.

Any comments or suggestions? 


Below is the experiments data:
(
"dup" means zero page, this kind of pages takes very little migration time and network 
resources, so they are always not regard as dirty pages in my numbers;
"rd" means really dirty pages;
"fd" means false dirty pages;
The numbers refer to the quantities of pages.
)

------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)-------------------

1. memcached

-----  original pre-copy (can not converge):  -----
Iteration   1, duration:  42111 ms , transferred pages:  1568788 (dup:   416239, rd:  1152549, fd:        0) , new dirty pages:   499015 , remaining dirty pages:   507397
Iteration   2, duration:  17208 ms , transferred pages:   498946 (dup:     5456, rd:   160206, fd:   333284) , new dirty pages:   261237 , remaining dirty pages:   269688
Iteration   3, duration:   9134 ms , transferred pages:   262377 (dup:      519, rd:   111900, fd:   149958) , new dirty pages:   170281 , remaining dirty pages:   177592
Iteration   4, duration:   5920 ms , transferred pages:   169966 (dup:       87, rd:    82487, fd:    87392) , new dirty pages:   121154 , remaining dirty pages:   128780
Iteration   5, duration:   4239 ms , transferred pages:   121551 (dup:       81, rd:    64120, fd:    57350) , new dirty pages:   100976 , remaining dirty pages:   108205
Iteration   6, duration:   3495 ms , transferred pages:   100353 (dup:       90, rd:    56021, fd:    44242) , new dirty pages:    74547 , remaining dirty pages:    82399
Iteration   7, duration:   2583 ms , transferred pages:    74160 (dup:       56, rd:    38016, fd:    36088) , new dirty pages:    58209 , remaining dirty pages:    66448
Iteration   8, duration:   2039 ms , transferred pages:    58534 (dup:       81, rd:    26885, fd:    31568) , new dirty pages:    43511 , remaining dirty pages:    51425
Iteration   9, duration:   1513 ms , transferred pages:    43484 (dup:       55, rd:    26641, fd:    16788) , new dirty pages:    43722 , remaining dirty pages:    51663
Iteration  10, duration:   1521 ms , transferred pages:    43676 (dup:       62, rd:    26463, fd:    17151) , new dirty pages:    35347 , remaining dirty pages:    43334
Iteration  11, duration:   1230 ms , transferred pages:    35287 (dup:        0, rd:    21293, fd:    13994) , new dirty pages:    28851 , remaining dirty pages:    36898
Iteration  12, duration:   1031 ms , transferred pages:    29651 (dup:       82, rd:    18143, fd:    11426) , new dirty pages:    27062 , remaining dirty pages:    34309
Iteration  13, duration:    917 ms , transferred pages:    26385 (dup:       56, rd:    14149, fd:    12180) , new dirty pages:    22723 , remaining dirty pages:    30647
Iteration  14, duration:    762 ms , transferred pages:    21902 (dup:       55, rd:    16355, fd:     5492) , new dirty pages:    18208 , remaining dirty pages:    26953
Iteration  15, duration:    650 ms , transferred pages:    18636 (dup:        0, rd:    11943, fd:     6693) , new dirty pages:    16085 , remaining dirty pages:    24402
Iteration  16, duration:    554 ms , transferred pages:    15946 (dup:       56, rd:     9527, fd:     6363) , new dirty pages:    14766 , remaining dirty pages:    23222
Iteration  17, duration:    538 ms , transferred pages:    15434 (dup:        0, rd:     9779, fd:     5655) , new dirty pages:    13381 , remaining dirty pages:    21169
Iteration  18, duration:    487 ms , transferred pages:    14089 (dup:       81, rd:     7737, fd:     6271) , new dirty pages:    13325 , remaining dirty pages:    20405
Iteration  19, duration:    428 ms , transferred pages:    12232 (dup:        0, rd:     8488, fd:     3744) , new dirty pages:    10274 , remaining dirty pages:    18447
Iteration  20, duration:    377 ms , transferred pages:    10887 (dup:       56, rd:     6362, fd:     4469) , new dirty pages:     9708 , remaining dirty pages:    17268
Iteration  21, duration:    320 ms , transferred pages:     9222 (dup:        0, rd:     5789, fd:     3433) , new dirty pages:     8015 , remaining dirty pages:    16061
Iteration  22, duration:    268 ms , transferred pages:     7621 (dup:        0, rd:     6204, fd:     1417) , new dirty pages:     7227 , remaining dirty pages:    15667
Iteration  23, duration:    269 ms , transferred pages:     7813 (dup:       56, rd:     4410, fd:     3347) , new dirty pages:     7591 , remaining dirty pages:    15445
Iteration  24, duration:    271 ms , transferred pages:     7749 (dup:        0, rd:     4565, fd:     3184) , new dirty pages:    15126 , remaining dirty pages:    22822
Iteration  25, duration:    549 ms , transferred pages:    15818 (dup:       60, rd:    10545, fd:     5213) , new dirty pages:    14559 , remaining dirty pages:    21563
Iteration  26, duration:    499 ms , transferred pages:    14281 (dup:        3, rd:     8760, fd:     5518) , new dirty pages:    11947 , remaining dirty pages:    19229
Iteration  27, duration:    376 ms , transferred pages:    10823 (dup:       25, rd:     6550, fd:     4248) , new dirty pages:     8561 , remaining dirty pages:    16967
Iteration  28, duration:    324 ms , transferred pages:     9350 (dup:       31, rd:     5292, fd:     4027) , new dirty pages:     8655 , remaining dirty pages:    16272
Iteration  29, duration:    274 ms , transferred pages:     7813 (dup:        0, rd:     6088, fd:     1725) , new dirty pages:     6300 , remaining dirty pages:    14759
Iteration  30, duration:    218 ms , transferred pages:     6340 (dup:       45, rd:     3196, fd:     3099) , new dirty pages:     5143 , remaining dirty pages:    13562

-----  after optimization:  -----
Iteration   1, duration:  40664 ms , transferred pages:  1569037 (dup:   405940, rd:  1163097) , new dirty pages:   506846 , remaining dirty pages:   514979
Iteration   2, duration:   8032 ms , transferred pages:   161130 (dup:     4007, rd:   157123) , new dirty pages:   153479 , remaining dirty pages:   153479
Iteration   3, duration:   2620 ms , transferred pages:    65260 (dup:       20, rd:    65240) , new dirty pages:    64014 , remaining dirty pages:    67100
Iteration   4, duration:   1160 ms , transferred pages:    30227 (dup:       60, rd:    30167) , new dirty pages:    34031 , remaining dirty pages:    41414
Iteration   5, duration:    648 ms , transferred pages:    18700 (dup:       56, rd:    18644) , new dirty pages:    18375 , remaining dirty pages:    25536
Iteration   6, duration:    389 ms , transferred pages:    11399 (dup:       55, rd:    11344) , new dirty pages:    12536 , remaining dirty pages:    17516
Iteration   7, duration:    292 ms , transferred pages:     8197 (dup:        0, rd:     8197) , new dirty pages:     8387 , remaining dirty pages:    16802
Iteration   8, duration:    171 ms , transferred pages:     4931 (dup:       39, rd:     4892) , new dirty pages:     6182 , remaining dirty pages:    14060
Iteration   9, duration:    163 ms , transferred pages:     4355 (dup:       16, rd:     4339) , new dirty pages:     5530 , remaining dirty pages:    11973
Iteration  10, duration:    104 ms , transferred pages:     3266 (dup:        0, rd:     3266) , new dirty pages:     2893 , remaining dirty pages:    11014
Iteration  11, duration:     52 ms , transferred pages:     1153 (dup:        0, rd:     1153) , new dirty pages:     1586 , remaining dirty pages:    10516
Iteration  12, duration:     52 ms , transferred pages:     1921 (dup:       39, rd:     1882) , new dirty pages:     1619 , remaining dirty pages:     8842
Iteration  13, duration:     62 ms , transferred pages:     1537 (dup:        0, rd:     1537) , new dirty pages:     2052 , remaining dirty pages:     8871
Iteration  14, duration:     58 ms , transferred pages:     1665 (dup:        0, rd:     1665) , new dirty pages:     1947 , remaining dirty pages:     7989
Iteration  15, duration:      2 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     7989
total time: 54693 milliseconds

2. cpu2006.zeusmp

-----  original pre-copy (can not converge):  -----
Iteration   1, duration:  21112 ms , transferred pages:   266450 (dup:    93385, rd:   173065, fd:        0) , new dirty pages:   127866 , remaining dirty pages:   127866
Iteration   2, duration:   6192 ms , transferred pages:   125662 (dup:    75762, rd:    17389, fd:    32511) , new dirty pages:   131655 , remaining dirty pages:   133859
Iteration   3, duration:   6699 ms , transferred pages:   131937 (dup:    77298, rd:    20320, fd:    34319) , new dirty pages:   121027 , remaining dirty pages:   122949
Iteration   4, duration:   5999 ms , transferred pages:   122512 (dup:    73588, rd:    17236, fd:    31688) , new dirty pages:   122759 , remaining dirty pages:   123196
Iteration   5, duration:   5804 ms , transferred pages:   122717 (dup:    75436, rd:    19016, fd:    28265) , new dirty pages:   123697 , remaining dirty pages:   124176
Iteration   6, duration:   5698 ms , transferred pages:   123708 (dup:    77249, rd:    18022, fd:    28437) , new dirty pages:   121838 , remaining dirty pages:   122306
Iteration   7, duration:   5515 ms , transferred pages:   122306 (dup:    76727, rd:    14819, fd:    30760) , new dirty pages:   122382 , remaining dirty pages:   122382
Iteration   8, duration:   6086 ms , transferred pages:   120825 (dup:    71834, rd:    15987, fd:    33004) , new dirty pages:   121587 , remaining dirty pages:   123144
Iteration   9, duration:   5899 ms , transferred pages:   120964 (dup:    72860, rd:    18191, fd:    29913) , new dirty pages:   120391 , remaining dirty pages:   122571
Iteration  10, duration:   5801 ms , transferred pages:   121425 (dup:    74140, rd:    20722, fd:    26563) , new dirty pages:   122302 , remaining dirty pages:   123448
Iteration  11, duration:   5909 ms , transferred pages:   123448 (dup:    74735, rd:    19678, fd:    29035) , new dirty pages:   123258 , remaining dirty pages:   123258
Iteration  12, duration:   6293 ms , transferred pages:   121211 (dup:    70442, rd:    18128, fd:    32641) , new dirty pages:   123623 , remaining dirty pages:   125670
Iteration  13, duration:   6398 ms , transferred pages:   124897 (dup:    72701, rd:    21134, fd:    31062) , new dirty pages:   122355 , remaining dirty pages:   123128
Iteration  14, duration:   6301 ms , transferred pages:   121893 (dup:    70514, rd:    23470, fd:    27909) , new dirty pages:   120980 , remaining dirty pages:   122215
Iteration  15, duration:   6304 ms , transferred pages:   121389 (dup:    70005, rd:    21731, fd:    29653) , new dirty pages:   121628 , remaining dirty pages:   122454
Iteration  16, duration:   6398 ms , transferred pages:   122164 (dup:    69962, rd:    24376, fd:    27826) , new dirty pages:   122246 , remaining dirty pages:   122536
Iteration  17, duration:   6201 ms , transferred pages:   121548 (dup:    70984, rd:    23915, fd:    26649) , new dirty pages:   121460 , remaining dirty pages:   122448
Iteration  18, duration:   6401 ms , transferred pages:   122272 (dup:    70072, rd:    22261, fd:    29939) , new dirty pages:   123518 , remaining dirty pages:   123694
Iteration  19, duration:   7003 ms , transferred pages:   121873 (dup:    64754, rd:    27325, fd:    29794) , new dirty pages:   120568 , remaining dirty pages:   122389
Iteration  20, duration:   6400 ms , transferred pages:   121422 (dup:    69221, rd:    25300, fd:    26901) , new dirty pages:   121229 , remaining dirty pages:   122196
Iteration  21, duration:   6703 ms , transferred pages:   119895 (dup:    65232, rd:    25877, fd:    28786) , new dirty pages:   123284 , remaining dirty pages:   125585
Iteration  22, duration:   6902 ms , transferred pages:   123884 (dup:    67582, rd:    29020, fd:    27282) , new dirty pages:   122057 , remaining dirty pages:   123758
Iteration  23, duration:   6800 ms , transferred pages:   122010 (dup:    66529, rd:    30644, fd:    24837) , new dirty pages:   120916 , remaining dirty pages:   122664
Iteration  24, duration:   7202 ms , transferred pages:   121951 (dup:    63188, rd:    31105, fd:    27658) , new dirty pages:   122715 , remaining dirty pages:   123428
Iteration  25, duration:   7202 ms , transferred pages:   122919 (dup:    64161, rd:    32063, fd:    26695) , new dirty pages:   123180 , remaining dirty pages:   123689
Iteration  26, duration:   7404 ms , transferred pages:   123092 (dup:    62694, rd:    33459, fd:    26939) , new dirty pages:   122149 , remaining dirty pages:   122746
Iteration  27, duration:   7205 ms , transferred pages:   120427 (dup:    61664, rd:    34344, fd:    24419) , new dirty pages:   120299 , remaining dirty pages:   122618
Iteration  28, duration:   7100 ms , transferred pages:   121074 (dup:    63130, rd:    32403, fd:    25541) , new dirty pages:   122984 , remaining dirty pages:   124528
Iteration  29, duration:   7904 ms , transferred pages:   124060 (dup:    59564, rd:    35631, fd:    28865) , new dirty pages:   127080 , remaining dirty pages:   127548
Iteration  30, duration:   7906 ms , transferred pages:   127518 (dup:    63029, rd:    34416, fd:    30073) , new dirty pages:   125028 , remaining dirty pages:   125058

-----  after optimization:  -----
Iteration   1, duration:  21601 ms , transferred pages:   266450 (dup:    89731, rd:   176719) , new dirty pages:   139843 , remaining dirty pages:   139843
Iteration   2, duration:   1747 ms , transferred pages:    92077 (dup:    78364, rd:    13713) , new dirty pages:    90945 , remaining dirty pages:    90945
Iteration   3, duration:   1592 ms , transferred pages:    62253 (dup:    49435, rd:    12818) , new dirty pages:    76929 , remaining dirty pages:    76929
Iteration   4, duration:    992 ms , transferred pages:    44837 (dup:    37886, rd:     6951) , new dirty pages:    71331 , remaining dirty pages:    72916
Iteration   5, duration:    998 ms , transferred pages:    55229 (dup:    47150, rd:     8079) , new dirty pages:    21703 , remaining dirty pages:    23302
Iteration   6, duration:    211 ms , transferred pages:    20337 (dup:    18516, rd:     1821) , new dirty pages:    14500 , remaining dirty pages:    14500
Iteration   7, duration:     31 ms , transferred pages:    12933 (dup:    12627, rd:      306) , new dirty pages:     1520 , remaining dirty pages:     1520
Iteration   8, duration:     30 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        4 , remaining dirty pages:     1524
total time: 27225 milliseconds

3. cpu2006.bzip2

-----  original pre-copy:  -----
Iteration   1, duration:  18306 ms , transferred pages:   266450 (dup:   116569, rd:   149881, fd:        0) , new dirty pages:   106299 , remaining dirty pages:   106299
Iteration   2, duration:  10694 ms , transferred pages:   104611 (dup:    17550, rd:    10536, fd:    76525) , new dirty pages:    34394 , remaining dirty pages:    36082
Iteration   3, duration:   2998 ms , transferred pages:    34442 (dup:     9924, rd:    12254, fd:    12264) , new dirty pages:     6419 , remaining dirty pages:     8059
Iteration   4, duration:    699 ms , transferred pages:     5748 (dup:       22, rd:     2583, fd:     3143) , new dirty pages:     1226 , remaining dirty pages:     3537
Iteration   5, duration:    200 ms , transferred pages:     1636 (dup:        0, rd:     1194, fd:      442) , new dirty pages:      478 , remaining dirty pages:     2379
Iteration   6, duration:      1 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2379

-----  after optimization:  -----
Iteration   1, duration:  13995 ms , transferred pages:   266314 (dup:   152118, rd:   114196) , new dirty pages:    97009 , remaining dirty pages:    97145
Iteration   2, duration:   1215 ms , transferred pages:    33400 (dup:    26745, rd:     6655) , new dirty pages:    12866 , remaining dirty pages:    14017
Iteration   3, duration:    701 ms , transferred pages:     5774 (dup:       48, rd:     5726) , new dirty pages:     6342 , remaining dirty pages:     8761
Iteration   4, duration:    500 ms , transferred pages:     4111 (dup:       21, rd:     4090) , new dirty pages:     4311 , remaining dirty pages:     6485
Iteration   5, duration:    400 ms , transferred pages:     3273 (dup:        1, rd:     3272) , new dirty pages:     3034 , remaining dirty pages:     5431
Iteration   6, duration:    301 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2094 , remaining dirty pages:     4472
Iteration   7, duration:    299 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2066 , remaining dirty pages:     4082
Iteration   8, duration:    202 ms , transferred pages:     1636 (dup:        0, rd:     1636) , new dirty pages:     2881 , remaining dirty pages:     4648
Iteration   9, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     4775 , remaining dirty pages:     6778
Iteration  10, duration:    400 ms , transferred pages:     3281 (dup:        9, rd:     3272) , new dirty pages:     3757 , remaining dirty pages:     5576
Iteration  11, duration:    401 ms , transferred pages:     3279 (dup:        7, rd:     3272) , new dirty pages:     6980 , remaining dirty pages:     8906
Iteration  12, duration:    500 ms , transferred pages:     7118 (dup:     3035, rd:     4083) , new dirty pages:    10774 , remaining dirty pages:    11922
Iteration  13, duration:    116 ms , transferred pages:    11706 (dup:    10152, rd:     1554) , new dirty pages:     1326 , remaining dirty pages:     1326
Iteration  14, duration:    117 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     1326
total time: 19479 milliseconds

4. cpu2006.mcf

-----  original pre-copy:  -----
Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6925, rd:   259525, fd:        0) , new dirty pages:   244403 , remaining dirty pages:   244403
Iteration   2, duration:  29603 ms , transferred pages:   242275 (dup:      377, rd:   224001, fd:    17897) , new dirty pages:   227335 , remaining dirty pages:   229463
Iteration   3, duration:  27806 ms , transferred pages:   227573 (dup:      169, rd:    65681, fd:   161723) , new dirty pages:   195593 , remaining dirty pages:   197483
Iteration   4, duration:  23907 ms , transferred pages:   195543 (dup:       41, rd:    39838, fd:   155664) , new dirty pages:   215066 , remaining dirty pages:   217006
Iteration   5, duration:  26305 ms , transferred pages:   215289 (dup:      155, rd:    33082, fd:   182052) , new dirty pages:   111098 , remaining dirty pages:   112815
Iteration   6, duration:  13502 ms , transferred pages:   110452 (dup:       22, rd:    26793, fd:    83637) , new dirty pages:   161054 , remaining dirty pages:   163417
Iteration   7, duration:  19705 ms , transferred pages:   161266 (dup:      120, rd:    33818, fd:   127328) , new dirty pages:   220562 , remaining dirty pages:   222713
Iteration   8, duration:  27003 ms , transferred pages:   220881 (dup:       21, rd:   215721, fd:     5139) , new dirty pages:   219787 , remaining dirty pages:   221619
Iteration   9, duration:  26802 ms , transferred pages:   219248 (dup:       24, rd:    84648, fd:   134576) , new dirty pages:   207959 , remaining dirty pages:   210330
Iteration  10, duration:  25411 ms , transferred pages:   207916 (dup:      144, rd:    35842, fd:   171930) , new dirty pages:   144442 , remaining dirty pages:   146856
Iteration  11, duration:  17714 ms , transferred pages:   144804 (dup:       18, rd:    25414, fd:   119372) , new dirty pages:   205127 , remaining dirty pages:   207179
Iteration  12, duration:  25112 ms , transferred pages:   205446 (dup:      128, rd:    23197, fd:   182121) , new dirty pages:   167319 , remaining dirty pages:   169052
Iteration  13, duration:  20411 ms , transferred pages:   166886 (dup:       14, rd:    21960, fd:   144912) , new dirty pages:   221592 , remaining dirty pages:   223758
Iteration  14, duration:  27126 ms , transferred pages:   221800 (dup:      122, rd:    42368, fd:   179310) , new dirty pages:   233630 , remaining dirty pages:   235588
Iteration  15, duration:  28517 ms , transferred pages:   233321 (dup:      191, rd:   222528, fd:    10602) , new dirty pages:   224282 , remaining dirty pages:   226549
Iteration  16, duration:  27422 ms , transferred pages:   224187 (dup:       55, rd:    45773, fd:   178359) , new dirty pages:   209815 , remaining dirty pages:   212177
Iteration  17, duration:  25723 ms , transferred pages:   210260 (dup:       34, rd:    79405, fd:   130821) , new dirty pages:   220297 , remaining dirty pages:   222214
Iteration  18, duration:  26920 ms , transferred pages:   220056 (dup:       14, rd:   214128, fd:     5914) , new dirty pages:   192015 , remaining dirty pages:   194173
Iteration  19, duration:  23520 ms , transferred pages:   192239 (dup:        9, rd:    25140, fd:   167090) , new dirty pages:    96450 , remaining dirty pages:    98384
Iteration  20, duration:  11805 ms , transferred pages:    96538 (dup:       14, rd:     7424, fd:    89100) , new dirty pages:     6978 , remaining dirty pages:     8824
Iteration  21, duration:    799 ms , transferred pages:     6545 (dup:        1, rd:     1802, fd:     4742) , new dirty pages:      138 , remaining dirty pages:     2417
Iteration  22, duration:      1 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2417

-----  after optimization:  -----
Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6831, rd:   259619) , new dirty pages:   240209 , remaining dirty pages:   240209
Iteration   2, duration:   6250 ms , transferred pages:    51244 (dup:      211, rd:    51033) , new dirty pages:   226651 , remaining dirty pages:   228571
Iteration   3, duration:   4395 ms , transferred pages:    36008 (dup:       80, rd:    35928) , new dirty pages:   110719 , remaining dirty pages:   111478
Iteration   4, duration:   3390 ms , transferred pages:    28068 (dup:       28, rd:    28040) , new dirty pages:   185172 , remaining dirty pages:   185172
Iteration   5, duration:   2986 ms , transferred pages:    23780 (dup:       45, rd:    23735) , new dirty pages:    64357 , remaining dirty pages:    66305
Iteration   6, duration:   2727 ms , transferred pages:    22800 (dup:       12, rd:    22788) , new dirty pages:    61675 , remaining dirty pages:    61675
Iteration   7, duration:   2372 ms , transferred pages:    18943 (dup:       13, rd:    18930) , new dirty pages:    55144 , remaining dirty pages:    55265
Iteration   8, duration:   2100 ms , transferred pages:    17189 (dup:       11, rd:    17178) , new dirty pages:    55244 , remaining dirty pages:    55668
Iteration   9, duration:   2003 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:   107058 , remaining dirty pages:   108014
Iteration  10, duration:   2132 ms , transferred pages:    17825 (dup:       24, rd:    17801) , new dirty pages:   126214 , remaining dirty pages:   126214
Iteration  11, duration:   2229 ms , transferred pages:    18156 (dup:       22, rd:    18134) , new dirty pages:    65725 , remaining dirty pages:    65725
Iteration  12, duration:   2315 ms , transferred pages:    18651 (dup:       21, rd:    18630) , new dirty pages:    52575 , remaining dirty pages:    53903
Iteration  13, duration:   2147 ms , transferred pages:    17435 (dup:       16, rd:    17419) , new dirty pages:    46652 , remaining dirty pages:    47260
Iteration  14, duration:   2000 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:    42721 , remaining dirty pages:    43266
Iteration  15, duration:   1901 ms , transferred pages:    15552 (dup:       10, rd:    15542) , new dirty pages:    38593 , remaining dirty pages:    40792
Iteration  16, duration:   1801 ms , transferred pages:    14735 (dup:       11, rd:    14724) , new dirty pages:    54252 , remaining dirty pages:    55639
Iteration  17, duration:   1708 ms , transferred pages:    13860 (dup:        2, rd:    13858) , new dirty pages:    72379 , remaining dirty pages:    74170
Iteration  18, duration:   1923 ms , transferred pages:    15442 (dup:       12, rd:    15430) , new dirty pages:   101911 , remaining dirty pages:   103547
Iteration  19, duration:   2311 ms , transferred pages:    18823 (dup:        9, rd:    18814) , new dirty pages:    80534 , remaining dirty pages:    82521
Iteration  20, duration:   2081 ms , transferred pages:    17156 (dup:       34, rd:    17122) , new dirty pages:    36054 , remaining dirty pages:    36054
Iteration  21, duration:   1665 ms , transferred pages:    13777 (dup:       10, rd:    13767) , new dirty pages:    29624 , remaining dirty pages:    29624
Iteration  22, duration:   1657 ms , transferred pages:    13290 (dup:        7, rd:    13283) , new dirty pages:    25949 , remaining dirty pages:    28265
Iteration  23, duration:   1599 ms , transferred pages:    13088 (dup:        0, rd:    13088) , new dirty pages:    22356 , remaining dirty pages:    24813
Iteration  24, duration:   1500 ms , transferred pages:    12280 (dup:       10, rd:    12270) , new dirty pages:    21181 , remaining dirty pages:    22608
Iteration  25, duration:   1400 ms , transferred pages:    11457 (dup:        5, rd:    11452) , new dirty pages:    18657 , remaining dirty pages:    20311
Iteration  26, duration:   1200 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    15690 , remaining dirty pages:    17294
Iteration  27, duration:   1201 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    14810 , remaining dirty pages:    15936
Iteration  28, duration:   1000 ms , transferred pages:     8183 (dup:        3, rd:     8180) , new dirty pages:    15387 , remaining dirty pages:    16423
Iteration  29, duration:    900 ms , transferred pages:     7372 (dup:       10, rd:     7362) , new dirty pages:    13303 , remaining dirty pages:    15292
Iteration  30, duration:   1000 ms , transferred pages:     8181 (dup:        1, rd:     8180) , new dirty pages:    17879 , remaining dirty pages:    18457
Iteration  31, duration:    951 ms , transferred pages:     8140 (dup:        9, rd:     8131) , new dirty pages:    21738 , remaining dirty pages:    23304
Iteration  32, duration:    946 ms , transferred pages:     6946 (dup:        1, rd:     6945) , new dirty pages:    15815 , remaining dirty pages:    15815
Iteration  33, duration:    747 ms , transferred pages:     6192 (dup:        0, rd:     6192) , new dirty pages:     6249 , remaining dirty pages:     7670
Iteration  34, duration:    501 ms , transferred pages:     4090 (dup:        0, rd:     4090) , new dirty pages:     6163 , remaining dirty pages:     8422
Iteration  35, duration:    600 ms , transferred pages:     4910 (dup:        2, rd:     4908) , new dirty pages:     3673 , remaining dirty pages:     5222
Iteration  36, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2132 , remaining dirty pages:     4337
Iteration  37, duration:    200 ms , transferred pages:     1637 (dup:        1, rd:     1636) , new dirty pages:      544 , remaining dirty pages:     2251
Iteration  38, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     2251
total time: 97919 milliseconds

------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)-------------------

5. idle

Iteration   1, duration:  14702 ms , transferred pages:   266450 (dup:   146393, rd:   120057, fd:        0) , new dirty pages:    14595 , remaining dirty pages:    14595
Iteration   2, duration:   1592 ms , transferred pages:    12412 (dup:      103, rd:     3280, fd:     9029) , new dirty pages:      218 , remaining dirty pages:     2401
Iteration   3, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2401

6. kernel compilation (can not converge)

Iteration   1, duration:  20607 ms , transferred pages:   266450 (dup:    97552, rd:   168898, fd:        0) , new dirty pages:    19293 , remaining dirty pages:    19293
Iteration   2, duration:   2092 ms , transferred pages:    17176 (dup:      597, rd:     8625, fd:     7954) , new dirty pages:     8318 , remaining dirty pages:    10435
Iteration   3, duration:   1000 ms , transferred pages:     8484 (dup:      304, rd:     6256, fd:     1924) , new dirty pages:     8736 , remaining dirty pages:    10687
Iteration   4, duration:   1000 ms , transferred pages:     8435 (dup:      255, rd:     7089, fd:     1091) , new dirty pages:     7627 , remaining dirty pages:     9879
Iteration   5, duration:    900 ms , transferred pages:     7553 (dup:      191, rd:     5602, fd:     1760) , new dirty pages:     7287 , remaining dirty pages:     9613
Iteration   6, duration:    900 ms , transferred pages:     7620 (dup:      258, rd:     5761, fd:     1601) , new dirty pages:     8958 , remaining dirty pages:    10951
Iteration   7, duration:   1099 ms , transferred pages:     9309 (dup:      311, rd:     8051, fd:      947) , new dirty pages:     7189 , remaining dirty pages:     8831
Iteration   8, duration:    800 ms , transferred pages:     6832 (dup:      288, rd:     5717, fd:      827) , new dirty pages:     5782 , remaining dirty pages:     7781
Iteration   9, duration:    701 ms , transferred pages:     5875 (dup:      149, rd:     4005, fd:     1721) , new dirty pages:     4587 , remaining dirty pages:     6493
Iteration  10, duration:    500 ms , transferred pages:     4234 (dup:      144, rd:     3057, fd:     1033) , new dirty pages:     7352 , remaining dirty pages:     9611
Iteration  11, duration:    900 ms , transferred pages:     7759 (dup:      397, rd:     6563, fd:      799) , new dirty pages:     6686 , remaining dirty pages:     8538
Iteration  12, duration:    800 ms , transferred pages:     6808 (dup:      264, rd:     6017, fd:      527) , new dirty pages:     6871 , remaining dirty pages:     8601
Iteration  13, duration:    800 ms , transferred pages:     6775 (dup:      231, rd:     5722, fd:      822) , new dirty pages:     7540 , remaining dirty pages:     9366
Iteration  14, duration:    900 ms , transferred pages:     7507 (dup:      145, rd:     5900, fd:     1462) , new dirty pages:     7581 , remaining dirty pages:     9440
Iteration  15, duration:    900 ms , transferred pages:     7630 (dup:      268, rd:     6211, fd:     1151) , new dirty pages:     7268 , remaining dirty pages:     9078
Iteration  16, duration:    800 ms , transferred pages:     6759 (dup:      215, rd:     5763, fd:      781) , new dirty pages:     6861 , remaining dirty pages:     9180
Iteration  17, duration:    800 ms , transferred pages:     6838 (dup:      294, rd:     6037, fd:      507) , new dirty pages:     6196 , remaining dirty pages:     8538
Iteration  18, duration:    800 ms , transferred pages:     6852 (dup:      308, rd:     4905, fd:     1639) , new dirty pages:     5947 , remaining dirty pages:     7633
Iteration  19, duration:    700 ms , transferred pages:     5919 (dup:      193, rd:     4853, fd:      873) , new dirty pages:     5861 , remaining dirty pages:     7575
Iteration  20, duration:    600 ms , transferred pages:     5284 (dup:      376, rd:     4408, fd:      500) , new dirty pages:     5206 , remaining dirty pages:     7497
Iteration  21, duration:    600 ms , transferred pages:     5147 (dup:      239, rd:     4308, fd:      600) , new dirty pages:     5031 , remaining dirty pages:     7381
Iteration  22, duration:    599 ms , transferred pages:     5064 (dup:      156, rd:     4026, fd:      882) , new dirty pages:     5601 , remaining dirty pages:     7918
Iteration  23, duration:    702 ms , transferred pages:     5965 (dup:      239, rd:     5028, fd:      698) , new dirty pages:     6079 , remaining dirty pages:     8032
Iteration  24, duration:    700 ms , transferred pages:     6175 (dup:      449, rd:     5146, fd:      580) , new dirty pages:    10932 , remaining dirty pages:    12789
Iteration  25, duration:   1300 ms , transferred pages:    10936 (dup:      302, rd:     6205, fd:     4429) , new dirty pages:     8713 , remaining dirty pages:    10566
Iteration  26, duration:   1000 ms , transferred pages:     8282 (dup:      102, rd:     5662, fd:     2518) , new dirty pages:     5119 , remaining dirty pages:     7403
Iteration  27, duration:    600 ms , transferred pages:     5007 (dup:       99, rd:     4099, fd:      809) , new dirty pages:     2226 , remaining dirty pages:     4622
Iteration  28, duration:    300 ms , transferred pages:     2491 (dup:       37, rd:     1794, fd:      660) , new dirty pages:     6746 , remaining dirty pages:     8877
Iteration  29, duration:    800 ms , transferred pages:     6757 (dup:      213, rd:     5532, fd:     1012) , new dirty pages:     6070 , remaining dirty pages:     8190
Iteration  30, duration:    700 ms , transferred pages:     6052 (dup:      326, rd:     5107, fd:      619) , new dirty pages:     5177 , remaining dirty pages:     7315

7. web server

Iteration   1, duration:  20902 ms , transferred pages:   266450 (dup:    95497, rd:   170953, fd:        0) , new dirty pages:     8528 , remaining dirty pages:     8528
Iteration   2, duration:    796 ms , transferred pages:     6472 (dup:      131, rd:     1885, fd:     4456) , new dirty pages:      650 , remaining dirty pages:     2706
Iteration   3, duration:    100 ms , transferred pages:      818 (dup:        0, rd:      383, fd:      435) , new dirty pages:      328 , remaining dirty pages:     2216
Iteration   4, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2216


8. cpu2006.bwaves (can not converge)

Iteration   1, duration:  31715 ms , transferred pages:   266450 (dup:     6766, rd:   259684, fd:        0) , new dirty pages:   242702 , remaining dirty pages:   242702
Iteration   2, duration:  29397 ms , transferred pages:   240508 (dup:      405, rd:   225588, fd:    14515) , new dirty pages:   230889 , remaining dirty pages:   233083
Iteration   3, duration:  28205 ms , transferred pages:   230858 (dup:      182, rd:   214596, fd:    16080) , new dirty pages:   226998 , remaining dirty pages:   229223
Iteration   4, duration:  27805 ms , transferred pages:   227574 (dup:      170, rd:   217045, fd:    10359) , new dirty pages:   227360 , remaining dirty pages:   229009
Iteration   5, duration:  27703 ms , transferred pages:   226786 (dup:      200, rd:   212130, fd:    14456) , new dirty pages:   225885 , remaining dirty pages:   228108
Iteration   6, duration:  27600 ms , transferred pages:   225923 (dup:      155, rd:   215503, fd:    10265) , new dirty pages:   223555 , remaining dirty pages:   225740
Iteration   7, duration:  27309 ms , transferred pages:   223574 (dup:      260, rd:   215641, fd:     7673) , new dirty pages:   231975 , remaining dirty pages:   234141
Iteration   8, duration:  28403 ms , transferred pages:   232397 (dup:       85, rd:   214086, fd:    18226) , new dirty pages:   222170 , remaining dirty pages:   223914
Iteration   9, duration:  27105 ms , transferred pages:   221809 (dup:      131, rd:   214988, fd:     6690) , new dirty pages:   230065 , remaining dirty pages:   232170
Iteration  10, duration:  28104 ms , transferred pages:   230201 (dup:      343, rd:   213531, fd:    16327) , new dirty pages:   227590 , remaining dirty pages:   229559
Iteration  11, duration:  27801 ms , transferred pages:   227717 (dup:      313, rd:   221408, fd:     5996) , new dirty pages:   228457 , remaining dirty pages:   230299
Iteration  12, duration:  27916 ms , transferred pages:   228560 (dup:      338, rd:   219660, fd:     8562) , new dirty pages:   238326 , remaining dirty pages:   240065

9. cpu2006.lbm (can not converge)
Iteration   1, duration:  31012 ms , transferred pages:   266450 (dup:    12253, rd:   254197, fd:        0) , new dirty pages:   108960 , remaining dirty pages:   108960
Iteration   2, duration:  13095 ms , transferred pages:   106522 (dup:        3, rd:   102045, fd:     4474) , new dirty pages:   129292 , remaining dirty pages:   131730
Iteration   3, duration:  15802 ms , transferred pages:   129688 (dup:      444, rd:   110860, fd:    18384) , new dirty pages:   116682 , remaining dirty pages:   118724
Iteration   4, duration:  14204 ms , transferred pages:   116316 (dup:      160, rd:   104951, fd:    11205) , new dirty pages:   107246 , remaining dirty pages:   109654
Iteration   5, duration:  13208 ms , transferred pages:   107977 (dup:        1, rd:   101834, fd:     6142) , new dirty pages:   105371 , remaining dirty pages:   107048
Iteration   6, duration:  12804 ms , transferred pages:   104705 (dup:        1, rd:    99629, fd:     5075) , new dirty pages:   103841 , remaining dirty pages:   106184
Iteration   7, duration:  12709 ms , transferred pages:   103891 (dup:        5, rd:    99212, fd:     4674) , new dirty pages:   106692 , remaining dirty pages:   108985
Iteration   8, duration:  13105 ms , transferred pages:   107169 (dup:       11, rd:   100125, fd:     7033) , new dirty pages:   103132 , remaining dirty pages:   104948
Iteration   9, duration:  12607 ms , transferred pages:   103068 (dup:        0, rd:    99460, fd:     3608) , new dirty pages:   102511 , remaining dirty pages:   104391
Iteration  10, duration:  12514 ms , transferred pages:   102250 (dup:        0, rd:    99094, fd:     3156) , new dirty pages:   102888 , remaining dirty pages:   105029

10. cpu2006.astar (can not converge)

Iteration   1, duration:  28402 ms , transferred pages:   266450 (dup:    33770, rd:   232680, fd:        0) , new dirty pages:    62078 , remaining dirty pages:    62078
Iteration   2, duration:   7393 ms , transferred pages:    60107 (dup:       10, rd:    51722, fd:     8375) , new dirty pages:    48854 , remaining dirty pages:    50825
Iteration   3, duration:   6001 ms , transferred pages:    49094 (dup:       14, rd:    46540, fd:     2540) , new dirty pages:    48137 , remaining dirty pages:    49868
Iteration   4, duration:   5800 ms , transferred pages:    47444 (dup:        0, rd:    45389, fd:     2055) , new dirty pages:    49147 , remaining dirty pages:    51571
Iteration   5, duration:   6102 ms , transferred pages:    49912 (dup:       14, rd:    46216, fd:     3682) , new dirty pages:    55606 , remaining dirty pages:    57265
Iteration   6, duration:   6699 ms , transferred pages:    54949 (dup:      143, rd:    20745, fd:    34061) , new dirty pages:     9166 , remaining dirty pages:    11482
Iteration   7, duration:   1200 ms , transferred pages:     9830 (dup:       14, rd:     7011, fd:     2805) , new dirty pages:     8294 , remaining dirty pages:     9946
Iteration   8, duration:   1000 ms , transferred pages:     8194 (dup:       14, rd:     7178, fd:     1002) , new dirty pages:     5475 , remaining dirty pages:     7227
Iteration   9, duration:    600 ms , transferred pages:     4908 (dup:        0, rd:     3470, fd:     1438) , new dirty pages:     4175 , remaining dirty pages:     6494
Iteration  10, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     3856, fd:      234) , new dirty pages:     4095 , remaining dirty pages:     6499
Iteration  11, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     3313, fd:      777) , new dirty pages:     3371 , remaining dirty pages:     5780
Iteration  12, duration:    502 ms , transferred pages:     4090 (dup:        0, rd:     3823, fd:      267) , new dirty pages:     7518 , remaining dirty pages:     9208
Iteration  13, duration:    899 ms , transferred pages:     7376 (dup:       14, rd:     6028, fd:     1334) , new dirty pages:     3931 , remaining dirty pages:     5763
Iteration  14, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     4078, fd:       12) , new dirty pages:     4346 , remaining dirty pages:     6019
Iteration  15, duration:    502 ms , transferred pages:     4090 (dup:        0, rd:     3817, fd:      273) , new dirty pages:     3054 , remaining dirty pages:     4983
Iteration  16, duration:    400 ms , transferred pages:     3272 (dup:        0, rd:     3138, fd:      134) , new dirty pages:     3874 , remaining dirty pages:     5585
Iteration  17, duration:    399 ms , transferred pages:     3272 (dup:        0, rd:     3248, fd:       24) , new dirty pages:     5285 , remaining dirty pages:     7598
Iteration  18, duration:    701 ms , transferred pages:     5726 (dup:        0, rd:     4385, fd:     1341) , new dirty pages:     8903 , remaining dirty pages:    10775
Iteration  19, duration:   1101 ms , transferred pages:     9010 (dup:       12, rd:     5597, fd:     3401) , new dirty pages:     4199 , remaining dirty pages:     5964
Iteration  20, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     4078, fd:       12) , new dirty pages:     3829 , remaining dirty pages:     5703

11. cpu2006.xalancbmk (can not converge)

Iteration   1, duration:  30407 ms , transferred pages:   266450 (dup:    17700, rd:   248750, fd:        0) , new dirty pages:    96169 , remaining dirty pages:    96169
Iteration   2, duration:  11495 ms , transferred pages:    94164 (dup:      205, rd:    67068, fd:    26891) , new dirty pages:    61766 , remaining dirty pages:    63771
Iteration   3, duration:   7501 ms , transferred pages:    61471 (dup:      121, rd:    53587, fd:     7763) , new dirty pages:    56569 , remaining dirty pages:    58869
Iteration   4, duration:   6902 ms , transferred pages:    56461 (dup:       19, rd:    50553, fd:     5889) , new dirty pages:    52181 , remaining dirty pages:    54589
Iteration   5, duration:   6402 ms , transferred pages:    52459 (dup:      107, rd:    46986, fd:     5366) , new dirty pages:    54051 , remaining dirty pages:    56181
Iteration   6, duration:   6601 ms , transferred pages:    54003 (dup:       15, rd:    47566, fd:     6422) , new dirty pages:    50844 , remaining dirty pages:    53022
Iteration   7, duration:   6202 ms , transferred pages:    50723 (dup:        7, rd:    47143, fd:     3573) , new dirty pages:    64880 , remaining dirty pages:    67179
Iteration   8, duration:   8001 ms , transferred pages:    65447 (dup:        7, rd:    61159, fd:     4281) , new dirty pages:    67854 , remaining dirty pages:    69586
Iteration   9, duration:   8202 ms , transferred pages:    67444 (dup:      368, rd:    56357, fd:    10719) , new dirty pages:    65178 , remaining dirty pages:    67320
Iteration  10, duration:   8000 ms , transferred pages:    65455 (dup:       15, rd:    60581, fd:     4859) , new dirty pages:    52421 , remaining dirty pages:    54286

12. cpu2006.milc (can not converge)

Iteration   1, duration:  31410 ms , transferred pages:   266450 (dup:     9454, rd:   256996, fd:        0) , new dirty pages:   158860 , remaining dirty pages:   158860
Iteration   2, duration:  19193 ms , transferred pages:   157048 (dup:      150, rd:    96807, fd:    60091) , new dirty pages:   102238 , remaining dirty pages:   104050
Iteration   3, duration:  12504 ms , transferred pages:   102271 (dup:       21, rd:    95107, fd:     7143) , new dirty pages:    97944 , remaining dirty pages:    99723
Iteration   4, duration:  11905 ms , transferred pages:    97360 (dup:       18, rd:    93610, fd:     3732) , new dirty pages:    99150 , remaining dirty pages:   101513
Iteration   5, duration:  12105 ms , transferred pages:    99094 (dup:      116, rd:    94125, fd:     4853) , new dirty pages:    98589 , remaining dirty pages:   101008
Iteration   6, duration:  12101 ms , transferred pages:    98995 (dup:       17, rd:    94069, fd:     4909) , new dirty pages:   147403 , remaining dirty pages:   149416
Iteration   7, duration:  18001 ms , transferred pages:   147284 (dup:       44, rd:   135691, fd:    11549) , new dirty pages:   136445 , remaining dirty pages:   138577
Iteration   8, duration:  16702 ms , transferred pages:   136636 (dup:       30, rd:   130805, fd:     5801) , new dirty pages:   145481 , remaining dirty pages:   147422
Iteration   9, duration:  17800 ms , transferred pages:   145734 (dup:      130, rd:   133239, fd:    12365) , new dirty pages:    98032 , remaining dirty pages:    99720
Iteration  10, duration:  11902 ms , transferred pages:    97364 (dup:       22, rd:    93096, fd:     4246) , new dirty pages:    95391 , remaining dirty pages:    97747

13. cpu2006.cactusADM (can not converge)

Iteration   1, duration:  23508 ms , transferred pages:   266450 (dup:    73568, rd:   192882, fd:        0) , new dirty pages:   123869 , remaining dirty pages:   123869
Iteration   2, duration:  13989 ms , transferred pages:   121594 (dup:     7874, rd:    81653, fd:    32067) , new dirty pages:   112960 , remaining dirty pages:   115235
Iteration   3, duration:  13605 ms , transferred pages:   113276 (dup:     2028, rd:    83783, fd:    27465) , new dirty pages:   112314 , remaining dirty pages:   114273
Iteration   4, duration:  13509 ms , transferred pages:   111935 (dup:     1505, rd:    83535, fd:    26895) , new dirty pages:   114078 , remaining dirty pages:   116416
Iteration   5, duration:  13810 ms , transferred pages:   114262 (dup:     1378, rd:    84039, fd:    28845) , new dirty pages:   112271 , remaining dirty pages:   114425
Iteration   6, duration:  13604 ms , transferred pages:   112664 (dup:     1416, rd:    84300, fd:    26948) , new dirty pages:   112903 , remaining dirty pages:   114664
Iteration   7, duration:  13604 ms , transferred pages:   112655 (dup:     1407, rd:    84027, fd:    27221) , new dirty pages:   110943 , remaining dirty pages:   112952
Iteration   8, duration:  13406 ms , transferred pages:   110720 (dup:     1108, rd:    84075, fd:    25537) , new dirty pages:   109321 , remaining dirty pages:   111553
Iteration   9, duration:  13306 ms , transferred pages:   109726 (dup:      932, rd:    83652, fd:    25142) , new dirty pages:   113446 , remaining dirty pages:   115273
Iteration  10, duration:  13705 ms , transferred pages:   113121 (dup:     1055, rd:    84671, fd:    27395) , new dirty pages:   108776 , remaining dirty pages:   110928

14. cpu2006.GmesFDTD (can not converge)

Iteration   1, duration:  13303 ms , transferred pages:   266450 (dup:   157809, rd:   108641, fd:        0) , new dirty pages:   226802 , remaining dirty pages:   226802
Iteration   2, duration:  10797 ms , transferred pages:   226507 (dup:   138637, rd:    61818, fd:    26052) , new dirty pages:   200769 , remaining dirty pages:   201064
Iteration   3, duration:   8900 ms , transferred pages:   199717 (dup:   127187, rd:    69340, fd:     3190) , new dirty pages:   203436 , remaining dirty pages:   204783
Iteration   4, duration:  10904 ms , transferred pages:   204127 (dup:   115211, rd:    85767, fd:     3149) , new dirty pages:   198407 , remaining dirty pages:   199063
Iteration   5, duration:  12109 ms , transferred pages:   198206 (dup:    99435, rd:    96956, fd:     1815) , new dirty pages:   213719 , remaining dirty pages:   214576
Iteration   6, duration:  16307 ms , transferred pages:   213595 (dup:    80422, rd:   116885, fd:    16288) , new dirty pages:   199637 , remaining dirty pages:   200618
Iteration   7, duration:  16915 ms , transferred pages:   198289 (dup:    60169, rd:   134208, fd:     3912) , new dirty pages:   199343 , remaining dirty pages:   201672
Iteration   8, duration:  19518 ms , transferred pages:   200452 (dup:    41014, rd:   156083, fd:     3355) , new dirty pages:   222927 , remaining dirty pages:   224147

15. cpu2006.wrf (can not converge)

Iteration   1, duration:  18499 ms , transferred pages:   266380 (dup:   115285, rd:   151095, fd:        0) , new dirty pages:   112322 , remaining dirty pages:   112392
Iteration   2, duration:   9802 ms , transferred pages:   110025 (dup:    29917, rd:    65782, fd:    14326) , new dirty pages:    88855 , remaining dirty pages:    91222
Iteration   3, duration:   8199 ms , transferred pages:    89761 (dup:    22728, rd:    57262, fd:     9771) , new dirty pages:    58431 , remaining dirty pages:    59892
Iteration   4, duration:   5603 ms , transferred pages:    58502 (dup:    12716, rd:    41809, fd:     3977) , new dirty pages:    80556 , remaining dirty pages:    81946
Iteration   5, duration:   7101 ms , transferred pages:    79778 (dup:    21738, rd:    50896, fd:     7144) , new dirty pages:    62592 , remaining dirty pages:    64760
Iteration   6, duration:   5702 ms , transferred pages:    63388 (dup:    16793, rd:    42726, fd:     3869) , new dirty pages:    80747 , remaining dirty pages:    82119
Iteration   7, duration:   7000 ms , transferred pages:    80868 (dup:    23652, rd:    52194, fd:     5022) , new dirty pages:    84593 , remaining dirty pages:    85844
Iteration   8, duration:   7099 ms , transferred pages:    83799 (dup:    25769, rd:    51772, fd:     6258) , new dirty pages:    67951 , remaining dirty pages:    69996
Iteration   9, duration:   6303 ms , transferred pages:    68478 (dup:    16979, rd:    36490, fd:    15009) , new dirty pages:    81181 , remaining dirty pages:    82699
Iteration  10, duration:   7000 ms , transferred pages:    80724 (dup:    23503, rd:    52826, fd:     4395) , new dirty pages:    47930 , remaining dirty pages:    49905



> 
> > So I think "booting" and  "kernel compilation" should benefit a lot from this
> > improvement. The reason of "kernel compilation" would benefit is that some 
> > iterations take around 600ms, and if they are halved into 300ms, then the precopy
> > may have the chance to step into stop and copy phase.
> > 
> > On the other hand, "idle" and "web server" would not benefit a lot, because
> > most of the time are spent on the 1st iteration and little on the others.
> > 
> > As to the "zeusmp" and "memcached", although the time spent on the other iterations
> > but the 1st one may be halved, they still could not converge to stop and copy 
> > with the 300ms downtime.
> > 
> > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> > 
> > 1. booting : begin to migrate when the VM is booting
> > 
> > Iteration   1, duration:   6997 ms , transferred pages:   266450 (n:    57269, d:   209181 ) , new dirty pages:    56414 , remaining dirty pages:    56414
> > Iteration   2, duration:   6497 ms , transferred pages:    54008 (n:    52701, d:     1307 ) , new dirty pages:    48053 , remaining dirty pages:    50459
> > Iteration   3, duration:   5800 ms , transferred pages:    48232 (n:    47444, d:      788 ) , new dirty pages:     9129 , remaining dirty pages:    11356
> > Iteration   4, duration:   1100 ms , transferred pages:     9091 (n:     8998, d:       93 ) , new dirty pages:      165 , remaining dirty pages:     2430
> > Iteration   5, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2430
> > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> >        "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> > 
> > 2. idle
> > 
> > Iteration   1, duration:  14496 ms , transferred pages:   266450 (n:   118980, d:   147470 ) , new dirty pages:    17398 , remaining dirty pages:    17398
> > Iteration   2, duration:   1896 ms , transferred pages:    14953 (n:    14854, d:       99 ) , new dirty pages:     1849 , remaining dirty pages:     4294
> > Iteration   3, duration:    300 ms , transferred pages:     2454 (n:     2454, d:        0 ) , new dirty pages:        9 , remaining dirty pages:     1849
> > Iteration   4, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     1849
> > 
> > 3. kernel compilation (can not converge)
> > 
> > Iteration   1, duration:  20700 ms , transferred pages:   266450 (n:   169778, d:    96672 ) , new dirty pages:    40067 , remaining dirty pages:    40067
> > Iteration   2, duration:   4696 ms , transferred pages:    38401 (n:    37787, d:      614 ) , new dirty pages:     8852 , remaining dirty pages:    10518
> > Iteration   3, duration:   1000 ms , transferred pages:     8642 (n:     8180, d:      462 ) , new dirty pages:     6331 , remaining dirty pages:     8207
> > Iteration   4, duration:    700 ms , transferred pages:     6110 (n:     5726, d:      384 ) , new dirty pages:     5242 , remaining dirty pages:     7339
> > Iteration   5, duration:    600 ms , transferred pages:     5007 (n:     4908, d:       99 ) , new dirty pages:     4868 , remaining dirty pages:     7200
> > Iteration   6, duration:    600 ms , transferred pages:     5226 (n:     4908, d:      318 ) , new dirty pages:     6142 , remaining dirty pages:     8116
> > Iteration   7, duration:    700 ms , transferred pages:     5985 (n:     5726, d:      259 ) , new dirty pages:     5902 , remaining dirty pages:     8033
> > Iteration   8, duration:    701 ms , transferred pages:     5893 (n:     5726, d:      167 ) , new dirty pages:     7502 , remaining dirty pages:     9642
> > Iteration   9, duration:    900 ms , transferred pages:     7623 (n:     7362, d:      261 ) , new dirty pages:     6408 , remaining dirty pages:     8427
> > Iteration  10, duration:    700 ms , transferred pages:     6008 (n:     5726, d:      282 ) , new dirty pages:     8312 , remaining dirty pages:    10731
> > Iteration  11, duration:   1000 ms , transferred pages:     8353 (n:     8180, d:      173 ) , new dirty pages:     6874 , remaining dirty pages:     9252
> > Iteration  12, duration:    899 ms , transferred pages:     7477 (n:     7362, d:      115 ) , new dirty pages:     5573 , remaining dirty pages:     7348
> > Iteration  13, duration:    601 ms , transferred pages:     5099 (n:     4908, d:      191 ) , new dirty pages:     7671 , remaining dirty pages:     9920
> > Iteration  14, duration:    900 ms , transferred pages:     7586 (n:     7362, d:      224 ) , new dirty pages:     7359 , remaining dirty pages:     9693
> > Iteration  15, duration:    900 ms , transferred pages:     7682 (n:     7362, d:      320 ) , new dirty pages:     7371 , remaining dirty pages:     9382
> > 
> > 4. cpu2006.zeusmp (can not converge)
> > 
> > Iteration   1, duration:  21603 ms , transferred pages:   266450 (n:   176660, d:    89790 ) , new dirty pages:   145625 , remaining dirty pages:   145625
> > Iteration   2, duration:   8696 ms , transferred pages:   144389 (n:    70862, d:    73527 ) , new dirty pages:   125124 , remaining dirty pages:   126360
> > Iteration   3, duration:   6301 ms , transferred pages:   124057 (n:    51379, d:    72678 ) , new dirty pages:   122528 , remaining dirty pages:   124831
> > Iteration   4, duration:   6400 ms , transferred pages:   124330 (n:    52196, d:    72134 ) , new dirty pages:   124267 , remaining dirty pages:   124768
> > Iteration   5, duration:   6703 ms , transferred pages:   124034 (n:    54656, d:    69378 ) , new dirty pages:   124151 , remaining dirty pages:   124885
> > Iteration   6, duration:   6703 ms , transferred pages:   124357 (n:    54658, d:    69699 ) , new dirty pages:   124106 , remaining dirty pages:   124634
> > Iteration   7, duration:   6602 ms , transferred pages:   124568 (n:    53838, d:    70730 ) , new dirty pages:   133828 , remaining dirty pages:   133894
> > Iteration   8, duration:   7600 ms , transferred pages:   133030 (n:    62021, d:    71009 ) , new dirty pages:   126612 , remaining dirty pages:   127476
> > Iteration   9, duration:   7299 ms , transferred pages:   126511 (n:    59569, d:    66942 ) , new dirty pages:   122727 , remaining dirty pages:   123692
> > Iteration  10, duration:   6609 ms , transferred pages:   123692 (n:    54539, d:    69153 ) , new dirty pages:   122727 , remaining dirty pages:   122727
> > Iteration  11, duration:   6995 ms , transferred pages:   120347 (n:    56423, d:    63924 ) , new dirty pages:   121430 , remaining dirty pages:   123810
> > Iteration  12, duration:   6703 ms , transferred pages:   123040 (n:    54657, d:    68383 ) , new dirty pages:   122043 , remaining dirty pages:   122813
> > Iteration  13, duration:   7006 ms , transferred pages:   122353 (n:    57121, d:    65232 ) , new dirty pages:   133869 , remaining dirty pages:   134329
> > Iteration  14, duration:   8209 ms , transferred pages:   132325 (n:    66932, d:    65393 ) , new dirty pages:   126914 , remaining dirty pages:   128918
> > Iteration  15, duration:   7802 ms , transferred pages:   126931 (n:    63671, d:    63260 ) , new dirty pages:   122351 , remaining dirty pages:   124338
> > 
> > 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> > 
> > Iteration   1, duration:  30697 ms , transferred pages:   266450 (n:   251215, d:    15235 ) , new dirty pages:    30628 , remaining dirty pages:    30628
> > Iteration   2, duration:   3496 ms , transferred pages:    28859 (n:    28513, d:      346 ) , new dirty pages:     5805 , remaining dirty pages:     7574
> > Iteration   3, duration:    701 ms , transferred pages:     5746 (n:     5726, d:       20 ) , new dirty pages:     3433 , remaining dirty pages:     5261
> > Iteration   4, duration:    400 ms , transferred pages:     3281 (n:     3272, d:        9 ) , new dirty pages:     1539 , remaining dirty pages:     3519
> > Iteration   5, duration:    199 ms , transferred pages:     1653 (n:     1636, d:       17 ) , new dirty pages:      301 , remaining dirty pages:     2167
> > Iteration   6, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2167
> > 
> > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> > 
> > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5  (can not converge)
> > 
> > Iteration   1, duration:  42486 ms , transferred pages:  1568087 (n:  1216079, d:   352008 ) , new dirty pages:   571940 , remaining dirty pages:   581023
> > Iteration   2, duration:  19774 ms , transferred pages:   571700 (n:   567416, d:     4284 ) , new dirty pages:   331690 , remaining dirty pages:   341013
> > Iteration   3, duration:  11589 ms , transferred pages:   332187 (n:   332095, d:       92 ) , new dirty pages:   222725 , remaining dirty pages:   231551
> > Iteration   4, duration:   7790 ms , transferred pages:   223571 (n:   223499, d:       72 ) , new dirty pages:   157658 , remaining dirty pages:   165638
> > Iteration   5, duration:   5518 ms , transferred pages:   158056 (n:   157998, d:       58 ) , new dirty pages:   128130 , remaining dirty pages:   135712
> > Iteration   6, duration:   4442 ms , transferred pages:   127764 (n:   127701, d:       63 ) , new dirty pages:   104839 , remaining dirty pages:   112787
> > Iteration   7, duration:   3649 ms , transferred pages:   104581 (n:   104523, d:       58 ) , new dirty pages:   100736 , remaining dirty pages:   108942
> > Iteration   8, duration:   3532 ms , transferred pages:   101379 (n:   101315, d:       64 ) , new dirty pages:    87869 , remaining dirty pages:    95432
> > Iteration   9, duration:   3030 ms , transferred pages:    86841 (n:    86786, d:       55 ) , new dirty pages:    77505 , remaining dirty pages:    86096
> > Iteration  10, duration:   2709 ms , transferred pages:    77875 (n:    77814, d:       61 ) , new dirty pages:    77197 , remaining dirty pages:    85418
> > Iteration  11, duration:   2696 ms , transferred pages:    77107 (n:    77044, d:       63 ) , new dirty pages:    65010 , remaining dirty pages:    73321
> > Iteration  12, duration:   2308 ms , transferred pages:    66540 (n:    66484, d:       56 ) , new dirty pages:    64388 , remaining dirty pages:    71169
> > Iteration  13, duration:   2198 ms , transferred pages:    62953 (n:    62897, d:       56 ) , new dirty pages:    62773 , remaining dirty pages:    70989
> > Iteration  14, duration:   2214 ms , transferred pages:    63466 (n:    63411, d:       55 ) , new dirty pages:    67538 , remaining dirty pages:    75061
> > Iteration  15, duration:   2329 ms , transferred pages:    66924 (n:    66875, d:       49 ) , new dirty pages:    63580 , remaining dirty pages:    71717
> > Iteration  16, duration:   2252 ms , transferred pages:    64554 (n:    64539, d:       15 ) , new dirty pages:    63094 , remaining dirty pages:    70257
> > Iteration  17, duration:   2188 ms , transferred pages:    62697 (n:    62641, d:       56 ) , new dirty pages:    63016 , remaining dirty pages:    70576
> > Iteration  18, duration:   2171 ms , transferred pages:    62377 (n:    62322, d:       55 ) , new dirty pages:    56764 , remaining dirty pages:    64963
> > Iteration  19, duration:   2003 ms , transferred pages:    57382 (n:    57324, d:       58 ) , new dirty pages:    65307 , remaining dirty pages:    72888
> > Iteration  20, duration:   2240 ms , transferred pages:    64426 (n:    64364, d:       62 ) , new dirty pages:    61585 , remaining dirty pages:    70047
> > 
> > 
> > --
> > Chunguang Li, Ph.D. Candidate
> > Wuhan National Laboratory for Optoelectronics (WNLO)
> > Huazhong University of Science & Technology (HUST)
> > Wuhan, Hubei Prov., China
> > 
> > 
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-03  8:25           ` Chunguang Li
@ 2016-11-03  9:59             ` Li, Liang Z
  2016-11-03 10:13             ` Li, Liang Z
  2016-11-08 11:05             ` Dr. David Alan Gilbert
  2 siblings, 0 replies; 21+ messages in thread
From: Li, Liang Z @ 2016-11-03  9:59 UTC (permalink / raw)
  To: Chunguang Li, Dr. David Alan Gilbert
  Cc: Amit Shah, pbonzini, qemu-devel, stefanha, quintela

> pages will be sent. Before that during the migration setup, the
> ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce
> the dirty bitmap from this moment. When the pages "that haven't been
> sent" are written, the kernel space marks them as dirty. However I don't
> think this is correct, because these pages will be sent during this and the next
> iterations with the same content (if they are not written again after they are
> sent). It only makes sense to mark the pages which have already been sent
> during one iteration as dirty when they are written.
> > > > > > >
> > > > > > >
> > > > > > > Am I right about this consideration? If I am right, is there some
> advice to improve this?
> > > > > >
> > > > > > I think you're right that this can happen; to clarify I think the
> > > > > > case you're talking about is:
> > > > > >
> > > > > >   Iteration 1
> > > > > >     sync bitmap
> > > > > >     start sending pages
> > > > > >     page 'n' is modified - but hasn't been sent yet
> > > > > >     page 'n' gets sent
> > > > > >   Iteration 2
> > > > > >     sync bitmap
> > > > > >        'page n is shown as modified'
> > > > > >     send page 'n' again
> > > > > >
> > > > >
> > > > > Yes,this is right the case I am talking about.
> > > > >
> > > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > > on big VMs with slow networks where the length of each iteration
> > > > > > is large.
> > > > >
> > > > > I think this is "very" wasteful. Assume the workload writes the pages
> dirty randomly within the guest address space, and the transfer speed is
> constant. Intuitively, I think nearly half of the dirty pages produced in
> Iteration 1 is not really dirty. This means the time of Iteration 2 is double of
> that to send only really dirty pages.
> > > >
> > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > workloads get impacted the most?  That would also help us to figure
> > > > out what kinds of speed improvements we can expect.
> > > >
> > > >
> > > > 		Amit
> > >
> > > I have picked up 6 workloads and got the following statistics numbers
> > > of every iteration (except the last stop-copy one) during precopy.
> > > These numbers are obtained with the basic precopy migration, without
> > > the capabilities like xbzrle or compression, etc. The network for the
> > > migration is exclusive, with a separate network for the workloads.
> > > They are both gigabit ethernet. I use qemu-2.5.1.
> > >
> > > Three (booting, idle, web server) of them converged to the stop-copy
> phase,
> > > with the given bandwidth and default downtime (300ms), while the other
> > > three (kernel compilation, zeusmp, memcached) did not.
> > >
> > > One page is "not-really-dirty", if it is written first and is sent later
> > > (and not written again after that) during one iteration. I guess this
> > > would not happen so often during the other iterations as during the 1st
> > > iteration. Because all the pages of the VM are sent to the dest node
> during
> > > the 1st iteration, while during the others, only part of the pages are sent.
> > > So I think the "not-really-dirty" pages should be produced mainly during
> > > the 1st iteration , and maybe very little during the other iterations.
> > >
> > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > think the time spent on Iteration 2 would be halved. This is a chain
> reaction,
> > > because the dirty pages produced during Iteration 2 is halved, which
> incurs
> > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> >
> > Yes; these numbers don't show how many of them are false dirty though.
> >
> > One problem is thinking about pages that have been redirtied, if the page is
> dirtied
> > after the sync but before the network write then it's the false-dirty that
> > you're describing.
> >
> > However, if the page is being written a few times, and so it would have
> been written
> > after the network write then it isn't a false-dirty.
> >
> > You might be able to figure that out with some kernel tracing of when the
> dirtying
> > happens, but it might be easier to write the fix!
> >
> > Dave
> 
> Hi, I have made some new progress now.
> 
> To tell how many false dirty pages there are exactly in each iteration, I malloc
> a
> buffer in memory as big as the size of the whole VM memory. When a page
> is
> transferred to the dest node, it is copied to the buffer; During the next
> iteration,
> if one page is transferred, it is compared to the old one in the buffer, and the
> old one will be replaced for next comparison if it is really dirty. Thus, we are
> now
> able to get the exact number of false dirty pages.
> 
> This time, I use 15 workloads to get the statistic number. They are:
> 
>   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all
> scientific
>      computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc.
> I pick
>      up these 11 benchmarks because compared to others, they have bigger
> memory
>      occupation and higher memory dirty rate. Thus most of them could not
> converge
>      to stop-and-copy using the default migration speed (32MB/s).
>   2. kernel compilation
>   3. idle VM
>   4. Apache web server which serves static content
> 
>   (the above workloads are all running in VM with 1 vcpu and 1GB memory,
> and the
>    migration speed is the default 32MB/s)
> 
>   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used
> as the cache.
>      After filling up the 4GB cache, a client writes the cache at a constant speed
>      during migration. This time, migration speed has no limit, and is up to the
>      capability of 1Gbps Ethernet.
> 
> Summarize the results first: (and you can read the precise number below)
> 
>   1. 4 of these 15 workloads have a big proportion (>60%, even >80% during
> some iterations)
>      of false dirty pages out of all the dirty pages since iteration 2 (and the big
>      proportion lasts during the following iterations). They are cpu2006.zeusmp,
>      cpu2006.bzip2, cpu2006.mcf, and memcached.
>   2. 2 workloads (idle, webserver) spend most of the migration time on
> iteration 1, even
>      though the proportion of false dirty pages is big since iteration 2, the space
> to
>      optimize is small.
>   3. 1 workload (kernel compilation) only have a big proportion during
> iteration 2, not
>      in the other iterations.
>   4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of
> false
>      dirty pages since iteration 2. So the spaces to optimize for them are small.
> 
> Now I want to talk a little more about the reasons why false dirty pages are
> produced.
> The first reason is what we have discussed before---the mechanism to track
> the dirty
> pages.
> And then I come up with another reason. Here is the situation: a write
> operation to one
> memory page happens, but it doesn't change any content of the page. So it's
> "write but
> not dirty", and kernel still marks it as dirty. One guy in our lab has done some
> experiments
> to figure out the proportion of "write but not dirty" operations, and he uses
> the cpu2006
> benchmark suit. According to his results, general workloads has a little
> proportion (<10%)
> of "write but not dirty" out of all the write operations, while few workloads
> have higher
> proportion (one even as high as 50%). Now we are not sure why "write but
> not dirty" would
> happen, it just happened.
> 
> So these two reasons contribute to the false dirty pages. To optimize, I
> compute and store
> the SHA1 hash before transferring each page. Next time, if one page needs
> retransmission, its
> SHA1 hash is computed again, and compared to the old hash. If the hash is
> the same, it's a
> false dirty page, and we just skip this page; Otherwise, the page is
> transferred, and the new
> hash replaces the old one for next comparison.
> The reason to use SHA1 hash but not byte-by-byte comparison is the
> memory overheads. One SHA1
> hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the
> whole VM memory, which
> is relatively small.
> As far as I know, SHA1 hash is widely used in the scenes of deduplication for
> backup systems.
> They have proven that the probability of hash collision is far smaller than disk
> hardware fault,
> so it's secure hash, that is, if the hashes of two chunks are the same, the
> content must be the
> same. So I think the SHA1 hash could replace byte-to-byte comparison in the
> VM memory scenery.
> 
> Then I do the same migration experiments using the SHA1 hash. For the 4
> workloads which have
> big proportions of false dirty pages, the improvement is remarkable. Without
> optimization,
> they either can not converge to stop-and-copy, or take a very long time to
> complete. With the
> SHA1 hash method, all of them now complete in a relatively short time.
> For the reason I have talked above, the other workloads don't get notable
> improvements from the
> optimization. So below, I only show the exact number after optimization for
> the 4 workloads with
> remarkable improvements.
> 
> Any comments or suggestions?
> 

It seems the current XBZRLE feature can be used to solve false dirty issue, no?

Liang


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-03  8:25           ` Chunguang Li
  2016-11-03  9:59             ` Li, Liang Z
@ 2016-11-03 10:13             ` Li, Liang Z
  2016-11-04  3:07               ` Chunguang Li
  2016-11-08 11:05             ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 21+ messages in thread
From: Li, Liang Z @ 2016-11-03 10:13 UTC (permalink / raw)
  To: Chunguang Li, Dr. David Alan Gilbert
  Cc: Amit Shah, pbonzini, qemu-devel, stefanha, quintela

> > > > > I think this is "very" wasteful. Assume the workload writes the pages
> dirty randomly within the guest address space, and the transfer speed is
> constant. Intuitively, I think nearly half of the dirty pages produced in
> Iteration 1 is not really dirty. This means the time of Iteration 2 is double of
> that to send only really dirty pages.
> > > >
> > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > workloads get impacted the most?  That would also help us to figure
> > > > out what kinds of speed improvements we can expect.
> > > >
> > > >
> > > > 		Amit
> > >
> > > I have picked up 6 workloads and got the following statistics numbers
> > > of every iteration (except the last stop-copy one) during precopy.
> > > These numbers are obtained with the basic precopy migration, without
> > > the capabilities like xbzrle or compression, etc. The network for the
> > > migration is exclusive, with a separate network for the workloads.
> > > They are both gigabit ethernet. I use qemu-2.5.1.
> > >
> > > Three (booting, idle, web server) of them converged to the stop-copy
> phase,
> > > with the given bandwidth and default downtime (300ms), while the other
> > > three (kernel compilation, zeusmp, memcached) did not.
> > >
> > > One page is "not-really-dirty", if it is written first and is sent later
> > > (and not written again after that) during one iteration. I guess this
> > > would not happen so often during the other iterations as during the 1st
> > > iteration. Because all the pages of the VM are sent to the dest node
> during
> > > the 1st iteration, while during the others, only part of the pages are sent.
> > > So I think the "not-really-dirty" pages should be produced mainly during
> > > the 1st iteration , and maybe very little during the other iterations.
> > >
> > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > think the time spent on Iteration 2 would be halved. This is a chain
> reaction,
> > > because the dirty pages produced during Iteration 2 is halved, which
> incurs
> > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> >
> > Yes; these numbers don't show how many of them are false dirty though.
> >
> > One problem is thinking about pages that have been redirtied, if the page is
> dirtied
> > after the sync but before the network write then it's the false-dirty that
> > you're describing.
> >
> > However, if the page is being written a few times, and so it would have
> been written
> > after the network write then it isn't a false-dirty.
> >
> > You might be able to figure that out with some kernel tracing of when the
> dirtying
> > happens, but it might be easier to write the fix!
> >
> > Dave
> 
> Hi, I have made some new progress now.
> 
> To tell how many false dirty pages there are exactly in each iteration, I malloc
> a
> buffer in memory as big as the size of the whole VM memory. When a page
> is
> transferred to the dest node, it is copied to the buffer; During the next
> iteration,
> if one page is transferred, it is compared to the old one in the buffer, and the
> old one will be replaced for next comparison if it is really dirty. Thus, we are
> now
> able to get the exact number of false dirty pages.
> 
> This time, I use 15 workloads to get the statistic number. They are:
> 
>   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all
> scientific
>      computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc.
> I pick
>      up these 11 benchmarks because compared to others, they have bigger
> memory
>      occupation and higher memory dirty rate. Thus most of them could not
> converge
>      to stop-and-copy using the default migration speed (32MB/s).
>   2. kernel compilation
>   3. idle VM
>   4. Apache web server which serves static content
> 
>   (the above workloads are all running in VM with 1 vcpu and 1GB memory,
> and the
>    migration speed is the default 32MB/s)
> 
>   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used
> as the cache.
>      After filling up the 4GB cache, a client writes the cache at a constant speed
>      during migration. This time, migration speed has no limit, and is up to the
>      capability of 1Gbps Ethernet.
> 
> Summarize the results first: (and you can read the precise number below)
> 
>   1. 4 of these 15 workloads have a big proportion (>60%, even >80% during
> some iterations)
>      of false dirty pages out of all the dirty pages since iteration 2 (and the big
>      proportion lasts during the following iterations). They are cpu2006.zeusmp,
>      cpu2006.bzip2, cpu2006.mcf, and memcached.
>   2. 2 workloads (idle, webserver) spend most of the migration time on
> iteration 1, even
>      though the proportion of false dirty pages is big since iteration 2, the space
> to
>      optimize is small.
>   3. 1 workload (kernel compilation) only have a big proportion during
> iteration 2, not
>      in the other iterations.
>   4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of
> false
>      dirty pages since iteration 2. So the spaces to optimize for them are small.
> 
> Now I want to talk a little more about the reasons why false dirty pages are
> produced.
> The first reason is what we have discussed before---the mechanism to track
> the dirty
> pages.
> And then I come up with another reason. Here is the situation: a write
> operation to one
> memory page happens, but it doesn't change any content of the page. So it's
> "write but
> not dirty", and kernel still marks it as dirty. One guy in our lab has done some
> experiments
> to figure out the proportion of "write but not dirty" operations, and he uses
> the cpu2006
> benchmark suit. According to his results, general workloads has a little
> proportion (<10%)
> of "write but not dirty" out of all the write operations, while few workloads
> have higher
> proportion (one even as high as 50%). Now we are not sure why "write but
> not dirty" would
> happen, it just happened.
> 
> So these two reasons contribute to the false dirty pages. To optimize, I
> compute and store
> the SHA1 hash before transferring each page. Next time, if one page needs
> retransmission, its
> SHA1 hash is computed again, and compared to the old hash. If the hash is
> the same, it's a
> false dirty page, and we just skip this page; Otherwise, the page is
> transferred, and the new
> hash replaces the old one for next comparison.
> The reason to use SHA1 hash but not byte-by-byte comparison is the
> memory overheads. One SHA1
> hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the
> whole VM memory, which
> is relatively small.
> As far as I know, SHA1 hash is widely used in the scenes of deduplication for
> backup systems.
> They have proven that the probability of hash collision is far smaller than disk
> hardware fault,
> so it's secure hash, that is, if the hashes of two chunks are the same, the
> content must be the
> same. So I think the SHA1 hash could replace byte-to-byte comparison in the
> VM memory scenery.
> 
> Then I do the same migration experiments using the SHA1 hash. For the 4
> workloads which have
> big proportions of false dirty pages, the improvement is remarkable. Without
> optimization,
> they either can not converge to stop-and-copy, or take a very long time to
> complete. With the
> SHA1 hash method, all of them now complete in a relatively short time.
> For the reason I have talked above, the other workloads don't get notable
> improvements from the
> optimization. So below, I only show the exact number after optimization for
> the 4 workloads with
> remarkable improvements.
> 
> Any comments or suggestions?

Maybe you can compare the performance of your solution as that of XBZRLE to see which one is better.
The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and need less buffer.
How about the overhead of calculating the SHA1? Is it faster than copying a page?

Liang



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-03 10:13             ` Li, Liang Z
@ 2016-11-04  3:07               ` Chunguang Li
  2016-11-04  4:50                 ` Li, Liang Z
  0 siblings, 1 reply; 21+ messages in thread
From: Chunguang Li @ 2016-11-04  3:07 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela




> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Thursday, November 3, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> > > > > > I think this is "very" wasteful. Assume the workload writes the pages
> > dirty randomly within the guest address space, and the transfer speed is
> > constant. Intuitively, I think nearly half of the dirty pages produced in
> > Iteration 1 is not really dirty. This means the time of Iteration 2 is double of
> > that to send only really dirty pages.
> > > > >
> > > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > > workloads get impacted the most?  That would also help us to figure
> > > > > out what kinds of speed improvements we can expect.
> > > > >
> > > > >
> > > > > 		Amit
> > > >
> > > > I have picked up 6 workloads and got the following statistics numbers
> > > > of every iteration (except the last stop-copy one) during precopy.
> > > > These numbers are obtained with the basic precopy migration, without
> > > > the capabilities like xbzrle or compression, etc. The network for the
> > > > migration is exclusive, with a separate network for the workloads.
> > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > >
> > > > Three (booting, idle, web server) of them converged to the stop-copy
> > phase,
> > > > with the given bandwidth and default downtime (300ms), while the other
> > > > three (kernel compilation, zeusmp, memcached) did not.
> > > >
> > > > One page is "not-really-dirty", if it is written first and is sent later
> > > > (and not written again after that) during one iteration. I guess this
> > > > would not happen so often during the other iterations as during the 1st
> > > > iteration. Because all the pages of the VM are sent to the dest node
> > during
> > > > the 1st iteration, while during the others, only part of the pages are sent.
> > > > So I think the "not-really-dirty" pages should be produced mainly during
> > > > the 1st iteration , and maybe very little during the other iterations.
> > > >
> > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > > think the time spent on Iteration 2 would be halved. This is a chain
> > reaction,
> > > > because the dirty pages produced during Iteration 2 is halved, which
> > incurs
> > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > >
> > > Yes; these numbers don't show how many of them are false dirty though.
> > >
> > > One problem is thinking about pages that have been redirtied, if the page is
> > dirtied
> > > after the sync but before the network write then it's the false-dirty that
> > > you're describing.
> > >
> > > However, if the page is being written a few times, and so it would have
> > been written
> > > after the network write then it isn't a false-dirty.
> > >
> > > You might be able to figure that out with some kernel tracing of when the
> > dirtying
> > > happens, but it might be easier to write the fix!
> > >
> > > Dave
> > 
> > Hi, I have made some new progress now.
> > 
> > To tell how many false dirty pages there are exactly in each iteration, I malloc
> > a
> > buffer in memory as big as the size of the whole VM memory. When a page
> > is
> > transferred to the dest node, it is copied to the buffer; During the next
> > iteration,
> > if one page is transferred, it is compared to the old one in the buffer, and the
> > old one will be replaced for next comparison if it is really dirty. Thus, we are
> > now
> > able to get the exact number of false dirty pages.
> > 
> > This time, I use 15 workloads to get the statistic number. They are:
> > 
> >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all
> > scientific
> >      computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc.
> > I pick
> >      up these 11 benchmarks because compared to others, they have bigger
> > memory
> >      occupation and higher memory dirty rate. Thus most of them could not
> > converge
> >      to stop-and-copy using the default migration speed (32MB/s).
> >   2. kernel compilation
> >   3. idle VM
> >   4. Apache web server which serves static content
> > 
> >   (the above workloads are all running in VM with 1 vcpu and 1GB memory,
> > and the
> >    migration speed is the default 32MB/s)
> > 
> >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used
> > as the cache.
> >      After filling up the 4GB cache, a client writes the cache at a constant speed
> >      during migration. This time, migration speed has no limit, and is up to the
> >      capability of 1Gbps Ethernet.
> > 
> > Summarize the results first: (and you can read the precise number below)
> > 
> >   1. 4 of these 15 workloads have a big proportion (>60%, even >80% during
> > some iterations)
> >      of false dirty pages out of all the dirty pages since iteration 2 (and the big
> >      proportion lasts during the following iterations). They are cpu2006.zeusmp,
> >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> >   2. 2 workloads (idle, webserver) spend most of the migration time on
> > iteration 1, even
> >      though the proportion of false dirty pages is big since iteration 2, the space
> > to
> >      optimize is small.
> >   3. 1 workload (kernel compilation) only have a big proportion during
> > iteration 2, not
> >      in the other iterations.
> >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of
> > false
> >      dirty pages since iteration 2. So the spaces to optimize for them are small.
> > 
> > Now I want to talk a little more about the reasons why false dirty pages are
> > produced.
> > The first reason is what we have discussed before---the mechanism to track
> > the dirty
> > pages.
> > And then I come up with another reason. Here is the situation: a write
> > operation to one
> > memory page happens, but it doesn't change any content of the page. So it's
> > "write but
> > not dirty", and kernel still marks it as dirty. One guy in our lab has done some
> > experiments
> > to figure out the proportion of "write but not dirty" operations, and he uses
> > the cpu2006
> > benchmark suit. According to his results, general workloads has a little
> > proportion (<10%)
> > of "write but not dirty" out of all the write operations, while few workloads
> > have higher
> > proportion (one even as high as 50%). Now we are not sure why "write but
> > not dirty" would
> > happen, it just happened.
> > 
> > So these two reasons contribute to the false dirty pages. To optimize, I
> > compute and store
> > the SHA1 hash before transferring each page. Next time, if one page needs
> > retransmission, its
> > SHA1 hash is computed again, and compared to the old hash. If the hash is
> > the same, it's a
> > false dirty page, and we just skip this page; Otherwise, the page is
> > transferred, and the new
> > hash replaces the old one for next comparison.
> > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > memory overheads. One SHA1
> > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the
> > whole VM memory, which
> > is relatively small.
> > As far as I know, SHA1 hash is widely used in the scenes of deduplication for
> > backup systems.
> > They have proven that the probability of hash collision is far smaller than disk
> > hardware fault,
> > so it's secure hash, that is, if the hashes of two chunks are the same, the
> > content must be the
> > same. So I think the SHA1 hash could replace byte-to-byte comparison in the
> > VM memory scenery.
> > 
> > Then I do the same migration experiments using the SHA1 hash. For the 4
> > workloads which have
> > big proportions of false dirty pages, the improvement is remarkable. Without
> > optimization,
> > they either can not converge to stop-and-copy, or take a very long time to
> > complete. With the
> > SHA1 hash method, all of them now complete in a relatively short time.
> > For the reason I have talked above, the other workloads don't get notable
> > improvements from the
> > optimization. So below, I only show the exact number after optimization for
> > the 4 workloads with
> > remarkable improvements.
> > 
> > Any comments or suggestions?
> 
> Maybe you can compare the performance of your solution as that of XBZRLE to see which one is better.
> The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and need less buffer.
> How about the overhead of calculating the SHA1? Is it faster than copying a page?
> 
> Liang
> 
> 

Yes, XBZRLE is able to handle the false dirty pages. However, if we want to avoid 
transferring all of the false dirty pages using XBZRLE, we need a buffer as big as 
the whole VM memory, while SHA1 needs a much small buffer. Of course, if we 
have a buffer as big as the whole VM memory using XBZRLE, we could transfer less data
on network than SHA1, because XBZRLE is able to compress similar pages. In a word, yes,
the merit of using SHA1 is that it needs much less buffer, and leads to nice improvement
if there are many false dirty pages.

In terms of the overhead of calculating the SHA1 compared with transferring a page, 
it's related to the CPU and network performance. In my test environment(Intel Xeon
E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing overhead 
caused by calculating the SHA1, because the throughput of network (got by "info migrate") 
remains almost the same.

--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-04  3:07               ` Chunguang Li
@ 2016-11-04  4:50                 ` Li, Liang Z
  2016-11-04  7:03                   ` Chunguang Li
  2016-11-07 13:52                   ` Chunguang Li
  0 siblings, 2 replies; 21+ messages in thread
From: Li, Liang Z @ 2016-11-04  4:50 UTC (permalink / raw)
  To: Chunguang Li
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela

> > > > > > > I think this is "very" wasteful. Assume the workload writes
> > > > > > > the pages
> > > dirty randomly within the guest address space, and the transfer
> > > speed is constant. Intuitively, I think nearly half of the dirty
> > > pages produced in Iteration 1 is not really dirty. This means the
> > > time of Iteration 2 is double of that to send only really dirty pages.
> > > > > >
> > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > kinds of workloads get impacted the most?  That would also
> > > > > > help us to figure out what kinds of speed improvements we can
> expect.
> > > > > >
> > > > > >
> > > > > > 		Amit
> > > > >
> > > > > I have picked up 6 workloads and got the following statistics
> > > > > numbers of every iteration (except the last stop-copy one) during
> precopy.
> > > > > These numbers are obtained with the basic precopy migration,
> > > > > without the capabilities like xbzrle or compression, etc. The
> > > > > network for the migration is exclusive, with a separate network for
> the workloads.
> > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > >
> > > > > Three (booting, idle, web server) of them converged to the
> > > > > stop-copy
> > > phase,
> > > > > with the given bandwidth and default downtime (300ms), while the
> > > > > other three (kernel compilation, zeusmp, memcached) did not.
> > > > >
> > > > > One page is "not-really-dirty", if it is written first and is
> > > > > sent later (and not written again after that) during one
> > > > > iteration. I guess this would not happen so often during the
> > > > > other iterations as during the 1st iteration. Because all the
> > > > > pages of the VM are sent to the dest node
> > > during
> > > > > the 1st iteration, while during the others, only part of the pages are
> sent.
> > > > > So I think the "not-really-dirty" pages should be produced
> > > > > mainly during the 1st iteration , and maybe very little during the other
> iterations.
> > > > >
> > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > halved. This is a chain
> > > reaction,
> > > > > because the dirty pages produced during Iteration 2 is halved,
> > > > > which
> > > incurs
> > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > >
> > > > Yes; these numbers don't show how many of them are false dirty
> though.
> > > >
> > > > One problem is thinking about pages that have been redirtied, if
> > > > the page is
> > > dirtied
> > > > after the sync but before the network write then it's the
> > > > false-dirty that you're describing.
> > > >
> > > > However, if the page is being written a few times, and so it would
> > > > have
> > > been written
> > > > after the network write then it isn't a false-dirty.
> > > >
> > > > You might be able to figure that out with some kernel tracing of
> > > > when the
> > > dirtying
> > > > happens, but it might be easier to write the fix!
> > > >
> > > > Dave
> > >
> > > Hi, I have made some new progress now.
> > >
> > > To tell how many false dirty pages there are exactly in each
> > > iteration, I malloc a buffer in memory as big as the size of the
> > > whole VM memory. When a page is transferred to the dest node, it is
> > > copied to the buffer; During the next iteration, if one page is
> > > transferred, it is compared to the old one in the buffer, and the
> > > old one will be replaced for next comparison if it is really dirty.
> > > Thus, we are now able to get the exact number of false dirty pages.
> > >
> > > This time, I use 15 workloads to get the statistic number. They are:
> > >
> > >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are
> > > all scientific
> > >      computing workloads like Quantum Chromodynamics, Fluid Dynamics,
> etc.
> > > I pick
> > >      up these 11 benchmarks because compared to others, they have
> > > bigger memory
> > >      occupation and higher memory dirty rate. Thus most of them
> > > could not converge
> > >      to stop-and-copy using the default migration speed (32MB/s).
> > >   2. kernel compilation
> > >   3. idle VM
> > >   4. Apache web server which serves static content
> > >
> > >   (the above workloads are all running in VM with 1 vcpu and 1GB
> > > memory, and the
> > >    migration speed is the default 32MB/s)
> > >
> > >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are
> > > used as the cache.
> > >      After filling up the 4GB cache, a client writes the cache at a constant
> speed
> > >      during migration. This time, migration speed has no limit, and is up to
> the
> > >      capability of 1Gbps Ethernet.
> > >
> > > Summarize the results first: (and you can read the precise number
> > > below)
> > >
> > >   1. 4 of these 15 workloads have a big proportion (>60%, even >80%
> > > during some iterations)
> > >      of false dirty pages out of all the dirty pages since iteration 2 (and the
> big
> > >      proportion lasts during the following iterations). They are
> cpu2006.zeusmp,
> > >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> > >   2. 2 workloads (idle, webserver) spend most of the migration time
> > > on iteration 1, even
> > >      though the proportion of false dirty pages is big since
> > > iteration 2, the space to
> > >      optimize is small.
> > >   3. 1 workload (kernel compilation) only have a big proportion
> > > during iteration 2, not
> > >      in the other iterations.
> > >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > proportion of false
> > >      dirty pages since iteration 2. So the spaces to optimize for them are
> small.
> > >
> > > Now I want to talk a little more about the reasons why false dirty
> > > pages are produced.
> > > The first reason is what we have discussed before---the mechanism to
> > > track the dirty pages.
> > > And then I come up with another reason. Here is the situation: a
> > > write operation to one memory page happens, but it doesn't change
> > > any content of the page. So it's "write but not dirty", and kernel
> > > still marks it as dirty. One guy in our lab has done some
> > > experiments to figure out the proportion of "write but not dirty"
> > > operations, and he uses the cpu2006 benchmark suit. According to his
> > > results, general workloads has a little proportion (<10%) of "write
> > > but not dirty" out of all the write operations, while few workloads
> > > have higher proportion (one even as high as 50%). Now we are not
> > > sure why "write but not dirty" would happen, it just happened.
> > >
> > > So these two reasons contribute to the false dirty pages. To
> > > optimize, I compute and store the SHA1 hash before transferring each
> > > page. Next time, if one page needs retransmission, its
> > > SHA1 hash is computed again, and compared to the old hash. If the
> > > hash is the same, it's a false dirty page, and we just skip this
> > > page; Otherwise, the page is transferred, and the new hash replaces
> > > the old one for next comparison.
> > > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > > memory overheads. One SHA1 hash is 20 bytes. So we need extra
> > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > relatively small.
> > > As far as I know, SHA1 hash is widely used in the scenes of
> > > deduplication for backup systems.
> > > They have proven that the probability of hash collision is far
> > > smaller than disk hardware fault, so it's secure hash, that is, if
> > > the hashes of two chunks are the same, the content must be the same.
> > > So I think the SHA1 hash could replace byte-to-byte comparison in
> > > the VM memory scenery.
> > >
> > > Then I do the same migration experiments using the SHA1 hash. For
> > > the 4 workloads which have big proportions of false dirty pages, the
> > > improvement is remarkable. Without optimization, they either can not
> > > converge to stop-and-copy, or take a very long time to complete.
> > > With the
> > > SHA1 hash method, all of them now complete in a relatively short time.
> > > For the reason I have talked above, the other workloads don't get
> > > notable improvements from the optimization. So below, I only show
> > > the exact number after optimization for the 4 workloads with
> > > remarkable improvements.
> > >
> > > Any comments or suggestions?
> >
> > Maybe you can compare the performance of your solution as that of
> XBZRLE to see which one is better.
> > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and
> need less buffer.
> > How about the overhead of calculating the SHA1? Is it faster than copying a
> page?
> >
> > Liang
> >
> >
> 
> Yes, XBZRLE is able to handle the false dirty pages. However, if we want to
> avoid transferring all of the false dirty pages using XBZRLE, we need a buffer
> as big as the whole VM memory, while SHA1 needs a much small buffer. Of
> course, if we have a buffer as big as the whole VM memory using XBZRLE, we
> could transfer less data on network than SHA1, because XBZRLE is able to
> compress similar pages. In a word, yes, the merit of using SHA1 is that it
> needs much less buffer, and leads to nice improvement if there are many
> false dirty pages.
> 

The current implementation of XBZRLE begins to buffer page from the second iteration,
Maybe it's worth to make it start to work from the first iteration based on your finding.

> In terms of the overhead of calculating the SHA1 compared with transferring
> a page, it's related to the CPU and network performance. In my test
> environment(Intel Xeon
> E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing
> overhead caused by calculating the SHA1, because the throughput of
> network (got by "info migrate") remains almost the same.

You can check the CPU usage, or to measure the time spend on a local live migration
 which use SHA1/ XBZRLE.

Liang



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-04  4:50                 ` Li, Liang Z
@ 2016-11-04  7:03                   ` Chunguang Li
  2016-11-07 13:52                   ` Chunguang Li
  1 sibling, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-04  7:03 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela




> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Friday, November 4, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> > > > > > > > I think this is "very" wasteful. Assume the workload writes
> > > > > > > > the pages
> > > > dirty randomly within the guest address space, and the transfer
> > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > pages produced in Iteration 1 is not really dirty. This means the
> > > > time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > >
> > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > kinds of workloads get impacted the most?  That would also
> > > > > > > help us to figure out what kinds of speed improvements we can
> > expect.
> > > > > > >
> > > > > > >
> > > > > > > 		Amit
> > > > > >
> > > > > > I have picked up 6 workloads and got the following statistics
> > > > > > numbers of every iteration (except the last stop-copy one) during
> > precopy.
> > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > without the capabilities like xbzrle or compression, etc. The
> > > > > > network for the migration is exclusive, with a separate network for
> > the workloads.
> > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > >
> > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > stop-copy
> > > > phase,
> > > > > > with the given bandwidth and default downtime (300ms), while the
> > > > > > other three (kernel compilation, zeusmp, memcached) did not.
> > > > > >
> > > > > > One page is "not-really-dirty", if it is written first and is
> > > > > > sent later (and not written again after that) during one
> > > > > > iteration. I guess this would not happen so often during the
> > > > > > other iterations as during the 1st iteration. Because all the
> > > > > > pages of the VM are sent to the dest node
> > > > during
> > > > > > the 1st iteration, while during the others, only part of the pages are
> > sent.
> > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > mainly during the 1st iteration , and maybe very little during the other
> > iterations.
> > > > > >
> > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > halved. This is a chain
> > > > reaction,
> > > > > > because the dirty pages produced during Iteration 2 is halved,
> > > > > > which
> > > > incurs
> > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > >
> > > > > Yes; these numbers don't show how many of them are false dirty
> > though.
> > > > >
> > > > > One problem is thinking about pages that have been redirtied, if
> > > > > the page is
> > > > dirtied
> > > > > after the sync but before the network write then it's the
> > > > > false-dirty that you're describing.
> > > > >
> > > > > However, if the page is being written a few times, and so it would
> > > > > have
> > > > been written
> > > > > after the network write then it isn't a false-dirty.
> > > > >
> > > > > You might be able to figure that out with some kernel tracing of
> > > > > when the
> > > > dirtying
> > > > > happens, but it might be easier to write the fix!
> > > > >
> > > > > Dave
> > > >
> > > > Hi, I have made some new progress now.
> > > >
> > > > To tell how many false dirty pages there are exactly in each
> > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > whole VM memory. When a page is transferred to the dest node, it is
> > > > copied to the buffer; During the next iteration, if one page is
> > > > transferred, it is compared to the old one in the buffer, and the
> > > > old one will be replaced for next comparison if it is really dirty.
> > > > Thus, we are now able to get the exact number of false dirty pages.
> > > >
> > > > This time, I use 15 workloads to get the statistic number. They are:
> > > >
> > > >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are
> > > > all scientific
> > > >      computing workloads like Quantum Chromodynamics, Fluid Dynamics,
> > etc.
> > > > I pick
> > > >      up these 11 benchmarks because compared to others, they have
> > > > bigger memory
> > > >      occupation and higher memory dirty rate. Thus most of them
> > > > could not converge
> > > >      to stop-and-copy using the default migration speed (32MB/s).
> > > >   2. kernel compilation
> > > >   3. idle VM
> > > >   4. Apache web server which serves static content
> > > >
> > > >   (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > memory, and the
> > > >    migration speed is the default 32MB/s)
> > > >
> > > >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are
> > > > used as the cache.
> > > >      After filling up the 4GB cache, a client writes the cache at a constant
> > speed
> > > >      during migration. This time, migration speed has no limit, and is up to
> > the
> > > >      capability of 1Gbps Ethernet.
> > > >
> > > > Summarize the results first: (and you can read the precise number
> > > > below)
> > > >
> > > >   1. 4 of these 15 workloads have a big proportion (>60%, even >80%
> > > > during some iterations)
> > > >      of false dirty pages out of all the dirty pages since iteration 2 (and the
> > big
> > > >      proportion lasts during the following iterations). They are
> > cpu2006.zeusmp,
> > > >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > >   2. 2 workloads (idle, webserver) spend most of the migration time
> > > > on iteration 1, even
> > > >      though the proportion of false dirty pages is big since
> > > > iteration 2, the space to
> > > >      optimize is small.
> > > >   3. 1 workload (kernel compilation) only have a big proportion
> > > > during iteration 2, not
> > > >      in the other iterations.
> > > >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > proportion of false
> > > >      dirty pages since iteration 2. So the spaces to optimize for them are
> > small.
> > > >
> > > > Now I want to talk a little more about the reasons why false dirty
> > > > pages are produced.
> > > > The first reason is what we have discussed before---the mechanism to
> > > > track the dirty pages.
> > > > And then I come up with another reason. Here is the situation: a
> > > > write operation to one memory page happens, but it doesn't change
> > > > any content of the page. So it's "write but not dirty", and kernel
> > > > still marks it as dirty. One guy in our lab has done some
> > > > experiments to figure out the proportion of "write but not dirty"
> > > > operations, and he uses the cpu2006 benchmark suit. According to his
> > > > results, general workloads has a little proportion (<10%) of "write
> > > > but not dirty" out of all the write operations, while few workloads
> > > > have higher proportion (one even as high as 50%). Now we are not
> > > > sure why "write but not dirty" would happen, it just happened.
> > > >
> > > > So these two reasons contribute to the false dirty pages. To
> > > > optimize, I compute and store the SHA1 hash before transferring each
> > > > page. Next time, if one page needs retransmission, its
> > > > SHA1 hash is computed again, and compared to the old hash. If the
> > > > hash is the same, it's a false dirty page, and we just skip this
> > > > page; Otherwise, the page is transferred, and the new hash replaces
> > > > the old one for next comparison.
> > > > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra
> > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > relatively small.
> > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > deduplication for backup systems.
> > > > They have proven that the probability of hash collision is far
> > > > smaller than disk hardware fault, so it's secure hash, that is, if
> > > > the hashes of two chunks are the same, the content must be the same.
> > > > So I think the SHA1 hash could replace byte-to-byte comparison in
> > > > the VM memory scenery.
> > > >
> > > > Then I do the same migration experiments using the SHA1 hash. For
> > > > the 4 workloads which have big proportions of false dirty pages, the
> > > > improvement is remarkable. Without optimization, they either can not
> > > > converge to stop-and-copy, or take a very long time to complete.
> > > > With the
> > > > SHA1 hash method, all of them now complete in a relatively short time.
> > > > For the reason I have talked above, the other workloads don't get
> > > > notable improvements from the optimization. So below, I only show
> > > > the exact number after optimization for the 4 workloads with
> > > > remarkable improvements.
> > > >
> > > > Any comments or suggestions?
> > >
> > > Maybe you can compare the performance of your solution as that of
> > XBZRLE to see which one is better.
> > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and
> > need less buffer.
> > > How about the overhead of calculating the SHA1? Is it faster than copying a
> > page?
> > >
> > > Liang
> > >
> > >
> > 
> > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to
> > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer
> > as big as the whole VM memory, while SHA1 needs a much small buffer. Of
> > course, if we have a buffer as big as the whole VM memory using XBZRLE, we
> > could transfer less data on network than SHA1, because XBZRLE is able to
> > compress similar pages. In a word, yes, the merit of using SHA1 is that it
> > needs much less buffer, and leads to nice improvement if there are many
> > false dirty pages.
> > 
> 
> The current implementation of XBZRLE begins to buffer page from the second iteration,
> Maybe it's worth to make it start to work from the first iteration based on your finding.

Yes, I noticed that. If we make it start to work from the first iteration, I think the 
buffer should be large enough to obtain obvious effect.

> 
> > In terms of the overhead of calculating the SHA1 compared with transferring
> > a page, it's related to the CPU and network performance. In my test
> > environment(Intel Xeon
> > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing
> > overhead caused by calculating the SHA1, because the throughput of
> > network (got by "info migrate") remains almost the same.
> 
> You can check the CPU usage, or to measure the time spend on a local live migration
>  which use SHA1/ XBZRLE.

Yes, I can compare SHA1 with XBZRLE. Maybe I will post the results later.

Chunguang

> 
> Liang
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-04  4:50                 ` Li, Liang Z
  2016-11-04  7:03                   ` Chunguang Li
@ 2016-11-07 13:52                   ` Chunguang Li
  2016-11-07 14:17                     ` Li, Liang Z
  2016-11-07 14:44                     ` Li, Liang Z
  1 sibling, 2 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-07 13:52 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela




> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Friday, November 4, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> > > > > > > > I think this is "very" wasteful. Assume the workload writes
> > > > > > > > the pages
> > > > dirty randomly within the guest address space, and the transfer
> > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > pages produced in Iteration 1 is not really dirty. This means the
> > > > time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > >
> > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > kinds of workloads get impacted the most?  That would also
> > > > > > > help us to figure out what kinds of speed improvements we can
> > expect.
> > > > > > >
> > > > > > >
> > > > > > > 		Amit
> > > > > >
> > > > > > I have picked up 6 workloads and got the following statistics
> > > > > > numbers of every iteration (except the last stop-copy one) during
> > precopy.
> > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > without the capabilities like xbzrle or compression, etc. The
> > > > > > network for the migration is exclusive, with a separate network for
> > the workloads.
> > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > >
> > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > stop-copy
> > > > phase,
> > > > > > with the given bandwidth and default downtime (300ms), while the
> > > > > > other three (kernel compilation, zeusmp, memcached) did not.
> > > > > >
> > > > > > One page is "not-really-dirty", if it is written first and is
> > > > > > sent later (and not written again after that) during one
> > > > > > iteration. I guess this would not happen so often during the
> > > > > > other iterations as during the 1st iteration. Because all the
> > > > > > pages of the VM are sent to the dest node
> > > > during
> > > > > > the 1st iteration, while during the others, only part of the pages are
> > sent.
> > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > mainly during the 1st iteration , and maybe very little during the other
> > iterations.
> > > > > >
> > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > halved. This is a chain
> > > > reaction,
> > > > > > because the dirty pages produced during Iteration 2 is halved,
> > > > > > which
> > > > incurs
> > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > >
> > > > > Yes; these numbers don't show how many of them are false dirty
> > though.
> > > > >
> > > > > One problem is thinking about pages that have been redirtied, if
> > > > > the page is
> > > > dirtied
> > > > > after the sync but before the network write then it's the
> > > > > false-dirty that you're describing.
> > > > >
> > > > > However, if the page is being written a few times, and so it would
> > > > > have
> > > > been written
> > > > > after the network write then it isn't a false-dirty.
> > > > >
> > > > > You might be able to figure that out with some kernel tracing of
> > > > > when the
> > > > dirtying
> > > > > happens, but it might be easier to write the fix!
> > > > >
> > > > > Dave
> > > >
> > > > Hi, I have made some new progress now.
> > > >
> > > > To tell how many false dirty pages there are exactly in each
> > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > whole VM memory. When a page is transferred to the dest node, it is
> > > > copied to the buffer; During the next iteration, if one page is
> > > > transferred, it is compared to the old one in the buffer, and the
> > > > old one will be replaced for next comparison if it is really dirty.
> > > > Thus, we are now able to get the exact number of false dirty pages.
> > > >
> > > > This time, I use 15 workloads to get the statistic number. They are:
> > > >
> > > >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are
> > > > all scientific
> > > >      computing workloads like Quantum Chromodynamics, Fluid Dynamics,
> > etc.
> > > > I pick
> > > >      up these 11 benchmarks because compared to others, they have
> > > > bigger memory
> > > >      occupation and higher memory dirty rate. Thus most of them
> > > > could not converge
> > > >      to stop-and-copy using the default migration speed (32MB/s).
> > > >   2. kernel compilation
> > > >   3. idle VM
> > > >   4. Apache web server which serves static content
> > > >
> > > >   (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > memory, and the
> > > >    migration speed is the default 32MB/s)
> > > >
> > > >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are
> > > > used as the cache.
> > > >      After filling up the 4GB cache, a client writes the cache at a constant
> > speed
> > > >      during migration. This time, migration speed has no limit, and is up to
> > the
> > > >      capability of 1Gbps Ethernet.
> > > >
> > > > Summarize the results first: (and you can read the precise number
> > > > below)
> > > >
> > > >   1. 4 of these 15 workloads have a big proportion (>60%, even >80%
> > > > during some iterations)
> > > >      of false dirty pages out of all the dirty pages since iteration 2 (and the
> > big
> > > >      proportion lasts during the following iterations). They are
> > cpu2006.zeusmp,
> > > >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > >   2. 2 workloads (idle, webserver) spend most of the migration time
> > > > on iteration 1, even
> > > >      though the proportion of false dirty pages is big since
> > > > iteration 2, the space to
> > > >      optimize is small.
> > > >   3. 1 workload (kernel compilation) only have a big proportion
> > > > during iteration 2, not
> > > >      in the other iterations.
> > > >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > proportion of false
> > > >      dirty pages since iteration 2. So the spaces to optimize for them are
> > small.
> > > >
> > > > Now I want to talk a little more about the reasons why false dirty
> > > > pages are produced.
> > > > The first reason is what we have discussed before---the mechanism to
> > > > track the dirty pages.
> > > > And then I come up with another reason. Here is the situation: a
> > > > write operation to one memory page happens, but it doesn't change
> > > > any content of the page. So it's "write but not dirty", and kernel
> > > > still marks it as dirty. One guy in our lab has done some
> > > > experiments to figure out the proportion of "write but not dirty"
> > > > operations, and he uses the cpu2006 benchmark suit. According to his
> > > > results, general workloads has a little proportion (<10%) of "write
> > > > but not dirty" out of all the write operations, while few workloads
> > > > have higher proportion (one even as high as 50%). Now we are not
> > > > sure why "write but not dirty" would happen, it just happened.
> > > >
> > > > So these two reasons contribute to the false dirty pages. To
> > > > optimize, I compute and store the SHA1 hash before transferring each
> > > > page. Next time, if one page needs retransmission, its
> > > > SHA1 hash is computed again, and compared to the old hash. If the
> > > > hash is the same, it's a false dirty page, and we just skip this
> > > > page; Otherwise, the page is transferred, and the new hash replaces
> > > > the old one for next comparison.
> > > > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra
> > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > relatively small.
> > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > deduplication for backup systems.
> > > > They have proven that the probability of hash collision is far
> > > > smaller than disk hardware fault, so it's secure hash, that is, if
> > > > the hashes of two chunks are the same, the content must be the same.
> > > > So I think the SHA1 hash could replace byte-to-byte comparison in
> > > > the VM memory scenery.
> > > >
> > > > Then I do the same migration experiments using the SHA1 hash. For
> > > > the 4 workloads which have big proportions of false dirty pages, the
> > > > improvement is remarkable. Without optimization, they either can not
> > > > converge to stop-and-copy, or take a very long time to complete.
> > > > With the
> > > > SHA1 hash method, all of them now complete in a relatively short time.
> > > > For the reason I have talked above, the other workloads don't get
> > > > notable improvements from the optimization. So below, I only show
> > > > the exact number after optimization for the 4 workloads with
> > > > remarkable improvements.
> > > >
> > > > Any comments or suggestions?
> > >
> > > Maybe you can compare the performance of your solution as that of
> > XBZRLE to see which one is better.
> > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and
> > need less buffer.
> > > How about the overhead of calculating the SHA1? Is it faster than copying a
> > page?
> > >
> > > Liang
> > >
> > >
> > 
> > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to
> > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer
> > as big as the whole VM memory, while SHA1 needs a much small buffer. Of
> > course, if we have a buffer as big as the whole VM memory using XBZRLE, we
> > could transfer less data on network than SHA1, because XBZRLE is able to
> > compress similar pages. In a word, yes, the merit of using SHA1 is that it
> > needs much less buffer, and leads to nice improvement if there are many
> > false dirty pages.
> > 
> 
> The current implementation of XBZRLE begins to buffer page from the second iteration,
> Maybe it's worth to make it start to work from the first iteration based on your finding.
> 
> > In terms of the overhead of calculating the SHA1 compared with transferring
> > a page, it's related to the CPU and network performance. In my test
> > environment(Intel Xeon
> > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing
> > overhead caused by calculating the SHA1, because the throughput of
> > network (got by "info migrate") remains almost the same.
> 
> You can check the CPU usage, or to measure the time spend on a local live migration
>  which use SHA1/ XBZRLE.
> 
> Liang
> 
> 

I compare SHA1 with XBZRLE. I use XBZRLE in two ways: 
1. Begins to buffer pages from iteration 1;
2. As current implementation, begins to buffer pages from iteration 2.

I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf, memcached. 
I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB ram),
and set the cache size as 1GB for memcached (it run in VM with 6GB ram, and 
memcached takes 4GB as cache).

As you can read from the data below, beginning to buffer pages from iteration 1
is better than the current implementation(from iteration 2), because the total
migration time is shorter.

SHA1 is better than the XBZRLE with the cache size I choose, because it leads to shorter
migration time, and consumes far less memory overhead (<1/200 of the total VM memory).


1. zeusmp
(1) XBZRLE 256MB cache, begins to buffer pages from iteration 1
Iteration   1, duration:  21402 ms , transferred pages:   266450 (dup:    91456, n:   174994, x:        0) , new dirty pages:   129225 , remaining dirty pages:   129225
Iteration   2, duration:   2295 ms , transferred pages:   101471 (dup:    77921, n:    16665, x:     6885) , new dirty pages:    76125 , remaining dirty pages:    77424
Iteration   3, duration:   2000 ms , transferred pages:    56092 (dup:    36345, n:    11249, x:     8498) , new dirty pages:   111498 , remaining dirty pages:   112836
Iteration   4, duration:   1604 ms , transferred pages:    87335 (dup:    69441, n:    10018, x:     7876) , new dirty pages:    19982 , remaining dirty pages:    19982
Iteration   5, duration:    302 ms , transferred pages:    19850 (dup:    16718, n:     2547, x:      585) , new dirty pages:    14084 , remaining dirty pages:    14084
Iteration   6, duration:    194 ms , transferred pages:    13403 (dup:    12338, n:      846, x:      219) , new dirty pages:     3900 , remaining dirty pages:     4243
Iteration   7, duration:      8 ms , transferred pages:     3938 (dup:     3425, n:      239, x:      274) , new dirty pages:      372 , remaining dirty pages:      372
Iteration   8, duration:     71 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        0 , remaining dirty pages:      372
total time: 27891 milliseconds

(2) XBZRLE 256MB cache, begins to buffer pages from iteration 2
    can not converge
Iteration   1, duration:  21698 ms , transferred pages:   266331 (dup:    89009, n:   177322, x:        0) , new dirty pages:   125990 , remaining dirty pages:   126109
Iteration   2, duration:   5909 ms , transferred pages:   126109 (dup:    77248, n:    48861, x:        0) , new dirty pages:   124870 , remaining dirty pages:   124870
Iteration   3, duration:   3197 ms , transferred pages:   110583 (dup:    75471, n:    23129, x:    11983) , new dirty pages:   118035 , remaining dirty pages:   118035
Iteration   4, duration:   3195 ms , transferred pages:   102787 (dup:    72708, n:    22158, x:     7921) , new dirty pages:    86576 , remaining dirty pages:    86773
Iteration   5, duration:   3111 ms , transferred pages:    79563 (dup:    52073, n:    21289, x:     6201) , new dirty pages:    97402 , remaining dirty pages:    97402
Iteration   6, duration:   2407 ms , transferred pages:    79567 (dup:    56415, n:    16013, x:     7139) , new dirty pages:   101193 , remaining dirty pages:   101193
Iteration   7, duration:   2896 ms , transferred pages:    83278 (dup:    55778, n:    20652, x:     6848) , new dirty pages:    90683 , remaining dirty pages:    92977
Iteration   8, duration:   2701 ms , transferred pages:    89112 (dup:    62579, n:    18699, x:     7834) , new dirty pages:   109827 , remaining dirty pages:   110008
Iteration   9, duration:   3602 ms , transferred pages:    95866 (dup:    61631, n:    25632, x:     8603) , new dirty pages:    94551 , remaining dirty pages:    96227
Iteration  10, duration:   3802 ms , transferred pages:    83693 (dup:    50558, n:    26427, x:     6708) , new dirty pages:   123537 , remaining dirty pages:   124170
Iteration  11, duration:   3399 ms , transferred pages:   108770 (dup:    75144, n:    23952, x:     9674) , new dirty pages:   103934 , remaining dirty pages:   104981
Iteration  12, duration:   2700 ms , transferred pages:    91080 (dup:    62981, n:    16600, x:    11499) , new dirty pages:    88314 , remaining dirty pages:    88948
Iteration  13, duration:   3102 ms , transferred pages:    78406 (dup:    50165, n:    21409, x:     6832) , new dirty pages:    73586 , remaining dirty pages:    74025
Iteration  14, duration:    806 ms , transferred pages:    66530 (dup:    51013, n:     3973, x:    11544) , new dirty pages:    67941 , remaining dirty pages:    67941
Iteration  15, duration:   2398 ms , transferred pages:    53117 (dup:    33312, n:    18436, x:     1369) , new dirty pages:   116502 , remaining dirty pages:   118956
Iteration  16, duration:   3200 ms , transferred pages:   103009 (dup:    71642, n:    21378, x:     9989) , new dirty pages:    81777 , remaining dirty pages:    83724
Iteration  17, duration:   3005 ms , transferred pages:    73096 (dup:    45738, n:    19016, x:     8342) , new dirty pages:   116671 , remaining dirty pages:   118397
Iteration  18, duration:   3302 ms , transferred pages:   101507 (dup:    67290, n:    22721, x:    11496) , new dirty pages:   104163 , remaining dirty pages:   105921
Iteration  19, duration:   3705 ms , transferred pages:    90516 (dup:    56932, n:    26394, x:     7190) , new dirty pages:   118139 , remaining dirty pages:   120170
Iteration  20, duration:   3903 ms , transferred pages:   102710 (dup:    67623, n:    25811, x:     9276) , new dirty pages:   103608 , remaining dirty pages:   105496

(3) SHA1
Iteration   1, duration:  21601 ms , transferred pages:   266450 (dup:    89731, rd:   176719) , new dirty pages:   139843 , remaining dirty pages:   139843
Iteration   2, duration:   1747 ms , transferred pages:    92077 (dup:    78364, rd:    13713) , new dirty pages:    90945 , remaining dirty pages:    90945
Iteration   3, duration:   1592 ms , transferred pages:    62253 (dup:    49435, rd:    12818) , new dirty pages:    76929 , remaining dirty pages:    76929
Iteration   4, duration:    992 ms , transferred pages:    44837 (dup:    37886, rd:     6951) , new dirty pages:    71331 , remaining dirty pages:    72916
Iteration   5, duration:    998 ms , transferred pages:    55229 (dup:    47150, rd:     8079) , new dirty pages:    21703 , remaining dirty pages:    23302
Iteration   6, duration:    211 ms , transferred pages:    20337 (dup:    18516, rd:     1821) , new dirty pages:    14500 , remaining dirty pages:    14500
Iteration   7, duration:     31 ms , transferred pages:    12933 (dup:    12627, rd:      306) , new dirty pages:     1520 , remaining dirty pages:     1520
Iteration   8, duration:     30 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        4 , remaining dirty pages:     1524
total time: 27225 milliseconds

2. mcf
(1) XBZRLE 256MB cache, begins to buffer pages from iteration 1
Iteration   1, duration:  31706 ms , transferred pages:   266325 (dup:     7032, n:   259293, x:        0) , new dirty pages:   238215 , remaining dirty pages:   238340
Iteration   2, duration:  21807 ms , transferred pages:   186619 (dup:      335, n:   176826, x:     9458) , new dirty pages:   226886 , remaining dirty pages:   228857
Iteration   3, duration:  21300 ms , transferred pages:   181925 (dup:      201, n:   172974, x:     8750) , new dirty pages:   202288 , remaining dirty pages:   204100
Iteration   4, duration:  17300 ms , transferred pages:   148972 (dup:       38, n:   141113, x:     7821) , new dirty pages:   136220 , remaining dirty pages:   137992
Iteration   5, duration:  13699 ms , transferred pages:   118247 (dup:       38, n:   112030, x:     6179) , new dirty pages:    48397 , remaining dirty pages:    50466
Iteration   6, duration:   4499 ms , transferred pages:    41719 (dup:       24, n:    36790, x:     4905) , new dirty pages:     3753 , remaining dirty pages:     5690
Iteration   7, duration:    399 ms , transferred pages:     3826 (dup:        4, n:     3265, x:      557) , new dirty pages:     1261 , remaining dirty pages:     2437
Iteration   8, duration:     72 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        0 , remaining dirty pages:     2437
total time: 110812 milliseconds

(2) XBZRLE 256MB cache, begins to buffer pages from iteration 2
Iteration   1, duration:  31606 ms , transferred pages:   266450 (dup:     7267, n:   259183, x:        0) , new dirty pages:   233582 , remaining dirty pages:   233582
Iteration   2, duration:  28413 ms , transferred pages:   231693 (dup:       89, n:   231604, x:        0) , new dirty pages:   216962 , remaining dirty pages:   218851
Iteration   3, duration:  18618 ms , transferred pages:   159936 (dup:        3, n:   151579, x:     8354) , new dirty pages:   216400 , remaining dirty pages:   218790
Iteration   4, duration:  18621 ms , transferred pages:   159665 (dup:        0, n:   152102, x:     7563) , new dirty pages:   209860 , remaining dirty pages:   211611
Iteration   5, duration:  17709 ms , transferred pages:   151672 (dup:        4, n:   144493, x:     7175) , new dirty pages:   146273 , remaining dirty pages:   148006
Iteration   6, duration:   9911 ms , transferred pages:    86971 (dup:        2, n:    80842, x:     6127) , new dirty pages:   118364 , remaining dirty pages:   120396
Iteration   7, duration:  14212 ms , transferred pages:   117460 (dup:        0, n:   116149, x:     1311) , new dirty pages:   213993 , remaining dirty pages:   216107
Iteration   8, duration:  22913 ms , transferred pages:   213698 (dup:        4, n:   161520, x:    52174) , new dirty pages:   217947 , remaining dirty pages:   219955
Iteration   9, duration:  23808 ms , transferred pages:   217375 (dup:        3, n:   152315, x:    65057) , new dirty pages:   172615 , remaining dirty pages:   174859
Iteration  10, duration:  15099 ms , transferred pages:   131265 (dup:        0, n:   123463, x:     7802) , new dirty pages:   113946 , remaining dirty pages:   116026
Iteration  11, duration:  10002 ms , transferred pages:    88477 (dup:        8, n:    81753, x:     6716) , new dirty pages:    97006 , remaining dirty pages:    99110
Iteration  12, duration:   6898 ms , transferred pages:    62861 (dup:        4, n:    56392, x:     6465) , new dirty pages:    45164 , remaining dirty pages:    47297
Iteration  13, duration:   3601 ms , transferred pages:    35360 (dup:        0, n:    29390, x:     5970) , new dirty pages:    24581 , remaining dirty pages:    26779
Iteration  14, duration:   1902 ms , transferred pages:    19794 (dup:        0, n:    15475, x:     4319) , new dirty pages:    66153 , remaining dirty pages:    67850
Iteration  15, duration:   5504 ms , transferred pages:    50369 (dup:        0, n:    44902, x:     5467) , new dirty pages:    49279 , remaining dirty pages:    51198
Iteration  16, duration:   3699 ms , transferred pages:    36519 (dup:        2, n:    30184, x:     6333) , new dirty pages:    23672 , remaining dirty pages:    25914
Iteration  17, duration:   1601 ms , transferred pages:    17628 (dup:        0, n:    12972, x:     4656) , new dirty pages:     8685 , remaining dirty pages:    10646
Iteration  18, duration:    599 ms , transferred pages:     7835 (dup:        0, n:     4825, x:     3010) , new dirty pages:     6167 , remaining dirty pages:     6266
Iteration  19, duration:    200 ms , transferred pages:     3590 (dup:        0, n:     1576, x:     2014) , new dirty pages:     3873 , remaining dirty pages:     5709
Iteration  20, duration:    200 ms , transferred pages:     4134 (dup:        0, n:     1574, x:     2560) , new dirty pages:     3609 , remaining dirty pages:     4099
Iteration  21, duration:    100 ms , transferred pages:     2785 (dup:        0, n:      787, x:     1998) , new dirty pages:     1900 , remaining dirty pages:     2585
Iteration  22, duration:      9 ms , transferred pages:     2191 (dup:        0, n:      539, x:     1652) , new dirty pages:      596 , remaining dirty pages:      596
Iteration  23, duration:     41 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        0 , remaining dirty pages:      596
total time: 235286 milliseconds

(3) SHA1
Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6831, rd:   259619) , new dirty pages:   240209 , remaining dirty pages:   240209
Iteration   2, duration:   6250 ms , transferred pages:    51244 (dup:      211, rd:    51033) , new dirty pages:   226651 , remaining dirty pages:   228571
Iteration   3, duration:   4395 ms , transferred pages:    36008 (dup:       80, rd:    35928) , new dirty pages:   110719 , remaining dirty pages:   111478
Iteration   4, duration:   3390 ms , transferred pages:    28068 (dup:       28, rd:    28040) , new dirty pages:   185172 , remaining dirty pages:   185172
Iteration   5, duration:   2986 ms , transferred pages:    23780 (dup:       45, rd:    23735) , new dirty pages:    64357 , remaining dirty pages:    66305
Iteration   6, duration:   2727 ms , transferred pages:    22800 (dup:       12, rd:    22788) , new dirty pages:    61675 , remaining dirty pages:    61675
Iteration   7, duration:   2372 ms , transferred pages:    18943 (dup:       13, rd:    18930) , new dirty pages:    55144 , remaining dirty pages:    55265
Iteration   8, duration:   2100 ms , transferred pages:    17189 (dup:       11, rd:    17178) , new dirty pages:    55244 , remaining dirty pages:    55668
Iteration   9, duration:   2003 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:   107058 , remaining dirty pages:   108014
Iteration  10, duration:   2132 ms , transferred pages:    17825 (dup:       24, rd:    17801) , new dirty pages:   126214 , remaining dirty pages:   126214
Iteration  11, duration:   2229 ms , transferred pages:    18156 (dup:       22, rd:    18134) , new dirty pages:    65725 , remaining dirty pages:    65725
Iteration  12, duration:   2315 ms , transferred pages:    18651 (dup:       21, rd:    18630) , new dirty pages:    52575 , remaining dirty pages:    53903
Iteration  13, duration:   2147 ms , transferred pages:    17435 (dup:       16, rd:    17419) , new dirty pages:    46652 , remaining dirty pages:    47260
Iteration  14, duration:   2000 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:    42721 , remaining dirty pages:    43266
Iteration  15, duration:   1901 ms , transferred pages:    15552 (dup:       10, rd:    15542) , new dirty pages:    38593 , remaining dirty pages:    40792
Iteration  16, duration:   1801 ms , transferred pages:    14735 (dup:       11, rd:    14724) , new dirty pages:    54252 , remaining dirty pages:    55639
Iteration  17, duration:   1708 ms , transferred pages:    13860 (dup:        2, rd:    13858) , new dirty pages:    72379 , remaining dirty pages:    74170
Iteration  18, duration:   1923 ms , transferred pages:    15442 (dup:       12, rd:    15430) , new dirty pages:   101911 , remaining dirty pages:   103547
Iteration  19, duration:   2311 ms , transferred pages:    18823 (dup:        9, rd:    18814) , new dirty pages:    80534 , remaining dirty pages:    82521
Iteration  20, duration:   2081 ms , transferred pages:    17156 (dup:       34, rd:    17122) , new dirty pages:    36054 , remaining dirty pages:    36054
Iteration  21, duration:   1665 ms , transferred pages:    13777 (dup:       10, rd:    13767) , new dirty pages:    29624 , remaining dirty pages:    29624
Iteration  22, duration:   1657 ms , transferred pages:    13290 (dup:        7, rd:    13283) , new dirty pages:    25949 , remaining dirty pages:    28265
Iteration  23, duration:   1599 ms , transferred pages:    13088 (dup:        0, rd:    13088) , new dirty pages:    22356 , remaining dirty pages:    24813
Iteration  24, duration:   1500 ms , transferred pages:    12280 (dup:       10, rd:    12270) , new dirty pages:    21181 , remaining dirty pages:    22608
Iteration  25, duration:   1400 ms , transferred pages:    11457 (dup:        5, rd:    11452) , new dirty pages:    18657 , remaining dirty pages:    20311
Iteration  26, duration:   1200 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    15690 , remaining dirty pages:    17294
Iteration  27, duration:   1201 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    14810 , remaining dirty pages:    15936
Iteration  28, duration:   1000 ms , transferred pages:     8183 (dup:        3, rd:     8180) , new dirty pages:    15387 , remaining dirty pages:    16423
Iteration  29, duration:    900 ms , transferred pages:     7372 (dup:       10, rd:     7362) , new dirty pages:    13303 , remaining dirty pages:    15292
Iteration  30, duration:   1000 ms , transferred pages:     8181 (dup:        1, rd:     8180) , new dirty pages:    17879 , remaining dirty pages:    18457
Iteration  31, duration:    951 ms , transferred pages:     8140 (dup:        9, rd:     8131) , new dirty pages:    21738 , remaining dirty pages:    23304
Iteration  32, duration:    946 ms , transferred pages:     6946 (dup:        1, rd:     6945) , new dirty pages:    15815 , remaining dirty pages:    15815
Iteration  33, duration:    747 ms , transferred pages:     6192 (dup:        0, rd:     6192) , new dirty pages:     6249 , remaining dirty pages:     7670
Iteration  34, duration:    501 ms , transferred pages:     4090 (dup:        0, rd:     4090) , new dirty pages:     6163 , remaining dirty pages:     8422
Iteration  35, duration:    600 ms , transferred pages:     4910 (dup:        2, rd:     4908) , new dirty pages:     3673 , remaining dirty pages:     5222
Iteration  36, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2132 , remaining dirty pages:     4337
Iteration  37, duration:    200 ms , transferred pages:     1637 (dup:        1, rd:     1636) , new dirty pages:      544 , remaining dirty pages:     2251
Iteration  38, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     2251
total time: 97919 milliseconds

3. memcached
(1) XBZRLE 1024MB cache, begins to buffer pages from iteration 1
Iteration   1, duration:  40763 ms , transferred pages:  1570149 (dup:   404139, n:  1166010, x:        0) , new dirty pages:   526462 , remaining dirty pages:   533483
Iteration   2, duration:  15741 ms , transferred pages:   461867 (dup:     4070, n:   437704, x:    20093) , new dirty pages:   256841 , remaining dirty pages:   265501
Iteration   3, duration:   7874 ms , transferred pages:   231950 (dup:      280, n:   207569, x:    24101) , new dirty pages:   153526 , remaining dirty pages:   160865
Iteration   4, duration:   4260 ms , transferred pages:   135181 (dup:      135, n:   116768, x:    18278) , new dirty pages:   100298 , remaining dirty pages:   107278
Iteration   5, duration:   2506 ms , transferred pages:    87596 (dup:      180, n:    67600, x:    19816) , new dirty pages:    63685 , remaining dirty pages:    71790
Iteration   6, duration:   1373 ms , transferred pages:    51800 (dup:      128, n:    37336, x:    14336) , new dirty pages:    38785 , remaining dirty pages:    46064
Iteration   7, duration:    872 ms , transferred pages:    32015 (dup:       56, n:    23414, x:     8545) , new dirty pages:    23580 , remaining dirty pages:    31629
Iteration   8, duration:    527 ms , transferred pages:    21833 (dup:       40, n:    14372, x:     7421) , new dirty pages:    16624 , remaining dirty pages:    23482
Iteration   9, duration:    291 ms , transferred pages:    14917 (dup:       16, n:     6572, x:     8329) , new dirty pages:    10039 , remaining dirty pages:    16753
Iteration  10, duration:    113 ms , transferred pages:     6082 (dup:      111, n:     3300, x:     2671) , new dirty pages:     4081 , remaining dirty pages:    12703
Iteration  11, duration:    119 ms , transferred pages:     3970 (dup:       16, n:     2953, x:     1001) , new dirty pages:     3824 , remaining dirty pages:    11936
Iteration  12, duration:     51 ms , transferred pages:     3585 (dup:        0, n:     1154, x:     2431) , new dirty pages:     1711 , remaining dirty pages:     9900
Iteration  13, duration:     62 ms , transferred pages:     2945 (dup:        0, n:     1589, x:     1356) , new dirty pages:     1909 , remaining dirty pages:     8503
Iteration  14, duration:      2 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        1 , remaining dirty pages:     8504
total time: 74738 milliseconds

(2) XBZRLE 1024MB cache, begins to buffer pages from iteration 2
Iteration   1, duration:  40375 ms , transferred pages:  1570347 (dup:   415923, n:  1154424, x:        0) , new dirty pages:   511859 , remaining dirty pages:   518682
Iteration   2, duration:  17580 ms , transferred pages:   510145 (dup:     5970, n:   504175, x:        0) , new dirty pages:   291686 , remaining dirty pages:   300223
Iteration   3, duration:   8259 ms , transferred pages:   253656 (dup:      929, n:   230020, x:    22707) , new dirty pages:   166721 , remaining dirty pages:   174231
Iteration   4, duration:   4733 ms , transferred pages:   147925 (dup:      257, n:   132454, x:    15214) , new dirty pages:   103965 , remaining dirty pages:   111436
Iteration   5, duration:   2587 ms , transferred pages:    90734 (dup:      251, n:    70008, x:    20475) , new dirty pages:    61266 , remaining dirty pages:    69202
Iteration   6, duration:   1377 ms , transferred pages:    51416 (dup:       55, n:    37776, x:    13585) , new dirty pages:    45236 , remaining dirty pages:    52106
Iteration   7, duration:   1126 ms , transferred pages:    40020 (dup:      259, n:    30064, x:     9697) , new dirty pages:    28433 , remaining dirty pages:    35358
Iteration   8, duration:    574 ms , transferred pages:    23754 (dup:       40, n:    16066, x:     7648) , new dirty pages:    18067 , remaining dirty pages:    26353
Iteration   9, duration:    395 ms , transferred pages:    17607 (dup:       16, n:     9463, x:     8128) , new dirty pages:    11507 , remaining dirty pages:    18488
Iteration  10, duration:    171 ms , transferred pages:     8195 (dup:       40, n:     4726, x:     3429) , new dirty pages:     5482 , remaining dirty pages:    13898
Iteration  11, duration:    116 ms , transferred pages:     6594 (dup:       16, n:     2679, x:     3899) , new dirty pages:     3884 , remaining dirty pages:     9581
Iteration  12, duration:     54 ms , transferred pages:     1793 (dup:        0, n:     1634, x:      159) , new dirty pages:     1515 , remaining dirty pages:     9189
Iteration  13, duration:     62 ms , transferred pages:     1793 (dup:        0, n:     1643, x:      150) , new dirty pages:     1657 , remaining dirty pages:     8871
Iteration  14, duration:      3 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        1 , remaining dirty pages:     8872
total time: 77578 milliseconds

(3) SHA1
Iteration   1, duration:  40664 ms , transferred pages:  1569037 (dup:   405940, rd:  1163097) , new dirty pages:   506846 , remaining dirty pages:   514979
Iteration   2, duration:   8032 ms , transferred pages:   161130 (dup:     4007, rd:   157123) , new dirty pages:   153479 , remaining dirty pages:   153479
Iteration   3, duration:   2620 ms , transferred pages:    65260 (dup:       20, rd:    65240) , new dirty pages:    64014 , remaining dirty pages:    67100
Iteration   4, duration:   1160 ms , transferred pages:    30227 (dup:       60, rd:    30167) , new dirty pages:    34031 , remaining dirty pages:    41414
Iteration   5, duration:    648 ms , transferred pages:    18700 (dup:       56, rd:    18644) , new dirty pages:    18375 , remaining dirty pages:    25536
Iteration   6, duration:    389 ms , transferred pages:    11399 (dup:       55, rd:    11344) , new dirty pages:    12536 , remaining dirty pages:    17516
Iteration   7, duration:    292 ms , transferred pages:     8197 (dup:        0, rd:     8197) , new dirty pages:     8387 , remaining dirty pages:    16802
Iteration   8, duration:    171 ms , transferred pages:     4931 (dup:       39, rd:     4892) , new dirty pages:     6182 , remaining dirty pages:    14060
Iteration   9, duration:    163 ms , transferred pages:     4355 (dup:       16, rd:     4339) , new dirty pages:     5530 , remaining dirty pages:    11973
Iteration  10, duration:    104 ms , transferred pages:     3266 (dup:        0, rd:     3266) , new dirty pages:     2893 , remaining dirty pages:    11014
Iteration  11, duration:     52 ms , transferred pages:     1153 (dup:        0, rd:     1153) , new dirty pages:     1586 , remaining dirty pages:    10516
Iteration  12, duration:     52 ms , transferred pages:     1921 (dup:       39, rd:     1882) , new dirty pages:     1619 , remaining dirty pages:     8842
Iteration  13, duration:     62 ms , transferred pages:     1537 (dup:        0, rd:     1537) , new dirty pages:     2052 , remaining dirty pages:     8871
Iteration  14, duration:     58 ms , transferred pages:     1665 (dup:        0, rd:     1665) , new dirty pages:     1947 , remaining dirty pages:     7989
Iteration  15, duration:      2 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     7989
total time: 54693 milliseconds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-07 13:52                   ` Chunguang Li
@ 2016-11-07 14:17                     ` Li, Liang Z
  2016-11-08  5:27                       ` Chunguang Li
  2016-11-07 14:44                     ` Li, Liang Z
  1 sibling, 1 reply; 21+ messages in thread
From: Li, Liang Z @ 2016-11-07 14:17 UTC (permalink / raw)
  To: Chunguang Li
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela

> > > > > > > > > I think this is "very" wasteful. Assume the workload
> > > > > > > > > writes the pages
> > > > > dirty randomly within the guest address space, and the transfer
> > > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > > pages produced in Iteration 1 is not really dirty. This means
> > > > > the time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > > >
> > > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > > kinds of workloads get impacted the most?  That would also
> > > > > > > > help us to figure out what kinds of speed improvements we
> > > > > > > > can
> > > expect.
> > > > > > > >
> > > > > > > >
> > > > > > > > 		Amit
> > > > > > >
> > > > > > > I have picked up 6 workloads and got the following
> > > > > > > statistics numbers of every iteration (except the last
> > > > > > > stop-copy one) during
> > > precopy.
> > > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > > without the capabilities like xbzrle or compression, etc.
> > > > > > > The network for the migration is exclusive, with a separate
> > > > > > > network for
> > > the workloads.
> > > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > > >
> > > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > > stop-copy
> > > > > phase,
> > > > > > > with the given bandwidth and default downtime (300ms), while
> > > > > > > the other three (kernel compilation, zeusmp, memcached) did not.
> > > > > > >
> > > > > > > One page is "not-really-dirty", if it is written first and
> > > > > > > is sent later (and not written again after that) during one
> > > > > > > iteration. I guess this would not happen so often during the
> > > > > > > other iterations as during the 1st iteration. Because all
> > > > > > > the pages of the VM are sent to the dest node
> > > > > during
> > > > > > > the 1st iteration, while during the others, only part of the
> > > > > > > pages are
> > > sent.
> > > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > > mainly during the 1st iteration , and maybe very little
> > > > > > > during the other
> > > iterations.
> > > > > > >
> > > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > > halved. This is a chain
> > > > > reaction,
> > > > > > > because the dirty pages produced during Iteration 2 is
> > > > > > > halved, which
> > > > > incurs
> > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > > >
> > > > > > Yes; these numbers don't show how many of them are false dirty
> > > though.
> > > > > >
> > > > > > One problem is thinking about pages that have been redirtied,
> > > > > > if the page is
> > > > > dirtied
> > > > > > after the sync but before the network write then it's the
> > > > > > false-dirty that you're describing.
> > > > > >
> > > > > > However, if the page is being written a few times, and so it
> > > > > > would have
> > > > > been written
> > > > > > after the network write then it isn't a false-dirty.
> > > > > >
> > > > > > You might be able to figure that out with some kernel tracing
> > > > > > of when the
> > > > > dirtying
> > > > > > happens, but it might be easier to write the fix!
> > > > > >
> > > > > > Dave
> > > > >
> > > > > Hi, I have made some new progress now.
> > > > >
> > > > > To tell how many false dirty pages there are exactly in each
> > > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > > whole VM memory. When a page is transferred to the dest node, it
> > > > > is copied to the buffer; During the next iteration, if one page
> > > > > is transferred, it is compared to the old one in the buffer, and
> > > > > the old one will be replaced for next comparison if it is really dirty.
> > > > > Thus, we are now able to get the exact number of false dirty pages.
> > > > >
> > > > > This time, I use 15 workloads to get the statistic number. They are:
> > > > >
> > > > >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They
> > > > > are all scientific
> > > > >      computing workloads like Quantum Chromodynamics, Fluid
> > > > > Dynamics,
> > > etc.
> > > > > I pick
> > > > >      up these 11 benchmarks because compared to others, they
> > > > > have bigger memory
> > > > >      occupation and higher memory dirty rate. Thus most of them
> > > > > could not converge
> > > > >      to stop-and-copy using the default migration speed (32MB/s).
> > > > >   2. kernel compilation
> > > > >   3. idle VM
> > > > >   4. Apache web server which serves static content
> > > > >
> > > > >   (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > > memory, and the
> > > > >    migration speed is the default 32MB/s)
> > > > >
> > > > >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB
> > > > > are used as the cache.
> > > > >      After filling up the 4GB cache, a client writes the cache
> > > > > at a constant
> > > speed
> > > > >      during migration. This time, migration speed has no limit,
> > > > > and is up to
> > > the
> > > > >      capability of 1Gbps Ethernet.
> > > > >
> > > > > Summarize the results first: (and you can read the precise
> > > > > number
> > > > > below)
> > > > >
> > > > >   1. 4 of these 15 workloads have a big proportion (>60%, even
> > > > > >80% during some iterations)
> > > > >      of false dirty pages out of all the dirty pages since
> > > > > iteration 2 (and the
> > > big
> > > > >      proportion lasts during the following iterations). They are
> > > cpu2006.zeusmp,
> > > > >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > > >   2. 2 workloads (idle, webserver) spend most of the migration
> > > > > time on iteration 1, even
> > > > >      though the proportion of false dirty pages is big since
> > > > > iteration 2, the space to
> > > > >      optimize is small.
> > > > >   3. 1 workload (kernel compilation) only have a big proportion
> > > > > during iteration 2, not
> > > > >      in the other iterations.
> > > > >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > > proportion of false
> > > > >      dirty pages since iteration 2. So the spaces to optimize
> > > > > for them are
> > > small.
> > > > >
> > > > > Now I want to talk a little more about the reasons why false
> > > > > dirty pages are produced.
> > > > > The first reason is what we have discussed before---the
> > > > > mechanism to track the dirty pages.
> > > > > And then I come up with another reason. Here is the situation: a
> > > > > write operation to one memory page happens, but it doesn't
> > > > > change any content of the page. So it's "write but not dirty",
> > > > > and kernel still marks it as dirty. One guy in our lab has done
> > > > > some experiments to figure out the proportion of "write but not
> dirty"
> > > > > operations, and he uses the cpu2006 benchmark suit. According to
> > > > > his results, general workloads has a little proportion (<10%) of
> > > > > "write but not dirty" out of all the write operations, while few
> > > > > workloads have higher proportion (one even as high as 50%). Now
> > > > > we are not sure why "write but not dirty" would happen, it just
> happened.
> > > > >
> > > > > So these two reasons contribute to the false dirty pages. To
> > > > > optimize, I compute and store the SHA1 hash before transferring
> > > > > each page. Next time, if one page needs retransmission, its
> > > > > SHA1 hash is computed again, and compared to the old hash. If
> > > > > the hash is the same, it's a false dirty page, and we just skip
> > > > > this page; Otherwise, the page is transferred, and the new hash
> > > > > replaces the old one for next comparison.
> > > > > The reason to use SHA1 hash but not byte-by-byte comparison is
> > > > > the memory overheads. One SHA1 hash is 20 bytes. So we need
> > > > > extra
> > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > > relatively small.
> > > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > > deduplication for backup systems.
> > > > > They have proven that the probability of hash collision is far
> > > > > smaller than disk hardware fault, so it's secure hash, that is,
> > > > > if the hashes of two chunks are the same, the content must be the
> same.
> > > > > So I think the SHA1 hash could replace byte-to-byte comparison
> > > > > in the VM memory scenery.
> > > > >
> > > > > Then I do the same migration experiments using the SHA1 hash.
> > > > > For the 4 workloads which have big proportions of false dirty
> > > > > pages, the improvement is remarkable. Without optimization, they
> > > > > either can not converge to stop-and-copy, or take a very long time to
> complete.
> > > > > With the
> > > > > SHA1 hash method, all of them now complete in a relatively short
> time.
> > > > > For the reason I have talked above, the other workloads don't
> > > > > get notable improvements from the optimization. So below, I only
> > > > > show the exact number after optimization for the 4 workloads
> > > > > with remarkable improvements.
> > > > >
> > > > > Any comments or suggestions?
> > > >
> > > > Maybe you can compare the performance of your solution as that of
> > > XBZRLE to see which one is better.
> > > > The merit of using SHA1 is that it can avoid data copy as that in
> > > > XBZRLE, and
> > > need less buffer.
> > > > How about the overhead of calculating the SHA1? Is it faster than
> > > > copying a
> > > page?
> > > >
> > > > Liang
> > > >
> > > >
> > >
> > > Yes, XBZRLE is able to handle the false dirty pages. However, if we
> > > want to avoid transferring all of the false dirty pages using
> > > XBZRLE, we need a buffer as big as the whole VM memory, while SHA1
> > > needs a much small buffer. Of course, if we have a buffer as big as
> > > the whole VM memory using XBZRLE, we could transfer less data on
> > > network than SHA1, because XBZRLE is able to compress similar pages.
> > > In a word, yes, the merit of using SHA1 is that it needs much less
> > > buffer, and leads to nice improvement if there are many false dirty pages.
> > >
> >
> > The current implementation of XBZRLE begins to buffer page from the
> > second iteration, Maybe it's worth to make it start to work from the first
> iteration based on your finding.
> >
> > > In terms of the overhead of calculating the SHA1 compared with
> > > transferring a page, it's related to the CPU and network
> > > performance. In my test environment(Intel Xeon
> > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra
> > > computing overhead caused by calculating the SHA1, because the
> > > throughput of network (got by "info migrate") remains almost the same.
> >
> > You can check the CPU usage, or to measure the time spend on a local
> > live migration  which use SHA1/ XBZRLE.
> >
> > Liang
> >
> >
> 
> I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
> 1. Begins to buffer pages from iteration 1; 2. As current implementation,
> begins to buffer pages from iteration 2.
> 
> I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf,
> memcached.
> I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB
> ram), and set the cache size as 1GB for memcached (it run in VM with 6GB
> ram, and memcached takes 4GB as cache).
> 
> As you can read from the data below, beginning to buffer pages from
> iteration 1 is better than the current implementation(from iteration 2),
> because the total migration time is shorter.
> 
> SHA1 is better than the XBZRLE with the cache size I choose, because it leads
> to shorter migration time, and consumes far less memory overhead (<1/200
> of the total VM memory).
> 

Hi Chunguang,

Have you tried to use a large XBZRLE cache size which equals to the guest's RAM size?
Is SHA1 faster in that case?

Thanks!
Liang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-07 13:52                   ` Chunguang Li
  2016-11-07 14:17                     ` Li, Liang Z
@ 2016-11-07 14:44                     ` Li, Liang Z
  1 sibling, 0 replies; 21+ messages in thread
From: Li, Liang Z @ 2016-11-07 14:44 UTC (permalink / raw)
  To: Chunguang Li
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela

> > I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
> > 1. Begins to buffer pages from iteration 1; 2. As current
> > implementation, begins to buffer pages from iteration 2.
> >
> > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf,
> > memcached.
> > I set the cache size as 256MB for zeusmp & mcf (they run in VM with
> > 1GB ram), and set the cache size as 1GB for memcached (it run in VM
> > with 6GB ram, and memcached takes 4GB as cache).
> >
> > As you can read from the data below, beginning to buffer pages from
> > iteration 1 is better than the current implementation(from iteration
> > 2), because the total migration time is shorter.
> >
> > SHA1 is better than the XBZRLE with the cache size I choose, because
> > it leads to shorter migration time, and consumes far less memory
> > overhead (<1/200 of the total VM memory).
> >
> 
> Hi Chunguang,
> 
> Have you tried to use a large XBZRLE cache size which equals to the guest's
> RAM size?
> Is SHA1 faster in that case?
> 
> Thanks!
> Liang

Intel's future chipset will contain hardware engines which supports SHA-x and MD5,
We can make use these engines to offload the overhead from CPU for SHA/MD5 calculation.

Liang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-07 14:17                     ` Li, Liang Z
@ 2016-11-08  5:27                       ` Chunguang Li
  0 siblings, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-08  5:27 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
	stefanha, quintela




> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Monday, November 7, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> > > > > > > > > > I think this is "very" wasteful. Assume the workload
> > > > > > > > > > writes the pages
> > > > > > dirty randomly within the guest address space, and the transfer
> > > > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > > > pages produced in Iteration 1 is not really dirty. This means
> > > > > > the time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > > > >
> > > > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > > > kinds of workloads get impacted the most?  That would also
> > > > > > > > > help us to figure out what kinds of speed improvements we
> > > > > > > > > can
> > > > expect.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 		Amit
> > > > > > > >
> > > > > > > > I have picked up 6 workloads and got the following
> > > > > > > > statistics numbers of every iteration (except the last
> > > > > > > > stop-copy one) during
> > > > precopy.
> > > > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > > > without the capabilities like xbzrle or compression, etc.
> > > > > > > > The network for the migration is exclusive, with a separate
> > > > > > > > network for
> > > > the workloads.
> > > > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > > > >
> > > > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > > > stop-copy
> > > > > > phase,
> > > > > > > > with the given bandwidth and default downtime (300ms), while
> > > > > > > > the other three (kernel compilation, zeusmp, memcached) did not.
> > > > > > > >
> > > > > > > > One page is "not-really-dirty", if it is written first and
> > > > > > > > is sent later (and not written again after that) during one
> > > > > > > > iteration. I guess this would not happen so often during the
> > > > > > > > other iterations as during the 1st iteration. Because all
> > > > > > > > the pages of the VM are sent to the dest node
> > > > > > during
> > > > > > > > the 1st iteration, while during the others, only part of the
> > > > > > > > pages are
> > > > sent.
> > > > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > > > mainly during the 1st iteration , and maybe very little
> > > > > > > > during the other
> > > > iterations.
> > > > > > > >
> > > > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > > > halved. This is a chain
> > > > > > reaction,
> > > > > > > > because the dirty pages produced during Iteration 2 is
> > > > > > > > halved, which
> > > > > > incurs
> > > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > > > >
> > > > > > > Yes; these numbers don't show how many of them are false dirty
> > > > though.
> > > > > > >
> > > > > > > One problem is thinking about pages that have been redirtied,
> > > > > > > if the page is
> > > > > > dirtied
> > > > > > > after the sync but before the network write then it's the
> > > > > > > false-dirty that you're describing.
> > > > > > >
> > > > > > > However, if the page is being written a few times, and so it
> > > > > > > would have
> > > > > > been written
> > > > > > > after the network write then it isn't a false-dirty.
> > > > > > >
> > > > > > > You might be able to figure that out with some kernel tracing
> > > > > > > of when the
> > > > > > dirtying
> > > > > > > happens, but it might be easier to write the fix!
> > > > > > >
> > > > > > > Dave
> > > > > >
> > > > > > Hi, I have made some new progress now.
> > > > > >
> > > > > > To tell how many false dirty pages there are exactly in each
> > > > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > > > whole VM memory. When a page is transferred to the dest node, it
> > > > > > is copied to the buffer; During the next iteration, if one page
> > > > > > is transferred, it is compared to the old one in the buffer, and
> > > > > > the old one will be replaced for next comparison if it is really dirty.
> > > > > > Thus, we are now able to get the exact number of false dirty pages.
> > > > > >
> > > > > > This time, I use 15 workloads to get the statistic number. They are:
> > > > > >
> > > > > >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They
> > > > > > are all scientific
> > > > > >      computing workloads like Quantum Chromodynamics, Fluid
> > > > > > Dynamics,
> > > > etc.
> > > > > > I pick
> > > > > >      up these 11 benchmarks because compared to others, they
> > > > > > have bigger memory
> > > > > >      occupation and higher memory dirty rate. Thus most of them
> > > > > > could not converge
> > > > > >      to stop-and-copy using the default migration speed (32MB/s).
> > > > > >   2. kernel compilation
> > > > > >   3. idle VM
> > > > > >   4. Apache web server which serves static content
> > > > > >
> > > > > >   (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > > > memory, and the
> > > > > >    migration speed is the default 32MB/s)
> > > > > >
> > > > > >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB
> > > > > > are used as the cache.
> > > > > >      After filling up the 4GB cache, a client writes the cache
> > > > > > at a constant
> > > > speed
> > > > > >      during migration. This time, migration speed has no limit,
> > > > > > and is up to
> > > > the
> > > > > >      capability of 1Gbps Ethernet.
> > > > > >
> > > > > > Summarize the results first: (and you can read the precise
> > > > > > number
> > > > > > below)
> > > > > >
> > > > > >   1. 4 of these 15 workloads have a big proportion (>60%, even
> > > > > > >80% during some iterations)
> > > > > >      of false dirty pages out of all the dirty pages since
> > > > > > iteration 2 (and the
> > > > big
> > > > > >      proportion lasts during the following iterations). They are
> > > > cpu2006.zeusmp,
> > > > > >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > > > >   2. 2 workloads (idle, webserver) spend most of the migration
> > > > > > time on iteration 1, even
> > > > > >      though the proportion of false dirty pages is big since
> > > > > > iteration 2, the space to
> > > > > >      optimize is small.
> > > > > >   3. 1 workload (kernel compilation) only have a big proportion
> > > > > > during iteration 2, not
> > > > > >      in the other iterations.
> > > > > >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > > > proportion of false
> > > > > >      dirty pages since iteration 2. So the spaces to optimize
> > > > > > for them are
> > > > small.
> > > > > >
> > > > > > Now I want to talk a little more about the reasons why false
> > > > > > dirty pages are produced.
> > > > > > The first reason is what we have discussed before---the
> > > > > > mechanism to track the dirty pages.
> > > > > > And then I come up with another reason. Here is the situation: a
> > > > > > write operation to one memory page happens, but it doesn't
> > > > > > change any content of the page. So it's "write but not dirty",
> > > > > > and kernel still marks it as dirty. One guy in our lab has done
> > > > > > some experiments to figure out the proportion of "write but not
> > dirty"
> > > > > > operations, and he uses the cpu2006 benchmark suit. According to
> > > > > > his results, general workloads has a little proportion (<10%) of
> > > > > > "write but not dirty" out of all the write operations, while few
> > > > > > workloads have higher proportion (one even as high as 50%). Now
> > > > > > we are not sure why "write but not dirty" would happen, it just
> > happened.
> > > > > >
> > > > > > So these two reasons contribute to the false dirty pages. To
> > > > > > optimize, I compute and store the SHA1 hash before transferring
> > > > > > each page. Next time, if one page needs retransmission, its
> > > > > > SHA1 hash is computed again, and compared to the old hash. If
> > > > > > the hash is the same, it's a false dirty page, and we just skip
> > > > > > this page; Otherwise, the page is transferred, and the new hash
> > > > > > replaces the old one for next comparison.
> > > > > > The reason to use SHA1 hash but not byte-by-byte comparison is
> > > > > > the memory overheads. One SHA1 hash is 20 bytes. So we need
> > > > > > extra
> > > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > > > relatively small.
> > > > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > > > deduplication for backup systems.
> > > > > > They have proven that the probability of hash collision is far
> > > > > > smaller than disk hardware fault, so it's secure hash, that is,
> > > > > > if the hashes of two chunks are the same, the content must be the
> > same.
> > > > > > So I think the SHA1 hash could replace byte-to-byte comparison
> > > > > > in the VM memory scenery.
> > > > > >
> > > > > > Then I do the same migration experiments using the SHA1 hash.
> > > > > > For the 4 workloads which have big proportions of false dirty
> > > > > > pages, the improvement is remarkable. Without optimization, they
> > > > > > either can not converge to stop-and-copy, or take a very long time to
> > complete.
> > > > > > With the
> > > > > > SHA1 hash method, all of them now complete in a relatively short
> > time.
> > > > > > For the reason I have talked above, the other workloads don't
> > > > > > get notable improvements from the optimization. So below, I only
> > > > > > show the exact number after optimization for the 4 workloads
> > > > > > with remarkable improvements.
> > > > > >
> > > > > > Any comments or suggestions?
> > > > >
> > > > > Maybe you can compare the performance of your solution as that of
> > > > XBZRLE to see which one is better.
> > > > > The merit of using SHA1 is that it can avoid data copy as that in
> > > > > XBZRLE, and
> > > > need less buffer.
> > > > > How about the overhead of calculating the SHA1? Is it faster than
> > > > > copying a
> > > > page?
> > > > >
> > > > > Liang
> > > > >
> > > > >
> > > >
> > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we
> > > > want to avoid transferring all of the false dirty pages using
> > > > XBZRLE, we need a buffer as big as the whole VM memory, while SHA1
> > > > needs a much small buffer. Of course, if we have a buffer as big as
> > > > the whole VM memory using XBZRLE, we could transfer less data on
> > > > network than SHA1, because XBZRLE is able to compress similar pages.
> > > > In a word, yes, the merit of using SHA1 is that it needs much less
> > > > buffer, and leads to nice improvement if there are many false dirty pages.
> > > >
> > >
> > > The current implementation of XBZRLE begins to buffer page from the
> > > second iteration, Maybe it's worth to make it start to work from the first
> > iteration based on your finding.
> > >
> > > > In terms of the overhead of calculating the SHA1 compared with
> > > > transferring a page, it's related to the CPU and network
> > > > performance. In my test environment(Intel Xeon
> > > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra
> > > > computing overhead caused by calculating the SHA1, because the
> > > > throughput of network (got by "info migrate") remains almost the same.
> > >
> > > You can check the CPU usage, or to measure the time spend on a local
> > > live migration  which use SHA1/ XBZRLE.
> > >
> > > Liang
> > >
> > >
> > 
> > I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
> > 1. Begins to buffer pages from iteration 1; 2. As current implementation,
> > begins to buffer pages from iteration 2.
> > 
> > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf,
> > memcached.
> > I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB
> > ram), and set the cache size as 1GB for memcached (it run in VM with 6GB
> > ram, and memcached takes 4GB as cache).
> > 
> > As you can read from the data below, beginning to buffer pages from
> > iteration 1 is better than the current implementation(from iteration 2),
> > because the total migration time is shorter.
> > 
> > SHA1 is better than the XBZRLE with the cache size I choose, because it leads
> > to shorter migration time, and consumes far less memory overhead (<1/200
> > of the total VM memory).
> > 
> 
> Hi Chunguang,
> 
> Have you tried to use a large XBZRLE cache size which equals to the guest's RAM size?
> Is SHA1 faster in that case?
> 
> Thanks!
> Liang

You can check the data below. For zeusmp and mcf when the XBZRLE cache size equals to 
the guest's RAM size (in fact, the 1024 cache size is a little smaller than the RAM
size, because the guest's RAM has a little extra ram space besides the 1GB we set), 
XBZRLE is faster than SHA1.

For the memcached, I am not able to set the cache size as the 6GB RAM size, because the 
cache size has to be a power of 2; And I am not able to set it larger than RAM size, because
the current implementation doesn't allow that. So I set the cache size as 4GB, and XBZRLE
with this cache size is almost the same as SHA1 in terms of migration time.

Note that XBZRLE begins to buffer pages from iteration 1.

zeusmp 1024MB cache

Iteration   1, duration:  21604 ms , transferred pages:   266450 (dup:    89509, n:   176941, x:        0) , new dirty pages:   129647 , remaining dirty pages:   129647
Iteration   2, duration:    652 ms , transferred pages:    89270 (dup:    78176, n:     1085, x:    10009) , new dirty pages:    46438 , remaining dirty pages:    46438
Iteration   3, duration:    400 ms , transferred pages:    35789 (dup:    30536, n:        0, x:     5253) , new dirty pages:    33569 , remaining dirty pages:    33569
Iteration   4, duration:    470 ms , transferred pages:    19106 (dup:    10317, n:       75, x:     8714) , new dirty pages:    39307 , remaining dirty pages:    39307
Iteration   5, duration:     72 ms , transferred pages:    17853 (dup:    15904, n:        0, x:     1949) , new dirty pages:     4078 , remaining dirty pages:     4078
Iteration   6, duration:     10 ms , transferred pages:     3280 (dup:     2910, n:        0, x:      370) , new dirty pages:      521 , remaining dirty pages:      521
Iteration   7, duration:    254 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        0 , remaining dirty pages:      521
total time: 23481 milliseconds  (v.s. 27225 milliseconds for SHA1)

mcf 1024MB cache

Iteration   1, duration:  31704 ms , transferred pages:   266450 (dup:     6794, n:   259656, x:        0) , new dirty pages:   233250 , remaining dirty pages:   233250
Iteration   2, duration:    544 ms , transferred pages:    34186 (dup:      182, n:      423, x:    33581) , new dirty pages:    32757 , remaining dirty pages:    32757
Iteration   3, duration:     67 ms , transferred pages:     8536 (dup:        0, n:        0, x:     8536) , new dirty pages:     5305 , remaining dirty pages:     5305
Iteration   4, duration:     13 ms , transferred pages:     2125 (dup:        0, n:        0, x:     2125) , new dirty pages:     1632 , remaining dirty pages:     1632
Iteration   5, duration:      9 ms , transferred pages:     1038 (dup:        0, n:        0, x:     1038) , new dirty pages:     1095 , remaining dirty pages:     1095
Iteration   6, duration:      3 ms , transferred pages:      592 (dup:        0, n:        0, x:      592) , new dirty pages:     1148 , remaining dirty pages:     1148
Iteration   7, duration:      2 ms , transferred pages:      136 (dup:        0, n:        0, x:      136) , new dirty pages:     1123 , remaining dirty pages:     1123
Iteration   8, duration:      2 ms , transferred pages:        2 (dup:        0, n:        0, x:        2) , new dirty pages:      985 , remaining dirty pages:      985
Iteration   9, duration:      2 ms , transferred pages:       14 (dup:        0, n:        0, x:       14) , new dirty pages:      640 , remaining dirty pages:      640
Iteration  10, duration:      2 ms , transferred pages:       16 (dup:        0, n:        0, x:       16) , new dirty pages:      622 , remaining dirty pages:      622
Iteration  11, duration:      1 ms , transferred pages:        1 (dup:        0, n:        0, x:        1) , new dirty pages:      693 , remaining dirty pages:      693
Iteration  12, duration:      1 ms , transferred pages:      122 (dup:        0, n:        0, x:      122) , new dirty pages:      639 , remaining dirty pages:      639
Iteration  13, duration:      2 ms , transferred pages:      475 (dup:        0, n:        0, x:      475) , new dirty pages:      522 , remaining dirty pages:      522
Iteration  14, duration:     22 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:       27 , remaining dirty pages:      549
total time: 32393 milliseconds  (v.s. 97919 milliseconds for SHA1)

memcached 4096MB cache

Iteration   1, duration:  41025 ms , transferred pages:  1569059 (dup:   395085, n:  1173974, x:        0) , new dirty pages:   560788 , remaining dirty pages:   568899
Iteration   2, duration:   8218 ms , transferred pages:   300889 (dup:     3963, n:   142928, x:   153998) , new dirty pages:   158832 , remaining dirty pages:   167022
Iteration   3, duration:   2408 ms , transferred pages:    98923 (dup:      285, n:    33854, x:    64784) , new dirty pages:    68647 , remaining dirty pages:    77338
Iteration   4, duration:    869 ms , transferred pages:    43408 (dup:       64, n:    17911, x:    25433) , new dirty pages:    26087 , remaining dirty pages:    33845
Iteration   5, duration:    455 ms , transferred pages:    23048 (dup:       55, n:    10156, x:    12837) , new dirty pages:    15275 , remaining dirty pages:    16636
Iteration   6, duration:    162 ms , transferred pages:     7939 (dup:       55, n:     2425, x:     5459) , new dirty pages:     6009 , remaining dirty pages:    10051
Iteration   7, duration:     52 ms , transferred pages:     5761 (dup:      212, n:      707, x:     4842) , new dirty pages:     2204 , remaining dirty pages:     4027
Iteration   8, duration:      1 ms , transferred pages:        0 (dup:        0, n:        0, x:        0) , new dirty pages:        0 , remaining dirty pages:     4027
total time: 53255 milliseconds  (v.s. 54693 milliseconds for SHA1)

--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-03  8:25           ` Chunguang Li
  2016-11-03  9:59             ` Li, Liang Z
  2016-11-03 10:13             ` Li, Liang Z
@ 2016-11-08 11:05             ` Dr. David Alan Gilbert
  2016-11-08 13:40               ` Chunguang Li
  2 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-11-08 11:05 UTC (permalink / raw)
  To: Chunguang Li; +Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela

* Chunguang Li (lichunguang@hust.edu.cn) wrote:
> 
> 
> 
> > -----Original Messages-----
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > Sent Time: Friday, October 14, 2016
> > To: "Chunguang Li" <lichunguang@hust.edu.cn>
> > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > 
> > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > 
> > > 
> > > 
> > > > -----原始邮件-----
> > > > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > > > 发送时间: 2016年9月30日 星期五
> > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > 
> > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > -----原始邮件-----
> > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > 发送时间: 2016年9月26日 星期一
> > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > > > 
> > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > > > Hi all!
> > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > > > 
> > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > > > 
> > > > > > > 
> > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > > > 
> > > > > > I think you're right that this can happen; to clarify I think the
> > > > > > case you're talking about is:
> > > > > > 
> > > > > >   Iteration 1
> > > > > >     sync bitmap
> > > > > >     start sending pages
> > > > > >     page 'n' is modified - but hasn't been sent yet
> > > > > >     page 'n' gets sent
> > > > > >   Iteration 2
> > > > > >     sync bitmap
> > > > > >        'page n is shown as modified'
> > > > > >     send page 'n' again
> > > > > >
> > > > > 
> > > > > Yes,this is right the case I am talking about.
> > > > >  
> > > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > > on big VMs with slow networks where the length of each iteration
> > > > > > is large.
> > > > > 
> > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > > > 
> > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > workloads get impacted the most?  That would also help us to figure
> > > > out what kinds of speed improvements we can expect.
> > > > 
> > > > 
> > > > 		Amit
> > > 
> > > I have picked up 6 workloads and got the following statistics numbers 
> > > of every iteration (except the last stop-copy one) during precopy.
> > > These numbers are obtained with the basic precopy migration, without 
> > > the capabilities like xbzrle or compression, etc. The network for the 
> > > migration is exclusive, with a separate network for the workloads. 
> > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > 
> > > Three (booting, idle, web server) of them converged to the stop-copy phase, 
> > > with the given bandwidth and default downtime (300ms), while the other
> > > three (kernel compilation, zeusmp, memcached) did not.
> > > 
> > > One page is "not-really-dirty", if it is written first and is sent later
> > > (and not written again after that) during one iteration. I guess this 
> > > would not happen so often during the other iterations as during the 1st 
> > > iteration. Because all the pages of the VM are sent to the dest node during 
> > > the 1st iteration, while during the others, only part of the pages are sent. 
> > > So I think the "not-really-dirty" pages should be produced mainly during 
> > > the 1st iteration , and maybe very little during the other iterations.
> > > 
> > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > think the time spent on Iteration 2 would be halved. This is a chain reaction,
> > > because the dirty pages produced during Iteration 2 is halved, which incurs
> > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > 
> > Yes; these numbers don't show how many of them are false dirty though.
> > 
> > One problem is thinking about pages that have been redirtied, if the page is dirtied
> > after the sync but before the network write then it's the false-dirty that
> > you're describing.
> > 
> > However, if the page is being written a few times, and so it would have been written
> > after the network write then it isn't a false-dirty. 
> > 
> > You might be able to figure that out with some kernel tracing of when the dirtying
> > happens, but it might be easier to write the fix!
> > 
> > Dave
> 
> Hi, I have made some new progress now.
> 
> To tell how many false dirty pages there are exactly in each iteration, I malloc a
> buffer in memory as big as the size of the whole VM memory. When a page is 
> transferred to the dest node, it is copied to the buffer; During the next iteration, 
> if one page is transferred, it is compared to the old one in the buffer, and the
> old one will be replaced for next comparison if it is really dirty. Thus, we are now 
> able to get the exact number of false dirty pages.
>
> This time, I use 15 workloads to get the statistic number. They are:
> 
>   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific 
>      computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick
>      up these 11 benchmarks because compared to others, they have bigger memory
>      occupation and higher memory dirty rate. Thus most of them could not converge
>      to stop-and-copy using the default migration speed (32MB/s).
>   2. kernel compilation
>   3. idle VM 
>   4. Apache web server which serves static content
> 
>   (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the 
>    migration speed is the default 32MB/s)
> 
>   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache.
>      After filling up the 4GB cache, a client writes the cache at a constant speed
>      during migration. This time, migration speed has no limit, and is up to the 
>      capability of 1Gbps Ethernet.
> 
> Summarize the results first: (and you can read the precise number below) 
> 
>   1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations) 
>      of false dirty pages out of all the dirty pages since iteration 2 (and the big 
>      proportion lasts during the following iterations). They are cpu2006.zeusmp, 
>      cpu2006.bzip2, cpu2006.mcf, and memcached.
>   2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even
>      though the proportion of false dirty pages is big since iteration 2, the space to
>      optimize is small.
>   3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not 
>      in the other iterations.
>   4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false 
>      dirty pages since iteration 2. So the spaces to optimize for them are small.
> 
> Now I want to talk a little more about the reasons why false dirty pages are produced. 
> The first reason is what we have discussed before---the mechanism to track the dirty 
> pages.
> And then I come up with another reason. Here is the situation: a write operation to one 
> memory page happens, but it doesn't change any content of the page. So it's "write but 
> not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments
> to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006
> benchmark suit. According to his results, general workloads has a little proportion (<10%) 
> of "write but not dirty" out of all the write operations, while few workloads have higher
> proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would 
> happen, it just happened.

I think there are a few different reasons I can think of:
  a) You have a flag or mutex that's set and cleared; so it gets set (marked
     dirty) and cleared around some operation. By the time we come to migrate
     it then it's back to cleared again.
     Similarly with other temporary data structures.
  b) Some system operation causes the page to be moved - e.g. swap or the kernel
     reorganising memory.

However, it's a shame I don't think you can tell in your experiment which of the
two cases we're hitting? I'd like to know if it's worth working on
making the page sync mechanism better or if it's nore important to deal
with the second reason you show.

> So these two reasons contribute to the false dirty pages. To optimize, I compute and store
> the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its
> SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a
> false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new
> hash replaces the old one for next comparison. 
> The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1
> hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which
> is relatively small. 
> As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems. 
> They have proven that the probability of hash collision is far smaller than disk hardware fault,
> so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the 
> same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery. 

There was a proposal ( https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05331.html )
 to do a migration system where
a copy of the migration RAM is stored on disc on the destination for cases where similar VMs
are migrated, and it used a checksum for each page to find the matching page
in the cache; that originally used a smaller hash, I think in the end they used a SHA-256.
(Hash based checks still make me nervous for intentional collisions but that's probably
me being paranoid?)

> Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have 
> big proportions of false dirty pages, the improvement is remarkable. Without optimization, 
> they either can not converge to stop-and-copy, or take a very long time to complete. With the 
> SHA1 hash method, all of them now complete in a relatively short time. 
> For the reason I have talked above, the other workloads don't get notable improvements from the 
> optimization. So below, I only show the exact number after optimization for the 4 workloads with
> remarkable improvements.
> 
> Any comments or suggestions? 

You might be able to save some of the CPU time; we've
got a test that checks if a page is all-zero; if you're doing
the SHA calculation you could avoid doing the all-zero check
and replace it by comparing hte output of the SHA.

> 
> Below is the experiments data:
> (
> "dup" means zero page, this kind of pages takes very little migration time and network 
> resources, so they are always not regard as dirty pages in my numbers;
> "rd" means really dirty pages;
> "fd" means false dirty pages;
> The numbers refer to the quantities of pages.
> )
> 
> ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)-------------------
> 
> 1. memcached
> 
> -----  original pre-copy (can not converge):  -----
> Iteration   1, duration:  42111 ms , transferred pages:  1568788 (dup:   416239, rd:  1152549, fd:        0) , new dirty pages:   499015 , remaining dirty pages:   507397
> Iteration   2, duration:  17208 ms , transferred pages:   498946 (dup:     5456, rd:   160206, fd:   333284) , new dirty pages:   261237 , remaining dirty pages:   269688
> Iteration   3, duration:   9134 ms , transferred pages:   262377 (dup:      519, rd:   111900, fd:   149958) , new dirty pages:   170281 , remaining dirty pages:   177592
> Iteration   4, duration:   5920 ms , transferred pages:   169966 (dup:       87, rd:    82487, fd:    87392) , new dirty pages:   121154 , remaining dirty pages:   128780
> Iteration   5, duration:   4239 ms , transferred pages:   121551 (dup:       81, rd:    64120, fd:    57350) , new dirty pages:   100976 , remaining dirty pages:   108205
> Iteration   6, duration:   3495 ms , transferred pages:   100353 (dup:       90, rd:    56021, fd:    44242) , new dirty pages:    74547 , remaining dirty pages:    82399
> Iteration   7, duration:   2583 ms , transferred pages:    74160 (dup:       56, rd:    38016, fd:    36088) , new dirty pages:    58209 , remaining dirty pages:    66448
> Iteration   8, duration:   2039 ms , transferred pages:    58534 (dup:       81, rd:    26885, fd:    31568) , new dirty pages:    43511 , remaining dirty pages:    51425
> Iteration   9, duration:   1513 ms , transferred pages:    43484 (dup:       55, rd:    26641, fd:    16788) , new dirty pages:    43722 , remaining dirty pages:    51663
> Iteration  10, duration:   1521 ms , transferred pages:    43676 (dup:       62, rd:    26463, fd:    17151) , new dirty pages:    35347 , remaining dirty pages:    43334
> Iteration  11, duration:   1230 ms , transferred pages:    35287 (dup:        0, rd:    21293, fd:    13994) , new dirty pages:    28851 , remaining dirty pages:    36898
> Iteration  12, duration:   1031 ms , transferred pages:    29651 (dup:       82, rd:    18143, fd:    11426) , new dirty pages:    27062 , remaining dirty pages:    34309
> Iteration  13, duration:    917 ms , transferred pages:    26385 (dup:       56, rd:    14149, fd:    12180) , new dirty pages:    22723 , remaining dirty pages:    30647
> Iteration  14, duration:    762 ms , transferred pages:    21902 (dup:       55, rd:    16355, fd:     5492) , new dirty pages:    18208 , remaining dirty pages:    26953
> Iteration  15, duration:    650 ms , transferred pages:    18636 (dup:        0, rd:    11943, fd:     6693) , new dirty pages:    16085 , remaining dirty pages:    24402
> Iteration  16, duration:    554 ms , transferred pages:    15946 (dup:       56, rd:     9527, fd:     6363) , new dirty pages:    14766 , remaining dirty pages:    23222
> Iteration  17, duration:    538 ms , transferred pages:    15434 (dup:        0, rd:     9779, fd:     5655) , new dirty pages:    13381 , remaining dirty pages:    21169
> Iteration  18, duration:    487 ms , transferred pages:    14089 (dup:       81, rd:     7737, fd:     6271) , new dirty pages:    13325 , remaining dirty pages:    20405
> Iteration  19, duration:    428 ms , transferred pages:    12232 (dup:        0, rd:     8488, fd:     3744) , new dirty pages:    10274 , remaining dirty pages:    18447
> Iteration  20, duration:    377 ms , transferred pages:    10887 (dup:       56, rd:     6362, fd:     4469) , new dirty pages:     9708 , remaining dirty pages:    17268
> Iteration  21, duration:    320 ms , transferred pages:     9222 (dup:        0, rd:     5789, fd:     3433) , new dirty pages:     8015 , remaining dirty pages:    16061
> Iteration  22, duration:    268 ms , transferred pages:     7621 (dup:        0, rd:     6204, fd:     1417) , new dirty pages:     7227 , remaining dirty pages:    15667
> Iteration  23, duration:    269 ms , transferred pages:     7813 (dup:       56, rd:     4410, fd:     3347) , new dirty pages:     7591 , remaining dirty pages:    15445
> Iteration  24, duration:    271 ms , transferred pages:     7749 (dup:        0, rd:     4565, fd:     3184) , new dirty pages:    15126 , remaining dirty pages:    22822
> Iteration  25, duration:    549 ms , transferred pages:    15818 (dup:       60, rd:    10545, fd:     5213) , new dirty pages:    14559 , remaining dirty pages:    21563
> Iteration  26, duration:    499 ms , transferred pages:    14281 (dup:        3, rd:     8760, fd:     5518) , new dirty pages:    11947 , remaining dirty pages:    19229
> Iteration  27, duration:    376 ms , transferred pages:    10823 (dup:       25, rd:     6550, fd:     4248) , new dirty pages:     8561 , remaining dirty pages:    16967
> Iteration  28, duration:    324 ms , transferred pages:     9350 (dup:       31, rd:     5292, fd:     4027) , new dirty pages:     8655 , remaining dirty pages:    16272
> Iteration  29, duration:    274 ms , transferred pages:     7813 (dup:        0, rd:     6088, fd:     1725) , new dirty pages:     6300 , remaining dirty pages:    14759
> Iteration  30, duration:    218 ms , transferred pages:     6340 (dup:       45, rd:     3196, fd:     3099) , new dirty pages:     5143 , remaining dirty pages:    13562
> 
> -----  after optimization:  -----
> Iteration   1, duration:  40664 ms , transferred pages:  1569037 (dup:   405940, rd:  1163097) , new dirty pages:   506846 , remaining dirty pages:   514979
> Iteration   2, duration:   8032 ms , transferred pages:   161130 (dup:     4007, rd:   157123) , new dirty pages:   153479 , remaining dirty pages:   153479

Big difference.

> Iteration   3, duration:   2620 ms , transferred pages:    65260 (dup:       20, rd:    65240) , new dirty pages:    64014 , remaining dirty pages:    67100
> Iteration   4, duration:   1160 ms , transferred pages:    30227 (dup:       60, rd:    30167) , new dirty pages:    34031 , remaining dirty pages:    41414
> Iteration   5, duration:    648 ms , transferred pages:    18700 (dup:       56, rd:    18644) , new dirty pages:    18375 , remaining dirty pages:    25536
> Iteration   6, duration:    389 ms , transferred pages:    11399 (dup:       55, rd:    11344) , new dirty pages:    12536 , remaining dirty pages:    17516
> Iteration   7, duration:    292 ms , transferred pages:     8197 (dup:        0, rd:     8197) , new dirty pages:     8387 , remaining dirty pages:    16802
> Iteration   8, duration:    171 ms , transferred pages:     4931 (dup:       39, rd:     4892) , new dirty pages:     6182 , remaining dirty pages:    14060
> Iteration   9, duration:    163 ms , transferred pages:     4355 (dup:       16, rd:     4339) , new dirty pages:     5530 , remaining dirty pages:    11973
> Iteration  10, duration:    104 ms , transferred pages:     3266 (dup:        0, rd:     3266) , new dirty pages:     2893 , remaining dirty pages:    11014
> Iteration  11, duration:     52 ms , transferred pages:     1153 (dup:        0, rd:     1153) , new dirty pages:     1586 , remaining dirty pages:    10516
> Iteration  12, duration:     52 ms , transferred pages:     1921 (dup:       39, rd:     1882) , new dirty pages:     1619 , remaining dirty pages:     8842
> Iteration  13, duration:     62 ms , transferred pages:     1537 (dup:        0, rd:     1537) , new dirty pages:     2052 , remaining dirty pages:     8871
> Iteration  14, duration:     58 ms , transferred pages:     1665 (dup:        0, rd:     1665) , new dirty pages:     1947 , remaining dirty pages:     7989
> Iteration  15, duration:      2 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     7989
> total time: 54693 milliseconds

Very nice.

Dave

> 2. cpu2006.zeusmp
> 
> -----  original pre-copy (can not converge):  -----
> Iteration   1, duration:  21112 ms , transferred pages:   266450 (dup:    93385, rd:   173065, fd:        0) , new dirty pages:   127866 , remaining dirty pages:   127866
> Iteration   2, duration:   6192 ms , transferred pages:   125662 (dup:    75762, rd:    17389, fd:    32511) , new dirty pages:   131655 , remaining dirty pages:   133859
> Iteration   3, duration:   6699 ms , transferred pages:   131937 (dup:    77298, rd:    20320, fd:    34319) , new dirty pages:   121027 , remaining dirty pages:   122949
> Iteration   4, duration:   5999 ms , transferred pages:   122512 (dup:    73588, rd:    17236, fd:    31688) , new dirty pages:   122759 , remaining dirty pages:   123196
> Iteration   5, duration:   5804 ms , transferred pages:   122717 (dup:    75436, rd:    19016, fd:    28265) , new dirty pages:   123697 , remaining dirty pages:   124176
> Iteration   6, duration:   5698 ms , transferred pages:   123708 (dup:    77249, rd:    18022, fd:    28437) , new dirty pages:   121838 , remaining dirty pages:   122306
> Iteration   7, duration:   5515 ms , transferred pages:   122306 (dup:    76727, rd:    14819, fd:    30760) , new dirty pages:   122382 , remaining dirty pages:   122382
> Iteration   8, duration:   6086 ms , transferred pages:   120825 (dup:    71834, rd:    15987, fd:    33004) , new dirty pages:   121587 , remaining dirty pages:   123144
> Iteration   9, duration:   5899 ms , transferred pages:   120964 (dup:    72860, rd:    18191, fd:    29913) , new dirty pages:   120391 , remaining dirty pages:   122571
> Iteration  10, duration:   5801 ms , transferred pages:   121425 (dup:    74140, rd:    20722, fd:    26563) , new dirty pages:   122302 , remaining dirty pages:   123448
> Iteration  11, duration:   5909 ms , transferred pages:   123448 (dup:    74735, rd:    19678, fd:    29035) , new dirty pages:   123258 , remaining dirty pages:   123258
> Iteration  12, duration:   6293 ms , transferred pages:   121211 (dup:    70442, rd:    18128, fd:    32641) , new dirty pages:   123623 , remaining dirty pages:   125670
> Iteration  13, duration:   6398 ms , transferred pages:   124897 (dup:    72701, rd:    21134, fd:    31062) , new dirty pages:   122355 , remaining dirty pages:   123128
> Iteration  14, duration:   6301 ms , transferred pages:   121893 (dup:    70514, rd:    23470, fd:    27909) , new dirty pages:   120980 , remaining dirty pages:   122215
> Iteration  15, duration:   6304 ms , transferred pages:   121389 (dup:    70005, rd:    21731, fd:    29653) , new dirty pages:   121628 , remaining dirty pages:   122454
> Iteration  16, duration:   6398 ms , transferred pages:   122164 (dup:    69962, rd:    24376, fd:    27826) , new dirty pages:   122246 , remaining dirty pages:   122536
> Iteration  17, duration:   6201 ms , transferred pages:   121548 (dup:    70984, rd:    23915, fd:    26649) , new dirty pages:   121460 , remaining dirty pages:   122448
> Iteration  18, duration:   6401 ms , transferred pages:   122272 (dup:    70072, rd:    22261, fd:    29939) , new dirty pages:   123518 , remaining dirty pages:   123694
> Iteration  19, duration:   7003 ms , transferred pages:   121873 (dup:    64754, rd:    27325, fd:    29794) , new dirty pages:   120568 , remaining dirty pages:   122389
> Iteration  20, duration:   6400 ms , transferred pages:   121422 (dup:    69221, rd:    25300, fd:    26901) , new dirty pages:   121229 , remaining dirty pages:   122196
> Iteration  21, duration:   6703 ms , transferred pages:   119895 (dup:    65232, rd:    25877, fd:    28786) , new dirty pages:   123284 , remaining dirty pages:   125585
> Iteration  22, duration:   6902 ms , transferred pages:   123884 (dup:    67582, rd:    29020, fd:    27282) , new dirty pages:   122057 , remaining dirty pages:   123758
> Iteration  23, duration:   6800 ms , transferred pages:   122010 (dup:    66529, rd:    30644, fd:    24837) , new dirty pages:   120916 , remaining dirty pages:   122664
> Iteration  24, duration:   7202 ms , transferred pages:   121951 (dup:    63188, rd:    31105, fd:    27658) , new dirty pages:   122715 , remaining dirty pages:   123428
> Iteration  25, duration:   7202 ms , transferred pages:   122919 (dup:    64161, rd:    32063, fd:    26695) , new dirty pages:   123180 , remaining dirty pages:   123689
> Iteration  26, duration:   7404 ms , transferred pages:   123092 (dup:    62694, rd:    33459, fd:    26939) , new dirty pages:   122149 , remaining dirty pages:   122746
> Iteration  27, duration:   7205 ms , transferred pages:   120427 (dup:    61664, rd:    34344, fd:    24419) , new dirty pages:   120299 , remaining dirty pages:   122618
> Iteration  28, duration:   7100 ms , transferred pages:   121074 (dup:    63130, rd:    32403, fd:    25541) , new dirty pages:   122984 , remaining dirty pages:   124528
> Iteration  29, duration:   7904 ms , transferred pages:   124060 (dup:    59564, rd:    35631, fd:    28865) , new dirty pages:   127080 , remaining dirty pages:   127548
> Iteration  30, duration:   7906 ms , transferred pages:   127518 (dup:    63029, rd:    34416, fd:    30073) , new dirty pages:   125028 , remaining dirty pages:   125058
> 
> -----  after optimization:  -----
> Iteration   1, duration:  21601 ms , transferred pages:   266450 (dup:    89731, rd:   176719) , new dirty pages:   139843 , remaining dirty pages:   139843
> Iteration   2, duration:   1747 ms , transferred pages:    92077 (dup:    78364, rd:    13713) , new dirty pages:    90945 , remaining dirty pages:    90945
> Iteration   3, duration:   1592 ms , transferred pages:    62253 (dup:    49435, rd:    12818) , new dirty pages:    76929 , remaining dirty pages:    76929
> Iteration   4, duration:    992 ms , transferred pages:    44837 (dup:    37886, rd:     6951) , new dirty pages:    71331 , remaining dirty pages:    72916
> Iteration   5, duration:    998 ms , transferred pages:    55229 (dup:    47150, rd:     8079) , new dirty pages:    21703 , remaining dirty pages:    23302
> Iteration   6, duration:    211 ms , transferred pages:    20337 (dup:    18516, rd:     1821) , new dirty pages:    14500 , remaining dirty pages:    14500
> Iteration   7, duration:     31 ms , transferred pages:    12933 (dup:    12627, rd:      306) , new dirty pages:     1520 , remaining dirty pages:     1520
> Iteration   8, duration:     30 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        4 , remaining dirty pages:     1524
> total time: 27225 milliseconds
> 
> 3. cpu2006.bzip2
> 
> -----  original pre-copy:  -----
> Iteration   1, duration:  18306 ms , transferred pages:   266450 (dup:   116569, rd:   149881, fd:        0) , new dirty pages:   106299 , remaining dirty pages:   106299
> Iteration   2, duration:  10694 ms , transferred pages:   104611 (dup:    17550, rd:    10536, fd:    76525) , new dirty pages:    34394 , remaining dirty pages:    36082
> Iteration   3, duration:   2998 ms , transferred pages:    34442 (dup:     9924, rd:    12254, fd:    12264) , new dirty pages:     6419 , remaining dirty pages:     8059
> Iteration   4, duration:    699 ms , transferred pages:     5748 (dup:       22, rd:     2583, fd:     3143) , new dirty pages:     1226 , remaining dirty pages:     3537
> Iteration   5, duration:    200 ms , transferred pages:     1636 (dup:        0, rd:     1194, fd:      442) , new dirty pages:      478 , remaining dirty pages:     2379
> Iteration   6, duration:      1 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2379
> 
> -----  after optimization:  -----
> Iteration   1, duration:  13995 ms , transferred pages:   266314 (dup:   152118, rd:   114196) , new dirty pages:    97009 , remaining dirty pages:    97145
> Iteration   2, duration:   1215 ms , transferred pages:    33400 (dup:    26745, rd:     6655) , new dirty pages:    12866 , remaining dirty pages:    14017
> Iteration   3, duration:    701 ms , transferred pages:     5774 (dup:       48, rd:     5726) , new dirty pages:     6342 , remaining dirty pages:     8761
> Iteration   4, duration:    500 ms , transferred pages:     4111 (dup:       21, rd:     4090) , new dirty pages:     4311 , remaining dirty pages:     6485
> Iteration   5, duration:    400 ms , transferred pages:     3273 (dup:        1, rd:     3272) , new dirty pages:     3034 , remaining dirty pages:     5431
> Iteration   6, duration:    301 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2094 , remaining dirty pages:     4472
> Iteration   7, duration:    299 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2066 , remaining dirty pages:     4082
> Iteration   8, duration:    202 ms , transferred pages:     1636 (dup:        0, rd:     1636) , new dirty pages:     2881 , remaining dirty pages:     4648
> Iteration   9, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     4775 , remaining dirty pages:     6778
> Iteration  10, duration:    400 ms , transferred pages:     3281 (dup:        9, rd:     3272) , new dirty pages:     3757 , remaining dirty pages:     5576
> Iteration  11, duration:    401 ms , transferred pages:     3279 (dup:        7, rd:     3272) , new dirty pages:     6980 , remaining dirty pages:     8906
> Iteration  12, duration:    500 ms , transferred pages:     7118 (dup:     3035, rd:     4083) , new dirty pages:    10774 , remaining dirty pages:    11922
> Iteration  13, duration:    116 ms , transferred pages:    11706 (dup:    10152, rd:     1554) , new dirty pages:     1326 , remaining dirty pages:     1326
> Iteration  14, duration:    117 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     1326
> total time: 19479 milliseconds
> 
> 4. cpu2006.mcf
> 
> -----  original pre-copy:  -----
> Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6925, rd:   259525, fd:        0) , new dirty pages:   244403 , remaining dirty pages:   244403
> Iteration   2, duration:  29603 ms , transferred pages:   242275 (dup:      377, rd:   224001, fd:    17897) , new dirty pages:   227335 , remaining dirty pages:   229463
> Iteration   3, duration:  27806 ms , transferred pages:   227573 (dup:      169, rd:    65681, fd:   161723) , new dirty pages:   195593 , remaining dirty pages:   197483
> Iteration   4, duration:  23907 ms , transferred pages:   195543 (dup:       41, rd:    39838, fd:   155664) , new dirty pages:   215066 , remaining dirty pages:   217006
> Iteration   5, duration:  26305 ms , transferred pages:   215289 (dup:      155, rd:    33082, fd:   182052) , new dirty pages:   111098 , remaining dirty pages:   112815
> Iteration   6, duration:  13502 ms , transferred pages:   110452 (dup:       22, rd:    26793, fd:    83637) , new dirty pages:   161054 , remaining dirty pages:   163417
> Iteration   7, duration:  19705 ms , transferred pages:   161266 (dup:      120, rd:    33818, fd:   127328) , new dirty pages:   220562 , remaining dirty pages:   222713
> Iteration   8, duration:  27003 ms , transferred pages:   220881 (dup:       21, rd:   215721, fd:     5139) , new dirty pages:   219787 , remaining dirty pages:   221619
> Iteration   9, duration:  26802 ms , transferred pages:   219248 (dup:       24, rd:    84648, fd:   134576) , new dirty pages:   207959 , remaining dirty pages:   210330
> Iteration  10, duration:  25411 ms , transferred pages:   207916 (dup:      144, rd:    35842, fd:   171930) , new dirty pages:   144442 , remaining dirty pages:   146856
> Iteration  11, duration:  17714 ms , transferred pages:   144804 (dup:       18, rd:    25414, fd:   119372) , new dirty pages:   205127 , remaining dirty pages:   207179
> Iteration  12, duration:  25112 ms , transferred pages:   205446 (dup:      128, rd:    23197, fd:   182121) , new dirty pages:   167319 , remaining dirty pages:   169052
> Iteration  13, duration:  20411 ms , transferred pages:   166886 (dup:       14, rd:    21960, fd:   144912) , new dirty pages:   221592 , remaining dirty pages:   223758
> Iteration  14, duration:  27126 ms , transferred pages:   221800 (dup:      122, rd:    42368, fd:   179310) , new dirty pages:   233630 , remaining dirty pages:   235588
> Iteration  15, duration:  28517 ms , transferred pages:   233321 (dup:      191, rd:   222528, fd:    10602) , new dirty pages:   224282 , remaining dirty pages:   226549
> Iteration  16, duration:  27422 ms , transferred pages:   224187 (dup:       55, rd:    45773, fd:   178359) , new dirty pages:   209815 , remaining dirty pages:   212177
> Iteration  17, duration:  25723 ms , transferred pages:   210260 (dup:       34, rd:    79405, fd:   130821) , new dirty pages:   220297 , remaining dirty pages:   222214
> Iteration  18, duration:  26920 ms , transferred pages:   220056 (dup:       14, rd:   214128, fd:     5914) , new dirty pages:   192015 , remaining dirty pages:   194173
> Iteration  19, duration:  23520 ms , transferred pages:   192239 (dup:        9, rd:    25140, fd:   167090) , new dirty pages:    96450 , remaining dirty pages:    98384
> Iteration  20, duration:  11805 ms , transferred pages:    96538 (dup:       14, rd:     7424, fd:    89100) , new dirty pages:     6978 , remaining dirty pages:     8824
> Iteration  21, duration:    799 ms , transferred pages:     6545 (dup:        1, rd:     1802, fd:     4742) , new dirty pages:      138 , remaining dirty pages:     2417
> Iteration  22, duration:      1 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2417
> 
> -----  after optimization:  -----
> Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6831, rd:   259619) , new dirty pages:   240209 , remaining dirty pages:   240209
> Iteration   2, duration:   6250 ms , transferred pages:    51244 (dup:      211, rd:    51033) , new dirty pages:   226651 , remaining dirty pages:   228571
> Iteration   3, duration:   4395 ms , transferred pages:    36008 (dup:       80, rd:    35928) , new dirty pages:   110719 , remaining dirty pages:   111478
> Iteration   4, duration:   3390 ms , transferred pages:    28068 (dup:       28, rd:    28040) , new dirty pages:   185172 , remaining dirty pages:   185172
> Iteration   5, duration:   2986 ms , transferred pages:    23780 (dup:       45, rd:    23735) , new dirty pages:    64357 , remaining dirty pages:    66305
> Iteration   6, duration:   2727 ms , transferred pages:    22800 (dup:       12, rd:    22788) , new dirty pages:    61675 , remaining dirty pages:    61675
> Iteration   7, duration:   2372 ms , transferred pages:    18943 (dup:       13, rd:    18930) , new dirty pages:    55144 , remaining dirty pages:    55265
> Iteration   8, duration:   2100 ms , transferred pages:    17189 (dup:       11, rd:    17178) , new dirty pages:    55244 , remaining dirty pages:    55668
> Iteration   9, duration:   2003 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:   107058 , remaining dirty pages:   108014
> Iteration  10, duration:   2132 ms , transferred pages:    17825 (dup:       24, rd:    17801) , new dirty pages:   126214 , remaining dirty pages:   126214
> Iteration  11, duration:   2229 ms , transferred pages:    18156 (dup:       22, rd:    18134) , new dirty pages:    65725 , remaining dirty pages:    65725
> Iteration  12, duration:   2315 ms , transferred pages:    18651 (dup:       21, rd:    18630) , new dirty pages:    52575 , remaining dirty pages:    53903
> Iteration  13, duration:   2147 ms , transferred pages:    17435 (dup:       16, rd:    17419) , new dirty pages:    46652 , remaining dirty pages:    47260
> Iteration  14, duration:   2000 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:    42721 , remaining dirty pages:    43266
> Iteration  15, duration:   1901 ms , transferred pages:    15552 (dup:       10, rd:    15542) , new dirty pages:    38593 , remaining dirty pages:    40792
> Iteration  16, duration:   1801 ms , transferred pages:    14735 (dup:       11, rd:    14724) , new dirty pages:    54252 , remaining dirty pages:    55639
> Iteration  17, duration:   1708 ms , transferred pages:    13860 (dup:        2, rd:    13858) , new dirty pages:    72379 , remaining dirty pages:    74170
> Iteration  18, duration:   1923 ms , transferred pages:    15442 (dup:       12, rd:    15430) , new dirty pages:   101911 , remaining dirty pages:   103547
> Iteration  19, duration:   2311 ms , transferred pages:    18823 (dup:        9, rd:    18814) , new dirty pages:    80534 , remaining dirty pages:    82521
> Iteration  20, duration:   2081 ms , transferred pages:    17156 (dup:       34, rd:    17122) , new dirty pages:    36054 , remaining dirty pages:    36054
> Iteration  21, duration:   1665 ms , transferred pages:    13777 (dup:       10, rd:    13767) , new dirty pages:    29624 , remaining dirty pages:    29624
> Iteration  22, duration:   1657 ms , transferred pages:    13290 (dup:        7, rd:    13283) , new dirty pages:    25949 , remaining dirty pages:    28265
> Iteration  23, duration:   1599 ms , transferred pages:    13088 (dup:        0, rd:    13088) , new dirty pages:    22356 , remaining dirty pages:    24813
> Iteration  24, duration:   1500 ms , transferred pages:    12280 (dup:       10, rd:    12270) , new dirty pages:    21181 , remaining dirty pages:    22608
> Iteration  25, duration:   1400 ms , transferred pages:    11457 (dup:        5, rd:    11452) , new dirty pages:    18657 , remaining dirty pages:    20311
> Iteration  26, duration:   1200 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    15690 , remaining dirty pages:    17294
> Iteration  27, duration:   1201 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    14810 , remaining dirty pages:    15936
> Iteration  28, duration:   1000 ms , transferred pages:     8183 (dup:        3, rd:     8180) , new dirty pages:    15387 , remaining dirty pages:    16423
> Iteration  29, duration:    900 ms , transferred pages:     7372 (dup:       10, rd:     7362) , new dirty pages:    13303 , remaining dirty pages:    15292
> Iteration  30, duration:   1000 ms , transferred pages:     8181 (dup:        1, rd:     8180) , new dirty pages:    17879 , remaining dirty pages:    18457
> Iteration  31, duration:    951 ms , transferred pages:     8140 (dup:        9, rd:     8131) , new dirty pages:    21738 , remaining dirty pages:    23304
> Iteration  32, duration:    946 ms , transferred pages:     6946 (dup:        1, rd:     6945) , new dirty pages:    15815 , remaining dirty pages:    15815
> Iteration  33, duration:    747 ms , transferred pages:     6192 (dup:        0, rd:     6192) , new dirty pages:     6249 , remaining dirty pages:     7670
> Iteration  34, duration:    501 ms , transferred pages:     4090 (dup:        0, rd:     4090) , new dirty pages:     6163 , remaining dirty pages:     8422
> Iteration  35, duration:    600 ms , transferred pages:     4910 (dup:        2, rd:     4908) , new dirty pages:     3673 , remaining dirty pages:     5222
> Iteration  36, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2132 , remaining dirty pages:     4337
> Iteration  37, duration:    200 ms , transferred pages:     1637 (dup:        1, rd:     1636) , new dirty pages:      544 , remaining dirty pages:     2251
> Iteration  38, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     2251
> total time: 97919 milliseconds
> 
> ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)-------------------
> 
> 5. idle
> 
> Iteration   1, duration:  14702 ms , transferred pages:   266450 (dup:   146393, rd:   120057, fd:        0) , new dirty pages:    14595 , remaining dirty pages:    14595
> Iteration   2, duration:   1592 ms , transferred pages:    12412 (dup:      103, rd:     3280, fd:     9029) , new dirty pages:      218 , remaining dirty pages:     2401
> Iteration   3, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2401
> 
> 6. kernel compilation (can not converge)
> 
> Iteration   1, duration:  20607 ms , transferred pages:   266450 (dup:    97552, rd:   168898, fd:        0) , new dirty pages:    19293 , remaining dirty pages:    19293
> Iteration   2, duration:   2092 ms , transferred pages:    17176 (dup:      597, rd:     8625, fd:     7954) , new dirty pages:     8318 , remaining dirty pages:    10435
> Iteration   3, duration:   1000 ms , transferred pages:     8484 (dup:      304, rd:     6256, fd:     1924) , new dirty pages:     8736 , remaining dirty pages:    10687
> Iteration   4, duration:   1000 ms , transferred pages:     8435 (dup:      255, rd:     7089, fd:     1091) , new dirty pages:     7627 , remaining dirty pages:     9879
> Iteration   5, duration:    900 ms , transferred pages:     7553 (dup:      191, rd:     5602, fd:     1760) , new dirty pages:     7287 , remaining dirty pages:     9613
> Iteration   6, duration:    900 ms , transferred pages:     7620 (dup:      258, rd:     5761, fd:     1601) , new dirty pages:     8958 , remaining dirty pages:    10951
> Iteration   7, duration:   1099 ms , transferred pages:     9309 (dup:      311, rd:     8051, fd:      947) , new dirty pages:     7189 , remaining dirty pages:     8831
> Iteration   8, duration:    800 ms , transferred pages:     6832 (dup:      288, rd:     5717, fd:      827) , new dirty pages:     5782 , remaining dirty pages:     7781
> Iteration   9, duration:    701 ms , transferred pages:     5875 (dup:      149, rd:     4005, fd:     1721) , new dirty pages:     4587 , remaining dirty pages:     6493
> Iteration  10, duration:    500 ms , transferred pages:     4234 (dup:      144, rd:     3057, fd:     1033) , new dirty pages:     7352 , remaining dirty pages:     9611
> Iteration  11, duration:    900 ms , transferred pages:     7759 (dup:      397, rd:     6563, fd:      799) , new dirty pages:     6686 , remaining dirty pages:     8538
> Iteration  12, duration:    800 ms , transferred pages:     6808 (dup:      264, rd:     6017, fd:      527) , new dirty pages:     6871 , remaining dirty pages:     8601
> Iteration  13, duration:    800 ms , transferred pages:     6775 (dup:      231, rd:     5722, fd:      822) , new dirty pages:     7540 , remaining dirty pages:     9366
> Iteration  14, duration:    900 ms , transferred pages:     7507 (dup:      145, rd:     5900, fd:     1462) , new dirty pages:     7581 , remaining dirty pages:     9440
> Iteration  15, duration:    900 ms , transferred pages:     7630 (dup:      268, rd:     6211, fd:     1151) , new dirty pages:     7268 , remaining dirty pages:     9078
> Iteration  16, duration:    800 ms , transferred pages:     6759 (dup:      215, rd:     5763, fd:      781) , new dirty pages:     6861 , remaining dirty pages:     9180
> Iteration  17, duration:    800 ms , transferred pages:     6838 (dup:      294, rd:     6037, fd:      507) , new dirty pages:     6196 , remaining dirty pages:     8538
> Iteration  18, duration:    800 ms , transferred pages:     6852 (dup:      308, rd:     4905, fd:     1639) , new dirty pages:     5947 , remaining dirty pages:     7633
> Iteration  19, duration:    700 ms , transferred pages:     5919 (dup:      193, rd:     4853, fd:      873) , new dirty pages:     5861 , remaining dirty pages:     7575
> Iteration  20, duration:    600 ms , transferred pages:     5284 (dup:      376, rd:     4408, fd:      500) , new dirty pages:     5206 , remaining dirty pages:     7497
> Iteration  21, duration:    600 ms , transferred pages:     5147 (dup:      239, rd:     4308, fd:      600) , new dirty pages:     5031 , remaining dirty pages:     7381
> Iteration  22, duration:    599 ms , transferred pages:     5064 (dup:      156, rd:     4026, fd:      882) , new dirty pages:     5601 , remaining dirty pages:     7918
> Iteration  23, duration:    702 ms , transferred pages:     5965 (dup:      239, rd:     5028, fd:      698) , new dirty pages:     6079 , remaining dirty pages:     8032
> Iteration  24, duration:    700 ms , transferred pages:     6175 (dup:      449, rd:     5146, fd:      580) , new dirty pages:    10932 , remaining dirty pages:    12789
> Iteration  25, duration:   1300 ms , transferred pages:    10936 (dup:      302, rd:     6205, fd:     4429) , new dirty pages:     8713 , remaining dirty pages:    10566
> Iteration  26, duration:   1000 ms , transferred pages:     8282 (dup:      102, rd:     5662, fd:     2518) , new dirty pages:     5119 , remaining dirty pages:     7403
> Iteration  27, duration:    600 ms , transferred pages:     5007 (dup:       99, rd:     4099, fd:      809) , new dirty pages:     2226 , remaining dirty pages:     4622
> Iteration  28, duration:    300 ms , transferred pages:     2491 (dup:       37, rd:     1794, fd:      660) , new dirty pages:     6746 , remaining dirty pages:     8877
> Iteration  29, duration:    800 ms , transferred pages:     6757 (dup:      213, rd:     5532, fd:     1012) , new dirty pages:     6070 , remaining dirty pages:     8190
> Iteration  30, duration:    700 ms , transferred pages:     6052 (dup:      326, rd:     5107, fd:      619) , new dirty pages:     5177 , remaining dirty pages:     7315
> 
> 7. web server
> 
> Iteration   1, duration:  20902 ms , transferred pages:   266450 (dup:    95497, rd:   170953, fd:        0) , new dirty pages:     8528 , remaining dirty pages:     8528
> Iteration   2, duration:    796 ms , transferred pages:     6472 (dup:      131, rd:     1885, fd:     4456) , new dirty pages:      650 , remaining dirty pages:     2706
> Iteration   3, duration:    100 ms , transferred pages:      818 (dup:        0, rd:      383, fd:      435) , new dirty pages:      328 , remaining dirty pages:     2216
> Iteration   4, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2216
> 
> 
> 8. cpu2006.bwaves (can not converge)
> 
> Iteration   1, duration:  31715 ms , transferred pages:   266450 (dup:     6766, rd:   259684, fd:        0) , new dirty pages:   242702 , remaining dirty pages:   242702
> Iteration   2, duration:  29397 ms , transferred pages:   240508 (dup:      405, rd:   225588, fd:    14515) , new dirty pages:   230889 , remaining dirty pages:   233083
> Iteration   3, duration:  28205 ms , transferred pages:   230858 (dup:      182, rd:   214596, fd:    16080) , new dirty pages:   226998 , remaining dirty pages:   229223
> Iteration   4, duration:  27805 ms , transferred pages:   227574 (dup:      170, rd:   217045, fd:    10359) , new dirty pages:   227360 , remaining dirty pages:   229009
> Iteration   5, duration:  27703 ms , transferred pages:   226786 (dup:      200, rd:   212130, fd:    14456) , new dirty pages:   225885 , remaining dirty pages:   228108
> Iteration   6, duration:  27600 ms , transferred pages:   225923 (dup:      155, rd:   215503, fd:    10265) , new dirty pages:   223555 , remaining dirty pages:   225740
> Iteration   7, duration:  27309 ms , transferred pages:   223574 (dup:      260, rd:   215641, fd:     7673) , new dirty pages:   231975 , remaining dirty pages:   234141
> Iteration   8, duration:  28403 ms , transferred pages:   232397 (dup:       85, rd:   214086, fd:    18226) , new dirty pages:   222170 , remaining dirty pages:   223914
> Iteration   9, duration:  27105 ms , transferred pages:   221809 (dup:      131, rd:   214988, fd:     6690) , new dirty pages:   230065 , remaining dirty pages:   232170
> Iteration  10, duration:  28104 ms , transferred pages:   230201 (dup:      343, rd:   213531, fd:    16327) , new dirty pages:   227590 , remaining dirty pages:   229559
> Iteration  11, duration:  27801 ms , transferred pages:   227717 (dup:      313, rd:   221408, fd:     5996) , new dirty pages:   228457 , remaining dirty pages:   230299
> Iteration  12, duration:  27916 ms , transferred pages:   228560 (dup:      338, rd:   219660, fd:     8562) , new dirty pages:   238326 , remaining dirty pages:   240065
> 
> 9. cpu2006.lbm (can not converge)
> Iteration   1, duration:  31012 ms , transferred pages:   266450 (dup:    12253, rd:   254197, fd:        0) , new dirty pages:   108960 , remaining dirty pages:   108960
> Iteration   2, duration:  13095 ms , transferred pages:   106522 (dup:        3, rd:   102045, fd:     4474) , new dirty pages:   129292 , remaining dirty pages:   131730
> Iteration   3, duration:  15802 ms , transferred pages:   129688 (dup:      444, rd:   110860, fd:    18384) , new dirty pages:   116682 , remaining dirty pages:   118724
> Iteration   4, duration:  14204 ms , transferred pages:   116316 (dup:      160, rd:   104951, fd:    11205) , new dirty pages:   107246 , remaining dirty pages:   109654
> Iteration   5, duration:  13208 ms , transferred pages:   107977 (dup:        1, rd:   101834, fd:     6142) , new dirty pages:   105371 , remaining dirty pages:   107048
> Iteration   6, duration:  12804 ms , transferred pages:   104705 (dup:        1, rd:    99629, fd:     5075) , new dirty pages:   103841 , remaining dirty pages:   106184
> Iteration   7, duration:  12709 ms , transferred pages:   103891 (dup:        5, rd:    99212, fd:     4674) , new dirty pages:   106692 , remaining dirty pages:   108985
> Iteration   8, duration:  13105 ms , transferred pages:   107169 (dup:       11, rd:   100125, fd:     7033) , new dirty pages:   103132 , remaining dirty pages:   104948
> Iteration   9, duration:  12607 ms , transferred pages:   103068 (dup:        0, rd:    99460, fd:     3608) , new dirty pages:   102511 , remaining dirty pages:   104391
> Iteration  10, duration:  12514 ms , transferred pages:   102250 (dup:        0, rd:    99094, fd:     3156) , new dirty pages:   102888 , remaining dirty pages:   105029
> 
> 10. cpu2006.astar (can not converge)
> 
> Iteration   1, duration:  28402 ms , transferred pages:   266450 (dup:    33770, rd:   232680, fd:        0) , new dirty pages:    62078 , remaining dirty pages:    62078
> Iteration   2, duration:   7393 ms , transferred pages:    60107 (dup:       10, rd:    51722, fd:     8375) , new dirty pages:    48854 , remaining dirty pages:    50825
> Iteration   3, duration:   6001 ms , transferred pages:    49094 (dup:       14, rd:    46540, fd:     2540) , new dirty pages:    48137 , remaining dirty pages:    49868
> Iteration   4, duration:   5800 ms , transferred pages:    47444 (dup:        0, rd:    45389, fd:     2055) , new dirty pages:    49147 , remaining dirty pages:    51571
> Iteration   5, duration:   6102 ms , transferred pages:    49912 (dup:       14, rd:    46216, fd:     3682) , new dirty pages:    55606 , remaining dirty pages:    57265
> Iteration   6, duration:   6699 ms , transferred pages:    54949 (dup:      143, rd:    20745, fd:    34061) , new dirty pages:     9166 , remaining dirty pages:    11482
> Iteration   7, duration:   1200 ms , transferred pages:     9830 (dup:       14, rd:     7011, fd:     2805) , new dirty pages:     8294 , remaining dirty pages:     9946
> Iteration   8, duration:   1000 ms , transferred pages:     8194 (dup:       14, rd:     7178, fd:     1002) , new dirty pages:     5475 , remaining dirty pages:     7227
> Iteration   9, duration:    600 ms , transferred pages:     4908 (dup:        0, rd:     3470, fd:     1438) , new dirty pages:     4175 , remaining dirty pages:     6494
> Iteration  10, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     3856, fd:      234) , new dirty pages:     4095 , remaining dirty pages:     6499
> Iteration  11, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     3313, fd:      777) , new dirty pages:     3371 , remaining dirty pages:     5780
> Iteration  12, duration:    502 ms , transferred pages:     4090 (dup:        0, rd:     3823, fd:      267) , new dirty pages:     7518 , remaining dirty pages:     9208
> Iteration  13, duration:    899 ms , transferred pages:     7376 (dup:       14, rd:     6028, fd:     1334) , new dirty pages:     3931 , remaining dirty pages:     5763
> Iteration  14, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     4078, fd:       12) , new dirty pages:     4346 , remaining dirty pages:     6019
> Iteration  15, duration:    502 ms , transferred pages:     4090 (dup:        0, rd:     3817, fd:      273) , new dirty pages:     3054 , remaining dirty pages:     4983
> Iteration  16, duration:    400 ms , transferred pages:     3272 (dup:        0, rd:     3138, fd:      134) , new dirty pages:     3874 , remaining dirty pages:     5585
> Iteration  17, duration:    399 ms , transferred pages:     3272 (dup:        0, rd:     3248, fd:       24) , new dirty pages:     5285 , remaining dirty pages:     7598
> Iteration  18, duration:    701 ms , transferred pages:     5726 (dup:        0, rd:     4385, fd:     1341) , new dirty pages:     8903 , remaining dirty pages:    10775
> Iteration  19, duration:   1101 ms , transferred pages:     9010 (dup:       12, rd:     5597, fd:     3401) , new dirty pages:     4199 , remaining dirty pages:     5964
> Iteration  20, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     4078, fd:       12) , new dirty pages:     3829 , remaining dirty pages:     5703
> 
> 11. cpu2006.xalancbmk (can not converge)
> 
> Iteration   1, duration:  30407 ms , transferred pages:   266450 (dup:    17700, rd:   248750, fd:        0) , new dirty pages:    96169 , remaining dirty pages:    96169
> Iteration   2, duration:  11495 ms , transferred pages:    94164 (dup:      205, rd:    67068, fd:    26891) , new dirty pages:    61766 , remaining dirty pages:    63771
> Iteration   3, duration:   7501 ms , transferred pages:    61471 (dup:      121, rd:    53587, fd:     7763) , new dirty pages:    56569 , remaining dirty pages:    58869
> Iteration   4, duration:   6902 ms , transferred pages:    56461 (dup:       19, rd:    50553, fd:     5889) , new dirty pages:    52181 , remaining dirty pages:    54589
> Iteration   5, duration:   6402 ms , transferred pages:    52459 (dup:      107, rd:    46986, fd:     5366) , new dirty pages:    54051 , remaining dirty pages:    56181
> Iteration   6, duration:   6601 ms , transferred pages:    54003 (dup:       15, rd:    47566, fd:     6422) , new dirty pages:    50844 , remaining dirty pages:    53022
> Iteration   7, duration:   6202 ms , transferred pages:    50723 (dup:        7, rd:    47143, fd:     3573) , new dirty pages:    64880 , remaining dirty pages:    67179
> Iteration   8, duration:   8001 ms , transferred pages:    65447 (dup:        7, rd:    61159, fd:     4281) , new dirty pages:    67854 , remaining dirty pages:    69586
> Iteration   9, duration:   8202 ms , transferred pages:    67444 (dup:      368, rd:    56357, fd:    10719) , new dirty pages:    65178 , remaining dirty pages:    67320
> Iteration  10, duration:   8000 ms , transferred pages:    65455 (dup:       15, rd:    60581, fd:     4859) , new dirty pages:    52421 , remaining dirty pages:    54286
> 
> 12. cpu2006.milc (can not converge)
> 
> Iteration   1, duration:  31410 ms , transferred pages:   266450 (dup:     9454, rd:   256996, fd:        0) , new dirty pages:   158860 , remaining dirty pages:   158860
> Iteration   2, duration:  19193 ms , transferred pages:   157048 (dup:      150, rd:    96807, fd:    60091) , new dirty pages:   102238 , remaining dirty pages:   104050
> Iteration   3, duration:  12504 ms , transferred pages:   102271 (dup:       21, rd:    95107, fd:     7143) , new dirty pages:    97944 , remaining dirty pages:    99723
> Iteration   4, duration:  11905 ms , transferred pages:    97360 (dup:       18, rd:    93610, fd:     3732) , new dirty pages:    99150 , remaining dirty pages:   101513
> Iteration   5, duration:  12105 ms , transferred pages:    99094 (dup:      116, rd:    94125, fd:     4853) , new dirty pages:    98589 , remaining dirty pages:   101008
> Iteration   6, duration:  12101 ms , transferred pages:    98995 (dup:       17, rd:    94069, fd:     4909) , new dirty pages:   147403 , remaining dirty pages:   149416
> Iteration   7, duration:  18001 ms , transferred pages:   147284 (dup:       44, rd:   135691, fd:    11549) , new dirty pages:   136445 , remaining dirty pages:   138577
> Iteration   8, duration:  16702 ms , transferred pages:   136636 (dup:       30, rd:   130805, fd:     5801) , new dirty pages:   145481 , remaining dirty pages:   147422
> Iteration   9, duration:  17800 ms , transferred pages:   145734 (dup:      130, rd:   133239, fd:    12365) , new dirty pages:    98032 , remaining dirty pages:    99720
> Iteration  10, duration:  11902 ms , transferred pages:    97364 (dup:       22, rd:    93096, fd:     4246) , new dirty pages:    95391 , remaining dirty pages:    97747
> 
> 13. cpu2006.cactusADM (can not converge)
> 
> Iteration   1, duration:  23508 ms , transferred pages:   266450 (dup:    73568, rd:   192882, fd:        0) , new dirty pages:   123869 , remaining dirty pages:   123869
> Iteration   2, duration:  13989 ms , transferred pages:   121594 (dup:     7874, rd:    81653, fd:    32067) , new dirty pages:   112960 , remaining dirty pages:   115235
> Iteration   3, duration:  13605 ms , transferred pages:   113276 (dup:     2028, rd:    83783, fd:    27465) , new dirty pages:   112314 , remaining dirty pages:   114273
> Iteration   4, duration:  13509 ms , transferred pages:   111935 (dup:     1505, rd:    83535, fd:    26895) , new dirty pages:   114078 , remaining dirty pages:   116416
> Iteration   5, duration:  13810 ms , transferred pages:   114262 (dup:     1378, rd:    84039, fd:    28845) , new dirty pages:   112271 , remaining dirty pages:   114425
> Iteration   6, duration:  13604 ms , transferred pages:   112664 (dup:     1416, rd:    84300, fd:    26948) , new dirty pages:   112903 , remaining dirty pages:   114664
> Iteration   7, duration:  13604 ms , transferred pages:   112655 (dup:     1407, rd:    84027, fd:    27221) , new dirty pages:   110943 , remaining dirty pages:   112952
> Iteration   8, duration:  13406 ms , transferred pages:   110720 (dup:     1108, rd:    84075, fd:    25537) , new dirty pages:   109321 , remaining dirty pages:   111553
> Iteration   9, duration:  13306 ms , transferred pages:   109726 (dup:      932, rd:    83652, fd:    25142) , new dirty pages:   113446 , remaining dirty pages:   115273
> Iteration  10, duration:  13705 ms , transferred pages:   113121 (dup:     1055, rd:    84671, fd:    27395) , new dirty pages:   108776 , remaining dirty pages:   110928
> 
> 14. cpu2006.GmesFDTD (can not converge)
> 
> Iteration   1, duration:  13303 ms , transferred pages:   266450 (dup:   157809, rd:   108641, fd:        0) , new dirty pages:   226802 , remaining dirty pages:   226802
> Iteration   2, duration:  10797 ms , transferred pages:   226507 (dup:   138637, rd:    61818, fd:    26052) , new dirty pages:   200769 , remaining dirty pages:   201064
> Iteration   3, duration:   8900 ms , transferred pages:   199717 (dup:   127187, rd:    69340, fd:     3190) , new dirty pages:   203436 , remaining dirty pages:   204783
> Iteration   4, duration:  10904 ms , transferred pages:   204127 (dup:   115211, rd:    85767, fd:     3149) , new dirty pages:   198407 , remaining dirty pages:   199063
> Iteration   5, duration:  12109 ms , transferred pages:   198206 (dup:    99435, rd:    96956, fd:     1815) , new dirty pages:   213719 , remaining dirty pages:   214576
> Iteration   6, duration:  16307 ms , transferred pages:   213595 (dup:    80422, rd:   116885, fd:    16288) , new dirty pages:   199637 , remaining dirty pages:   200618
> Iteration   7, duration:  16915 ms , transferred pages:   198289 (dup:    60169, rd:   134208, fd:     3912) , new dirty pages:   199343 , remaining dirty pages:   201672
> Iteration   8, duration:  19518 ms , transferred pages:   200452 (dup:    41014, rd:   156083, fd:     3355) , new dirty pages:   222927 , remaining dirty pages:   224147
> 
> 15. cpu2006.wrf (can not converge)
> 
> Iteration   1, duration:  18499 ms , transferred pages:   266380 (dup:   115285, rd:   151095, fd:        0) , new dirty pages:   112322 , remaining dirty pages:   112392
> Iteration   2, duration:   9802 ms , transferred pages:   110025 (dup:    29917, rd:    65782, fd:    14326) , new dirty pages:    88855 , remaining dirty pages:    91222
> Iteration   3, duration:   8199 ms , transferred pages:    89761 (dup:    22728, rd:    57262, fd:     9771) , new dirty pages:    58431 , remaining dirty pages:    59892
> Iteration   4, duration:   5603 ms , transferred pages:    58502 (dup:    12716, rd:    41809, fd:     3977) , new dirty pages:    80556 , remaining dirty pages:    81946
> Iteration   5, duration:   7101 ms , transferred pages:    79778 (dup:    21738, rd:    50896, fd:     7144) , new dirty pages:    62592 , remaining dirty pages:    64760
> Iteration   6, duration:   5702 ms , transferred pages:    63388 (dup:    16793, rd:    42726, fd:     3869) , new dirty pages:    80747 , remaining dirty pages:    82119
> Iteration   7, duration:   7000 ms , transferred pages:    80868 (dup:    23652, rd:    52194, fd:     5022) , new dirty pages:    84593 , remaining dirty pages:    85844
> Iteration   8, duration:   7099 ms , transferred pages:    83799 (dup:    25769, rd:    51772, fd:     6258) , new dirty pages:    67951 , remaining dirty pages:    69996
> Iteration   9, duration:   6303 ms , transferred pages:    68478 (dup:    16979, rd:    36490, fd:    15009) , new dirty pages:    81181 , remaining dirty pages:    82699
> Iteration  10, duration:   7000 ms , transferred pages:    80724 (dup:    23503, rd:    52826, fd:     4395) , new dirty pages:    47930 , remaining dirty pages:    49905
> 
> 
> 
> > 
> > > So I think "booting" and  "kernel compilation" should benefit a lot from this
> > > improvement. The reason of "kernel compilation" would benefit is that some 
> > > iterations take around 600ms, and if they are halved into 300ms, then the precopy
> > > may have the chance to step into stop and copy phase.
> > > 
> > > On the other hand, "idle" and "web server" would not benefit a lot, because
> > > most of the time are spent on the 1st iteration and little on the others.
> > > 
> > > As to the "zeusmp" and "memcached", although the time spent on the other iterations
> > > but the 1st one may be halved, they still could not converge to stop and copy 
> > > with the 300ms downtime.
> > > 
> > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> > > 
> > > 1. booting : begin to migrate when the VM is booting
> > > 
> > > Iteration   1, duration:   6997 ms , transferred pages:   266450 (n:    57269, d:   209181 ) , new dirty pages:    56414 , remaining dirty pages:    56414
> > > Iteration   2, duration:   6497 ms , transferred pages:    54008 (n:    52701, d:     1307 ) , new dirty pages:    48053 , remaining dirty pages:    50459
> > > Iteration   3, duration:   5800 ms , transferred pages:    48232 (n:    47444, d:      788 ) , new dirty pages:     9129 , remaining dirty pages:    11356
> > > Iteration   4, duration:   1100 ms , transferred pages:     9091 (n:     8998, d:       93 ) , new dirty pages:      165 , remaining dirty pages:     2430
> > > Iteration   5, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2430
> > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> > >        "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> > > 
> > > 2. idle
> > > 
> > > Iteration   1, duration:  14496 ms , transferred pages:   266450 (n:   118980, d:   147470 ) , new dirty pages:    17398 , remaining dirty pages:    17398
> > > Iteration   2, duration:   1896 ms , transferred pages:    14953 (n:    14854, d:       99 ) , new dirty pages:     1849 , remaining dirty pages:     4294
> > > Iteration   3, duration:    300 ms , transferred pages:     2454 (n:     2454, d:        0 ) , new dirty pages:        9 , remaining dirty pages:     1849
> > > Iteration   4, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     1849
> > > 
> > > 3. kernel compilation (can not converge)
> > > 
> > > Iteration   1, duration:  20700 ms , transferred pages:   266450 (n:   169778, d:    96672 ) , new dirty pages:    40067 , remaining dirty pages:    40067
> > > Iteration   2, duration:   4696 ms , transferred pages:    38401 (n:    37787, d:      614 ) , new dirty pages:     8852 , remaining dirty pages:    10518
> > > Iteration   3, duration:   1000 ms , transferred pages:     8642 (n:     8180, d:      462 ) , new dirty pages:     6331 , remaining dirty pages:     8207
> > > Iteration   4, duration:    700 ms , transferred pages:     6110 (n:     5726, d:      384 ) , new dirty pages:     5242 , remaining dirty pages:     7339
> > > Iteration   5, duration:    600 ms , transferred pages:     5007 (n:     4908, d:       99 ) , new dirty pages:     4868 , remaining dirty pages:     7200
> > > Iteration   6, duration:    600 ms , transferred pages:     5226 (n:     4908, d:      318 ) , new dirty pages:     6142 , remaining dirty pages:     8116
> > > Iteration   7, duration:    700 ms , transferred pages:     5985 (n:     5726, d:      259 ) , new dirty pages:     5902 , remaining dirty pages:     8033
> > > Iteration   8, duration:    701 ms , transferred pages:     5893 (n:     5726, d:      167 ) , new dirty pages:     7502 , remaining dirty pages:     9642
> > > Iteration   9, duration:    900 ms , transferred pages:     7623 (n:     7362, d:      261 ) , new dirty pages:     6408 , remaining dirty pages:     8427
> > > Iteration  10, duration:    700 ms , transferred pages:     6008 (n:     5726, d:      282 ) , new dirty pages:     8312 , remaining dirty pages:    10731
> > > Iteration  11, duration:   1000 ms , transferred pages:     8353 (n:     8180, d:      173 ) , new dirty pages:     6874 , remaining dirty pages:     9252
> > > Iteration  12, duration:    899 ms , transferred pages:     7477 (n:     7362, d:      115 ) , new dirty pages:     5573 , remaining dirty pages:     7348
> > > Iteration  13, duration:    601 ms , transferred pages:     5099 (n:     4908, d:      191 ) , new dirty pages:     7671 , remaining dirty pages:     9920
> > > Iteration  14, duration:    900 ms , transferred pages:     7586 (n:     7362, d:      224 ) , new dirty pages:     7359 , remaining dirty pages:     9693
> > > Iteration  15, duration:    900 ms , transferred pages:     7682 (n:     7362, d:      320 ) , new dirty pages:     7371 , remaining dirty pages:     9382
> > > 
> > > 4. cpu2006.zeusmp (can not converge)
> > > 
> > > Iteration   1, duration:  21603 ms , transferred pages:   266450 (n:   176660, d:    89790 ) , new dirty pages:   145625 , remaining dirty pages:   145625
> > > Iteration   2, duration:   8696 ms , transferred pages:   144389 (n:    70862, d:    73527 ) , new dirty pages:   125124 , remaining dirty pages:   126360
> > > Iteration   3, duration:   6301 ms , transferred pages:   124057 (n:    51379, d:    72678 ) , new dirty pages:   122528 , remaining dirty pages:   124831
> > > Iteration   4, duration:   6400 ms , transferred pages:   124330 (n:    52196, d:    72134 ) , new dirty pages:   124267 , remaining dirty pages:   124768
> > > Iteration   5, duration:   6703 ms , transferred pages:   124034 (n:    54656, d:    69378 ) , new dirty pages:   124151 , remaining dirty pages:   124885
> > > Iteration   6, duration:   6703 ms , transferred pages:   124357 (n:    54658, d:    69699 ) , new dirty pages:   124106 , remaining dirty pages:   124634
> > > Iteration   7, duration:   6602 ms , transferred pages:   124568 (n:    53838, d:    70730 ) , new dirty pages:   133828 , remaining dirty pages:   133894
> > > Iteration   8, duration:   7600 ms , transferred pages:   133030 (n:    62021, d:    71009 ) , new dirty pages:   126612 , remaining dirty pages:   127476
> > > Iteration   9, duration:   7299 ms , transferred pages:   126511 (n:    59569, d:    66942 ) , new dirty pages:   122727 , remaining dirty pages:   123692
> > > Iteration  10, duration:   6609 ms , transferred pages:   123692 (n:    54539, d:    69153 ) , new dirty pages:   122727 , remaining dirty pages:   122727
> > > Iteration  11, duration:   6995 ms , transferred pages:   120347 (n:    56423, d:    63924 ) , new dirty pages:   121430 , remaining dirty pages:   123810
> > > Iteration  12, duration:   6703 ms , transferred pages:   123040 (n:    54657, d:    68383 ) , new dirty pages:   122043 , remaining dirty pages:   122813
> > > Iteration  13, duration:   7006 ms , transferred pages:   122353 (n:    57121, d:    65232 ) , new dirty pages:   133869 , remaining dirty pages:   134329
> > > Iteration  14, duration:   8209 ms , transferred pages:   132325 (n:    66932, d:    65393 ) , new dirty pages:   126914 , remaining dirty pages:   128918
> > > Iteration  15, duration:   7802 ms , transferred pages:   126931 (n:    63671, d:    63260 ) , new dirty pages:   122351 , remaining dirty pages:   124338
> > > 
> > > 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> > > 
> > > Iteration   1, duration:  30697 ms , transferred pages:   266450 (n:   251215, d:    15235 ) , new dirty pages:    30628 , remaining dirty pages:    30628
> > > Iteration   2, duration:   3496 ms , transferred pages:    28859 (n:    28513, d:      346 ) , new dirty pages:     5805 , remaining dirty pages:     7574
> > > Iteration   3, duration:    701 ms , transferred pages:     5746 (n:     5726, d:       20 ) , new dirty pages:     3433 , remaining dirty pages:     5261
> > > Iteration   4, duration:    400 ms , transferred pages:     3281 (n:     3272, d:        9 ) , new dirty pages:     1539 , remaining dirty pages:     3519
> > > Iteration   5, duration:    199 ms , transferred pages:     1653 (n:     1636, d:       17 ) , new dirty pages:      301 , remaining dirty pages:     2167
> > > Iteration   6, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2167
> > > 
> > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> > > 
> > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5  (can not converge)
> > > 
> > > Iteration   1, duration:  42486 ms , transferred pages:  1568087 (n:  1216079, d:   352008 ) , new dirty pages:   571940 , remaining dirty pages:   581023
> > > Iteration   2, duration:  19774 ms , transferred pages:   571700 (n:   567416, d:     4284 ) , new dirty pages:   331690 , remaining dirty pages:   341013
> > > Iteration   3, duration:  11589 ms , transferred pages:   332187 (n:   332095, d:       92 ) , new dirty pages:   222725 , remaining dirty pages:   231551
> > > Iteration   4, duration:   7790 ms , transferred pages:   223571 (n:   223499, d:       72 ) , new dirty pages:   157658 , remaining dirty pages:   165638
> > > Iteration   5, duration:   5518 ms , transferred pages:   158056 (n:   157998, d:       58 ) , new dirty pages:   128130 , remaining dirty pages:   135712
> > > Iteration   6, duration:   4442 ms , transferred pages:   127764 (n:   127701, d:       63 ) , new dirty pages:   104839 , remaining dirty pages:   112787
> > > Iteration   7, duration:   3649 ms , transferred pages:   104581 (n:   104523, d:       58 ) , new dirty pages:   100736 , remaining dirty pages:   108942
> > > Iteration   8, duration:   3532 ms , transferred pages:   101379 (n:   101315, d:       64 ) , new dirty pages:    87869 , remaining dirty pages:    95432
> > > Iteration   9, duration:   3030 ms , transferred pages:    86841 (n:    86786, d:       55 ) , new dirty pages:    77505 , remaining dirty pages:    86096
> > > Iteration  10, duration:   2709 ms , transferred pages:    77875 (n:    77814, d:       61 ) , new dirty pages:    77197 , remaining dirty pages:    85418
> > > Iteration  11, duration:   2696 ms , transferred pages:    77107 (n:    77044, d:       63 ) , new dirty pages:    65010 , remaining dirty pages:    73321
> > > Iteration  12, duration:   2308 ms , transferred pages:    66540 (n:    66484, d:       56 ) , new dirty pages:    64388 , remaining dirty pages:    71169
> > > Iteration  13, duration:   2198 ms , transferred pages:    62953 (n:    62897, d:       56 ) , new dirty pages:    62773 , remaining dirty pages:    70989
> > > Iteration  14, duration:   2214 ms , transferred pages:    63466 (n:    63411, d:       55 ) , new dirty pages:    67538 , remaining dirty pages:    75061
> > > Iteration  15, duration:   2329 ms , transferred pages:    66924 (n:    66875, d:       49 ) , new dirty pages:    63580 , remaining dirty pages:    71717
> > > Iteration  16, duration:   2252 ms , transferred pages:    64554 (n:    64539, d:       15 ) , new dirty pages:    63094 , remaining dirty pages:    70257
> > > Iteration  17, duration:   2188 ms , transferred pages:    62697 (n:    62641, d:       56 ) , new dirty pages:    63016 , remaining dirty pages:    70576
> > > Iteration  18, duration:   2171 ms , transferred pages:    62377 (n:    62322, d:       55 ) , new dirty pages:    56764 , remaining dirty pages:    64963
> > > Iteration  19, duration:   2003 ms , transferred pages:    57382 (n:    57324, d:       58 ) , new dirty pages:    65307 , remaining dirty pages:    72888
> > > Iteration  20, duration:   2240 ms , transferred pages:    64426 (n:    64364, d:       62 ) , new dirty pages:    61585 , remaining dirty pages:    70047
> > > 
> > > 
> > > --
> > > Chunguang Li, Ph.D. Candidate
> > > Wuhan National Laboratory for Optoelectronics (WNLO)
> > > Huazhong University of Science & Technology (HUST)
> > > Wuhan, Hubei Prov., China
> > > 
> > > 
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
  2016-11-08 11:05             ` Dr. David Alan Gilbert
@ 2016-11-08 13:40               ` Chunguang Li
  0 siblings, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-08 13:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela




> -----Original Messages-----
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Sent Time: Tuesday, November 8, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> 
> * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > 
> > 
> > 
> > > -----Original Messages-----
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > Sent Time: Friday, October 14, 2016
> > > To: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > 
> > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > 
> > > > 
> > > > 
> > > > > -----原始邮件-----
> > > > > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > > > > 发送时间: 2016年9月30日 星期五
> > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > > 
> > > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > -----原始邮件-----
> > > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > > 发送时间: 2016年9月26日 星期一
> > > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > > > > 
> > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > > > > Hi all!
> > > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > > > > 
> > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > > > > 
> > > > > > > I think you're right that this can happen; to clarify I think the
> > > > > > > case you're talking about is:
> > > > > > > 
> > > > > > >   Iteration 1
> > > > > > >     sync bitmap
> > > > > > >     start sending pages
> > > > > > >     page 'n' is modified - but hasn't been sent yet
> > > > > > >     page 'n' gets sent
> > > > > > >   Iteration 2
> > > > > > >     sync bitmap
> > > > > > >        'page n is shown as modified'
> > > > > > >     send page 'n' again
> > > > > > >
> > > > > > 
> > > > > > Yes,this is right the case I am talking about.
> > > > > >  
> > > > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > > > on big VMs with slow networks where the length of each iteration
> > > > > > > is large.
> > > > > > 
> > > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > > > > 
> > > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > > workloads get impacted the most?  That would also help us to figure
> > > > > out what kinds of speed improvements we can expect.
> > > > > 
> > > > > 
> > > > > 		Amit
> > > > 
> > > > I have picked up 6 workloads and got the following statistics numbers 
> > > > of every iteration (except the last stop-copy one) during precopy.
> > > > These numbers are obtained with the basic precopy migration, without 
> > > > the capabilities like xbzrle or compression, etc. The network for the 
> > > > migration is exclusive, with a separate network for the workloads. 
> > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > 
> > > > Three (booting, idle, web server) of them converged to the stop-copy phase, 
> > > > with the given bandwidth and default downtime (300ms), while the other
> > > > three (kernel compilation, zeusmp, memcached) did not.
> > > > 
> > > > One page is "not-really-dirty", if it is written first and is sent later
> > > > (and not written again after that) during one iteration. I guess this 
> > > > would not happen so often during the other iterations as during the 1st 
> > > > iteration. Because all the pages of the VM are sent to the dest node during 
> > > > the 1st iteration, while during the others, only part of the pages are sent. 
> > > > So I think the "not-really-dirty" pages should be produced mainly during 
> > > > the 1st iteration , and maybe very little during the other iterations.
> > > > 
> > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > > think the time spent on Iteration 2 would be halved. This is a chain reaction,
> > > > because the dirty pages produced during Iteration 2 is halved, which incurs
> > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > 
> > > Yes; these numbers don't show how many of them are false dirty though.
> > > 
> > > One problem is thinking about pages that have been redirtied, if the page is dirtied
> > > after the sync but before the network write then it's the false-dirty that
> > > you're describing.
> > > 
> > > However, if the page is being written a few times, and so it would have been written
> > > after the network write then it isn't a false-dirty. 
> > > 
> > > You might be able to figure that out with some kernel tracing of when the dirtying
> > > happens, but it might be easier to write the fix!
> > > 
> > > Dave
> > 
> > Hi, I have made some new progress now.
> > 
> > To tell how many false dirty pages there are exactly in each iteration, I malloc a
> > buffer in memory as big as the size of the whole VM memory. When a page is 
> > transferred to the dest node, it is copied to the buffer; During the next iteration, 
> > if one page is transferred, it is compared to the old one in the buffer, and the
> > old one will be replaced for next comparison if it is really dirty. Thus, we are now 
> > able to get the exact number of false dirty pages.
> >
> > This time, I use 15 workloads to get the statistic number. They are:
> > 
> >   1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific 
> >      computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick
> >      up these 11 benchmarks because compared to others, they have bigger memory
> >      occupation and higher memory dirty rate. Thus most of them could not converge
> >      to stop-and-copy using the default migration speed (32MB/s).
> >   2. kernel compilation
> >   3. idle VM 
> >   4. Apache web server which serves static content
> > 
> >   (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the 
> >    migration speed is the default 32MB/s)
> > 
> >   5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache.
> >      After filling up the 4GB cache, a client writes the cache at a constant speed
> >      during migration. This time, migration speed has no limit, and is up to the 
> >      capability of 1Gbps Ethernet.
> > 
> > Summarize the results first: (and you can read the precise number below) 
> > 
> >   1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations) 
> >      of false dirty pages out of all the dirty pages since iteration 2 (and the big 
> >      proportion lasts during the following iterations). They are cpu2006.zeusmp, 
> >      cpu2006.bzip2, cpu2006.mcf, and memcached.
> >   2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even
> >      though the proportion of false dirty pages is big since iteration 2, the space to
> >      optimize is small.
> >   3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not 
> >      in the other iterations.
> >   4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false 
> >      dirty pages since iteration 2. So the spaces to optimize for them are small.
> > 
> > Now I want to talk a little more about the reasons why false dirty pages are produced. 
> > The first reason is what we have discussed before---the mechanism to track the dirty 
> > pages.
> > And then I come up with another reason. Here is the situation: a write operation to one 
> > memory page happens, but it doesn't change any content of the page. So it's "write but 
> > not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments
> > to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006
> > benchmark suit. According to his results, general workloads has a little proportion (<10%) 
> > of "write but not dirty" out of all the write operations, while few workloads have higher
> > proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would 
> > happen, it just happened.
> 
> I think there are a few different reasons I can think of:
>   a) You have a flag or mutex that's set and cleared; so it gets set (marked
>      dirty) and cleared around some operation. By the time we come to migrate
>      it then it's back to cleared again.
>      Similarly with other temporary data structures.
>   b) Some system operation causes the page to be moved - e.g. swap or the kernel
>      reorganising memory.

Sorry, I don't quite understand reason (b). Take swap as example, do you mean a page
is swapped out and swapped in to the old address again, so the content remains unchanged?

> 
> However, it's a shame I don't think you can tell in your experiment which of the
> two cases we're hitting? I'd like to know if it's worth working on
> making the page sync mechanism better or if it's nore important to deal
> with the second reason you show.

Yes, you are right, it's hard to tell which case we're hitting (including the cases you
think of). However, as I use the SHA1 method, now I don't have to tell them. Because it
just handle all the cases we have thought of.

> 
> > So these two reasons contribute to the false dirty pages. To optimize, I compute and store
> > the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its
> > SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a
> > false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new
> > hash replaces the old one for next comparison. 
> > The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1
> > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which
> > is relatively small. 
> > As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems. 
> > They have proven that the probability of hash collision is far smaller than disk hardware fault,
> > so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the 
> > same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery. 
> 
> There was a proposal ( https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05331.html )
>  to do a migration system where
> a copy of the migration RAM is stored on disc on the destination for cases where similar VMs
> are migrated, and it used a checksum for each page to find the matching page
> in the cache; that originally used a smaller hash, I think in the end they used a SHA-256.
> (Hash based checks still make me nervous for intentional collisions but that's probably
> me being paranoid?)

Em... I don't know if most people would accept the hash based checks. 
Maybe it needs some more mathematical proving like they have done in the
field of deduplication for backup systems.

> 
> > Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have 
> > big proportions of false dirty pages, the improvement is remarkable. Without optimization, 
> > they either can not converge to stop-and-copy, or take a very long time to complete. With the 
> > SHA1 hash method, all of them now complete in a relatively short time. 
> > For the reason I have talked above, the other workloads don't get notable improvements from the 
> > optimization. So below, I only show the exact number after optimization for the 4 workloads with
> > remarkable improvements.
> > 
> > Any comments or suggestions? 
> 
> You might be able to save some of the CPU time; we've
> got a test that checks if a page is all-zero; if you're doing
> the SHA calculation you could avoid doing the all-zero check
> and replace it by comparing hte output of the SHA.

Yes, this is one way. However now I'm doing the opposite. I first 
calculate the SHA1 of the all-zero page and remember that. Then next time, 
if I recognize an all-zero page after the check, I just store the
SHA1 I have got earlier, avoiding calculating the SHA1 of the all-zero 
page again. I think this is better, because I think the current implementation
to check all-zero pages is faster than calculating SHA1.

Thanks,
Chunguang

> 
> > 
> > Below is the experiments data:
> > (
> > "dup" means zero page, this kind of pages takes very little migration time and network 
> > resources, so they are always not regard as dirty pages in my numbers;
> > "rd" means really dirty pages;
> > "fd" means false dirty pages;
> > The numbers refer to the quantities of pages.
> > )
> > 
> > ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)-------------------
> > 
> > 1. memcached
> > 
> > -----  original pre-copy (can not converge):  -----
> > Iteration   1, duration:  42111 ms , transferred pages:  1568788 (dup:   416239, rd:  1152549, fd:        0) , new dirty pages:   499015 , remaining dirty pages:   507397
> > Iteration   2, duration:  17208 ms , transferred pages:   498946 (dup:     5456, rd:   160206, fd:   333284) , new dirty pages:   261237 , remaining dirty pages:   269688
> > Iteration   3, duration:   9134 ms , transferred pages:   262377 (dup:      519, rd:   111900, fd:   149958) , new dirty pages:   170281 , remaining dirty pages:   177592
> > Iteration   4, duration:   5920 ms , transferred pages:   169966 (dup:       87, rd:    82487, fd:    87392) , new dirty pages:   121154 , remaining dirty pages:   128780
> > Iteration   5, duration:   4239 ms , transferred pages:   121551 (dup:       81, rd:    64120, fd:    57350) , new dirty pages:   100976 , remaining dirty pages:   108205
> > Iteration   6, duration:   3495 ms , transferred pages:   100353 (dup:       90, rd:    56021, fd:    44242) , new dirty pages:    74547 , remaining dirty pages:    82399
> > Iteration   7, duration:   2583 ms , transferred pages:    74160 (dup:       56, rd:    38016, fd:    36088) , new dirty pages:    58209 , remaining dirty pages:    66448
> > Iteration   8, duration:   2039 ms , transferred pages:    58534 (dup:       81, rd:    26885, fd:    31568) , new dirty pages:    43511 , remaining dirty pages:    51425
> > Iteration   9, duration:   1513 ms , transferred pages:    43484 (dup:       55, rd:    26641, fd:    16788) , new dirty pages:    43722 , remaining dirty pages:    51663
> > Iteration  10, duration:   1521 ms , transferred pages:    43676 (dup:       62, rd:    26463, fd:    17151) , new dirty pages:    35347 , remaining dirty pages:    43334
> > Iteration  11, duration:   1230 ms , transferred pages:    35287 (dup:        0, rd:    21293, fd:    13994) , new dirty pages:    28851 , remaining dirty pages:    36898
> > Iteration  12, duration:   1031 ms , transferred pages:    29651 (dup:       82, rd:    18143, fd:    11426) , new dirty pages:    27062 , remaining dirty pages:    34309
> > Iteration  13, duration:    917 ms , transferred pages:    26385 (dup:       56, rd:    14149, fd:    12180) , new dirty pages:    22723 , remaining dirty pages:    30647
> > Iteration  14, duration:    762 ms , transferred pages:    21902 (dup:       55, rd:    16355, fd:     5492) , new dirty pages:    18208 , remaining dirty pages:    26953
> > Iteration  15, duration:    650 ms , transferred pages:    18636 (dup:        0, rd:    11943, fd:     6693) , new dirty pages:    16085 , remaining dirty pages:    24402
> > Iteration  16, duration:    554 ms , transferred pages:    15946 (dup:       56, rd:     9527, fd:     6363) , new dirty pages:    14766 , remaining dirty pages:    23222
> > Iteration  17, duration:    538 ms , transferred pages:    15434 (dup:        0, rd:     9779, fd:     5655) , new dirty pages:    13381 , remaining dirty pages:    21169
> > Iteration  18, duration:    487 ms , transferred pages:    14089 (dup:       81, rd:     7737, fd:     6271) , new dirty pages:    13325 , remaining dirty pages:    20405
> > Iteration  19, duration:    428 ms , transferred pages:    12232 (dup:        0, rd:     8488, fd:     3744) , new dirty pages:    10274 , remaining dirty pages:    18447
> > Iteration  20, duration:    377 ms , transferred pages:    10887 (dup:       56, rd:     6362, fd:     4469) , new dirty pages:     9708 , remaining dirty pages:    17268
> > Iteration  21, duration:    320 ms , transferred pages:     9222 (dup:        0, rd:     5789, fd:     3433) , new dirty pages:     8015 , remaining dirty pages:    16061
> > Iteration  22, duration:    268 ms , transferred pages:     7621 (dup:        0, rd:     6204, fd:     1417) , new dirty pages:     7227 , remaining dirty pages:    15667
> > Iteration  23, duration:    269 ms , transferred pages:     7813 (dup:       56, rd:     4410, fd:     3347) , new dirty pages:     7591 , remaining dirty pages:    15445
> > Iteration  24, duration:    271 ms , transferred pages:     7749 (dup:        0, rd:     4565, fd:     3184) , new dirty pages:    15126 , remaining dirty pages:    22822
> > Iteration  25, duration:    549 ms , transferred pages:    15818 (dup:       60, rd:    10545, fd:     5213) , new dirty pages:    14559 , remaining dirty pages:    21563
> > Iteration  26, duration:    499 ms , transferred pages:    14281 (dup:        3, rd:     8760, fd:     5518) , new dirty pages:    11947 , remaining dirty pages:    19229
> > Iteration  27, duration:    376 ms , transferred pages:    10823 (dup:       25, rd:     6550, fd:     4248) , new dirty pages:     8561 , remaining dirty pages:    16967
> > Iteration  28, duration:    324 ms , transferred pages:     9350 (dup:       31, rd:     5292, fd:     4027) , new dirty pages:     8655 , remaining dirty pages:    16272
> > Iteration  29, duration:    274 ms , transferred pages:     7813 (dup:        0, rd:     6088, fd:     1725) , new dirty pages:     6300 , remaining dirty pages:    14759
> > Iteration  30, duration:    218 ms , transferred pages:     6340 (dup:       45, rd:     3196, fd:     3099) , new dirty pages:     5143 , remaining dirty pages:    13562
> > 
> > -----  after optimization:  -----
> > Iteration   1, duration:  40664 ms , transferred pages:  1569037 (dup:   405940, rd:  1163097) , new dirty pages:   506846 , remaining dirty pages:   514979
> > Iteration   2, duration:   8032 ms , transferred pages:   161130 (dup:     4007, rd:   157123) , new dirty pages:   153479 , remaining dirty pages:   153479
> 
> Big difference.
> 
> > Iteration   3, duration:   2620 ms , transferred pages:    65260 (dup:       20, rd:    65240) , new dirty pages:    64014 , remaining dirty pages:    67100
> > Iteration   4, duration:   1160 ms , transferred pages:    30227 (dup:       60, rd:    30167) , new dirty pages:    34031 , remaining dirty pages:    41414
> > Iteration   5, duration:    648 ms , transferred pages:    18700 (dup:       56, rd:    18644) , new dirty pages:    18375 , remaining dirty pages:    25536
> > Iteration   6, duration:    389 ms , transferred pages:    11399 (dup:       55, rd:    11344) , new dirty pages:    12536 , remaining dirty pages:    17516
> > Iteration   7, duration:    292 ms , transferred pages:     8197 (dup:        0, rd:     8197) , new dirty pages:     8387 , remaining dirty pages:    16802
> > Iteration   8, duration:    171 ms , transferred pages:     4931 (dup:       39, rd:     4892) , new dirty pages:     6182 , remaining dirty pages:    14060
> > Iteration   9, duration:    163 ms , transferred pages:     4355 (dup:       16, rd:     4339) , new dirty pages:     5530 , remaining dirty pages:    11973
> > Iteration  10, duration:    104 ms , transferred pages:     3266 (dup:        0, rd:     3266) , new dirty pages:     2893 , remaining dirty pages:    11014
> > Iteration  11, duration:     52 ms , transferred pages:     1153 (dup:        0, rd:     1153) , new dirty pages:     1586 , remaining dirty pages:    10516
> > Iteration  12, duration:     52 ms , transferred pages:     1921 (dup:       39, rd:     1882) , new dirty pages:     1619 , remaining dirty pages:     8842
> > Iteration  13, duration:     62 ms , transferred pages:     1537 (dup:        0, rd:     1537) , new dirty pages:     2052 , remaining dirty pages:     8871
> > Iteration  14, duration:     58 ms , transferred pages:     1665 (dup:        0, rd:     1665) , new dirty pages:     1947 , remaining dirty pages:     7989
> > Iteration  15, duration:      2 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     7989
> > total time: 54693 milliseconds
> 
> Very nice.
> 
> Dave
> 
> > 2. cpu2006.zeusmp
> > 
> > -----  original pre-copy (can not converge):  -----
> > Iteration   1, duration:  21112 ms , transferred pages:   266450 (dup:    93385, rd:   173065, fd:        0) , new dirty pages:   127866 , remaining dirty pages:   127866
> > Iteration   2, duration:   6192 ms , transferred pages:   125662 (dup:    75762, rd:    17389, fd:    32511) , new dirty pages:   131655 , remaining dirty pages:   133859
> > Iteration   3, duration:   6699 ms , transferred pages:   131937 (dup:    77298, rd:    20320, fd:    34319) , new dirty pages:   121027 , remaining dirty pages:   122949
> > Iteration   4, duration:   5999 ms , transferred pages:   122512 (dup:    73588, rd:    17236, fd:    31688) , new dirty pages:   122759 , remaining dirty pages:   123196
> > Iteration   5, duration:   5804 ms , transferred pages:   122717 (dup:    75436, rd:    19016, fd:    28265) , new dirty pages:   123697 , remaining dirty pages:   124176
> > Iteration   6, duration:   5698 ms , transferred pages:   123708 (dup:    77249, rd:    18022, fd:    28437) , new dirty pages:   121838 , remaining dirty pages:   122306
> > Iteration   7, duration:   5515 ms , transferred pages:   122306 (dup:    76727, rd:    14819, fd:    30760) , new dirty pages:   122382 , remaining dirty pages:   122382
> > Iteration   8, duration:   6086 ms , transferred pages:   120825 (dup:    71834, rd:    15987, fd:    33004) , new dirty pages:   121587 , remaining dirty pages:   123144
> > Iteration   9, duration:   5899 ms , transferred pages:   120964 (dup:    72860, rd:    18191, fd:    29913) , new dirty pages:   120391 , remaining dirty pages:   122571
> > Iteration  10, duration:   5801 ms , transferred pages:   121425 (dup:    74140, rd:    20722, fd:    26563) , new dirty pages:   122302 , remaining dirty pages:   123448
> > Iteration  11, duration:   5909 ms , transferred pages:   123448 (dup:    74735, rd:    19678, fd:    29035) , new dirty pages:   123258 , remaining dirty pages:   123258
> > Iteration  12, duration:   6293 ms , transferred pages:   121211 (dup:    70442, rd:    18128, fd:    32641) , new dirty pages:   123623 , remaining dirty pages:   125670
> > Iteration  13, duration:   6398 ms , transferred pages:   124897 (dup:    72701, rd:    21134, fd:    31062) , new dirty pages:   122355 , remaining dirty pages:   123128
> > Iteration  14, duration:   6301 ms , transferred pages:   121893 (dup:    70514, rd:    23470, fd:    27909) , new dirty pages:   120980 , remaining dirty pages:   122215
> > Iteration  15, duration:   6304 ms , transferred pages:   121389 (dup:    70005, rd:    21731, fd:    29653) , new dirty pages:   121628 , remaining dirty pages:   122454
> > Iteration  16, duration:   6398 ms , transferred pages:   122164 (dup:    69962, rd:    24376, fd:    27826) , new dirty pages:   122246 , remaining dirty pages:   122536
> > Iteration  17, duration:   6201 ms , transferred pages:   121548 (dup:    70984, rd:    23915, fd:    26649) , new dirty pages:   121460 , remaining dirty pages:   122448
> > Iteration  18, duration:   6401 ms , transferred pages:   122272 (dup:    70072, rd:    22261, fd:    29939) , new dirty pages:   123518 , remaining dirty pages:   123694
> > Iteration  19, duration:   7003 ms , transferred pages:   121873 (dup:    64754, rd:    27325, fd:    29794) , new dirty pages:   120568 , remaining dirty pages:   122389
> > Iteration  20, duration:   6400 ms , transferred pages:   121422 (dup:    69221, rd:    25300, fd:    26901) , new dirty pages:   121229 , remaining dirty pages:   122196
> > Iteration  21, duration:   6703 ms , transferred pages:   119895 (dup:    65232, rd:    25877, fd:    28786) , new dirty pages:   123284 , remaining dirty pages:   125585
> > Iteration  22, duration:   6902 ms , transferred pages:   123884 (dup:    67582, rd:    29020, fd:    27282) , new dirty pages:   122057 , remaining dirty pages:   123758
> > Iteration  23, duration:   6800 ms , transferred pages:   122010 (dup:    66529, rd:    30644, fd:    24837) , new dirty pages:   120916 , remaining dirty pages:   122664
> > Iteration  24, duration:   7202 ms , transferred pages:   121951 (dup:    63188, rd:    31105, fd:    27658) , new dirty pages:   122715 , remaining dirty pages:   123428
> > Iteration  25, duration:   7202 ms , transferred pages:   122919 (dup:    64161, rd:    32063, fd:    26695) , new dirty pages:   123180 , remaining dirty pages:   123689
> > Iteration  26, duration:   7404 ms , transferred pages:   123092 (dup:    62694, rd:    33459, fd:    26939) , new dirty pages:   122149 , remaining dirty pages:   122746
> > Iteration  27, duration:   7205 ms , transferred pages:   120427 (dup:    61664, rd:    34344, fd:    24419) , new dirty pages:   120299 , remaining dirty pages:   122618
> > Iteration  28, duration:   7100 ms , transferred pages:   121074 (dup:    63130, rd:    32403, fd:    25541) , new dirty pages:   122984 , remaining dirty pages:   124528
> > Iteration  29, duration:   7904 ms , transferred pages:   124060 (dup:    59564, rd:    35631, fd:    28865) , new dirty pages:   127080 , remaining dirty pages:   127548
> > Iteration  30, duration:   7906 ms , transferred pages:   127518 (dup:    63029, rd:    34416, fd:    30073) , new dirty pages:   125028 , remaining dirty pages:   125058
> > 
> > -----  after optimization:  -----
> > Iteration   1, duration:  21601 ms , transferred pages:   266450 (dup:    89731, rd:   176719) , new dirty pages:   139843 , remaining dirty pages:   139843
> > Iteration   2, duration:   1747 ms , transferred pages:    92077 (dup:    78364, rd:    13713) , new dirty pages:    90945 , remaining dirty pages:    90945
> > Iteration   3, duration:   1592 ms , transferred pages:    62253 (dup:    49435, rd:    12818) , new dirty pages:    76929 , remaining dirty pages:    76929
> > Iteration   4, duration:    992 ms , transferred pages:    44837 (dup:    37886, rd:     6951) , new dirty pages:    71331 , remaining dirty pages:    72916
> > Iteration   5, duration:    998 ms , transferred pages:    55229 (dup:    47150, rd:     8079) , new dirty pages:    21703 , remaining dirty pages:    23302
> > Iteration   6, duration:    211 ms , transferred pages:    20337 (dup:    18516, rd:     1821) , new dirty pages:    14500 , remaining dirty pages:    14500
> > Iteration   7, duration:     31 ms , transferred pages:    12933 (dup:    12627, rd:      306) , new dirty pages:     1520 , remaining dirty pages:     1520
> > Iteration   8, duration:     30 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        4 , remaining dirty pages:     1524
> > total time: 27225 milliseconds
> > 
> > 3. cpu2006.bzip2
> > 
> > -----  original pre-copy:  -----
> > Iteration   1, duration:  18306 ms , transferred pages:   266450 (dup:   116569, rd:   149881, fd:        0) , new dirty pages:   106299 , remaining dirty pages:   106299
> > Iteration   2, duration:  10694 ms , transferred pages:   104611 (dup:    17550, rd:    10536, fd:    76525) , new dirty pages:    34394 , remaining dirty pages:    36082
> > Iteration   3, duration:   2998 ms , transferred pages:    34442 (dup:     9924, rd:    12254, fd:    12264) , new dirty pages:     6419 , remaining dirty pages:     8059
> > Iteration   4, duration:    699 ms , transferred pages:     5748 (dup:       22, rd:     2583, fd:     3143) , new dirty pages:     1226 , remaining dirty pages:     3537
> > Iteration   5, duration:    200 ms , transferred pages:     1636 (dup:        0, rd:     1194, fd:      442) , new dirty pages:      478 , remaining dirty pages:     2379
> > Iteration   6, duration:      1 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2379
> > 
> > -----  after optimization:  -----
> > Iteration   1, duration:  13995 ms , transferred pages:   266314 (dup:   152118, rd:   114196) , new dirty pages:    97009 , remaining dirty pages:    97145
> > Iteration   2, duration:   1215 ms , transferred pages:    33400 (dup:    26745, rd:     6655) , new dirty pages:    12866 , remaining dirty pages:    14017
> > Iteration   3, duration:    701 ms , transferred pages:     5774 (dup:       48, rd:     5726) , new dirty pages:     6342 , remaining dirty pages:     8761
> > Iteration   4, duration:    500 ms , transferred pages:     4111 (dup:       21, rd:     4090) , new dirty pages:     4311 , remaining dirty pages:     6485
> > Iteration   5, duration:    400 ms , transferred pages:     3273 (dup:        1, rd:     3272) , new dirty pages:     3034 , remaining dirty pages:     5431
> > Iteration   6, duration:    301 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2094 , remaining dirty pages:     4472
> > Iteration   7, duration:    299 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2066 , remaining dirty pages:     4082
> > Iteration   8, duration:    202 ms , transferred pages:     1636 (dup:        0, rd:     1636) , new dirty pages:     2881 , remaining dirty pages:     4648
> > Iteration   9, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     4775 , remaining dirty pages:     6778
> > Iteration  10, duration:    400 ms , transferred pages:     3281 (dup:        9, rd:     3272) , new dirty pages:     3757 , remaining dirty pages:     5576
> > Iteration  11, duration:    401 ms , transferred pages:     3279 (dup:        7, rd:     3272) , new dirty pages:     6980 , remaining dirty pages:     8906
> > Iteration  12, duration:    500 ms , transferred pages:     7118 (dup:     3035, rd:     4083) , new dirty pages:    10774 , remaining dirty pages:    11922
> > Iteration  13, duration:    116 ms , transferred pages:    11706 (dup:    10152, rd:     1554) , new dirty pages:     1326 , remaining dirty pages:     1326
> > Iteration  14, duration:    117 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     1326
> > total time: 19479 milliseconds
> > 
> > 4. cpu2006.mcf
> > 
> > -----  original pre-copy:  -----
> > Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6925, rd:   259525, fd:        0) , new dirty pages:   244403 , remaining dirty pages:   244403
> > Iteration   2, duration:  29603 ms , transferred pages:   242275 (dup:      377, rd:   224001, fd:    17897) , new dirty pages:   227335 , remaining dirty pages:   229463
> > Iteration   3, duration:  27806 ms , transferred pages:   227573 (dup:      169, rd:    65681, fd:   161723) , new dirty pages:   195593 , remaining dirty pages:   197483
> > Iteration   4, duration:  23907 ms , transferred pages:   195543 (dup:       41, rd:    39838, fd:   155664) , new dirty pages:   215066 , remaining dirty pages:   217006
> > Iteration   5, duration:  26305 ms , transferred pages:   215289 (dup:      155, rd:    33082, fd:   182052) , new dirty pages:   111098 , remaining dirty pages:   112815
> > Iteration   6, duration:  13502 ms , transferred pages:   110452 (dup:       22, rd:    26793, fd:    83637) , new dirty pages:   161054 , remaining dirty pages:   163417
> > Iteration   7, duration:  19705 ms , transferred pages:   161266 (dup:      120, rd:    33818, fd:   127328) , new dirty pages:   220562 , remaining dirty pages:   222713
> > Iteration   8, duration:  27003 ms , transferred pages:   220881 (dup:       21, rd:   215721, fd:     5139) , new dirty pages:   219787 , remaining dirty pages:   221619
> > Iteration   9, duration:  26802 ms , transferred pages:   219248 (dup:       24, rd:    84648, fd:   134576) , new dirty pages:   207959 , remaining dirty pages:   210330
> > Iteration  10, duration:  25411 ms , transferred pages:   207916 (dup:      144, rd:    35842, fd:   171930) , new dirty pages:   144442 , remaining dirty pages:   146856
> > Iteration  11, duration:  17714 ms , transferred pages:   144804 (dup:       18, rd:    25414, fd:   119372) , new dirty pages:   205127 , remaining dirty pages:   207179
> > Iteration  12, duration:  25112 ms , transferred pages:   205446 (dup:      128, rd:    23197, fd:   182121) , new dirty pages:   167319 , remaining dirty pages:   169052
> > Iteration  13, duration:  20411 ms , transferred pages:   166886 (dup:       14, rd:    21960, fd:   144912) , new dirty pages:   221592 , remaining dirty pages:   223758
> > Iteration  14, duration:  27126 ms , transferred pages:   221800 (dup:      122, rd:    42368, fd:   179310) , new dirty pages:   233630 , remaining dirty pages:   235588
> > Iteration  15, duration:  28517 ms , transferred pages:   233321 (dup:      191, rd:   222528, fd:    10602) , new dirty pages:   224282 , remaining dirty pages:   226549
> > Iteration  16, duration:  27422 ms , transferred pages:   224187 (dup:       55, rd:    45773, fd:   178359) , new dirty pages:   209815 , remaining dirty pages:   212177
> > Iteration  17, duration:  25723 ms , transferred pages:   210260 (dup:       34, rd:    79405, fd:   130821) , new dirty pages:   220297 , remaining dirty pages:   222214
> > Iteration  18, duration:  26920 ms , transferred pages:   220056 (dup:       14, rd:   214128, fd:     5914) , new dirty pages:   192015 , remaining dirty pages:   194173
> > Iteration  19, duration:  23520 ms , transferred pages:   192239 (dup:        9, rd:    25140, fd:   167090) , new dirty pages:    96450 , remaining dirty pages:    98384
> > Iteration  20, duration:  11805 ms , transferred pages:    96538 (dup:       14, rd:     7424, fd:    89100) , new dirty pages:     6978 , remaining dirty pages:     8824
> > Iteration  21, duration:    799 ms , transferred pages:     6545 (dup:        1, rd:     1802, fd:     4742) , new dirty pages:      138 , remaining dirty pages:     2417
> > Iteration  22, duration:      1 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2417
> > 
> > -----  after optimization:  -----
> > Iteration   1, duration:  31711 ms , transferred pages:   266450 (dup:     6831, rd:   259619) , new dirty pages:   240209 , remaining dirty pages:   240209
> > Iteration   2, duration:   6250 ms , transferred pages:    51244 (dup:      211, rd:    51033) , new dirty pages:   226651 , remaining dirty pages:   228571
> > Iteration   3, duration:   4395 ms , transferred pages:    36008 (dup:       80, rd:    35928) , new dirty pages:   110719 , remaining dirty pages:   111478
> > Iteration   4, duration:   3390 ms , transferred pages:    28068 (dup:       28, rd:    28040) , new dirty pages:   185172 , remaining dirty pages:   185172
> > Iteration   5, duration:   2986 ms , transferred pages:    23780 (dup:       45, rd:    23735) , new dirty pages:    64357 , remaining dirty pages:    66305
> > Iteration   6, duration:   2727 ms , transferred pages:    22800 (dup:       12, rd:    22788) , new dirty pages:    61675 , remaining dirty pages:    61675
> > Iteration   7, duration:   2372 ms , transferred pages:    18943 (dup:       13, rd:    18930) , new dirty pages:    55144 , remaining dirty pages:    55265
> > Iteration   8, duration:   2100 ms , transferred pages:    17189 (dup:       11, rd:    17178) , new dirty pages:    55244 , remaining dirty pages:    55668
> > Iteration   9, duration:   2003 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:   107058 , remaining dirty pages:   108014
> > Iteration  10, duration:   2132 ms , transferred pages:    17825 (dup:       24, rd:    17801) , new dirty pages:   126214 , remaining dirty pages:   126214
> > Iteration  11, duration:   2229 ms , transferred pages:    18156 (dup:       22, rd:    18134) , new dirty pages:    65725 , remaining dirty pages:    65725
> > Iteration  12, duration:   2315 ms , transferred pages:    18651 (dup:       21, rd:    18630) , new dirty pages:    52575 , remaining dirty pages:    53903
> > Iteration  13, duration:   2147 ms , transferred pages:    17435 (dup:       16, rd:    17419) , new dirty pages:    46652 , remaining dirty pages:    47260
> > Iteration  14, duration:   2000 ms , transferred pages:    16371 (dup:       11, rd:    16360) , new dirty pages:    42721 , remaining dirty pages:    43266
> > Iteration  15, duration:   1901 ms , transferred pages:    15552 (dup:       10, rd:    15542) , new dirty pages:    38593 , remaining dirty pages:    40792
> > Iteration  16, duration:   1801 ms , transferred pages:    14735 (dup:       11, rd:    14724) , new dirty pages:    54252 , remaining dirty pages:    55639
> > Iteration  17, duration:   1708 ms , transferred pages:    13860 (dup:        2, rd:    13858) , new dirty pages:    72379 , remaining dirty pages:    74170
> > Iteration  18, duration:   1923 ms , transferred pages:    15442 (dup:       12, rd:    15430) , new dirty pages:   101911 , remaining dirty pages:   103547
> > Iteration  19, duration:   2311 ms , transferred pages:    18823 (dup:        9, rd:    18814) , new dirty pages:    80534 , remaining dirty pages:    82521
> > Iteration  20, duration:   2081 ms , transferred pages:    17156 (dup:       34, rd:    17122) , new dirty pages:    36054 , remaining dirty pages:    36054
> > Iteration  21, duration:   1665 ms , transferred pages:    13777 (dup:       10, rd:    13767) , new dirty pages:    29624 , remaining dirty pages:    29624
> > Iteration  22, duration:   1657 ms , transferred pages:    13290 (dup:        7, rd:    13283) , new dirty pages:    25949 , remaining dirty pages:    28265
> > Iteration  23, duration:   1599 ms , transferred pages:    13088 (dup:        0, rd:    13088) , new dirty pages:    22356 , remaining dirty pages:    24813
> > Iteration  24, duration:   1500 ms , transferred pages:    12280 (dup:       10, rd:    12270) , new dirty pages:    21181 , remaining dirty pages:    22608
> > Iteration  25, duration:   1400 ms , transferred pages:    11457 (dup:        5, rd:    11452) , new dirty pages:    18657 , remaining dirty pages:    20311
> > Iteration  26, duration:   1200 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    15690 , remaining dirty pages:    17294
> > Iteration  27, duration:   1201 ms , transferred pages:     9822 (dup:        6, rd:     9816) , new dirty pages:    14810 , remaining dirty pages:    15936
> > Iteration  28, duration:   1000 ms , transferred pages:     8183 (dup:        3, rd:     8180) , new dirty pages:    15387 , remaining dirty pages:    16423
> > Iteration  29, duration:    900 ms , transferred pages:     7372 (dup:       10, rd:     7362) , new dirty pages:    13303 , remaining dirty pages:    15292
> > Iteration  30, duration:   1000 ms , transferred pages:     8181 (dup:        1, rd:     8180) , new dirty pages:    17879 , remaining dirty pages:    18457
> > Iteration  31, duration:    951 ms , transferred pages:     8140 (dup:        9, rd:     8131) , new dirty pages:    21738 , remaining dirty pages:    23304
> > Iteration  32, duration:    946 ms , transferred pages:     6946 (dup:        1, rd:     6945) , new dirty pages:    15815 , remaining dirty pages:    15815
> > Iteration  33, duration:    747 ms , transferred pages:     6192 (dup:        0, rd:     6192) , new dirty pages:     6249 , remaining dirty pages:     7670
> > Iteration  34, duration:    501 ms , transferred pages:     4090 (dup:        0, rd:     4090) , new dirty pages:     6163 , remaining dirty pages:     8422
> > Iteration  35, duration:    600 ms , transferred pages:     4910 (dup:        2, rd:     4908) , new dirty pages:     3673 , remaining dirty pages:     5222
> > Iteration  36, duration:    300 ms , transferred pages:     2454 (dup:        0, rd:     2454) , new dirty pages:     2132 , remaining dirty pages:     4337
> > Iteration  37, duration:    200 ms , transferred pages:     1637 (dup:        1, rd:     1636) , new dirty pages:      544 , remaining dirty pages:     2251
> > Iteration  38, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0) , new dirty pages:        0 , remaining dirty pages:     2251
> > total time: 97919 milliseconds
> > 
> > ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)-------------------
> > 
> > 5. idle
> > 
> > Iteration   1, duration:  14702 ms , transferred pages:   266450 (dup:   146393, rd:   120057, fd:        0) , new dirty pages:    14595 , remaining dirty pages:    14595
> > Iteration   2, duration:   1592 ms , transferred pages:    12412 (dup:      103, rd:     3280, fd:     9029) , new dirty pages:      218 , remaining dirty pages:     2401
> > Iteration   3, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2401
> > 
> > 6. kernel compilation (can not converge)
> > 
> > Iteration   1, duration:  20607 ms , transferred pages:   266450 (dup:    97552, rd:   168898, fd:        0) , new dirty pages:    19293 , remaining dirty pages:    19293
> > Iteration   2, duration:   2092 ms , transferred pages:    17176 (dup:      597, rd:     8625, fd:     7954) , new dirty pages:     8318 , remaining dirty pages:    10435
> > Iteration   3, duration:   1000 ms , transferred pages:     8484 (dup:      304, rd:     6256, fd:     1924) , new dirty pages:     8736 , remaining dirty pages:    10687
> > Iteration   4, duration:   1000 ms , transferred pages:     8435 (dup:      255, rd:     7089, fd:     1091) , new dirty pages:     7627 , remaining dirty pages:     9879
> > Iteration   5, duration:    900 ms , transferred pages:     7553 (dup:      191, rd:     5602, fd:     1760) , new dirty pages:     7287 , remaining dirty pages:     9613
> > Iteration   6, duration:    900 ms , transferred pages:     7620 (dup:      258, rd:     5761, fd:     1601) , new dirty pages:     8958 , remaining dirty pages:    10951
> > Iteration   7, duration:   1099 ms , transferred pages:     9309 (dup:      311, rd:     8051, fd:      947) , new dirty pages:     7189 , remaining dirty pages:     8831
> > Iteration   8, duration:    800 ms , transferred pages:     6832 (dup:      288, rd:     5717, fd:      827) , new dirty pages:     5782 , remaining dirty pages:     7781
> > Iteration   9, duration:    701 ms , transferred pages:     5875 (dup:      149, rd:     4005, fd:     1721) , new dirty pages:     4587 , remaining dirty pages:     6493
> > Iteration  10, duration:    500 ms , transferred pages:     4234 (dup:      144, rd:     3057, fd:     1033) , new dirty pages:     7352 , remaining dirty pages:     9611
> > Iteration  11, duration:    900 ms , transferred pages:     7759 (dup:      397, rd:     6563, fd:      799) , new dirty pages:     6686 , remaining dirty pages:     8538
> > Iteration  12, duration:    800 ms , transferred pages:     6808 (dup:      264, rd:     6017, fd:      527) , new dirty pages:     6871 , remaining dirty pages:     8601
> > Iteration  13, duration:    800 ms , transferred pages:     6775 (dup:      231, rd:     5722, fd:      822) , new dirty pages:     7540 , remaining dirty pages:     9366
> > Iteration  14, duration:    900 ms , transferred pages:     7507 (dup:      145, rd:     5900, fd:     1462) , new dirty pages:     7581 , remaining dirty pages:     9440
> > Iteration  15, duration:    900 ms , transferred pages:     7630 (dup:      268, rd:     6211, fd:     1151) , new dirty pages:     7268 , remaining dirty pages:     9078
> > Iteration  16, duration:    800 ms , transferred pages:     6759 (dup:      215, rd:     5763, fd:      781) , new dirty pages:     6861 , remaining dirty pages:     9180
> > Iteration  17, duration:    800 ms , transferred pages:     6838 (dup:      294, rd:     6037, fd:      507) , new dirty pages:     6196 , remaining dirty pages:     8538
> > Iteration  18, duration:    800 ms , transferred pages:     6852 (dup:      308, rd:     4905, fd:     1639) , new dirty pages:     5947 , remaining dirty pages:     7633
> > Iteration  19, duration:    700 ms , transferred pages:     5919 (dup:      193, rd:     4853, fd:      873) , new dirty pages:     5861 , remaining dirty pages:     7575
> > Iteration  20, duration:    600 ms , transferred pages:     5284 (dup:      376, rd:     4408, fd:      500) , new dirty pages:     5206 , remaining dirty pages:     7497
> > Iteration  21, duration:    600 ms , transferred pages:     5147 (dup:      239, rd:     4308, fd:      600) , new dirty pages:     5031 , remaining dirty pages:     7381
> > Iteration  22, duration:    599 ms , transferred pages:     5064 (dup:      156, rd:     4026, fd:      882) , new dirty pages:     5601 , remaining dirty pages:     7918
> > Iteration  23, duration:    702 ms , transferred pages:     5965 (dup:      239, rd:     5028, fd:      698) , new dirty pages:     6079 , remaining dirty pages:     8032
> > Iteration  24, duration:    700 ms , transferred pages:     6175 (dup:      449, rd:     5146, fd:      580) , new dirty pages:    10932 , remaining dirty pages:    12789
> > Iteration  25, duration:   1300 ms , transferred pages:    10936 (dup:      302, rd:     6205, fd:     4429) , new dirty pages:     8713 , remaining dirty pages:    10566
> > Iteration  26, duration:   1000 ms , transferred pages:     8282 (dup:      102, rd:     5662, fd:     2518) , new dirty pages:     5119 , remaining dirty pages:     7403
> > Iteration  27, duration:    600 ms , transferred pages:     5007 (dup:       99, rd:     4099, fd:      809) , new dirty pages:     2226 , remaining dirty pages:     4622
> > Iteration  28, duration:    300 ms , transferred pages:     2491 (dup:       37, rd:     1794, fd:      660) , new dirty pages:     6746 , remaining dirty pages:     8877
> > Iteration  29, duration:    800 ms , transferred pages:     6757 (dup:      213, rd:     5532, fd:     1012) , new dirty pages:     6070 , remaining dirty pages:     8190
> > Iteration  30, duration:    700 ms , transferred pages:     6052 (dup:      326, rd:     5107, fd:      619) , new dirty pages:     5177 , remaining dirty pages:     7315
> > 
> > 7. web server
> > 
> > Iteration   1, duration:  20902 ms , transferred pages:   266450 (dup:    95497, rd:   170953, fd:        0) , new dirty pages:     8528 , remaining dirty pages:     8528
> > Iteration   2, duration:    796 ms , transferred pages:     6472 (dup:      131, rd:     1885, fd:     4456) , new dirty pages:      650 , remaining dirty pages:     2706
> > Iteration   3, duration:    100 ms , transferred pages:      818 (dup:        0, rd:      383, fd:      435) , new dirty pages:      328 , remaining dirty pages:     2216
> > Iteration   4, duration:      0 ms , transferred pages:        0 (dup:        0, rd:        0, fd:        0) , new dirty pages:        0 , remaining dirty pages:     2216
> > 
> > 
> > 8. cpu2006.bwaves (can not converge)
> > 
> > Iteration   1, duration:  31715 ms , transferred pages:   266450 (dup:     6766, rd:   259684, fd:        0) , new dirty pages:   242702 , remaining dirty pages:   242702
> > Iteration   2, duration:  29397 ms , transferred pages:   240508 (dup:      405, rd:   225588, fd:    14515) , new dirty pages:   230889 , remaining dirty pages:   233083
> > Iteration   3, duration:  28205 ms , transferred pages:   230858 (dup:      182, rd:   214596, fd:    16080) , new dirty pages:   226998 , remaining dirty pages:   229223
> > Iteration   4, duration:  27805 ms , transferred pages:   227574 (dup:      170, rd:   217045, fd:    10359) , new dirty pages:   227360 , remaining dirty pages:   229009
> > Iteration   5, duration:  27703 ms , transferred pages:   226786 (dup:      200, rd:   212130, fd:    14456) , new dirty pages:   225885 , remaining dirty pages:   228108
> > Iteration   6, duration:  27600 ms , transferred pages:   225923 (dup:      155, rd:   215503, fd:    10265) , new dirty pages:   223555 , remaining dirty pages:   225740
> > Iteration   7, duration:  27309 ms , transferred pages:   223574 (dup:      260, rd:   215641, fd:     7673) , new dirty pages:   231975 , remaining dirty pages:   234141
> > Iteration   8, duration:  28403 ms , transferred pages:   232397 (dup:       85, rd:   214086, fd:    18226) , new dirty pages:   222170 , remaining dirty pages:   223914
> > Iteration   9, duration:  27105 ms , transferred pages:   221809 (dup:      131, rd:   214988, fd:     6690) , new dirty pages:   230065 , remaining dirty pages:   232170
> > Iteration  10, duration:  28104 ms , transferred pages:   230201 (dup:      343, rd:   213531, fd:    16327) , new dirty pages:   227590 , remaining dirty pages:   229559
> > Iteration  11, duration:  27801 ms , transferred pages:   227717 (dup:      313, rd:   221408, fd:     5996) , new dirty pages:   228457 , remaining dirty pages:   230299
> > Iteration  12, duration:  27916 ms , transferred pages:   228560 (dup:      338, rd:   219660, fd:     8562) , new dirty pages:   238326 , remaining dirty pages:   240065
> > 
> > 9. cpu2006.lbm (can not converge)
> > Iteration   1, duration:  31012 ms , transferred pages:   266450 (dup:    12253, rd:   254197, fd:        0) , new dirty pages:   108960 , remaining dirty pages:   108960
> > Iteration   2, duration:  13095 ms , transferred pages:   106522 (dup:        3, rd:   102045, fd:     4474) , new dirty pages:   129292 , remaining dirty pages:   131730
> > Iteration   3, duration:  15802 ms , transferred pages:   129688 (dup:      444, rd:   110860, fd:    18384) , new dirty pages:   116682 , remaining dirty pages:   118724
> > Iteration   4, duration:  14204 ms , transferred pages:   116316 (dup:      160, rd:   104951, fd:    11205) , new dirty pages:   107246 , remaining dirty pages:   109654
> > Iteration   5, duration:  13208 ms , transferred pages:   107977 (dup:        1, rd:   101834, fd:     6142) , new dirty pages:   105371 , remaining dirty pages:   107048
> > Iteration   6, duration:  12804 ms , transferred pages:   104705 (dup:        1, rd:    99629, fd:     5075) , new dirty pages:   103841 , remaining dirty pages:   106184
> > Iteration   7, duration:  12709 ms , transferred pages:   103891 (dup:        5, rd:    99212, fd:     4674) , new dirty pages:   106692 , remaining dirty pages:   108985
> > Iteration   8, duration:  13105 ms , transferred pages:   107169 (dup:       11, rd:   100125, fd:     7033) , new dirty pages:   103132 , remaining dirty pages:   104948
> > Iteration   9, duration:  12607 ms , transferred pages:   103068 (dup:        0, rd:    99460, fd:     3608) , new dirty pages:   102511 , remaining dirty pages:   104391
> > Iteration  10, duration:  12514 ms , transferred pages:   102250 (dup:        0, rd:    99094, fd:     3156) , new dirty pages:   102888 , remaining dirty pages:   105029
> > 
> > 10. cpu2006.astar (can not converge)
> > 
> > Iteration   1, duration:  28402 ms , transferred pages:   266450 (dup:    33770, rd:   232680, fd:        0) , new dirty pages:    62078 , remaining dirty pages:    62078
> > Iteration   2, duration:   7393 ms , transferred pages:    60107 (dup:       10, rd:    51722, fd:     8375) , new dirty pages:    48854 , remaining dirty pages:    50825
> > Iteration   3, duration:   6001 ms , transferred pages:    49094 (dup:       14, rd:    46540, fd:     2540) , new dirty pages:    48137 , remaining dirty pages:    49868
> > Iteration   4, duration:   5800 ms , transferred pages:    47444 (dup:        0, rd:    45389, fd:     2055) , new dirty pages:    49147 , remaining dirty pages:    51571
> > Iteration   5, duration:   6102 ms , transferred pages:    49912 (dup:       14, rd:    46216, fd:     3682) , new dirty pages:    55606 , remaining dirty pages:    57265
> > Iteration   6, duration:   6699 ms , transferred pages:    54949 (dup:      143, rd:    20745, fd:    34061) , new dirty pages:     9166 , remaining dirty pages:    11482
> > Iteration   7, duration:   1200 ms , transferred pages:     9830 (dup:       14, rd:     7011, fd:     2805) , new dirty pages:     8294 , remaining dirty pages:     9946
> > Iteration   8, duration:   1000 ms , transferred pages:     8194 (dup:       14, rd:     7178, fd:     1002) , new dirty pages:     5475 , remaining dirty pages:     7227
> > Iteration   9, duration:    600 ms , transferred pages:     4908 (dup:        0, rd:     3470, fd:     1438) , new dirty pages:     4175 , remaining dirty pages:     6494
> > Iteration  10, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     3856, fd:      234) , new dirty pages:     4095 , remaining dirty pages:     6499
> > Iteration  11, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     3313, fd:      777) , new dirty pages:     3371 , remaining dirty pages:     5780
> > Iteration  12, duration:    502 ms , transferred pages:     4090 (dup:        0, rd:     3823, fd:      267) , new dirty pages:     7518 , remaining dirty pages:     9208
> > Iteration  13, duration:    899 ms , transferred pages:     7376 (dup:       14, rd:     6028, fd:     1334) , new dirty pages:     3931 , remaining dirty pages:     5763
> > Iteration  14, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     4078, fd:       12) , new dirty pages:     4346 , remaining dirty pages:     6019
> > Iteration  15, duration:    502 ms , transferred pages:     4090 (dup:        0, rd:     3817, fd:      273) , new dirty pages:     3054 , remaining dirty pages:     4983
> > Iteration  16, duration:    400 ms , transferred pages:     3272 (dup:        0, rd:     3138, fd:      134) , new dirty pages:     3874 , remaining dirty pages:     5585
> > Iteration  17, duration:    399 ms , transferred pages:     3272 (dup:        0, rd:     3248, fd:       24) , new dirty pages:     5285 , remaining dirty pages:     7598
> > Iteration  18, duration:    701 ms , transferred pages:     5726 (dup:        0, rd:     4385, fd:     1341) , new dirty pages:     8903 , remaining dirty pages:    10775
> > Iteration  19, duration:   1101 ms , transferred pages:     9010 (dup:       12, rd:     5597, fd:     3401) , new dirty pages:     4199 , remaining dirty pages:     5964
> > Iteration  20, duration:    500 ms , transferred pages:     4090 (dup:        0, rd:     4078, fd:       12) , new dirty pages:     3829 , remaining dirty pages:     5703
> > 
> > 11. cpu2006.xalancbmk (can not converge)
> > 
> > Iteration   1, duration:  30407 ms , transferred pages:   266450 (dup:    17700, rd:   248750, fd:        0) , new dirty pages:    96169 , remaining dirty pages:    96169
> > Iteration   2, duration:  11495 ms , transferred pages:    94164 (dup:      205, rd:    67068, fd:    26891) , new dirty pages:    61766 , remaining dirty pages:    63771
> > Iteration   3, duration:   7501 ms , transferred pages:    61471 (dup:      121, rd:    53587, fd:     7763) , new dirty pages:    56569 , remaining dirty pages:    58869
> > Iteration   4, duration:   6902 ms , transferred pages:    56461 (dup:       19, rd:    50553, fd:     5889) , new dirty pages:    52181 , remaining dirty pages:    54589
> > Iteration   5, duration:   6402 ms , transferred pages:    52459 (dup:      107, rd:    46986, fd:     5366) , new dirty pages:    54051 , remaining dirty pages:    56181
> > Iteration   6, duration:   6601 ms , transferred pages:    54003 (dup:       15, rd:    47566, fd:     6422) , new dirty pages:    50844 , remaining dirty pages:    53022
> > Iteration   7, duration:   6202 ms , transferred pages:    50723 (dup:        7, rd:    47143, fd:     3573) , new dirty pages:    64880 , remaining dirty pages:    67179
> > Iteration   8, duration:   8001 ms , transferred pages:    65447 (dup:        7, rd:    61159, fd:     4281) , new dirty pages:    67854 , remaining dirty pages:    69586
> > Iteration   9, duration:   8202 ms , transferred pages:    67444 (dup:      368, rd:    56357, fd:    10719) , new dirty pages:    65178 , remaining dirty pages:    67320
> > Iteration  10, duration:   8000 ms , transferred pages:    65455 (dup:       15, rd:    60581, fd:     4859) , new dirty pages:    52421 , remaining dirty pages:    54286
> > 
> > 12. cpu2006.milc (can not converge)
> > 
> > Iteration   1, duration:  31410 ms , transferred pages:   266450 (dup:     9454, rd:   256996, fd:        0) , new dirty pages:   158860 , remaining dirty pages:   158860
> > Iteration   2, duration:  19193 ms , transferred pages:   157048 (dup:      150, rd:    96807, fd:    60091) , new dirty pages:   102238 , remaining dirty pages:   104050
> > Iteration   3, duration:  12504 ms , transferred pages:   102271 (dup:       21, rd:    95107, fd:     7143) , new dirty pages:    97944 , remaining dirty pages:    99723
> > Iteration   4, duration:  11905 ms , transferred pages:    97360 (dup:       18, rd:    93610, fd:     3732) , new dirty pages:    99150 , remaining dirty pages:   101513
> > Iteration   5, duration:  12105 ms , transferred pages:    99094 (dup:      116, rd:    94125, fd:     4853) , new dirty pages:    98589 , remaining dirty pages:   101008
> > Iteration   6, duration:  12101 ms , transferred pages:    98995 (dup:       17, rd:    94069, fd:     4909) , new dirty pages:   147403 , remaining dirty pages:   149416
> > Iteration   7, duration:  18001 ms , transferred pages:   147284 (dup:       44, rd:   135691, fd:    11549) , new dirty pages:   136445 , remaining dirty pages:   138577
> > Iteration   8, duration:  16702 ms , transferred pages:   136636 (dup:       30, rd:   130805, fd:     5801) , new dirty pages:   145481 , remaining dirty pages:   147422
> > Iteration   9, duration:  17800 ms , transferred pages:   145734 (dup:      130, rd:   133239, fd:    12365) , new dirty pages:    98032 , remaining dirty pages:    99720
> > Iteration  10, duration:  11902 ms , transferred pages:    97364 (dup:       22, rd:    93096, fd:     4246) , new dirty pages:    95391 , remaining dirty pages:    97747
> > 
> > 13. cpu2006.cactusADM (can not converge)
> > 
> > Iteration   1, duration:  23508 ms , transferred pages:   266450 (dup:    73568, rd:   192882, fd:        0) , new dirty pages:   123869 , remaining dirty pages:   123869
> > Iteration   2, duration:  13989 ms , transferred pages:   121594 (dup:     7874, rd:    81653, fd:    32067) , new dirty pages:   112960 , remaining dirty pages:   115235
> > Iteration   3, duration:  13605 ms , transferred pages:   113276 (dup:     2028, rd:    83783, fd:    27465) , new dirty pages:   112314 , remaining dirty pages:   114273
> > Iteration   4, duration:  13509 ms , transferred pages:   111935 (dup:     1505, rd:    83535, fd:    26895) , new dirty pages:   114078 , remaining dirty pages:   116416
> > Iteration   5, duration:  13810 ms , transferred pages:   114262 (dup:     1378, rd:    84039, fd:    28845) , new dirty pages:   112271 , remaining dirty pages:   114425
> > Iteration   6, duration:  13604 ms , transferred pages:   112664 (dup:     1416, rd:    84300, fd:    26948) , new dirty pages:   112903 , remaining dirty pages:   114664
> > Iteration   7, duration:  13604 ms , transferred pages:   112655 (dup:     1407, rd:    84027, fd:    27221) , new dirty pages:   110943 , remaining dirty pages:   112952
> > Iteration   8, duration:  13406 ms , transferred pages:   110720 (dup:     1108, rd:    84075, fd:    25537) , new dirty pages:   109321 , remaining dirty pages:   111553
> > Iteration   9, duration:  13306 ms , transferred pages:   109726 (dup:      932, rd:    83652, fd:    25142) , new dirty pages:   113446 , remaining dirty pages:   115273
> > Iteration  10, duration:  13705 ms , transferred pages:   113121 (dup:     1055, rd:    84671, fd:    27395) , new dirty pages:   108776 , remaining dirty pages:   110928
> > 
> > 14. cpu2006.GmesFDTD (can not converge)
> > 
> > Iteration   1, duration:  13303 ms , transferred pages:   266450 (dup:   157809, rd:   108641, fd:        0) , new dirty pages:   226802 , remaining dirty pages:   226802
> > Iteration   2, duration:  10797 ms , transferred pages:   226507 (dup:   138637, rd:    61818, fd:    26052) , new dirty pages:   200769 , remaining dirty pages:   201064
> > Iteration   3, duration:   8900 ms , transferred pages:   199717 (dup:   127187, rd:    69340, fd:     3190) , new dirty pages:   203436 , remaining dirty pages:   204783
> > Iteration   4, duration:  10904 ms , transferred pages:   204127 (dup:   115211, rd:    85767, fd:     3149) , new dirty pages:   198407 , remaining dirty pages:   199063
> > Iteration   5, duration:  12109 ms , transferred pages:   198206 (dup:    99435, rd:    96956, fd:     1815) , new dirty pages:   213719 , remaining dirty pages:   214576
> > Iteration   6, duration:  16307 ms , transferred pages:   213595 (dup:    80422, rd:   116885, fd:    16288) , new dirty pages:   199637 , remaining dirty pages:   200618
> > Iteration   7, duration:  16915 ms , transferred pages:   198289 (dup:    60169, rd:   134208, fd:     3912) , new dirty pages:   199343 , remaining dirty pages:   201672
> > Iteration   8, duration:  19518 ms , transferred pages:   200452 (dup:    41014, rd:   156083, fd:     3355) , new dirty pages:   222927 , remaining dirty pages:   224147
> > 
> > 15. cpu2006.wrf (can not converge)
> > 
> > Iteration   1, duration:  18499 ms , transferred pages:   266380 (dup:   115285, rd:   151095, fd:        0) , new dirty pages:   112322 , remaining dirty pages:   112392
> > Iteration   2, duration:   9802 ms , transferred pages:   110025 (dup:    29917, rd:    65782, fd:    14326) , new dirty pages:    88855 , remaining dirty pages:    91222
> > Iteration   3, duration:   8199 ms , transferred pages:    89761 (dup:    22728, rd:    57262, fd:     9771) , new dirty pages:    58431 , remaining dirty pages:    59892
> > Iteration   4, duration:   5603 ms , transferred pages:    58502 (dup:    12716, rd:    41809, fd:     3977) , new dirty pages:    80556 , remaining dirty pages:    81946
> > Iteration   5, duration:   7101 ms , transferred pages:    79778 (dup:    21738, rd:    50896, fd:     7144) , new dirty pages:    62592 , remaining dirty pages:    64760
> > Iteration   6, duration:   5702 ms , transferred pages:    63388 (dup:    16793, rd:    42726, fd:     3869) , new dirty pages:    80747 , remaining dirty pages:    82119
> > Iteration   7, duration:   7000 ms , transferred pages:    80868 (dup:    23652, rd:    52194, fd:     5022) , new dirty pages:    84593 , remaining dirty pages:    85844
> > Iteration   8, duration:   7099 ms , transferred pages:    83799 (dup:    25769, rd:    51772, fd:     6258) , new dirty pages:    67951 , remaining dirty pages:    69996
> > Iteration   9, duration:   6303 ms , transferred pages:    68478 (dup:    16979, rd:    36490, fd:    15009) , new dirty pages:    81181 , remaining dirty pages:    82699
> > Iteration  10, duration:   7000 ms , transferred pages:    80724 (dup:    23503, rd:    52826, fd:     4395) , new dirty pages:    47930 , remaining dirty pages:    49905
> > 
> > 
> > 
> > > 
> > > > So I think "booting" and  "kernel compilation" should benefit a lot from this
> > > > improvement. The reason of "kernel compilation" would benefit is that some 
> > > > iterations take around 600ms, and if they are halved into 300ms, then the precopy
> > > > may have the chance to step into stop and copy phase.
> > > > 
> > > > On the other hand, "idle" and "web server" would not benefit a lot, because
> > > > most of the time are spent on the 1st iteration and little on the others.
> > > > 
> > > > As to the "zeusmp" and "memcached", although the time spent on the other iterations
> > > > but the 1st one may be halved, they still could not converge to stop and copy 
> > > > with the 300ms downtime.
> > > > 
> > > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> > > > 
> > > > 1. booting : begin to migrate when the VM is booting
> > > > 
> > > > Iteration   1, duration:   6997 ms , transferred pages:   266450 (n:    57269, d:   209181 ) , new dirty pages:    56414 , remaining dirty pages:    56414
> > > > Iteration   2, duration:   6497 ms , transferred pages:    54008 (n:    52701, d:     1307 ) , new dirty pages:    48053 , remaining dirty pages:    50459
> > > > Iteration   3, duration:   5800 ms , transferred pages:    48232 (n:    47444, d:      788 ) , new dirty pages:     9129 , remaining dirty pages:    11356
> > > > Iteration   4, duration:   1100 ms , transferred pages:     9091 (n:     8998, d:       93 ) , new dirty pages:      165 , remaining dirty pages:     2430
> > > > Iteration   5, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2430
> > > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> > > >        "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> > > > 
> > > > 2. idle
> > > > 
> > > > Iteration   1, duration:  14496 ms , transferred pages:   266450 (n:   118980, d:   147470 ) , new dirty pages:    17398 , remaining dirty pages:    17398
> > > > Iteration   2, duration:   1896 ms , transferred pages:    14953 (n:    14854, d:       99 ) , new dirty pages:     1849 , remaining dirty pages:     4294
> > > > Iteration   3, duration:    300 ms , transferred pages:     2454 (n:     2454, d:        0 ) , new dirty pages:        9 , remaining dirty pages:     1849
> > > > Iteration   4, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     1849
> > > > 
> > > > 3. kernel compilation (can not converge)
> > > > 
> > > > Iteration   1, duration:  20700 ms , transferred pages:   266450 (n:   169778, d:    96672 ) , new dirty pages:    40067 , remaining dirty pages:    40067
> > > > Iteration   2, duration:   4696 ms , transferred pages:    38401 (n:    37787, d:      614 ) , new dirty pages:     8852 , remaining dirty pages:    10518
> > > > Iteration   3, duration:   1000 ms , transferred pages:     8642 (n:     8180, d:      462 ) , new dirty pages:     6331 , remaining dirty pages:     8207
> > > > Iteration   4, duration:    700 ms , transferred pages:     6110 (n:     5726, d:      384 ) , new dirty pages:     5242 , remaining dirty pages:     7339
> > > > Iteration   5, duration:    600 ms , transferred pages:     5007 (n:     4908, d:       99 ) , new dirty pages:     4868 , remaining dirty pages:     7200
> > > > Iteration   6, duration:    600 ms , transferred pages:     5226 (n:     4908, d:      318 ) , new dirty pages:     6142 , remaining dirty pages:     8116
> > > > Iteration   7, duration:    700 ms , transferred pages:     5985 (n:     5726, d:      259 ) , new dirty pages:     5902 , remaining dirty pages:     8033
> > > > Iteration   8, duration:    701 ms , transferred pages:     5893 (n:     5726, d:      167 ) , new dirty pages:     7502 , remaining dirty pages:     9642
> > > > Iteration   9, duration:    900 ms , transferred pages:     7623 (n:     7362, d:      261 ) , new dirty pages:     6408 , remaining dirty pages:     8427
> > > > Iteration  10, duration:    700 ms , transferred pages:     6008 (n:     5726, d:      282 ) , new dirty pages:     8312 , remaining dirty pages:    10731
> > > > Iteration  11, duration:   1000 ms , transferred pages:     8353 (n:     8180, d:      173 ) , new dirty pages:     6874 , remaining dirty pages:     9252
> > > > Iteration  12, duration:    899 ms , transferred pages:     7477 (n:     7362, d:      115 ) , new dirty pages:     5573 , remaining dirty pages:     7348
> > > > Iteration  13, duration:    601 ms , transferred pages:     5099 (n:     4908, d:      191 ) , new dirty pages:     7671 , remaining dirty pages:     9920
> > > > Iteration  14, duration:    900 ms , transferred pages:     7586 (n:     7362, d:      224 ) , new dirty pages:     7359 , remaining dirty pages:     9693
> > > > Iteration  15, duration:    900 ms , transferred pages:     7682 (n:     7362, d:      320 ) , new dirty pages:     7371 , remaining dirty pages:     9382
> > > > 
> > > > 4. cpu2006.zeusmp (can not converge)
> > > > 
> > > > Iteration   1, duration:  21603 ms , transferred pages:   266450 (n:   176660, d:    89790 ) , new dirty pages:   145625 , remaining dirty pages:   145625
> > > > Iteration   2, duration:   8696 ms , transferred pages:   144389 (n:    70862, d:    73527 ) , new dirty pages:   125124 , remaining dirty pages:   126360
> > > > Iteration   3, duration:   6301 ms , transferred pages:   124057 (n:    51379, d:    72678 ) , new dirty pages:   122528 , remaining dirty pages:   124831
> > > > Iteration   4, duration:   6400 ms , transferred pages:   124330 (n:    52196, d:    72134 ) , new dirty pages:   124267 , remaining dirty pages:   124768
> > > > Iteration   5, duration:   6703 ms , transferred pages:   124034 (n:    54656, d:    69378 ) , new dirty pages:   124151 , remaining dirty pages:   124885
> > > > Iteration   6, duration:   6703 ms , transferred pages:   124357 (n:    54658, d:    69699 ) , new dirty pages:   124106 , remaining dirty pages:   124634
> > > > Iteration   7, duration:   6602 ms , transferred pages:   124568 (n:    53838, d:    70730 ) , new dirty pages:   133828 , remaining dirty pages:   133894
> > > > Iteration   8, duration:   7600 ms , transferred pages:   133030 (n:    62021, d:    71009 ) , new dirty pages:   126612 , remaining dirty pages:   127476
> > > > Iteration   9, duration:   7299 ms , transferred pages:   126511 (n:    59569, d:    66942 ) , new dirty pages:   122727 , remaining dirty pages:   123692
> > > > Iteration  10, duration:   6609 ms , transferred pages:   123692 (n:    54539, d:    69153 ) , new dirty pages:   122727 , remaining dirty pages:   122727
> > > > Iteration  11, duration:   6995 ms , transferred pages:   120347 (n:    56423, d:    63924 ) , new dirty pages:   121430 , remaining dirty pages:   123810
> > > > Iteration  12, duration:   6703 ms , transferred pages:   123040 (n:    54657, d:    68383 ) , new dirty pages:   122043 , remaining dirty pages:   122813
> > > > Iteration  13, duration:   7006 ms , transferred pages:   122353 (n:    57121, d:    65232 ) , new dirty pages:   133869 , remaining dirty pages:   134329
> > > > Iteration  14, duration:   8209 ms , transferred pages:   132325 (n:    66932, d:    65393 ) , new dirty pages:   126914 , remaining dirty pages:   128918
> > > > Iteration  15, duration:   7802 ms , transferred pages:   126931 (n:    63671, d:    63260 ) , new dirty pages:   122351 , remaining dirty pages:   124338
> > > > 
> > > > 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> > > > 
> > > > Iteration   1, duration:  30697 ms , transferred pages:   266450 (n:   251215, d:    15235 ) , new dirty pages:    30628 , remaining dirty pages:    30628
> > > > Iteration   2, duration:   3496 ms , transferred pages:    28859 (n:    28513, d:      346 ) , new dirty pages:     5805 , remaining dirty pages:     7574
> > > > Iteration   3, duration:    701 ms , transferred pages:     5746 (n:     5726, d:       20 ) , new dirty pages:     3433 , remaining dirty pages:     5261
> > > > Iteration   4, duration:    400 ms , transferred pages:     3281 (n:     3272, d:        9 ) , new dirty pages:     1539 , remaining dirty pages:     3519
> > > > Iteration   5, duration:    199 ms , transferred pages:     1653 (n:     1636, d:       17 ) , new dirty pages:      301 , remaining dirty pages:     2167
> > > > Iteration   6, duration:      1 ms , transferred pages:        0 (n:        0, d:        0 ) , new dirty pages:        0 , remaining dirty pages:     2167
> > > > 
> > > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> > > > 
> > > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5  (can not converge)
> > > > 
> > > > Iteration   1, duration:  42486 ms , transferred pages:  1568087 (n:  1216079, d:   352008 ) , new dirty pages:   571940 , remaining dirty pages:   581023
> > > > Iteration   2, duration:  19774 ms , transferred pages:   571700 (n:   567416, d:     4284 ) , new dirty pages:   331690 , remaining dirty pages:   341013
> > > > Iteration   3, duration:  11589 ms , transferred pages:   332187 (n:   332095, d:       92 ) , new dirty pages:   222725 , remaining dirty pages:   231551
> > > > Iteration   4, duration:   7790 ms , transferred pages:   223571 (n:   223499, d:       72 ) , new dirty pages:   157658 , remaining dirty pages:   165638
> > > > Iteration   5, duration:   5518 ms , transferred pages:   158056 (n:   157998, d:       58 ) , new dirty pages:   128130 , remaining dirty pages:   135712
> > > > Iteration   6, duration:   4442 ms , transferred pages:   127764 (n:   127701, d:       63 ) , new dirty pages:   104839 , remaining dirty pages:   112787
> > > > Iteration   7, duration:   3649 ms , transferred pages:   104581 (n:   104523, d:       58 ) , new dirty pages:   100736 , remaining dirty pages:   108942
> > > > Iteration   8, duration:   3532 ms , transferred pages:   101379 (n:   101315, d:       64 ) , new dirty pages:    87869 , remaining dirty pages:    95432
> > > > Iteration   9, duration:   3030 ms , transferred pages:    86841 (n:    86786, d:       55 ) , new dirty pages:    77505 , remaining dirty pages:    86096
> > > > Iteration  10, duration:   2709 ms , transferred pages:    77875 (n:    77814, d:       61 ) , new dirty pages:    77197 , remaining dirty pages:    85418
> > > > Iteration  11, duration:   2696 ms , transferred pages:    77107 (n:    77044, d:       63 ) , new dirty pages:    65010 , remaining dirty pages:    73321
> > > > Iteration  12, duration:   2308 ms , transferred pages:    66540 (n:    66484, d:       56 ) , new dirty pages:    64388 , remaining dirty pages:    71169
> > > > Iteration  13, duration:   2198 ms , transferred pages:    62953 (n:    62897, d:       56 ) , new dirty pages:    62773 , remaining dirty pages:    70989
> > > > Iteration  14, duration:   2214 ms , transferred pages:    63466 (n:    63411, d:       55 ) , new dirty pages:    67538 , remaining dirty pages:    75061
> > > > Iteration  15, duration:   2329 ms , transferred pages:    66924 (n:    66875, d:       49 ) , new dirty pages:    63580 , remaining dirty pages:    71717
> > > > Iteration  16, duration:   2252 ms , transferred pages:    64554 (n:    64539, d:       15 ) , new dirty pages:    63094 , remaining dirty pages:    70257
> > > > Iteration  17, duration:   2188 ms , transferred pages:    62697 (n:    62641, d:       56 ) , new dirty pages:    63016 , remaining dirty pages:    70576
> > > > Iteration  18, duration:   2171 ms , transferred pages:    62377 (n:    62322, d:       55 ) , new dirty pages:    56764 , remaining dirty pages:    64963
> > > > Iteration  19, duration:   2003 ms , transferred pages:    57382 (n:    57324, d:       58 ) , new dirty pages:    65307 , remaining dirty pages:    72888
> > > > Iteration  20, duration:   2240 ms , transferred pages:    64426 (n:    64364, d:       62 ) , new dirty pages:    61585 , remaining dirty pages:    70047
> > > > 
> > > > 
> > > > --
> > > > Chunguang Li, Ph.D. Candidate
> > > > Wuhan National Laboratory for Optoelectronics (WNLO)
> > > > Huazhong University of Science & Technology (HUST)
> > > > Wuhan, Hubei Prov., China
> > > > 
> > > > 
> > > > 
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > 
> > --
> > Chunguang Li, Ph.D. Candidate
> > Wuhan National Laboratory for Optoelectronics (WNLO)
> > Huazhong University of Science & Technology (HUST)
> > Wuhan, Hubei Prov., China
> > 
> > 
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-11-08 13:32 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-25  8:22 [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent Chunguang Li
2016-09-26 11:23 ` Dr. David Alan Gilbert
2016-09-26 14:55   ` Chunguang Li
2016-09-26 18:52     ` Dr. David Alan Gilbert
2016-09-27 12:28       ` Chunguang Li
2016-09-30  5:46     ` Amit Shah
2016-09-30  8:18       ` Chunguang Li
2016-10-08  7:55       ` Chunguang Li
2016-10-14 11:15         ` Dr. David Alan Gilbert
2016-11-03  8:25           ` Chunguang Li
2016-11-03  9:59             ` Li, Liang Z
2016-11-03 10:13             ` Li, Liang Z
2016-11-04  3:07               ` Chunguang Li
2016-11-04  4:50                 ` Li, Liang Z
2016-11-04  7:03                   ` Chunguang Li
2016-11-07 13:52                   ` Chunguang Li
2016-11-07 14:17                     ` Li, Liang Z
2016-11-08  5:27                       ` Chunguang Li
2016-11-07 14:44                     ` Li, Liang Z
2016-11-08 11:05             ` Dr. David Alan Gilbert
2016-11-08 13:40               ` Chunguang Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.