* [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
@ 2016-09-25 8:22 Chunguang Li
2016-09-26 11:23 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 21+ messages in thread
From: Chunguang Li @ 2016-09-25 8:22 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, amit.shah, pbonzini, stefanha
Hi all!
I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
Am I right about this consideration? If I am right, is there some advice to improve this?
Thanks,
Chunguang Li
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-25 8:22 [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent Chunguang Li
@ 2016-09-26 11:23 ` Dr. David Alan Gilbert
2016-09-26 14:55 ` Chunguang Li
0 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-09-26 11:23 UTC (permalink / raw)
To: Chunguang Li; +Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela
* Chunguang Li (lichunguang@hust.edu.cn) wrote:
> Hi all!
> I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
>
> Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
>
>
> Am I right about this consideration? If I am right, is there some advice to improve this?
I think you're right that this can happen; to clarify I think the
case you're talking about is:
Iteration 1
sync bitmap
start sending pages
page 'n' is modified - but hasn't been sent yet
page 'n' gets sent
Iteration 2
sync bitmap
'page n is shown as modified'
send page 'n' again
So you're right that is wasteful; I guess it's more wasteful
on big VMs with slow networks where the length of each iteration
is large.
Fixing it is not easy, because you have to be really careful
never to miss a page modification, even if the page is sent
about the same time it's dirtied.
One way would be to sync the dirty log from the kernel
in smaller chunks.
Dave
>
> Thanks,
> Chunguang Li
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-26 11:23 ` Dr. David Alan Gilbert
@ 2016-09-26 14:55 ` Chunguang Li
2016-09-26 18:52 ` Dr. David Alan Gilbert
2016-09-30 5:46 ` Amit Shah
0 siblings, 2 replies; 21+ messages in thread
From: Chunguang Li @ 2016-09-26 14:55 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela
> -----原始邮件-----
> 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 发送时间: 2016年9月26日 星期一
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > Hi all!
> > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> >
> > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> >
> >
> > Am I right about this consideration? If I am right, is there some advice to improve this?
>
> I think you're right that this can happen; to clarify I think the
> case you're talking about is:
>
> Iteration 1
> sync bitmap
> start sending pages
> page 'n' is modified - but hasn't been sent yet
> page 'n' gets sent
> Iteration 2
> sync bitmap
> 'page n is shown as modified'
> send page 'n' again
>
Yes,this is right the case I am talking about.
> So you're right that is wasteful; I guess it's more wasteful
> on big VMs with slow networks where the length of each iteration
> is large.
I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
Thanks,
Chunguang
>
> Fixing it is not easy, because you have to be really careful
> never to miss a page modification, even if the page is sent
> about the same time it's dirtied.
>
> One way would be to sync the dirty log from the kernel
> in smaller chunks.
>
> Dave
>
>
> >
> > Thanks,
> > Chunguang Li
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-26 14:55 ` Chunguang Li
@ 2016-09-26 18:52 ` Dr. David Alan Gilbert
2016-09-27 12:28 ` Chunguang Li
2016-09-30 5:46 ` Amit Shah
1 sibling, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-09-26 18:52 UTC (permalink / raw)
To: Chunguang Li; +Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela
* Chunguang Li (lichunguang@hust.edu.cn) wrote:
>
>
>
> > -----原始邮件-----
> > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 发送时间: 2016年9月26日 星期一
> > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> >
> > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > Hi all!
> > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > >
> > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > >
> > >
> > > Am I right about this consideration? If I am right, is there some advice to improve this?
> >
> > I think you're right that this can happen; to clarify I think the
> > case you're talking about is:
> >
> > Iteration 1
> > sync bitmap
> > start sending pages
> > page 'n' is modified - but hasn't been sent yet
> > page 'n' gets sent
> > Iteration 2
> > sync bitmap
> > 'page n is shown as modified'
> > send page 'n' again
> >
>
> Yes,this is right the case I am talking about.
>
> > So you're right that is wasteful; I guess it's more wasteful
> > on big VMs with slow networks where the length of each iteration
> > is large.
>
> I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
Yes, it's probably pretty bad; and we really need to do something like
split the sync into smaller chunks; there are other suggestions
for how to improve it (e.g. there's the page-modification-logging
changes).
However, I don't think you usually get really random writes, if you
do precopy rarely converges at all, because even without your
observation it changes lots and lots of pages.
Dave
> Thanks,
>
> Chunguang
>
> >
> > Fixing it is not easy, because you have to be really careful
> > never to miss a page modification, even if the page is sent
> > about the same time it's dirtied.
> >
> > One way would be to sync the dirty log from the kernel
> > in smaller chunks.
> >
> > Dave
> >
> >
> > >
> > > Thanks,
> > > Chunguang Li
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
>
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-26 18:52 ` Dr. David Alan Gilbert
@ 2016-09-27 12:28 ` Chunguang Li
0 siblings, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-09-27 12:28 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela
> -----原始邮件-----
> 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 发送时间: 2016年9月27日 星期二
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> Yes, it's probably pretty bad; and we really need to do something like
> split the sync into smaller chunks; there are other suggestions
> for how to improve it (e.g. there's the page-modification-logging
> changes).
>
> However, I don't think you usually get really random writes, if you
> do precopy rarely converges at all, because even without your
> observation it changes lots and lots of pages.
>
> Dave
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
I have read a little about the page-modification-logging. I think
it is only a more efficient way for dirty logging with better performance,
compared with write protection, but will not solve the problem we are
talking about.
The only idea to handle this, which I have come up with so far,
is to split the sync into smaller chunks that you have mentioned.
Maybe I can start from this idea to try to fix it.
If you come up with some other idea or suggestion, please let me know.
Thank you~
Chunguang
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-26 14:55 ` Chunguang Li
2016-09-26 18:52 ` Dr. David Alan Gilbert
@ 2016-09-30 5:46 ` Amit Shah
2016-09-30 8:18 ` Chunguang Li
2016-10-08 7:55 ` Chunguang Li
1 sibling, 2 replies; 21+ messages in thread
From: Amit Shah @ 2016-09-30 5:46 UTC (permalink / raw)
To: Chunguang Li
Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela
On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
>
>
>
> > -----原始邮件-----
> > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 发送时间: 2016年9月26日 星期一
> > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> >
> > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > Hi all!
> > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > >
> > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > >
> > >
> > > Am I right about this consideration? If I am right, is there some advice to improve this?
> >
> > I think you're right that this can happen; to clarify I think the
> > case you're talking about is:
> >
> > Iteration 1
> > sync bitmap
> > start sending pages
> > page 'n' is modified - but hasn't been sent yet
> > page 'n' gets sent
> > Iteration 2
> > sync bitmap
> > 'page n is shown as modified'
> > send page 'n' again
> >
>
> Yes,this is right the case I am talking about.
>
> > So you're right that is wasteful; I guess it's more wasteful
> > on big VMs with slow networks where the length of each iteration
> > is large.
>
> I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
It makes sense, can you get some perf numbers to show what kinds of
workloads get impacted the most? That would also help us to figure
out what kinds of speed improvements we can expect.
Amit
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-30 5:46 ` Amit Shah
@ 2016-09-30 8:18 ` Chunguang Li
2016-10-08 7:55 ` Chunguang Li
1 sibling, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-09-30 8:18 UTC (permalink / raw)
To: Amit Shah
Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela
> -----原始邮件-----
> 发件人: "Amit Shah" <amit.shah@redhat.com>
> 发送时间: 2016年9月30日 星期五
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> >
> >
> >
> > > -----原始邮件-----
> > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 发送时间: 2016年9月26日 星期一
> > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > >
> > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > Hi all!
> > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > >
> > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > >
> > > >
> > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > >
> > > I think you're right that this can happen; to clarify I think the
> > > case you're talking about is:
> > >
> > > Iteration 1
> > > sync bitmap
> > > start sending pages
> > > page 'n' is modified - but hasn't been sent yet
> > > page 'n' gets sent
> > > Iteration 2
> > > sync bitmap
> > > 'page n is shown as modified'
> > > send page 'n' again
> > >
> >
> > Yes,this is right the case I am talking about.
> >
> > > So you're right that is wasteful; I guess it's more wasteful
> > > on big VMs with slow networks where the length of each iteration
> > > is large.
> >
> > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
>
> It makes sense, can you get some perf numbers to show what kinds of
> workloads get impacted the most? That would also help us to figure
> out what kinds of speed improvements we can expect.
>
>
> Amit
Yes, I can pick up some workloads to get some perf numbers.
However, I don't know how to get the quantity of non-dirty pages we are
resending in each iteration. Instead, I can get the numbers below:
1. The time consuming of each iteration;
2. The quantity of pages transferred during each iteration;
3. The quantity of dirty pages (including not-really-dirty pages) produced
during each iteration.
With these numbers, we can only estimate the quantity of not-really-dirty
pages to some extent. How do you think of this test plan? Any suggestions?
Chunguang
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-09-30 5:46 ` Amit Shah
2016-09-30 8:18 ` Chunguang Li
@ 2016-10-08 7:55 ` Chunguang Li
2016-10-14 11:15 ` Dr. David Alan Gilbert
1 sibling, 1 reply; 21+ messages in thread
From: Chunguang Li @ 2016-10-08 7:55 UTC (permalink / raw)
To: Amit Shah
Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela
> -----原始邮件-----
> 发件人: "Amit Shah" <amit.shah@redhat.com>
> 发送时间: 2016年9月30日 星期五
> 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> >
> >
> >
> > > -----原始邮件-----
> > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 发送时间: 2016年9月26日 星期一
> > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > >
> > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > Hi all!
> > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > >
> > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > >
> > > >
> > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > >
> > > I think you're right that this can happen; to clarify I think the
> > > case you're talking about is:
> > >
> > > Iteration 1
> > > sync bitmap
> > > start sending pages
> > > page 'n' is modified - but hasn't been sent yet
> > > page 'n' gets sent
> > > Iteration 2
> > > sync bitmap
> > > 'page n is shown as modified'
> > > send page 'n' again
> > >
> >
> > Yes,this is right the case I am talking about.
> >
> > > So you're right that is wasteful; I guess it's more wasteful
> > > on big VMs with slow networks where the length of each iteration
> > > is large.
> >
> > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
>
> It makes sense, can you get some perf numbers to show what kinds of
> workloads get impacted the most? That would also help us to figure
> out what kinds of speed improvements we can expect.
>
>
> Amit
I have picked up 6 workloads and got the following statistics numbers
of every iteration (except the last stop-copy one) during precopy.
These numbers are obtained with the basic precopy migration, without
the capabilities like xbzrle or compression, etc. The network for the
migration is exclusive, with a separate network for the workloads.
They are both gigabit ethernet. I use qemu-2.5.1.
Three (booting, idle, web server) of them converged to the stop-copy phase,
with the given bandwidth and default downtime (300ms), while the other
three (kernel compilation, zeusmp, memcached) did not.
One page is "not-really-dirty", if it is written first and is sent later
(and not written again after that) during one iteration. I guess this
would not happen so often during the other iterations as during the 1st
iteration. Because all the pages of the VM are sent to the dest node during
the 1st iteration, while during the others, only part of the pages are sent.
So I think the "not-really-dirty" pages should be produced mainly during
the 1st iteration , and maybe very little during the other iterations.
If we could avoid resending the "not-really-dirty" pages, intuitively, I
think the time spent on Iteration 2 would be halved. This is a chain reaction,
because the dirty pages produced during Iteration 2 is halved, which incurs
that the time spent on Iteration 3 is halved, then Iteration 4, 5...
So I think "booting" and "kernel compilation" should benefit a lot from this
improvement. The reason of "kernel compilation" would benefit is that some
iterations take around 600ms, and if they are halved into 300ms, then the precopy
may have the chance to step into stop and copy phase.
On the other hand, "idle" and "web server" would not benefit a lot, because
most of the time are spent on the 1st iteration and little on the others.
As to the "zeusmp" and "memcached", although the time spent on the other iterations
but the 1st one may be halved, they still could not converge to stop and copy
with the 300ms downtime.
--------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
1. booting : begin to migrate when the VM is booting
Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414
Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459
Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356
Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430
Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430
(note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
"n" means "normal pages" and "d" means "duplicate (zero) pages".)
2. idle
Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398
Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294
Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849
Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849
3. kernel compilation (can not converge)
Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067
Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518
Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207
Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339
Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200
Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116
Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033
Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642
Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427
Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731
Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252
Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348
Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920
Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693
Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382
4. cpu2006.zeusmp (can not converge)
Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625
Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360
Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831
Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768
Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885
Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634
Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894
Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476
Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692
Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727
Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810
Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813
Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329
Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918
Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338
5. web server : An apache web server. The client is configured with 50 concurrent connections.
Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628
Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574
Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261
Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519
Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167
Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167
--------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge)
Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023
Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013
Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551
Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638
Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712
Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787
Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942
Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432
Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096
Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418
Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321
Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169
Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989
Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061
Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717
Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257
Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576
Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963
Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888
Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-10-08 7:55 ` Chunguang Li
@ 2016-10-14 11:15 ` Dr. David Alan Gilbert
2016-11-03 8:25 ` Chunguang Li
0 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-14 11:15 UTC (permalink / raw)
To: Chunguang Li; +Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela
* Chunguang Li (lichunguang@hust.edu.cn) wrote:
>
>
>
> > -----原始邮件-----
> > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > 发送时间: 2016年9月30日 星期五
> > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> >
> > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > >
> > >
> > >
> > > > -----原始邮件-----
> > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > 发送时间: 2016年9月26日 星期一
> > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > >
> > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > Hi all!
> > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > >
> > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > >
> > > > >
> > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > >
> > > > I think you're right that this can happen; to clarify I think the
> > > > case you're talking about is:
> > > >
> > > > Iteration 1
> > > > sync bitmap
> > > > start sending pages
> > > > page 'n' is modified - but hasn't been sent yet
> > > > page 'n' gets sent
> > > > Iteration 2
> > > > sync bitmap
> > > > 'page n is shown as modified'
> > > > send page 'n' again
> > > >
> > >
> > > Yes,this is right the case I am talking about.
> > >
> > > > So you're right that is wasteful; I guess it's more wasteful
> > > > on big VMs with slow networks where the length of each iteration
> > > > is large.
> > >
> > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> >
> > It makes sense, can you get some perf numbers to show what kinds of
> > workloads get impacted the most? That would also help us to figure
> > out what kinds of speed improvements we can expect.
> >
> >
> > Amit
>
> I have picked up 6 workloads and got the following statistics numbers
> of every iteration (except the last stop-copy one) during precopy.
> These numbers are obtained with the basic precopy migration, without
> the capabilities like xbzrle or compression, etc. The network for the
> migration is exclusive, with a separate network for the workloads.
> They are both gigabit ethernet. I use qemu-2.5.1.
>
> Three (booting, idle, web server) of them converged to the stop-copy phase,
> with the given bandwidth and default downtime (300ms), while the other
> three (kernel compilation, zeusmp, memcached) did not.
>
> One page is "not-really-dirty", if it is written first and is sent later
> (and not written again after that) during one iteration. I guess this
> would not happen so often during the other iterations as during the 1st
> iteration. Because all the pages of the VM are sent to the dest node during
> the 1st iteration, while during the others, only part of the pages are sent.
> So I think the "not-really-dirty" pages should be produced mainly during
> the 1st iteration , and maybe very little during the other iterations.
>
> If we could avoid resending the "not-really-dirty" pages, intuitively, I
> think the time spent on Iteration 2 would be halved. This is a chain reaction,
> because the dirty pages produced during Iteration 2 is halved, which incurs
> that the time spent on Iteration 3 is halved, then Iteration 4, 5...
Yes; these numbers don't show how many of them are false dirty though.
One problem is thinking about pages that have been redirtied, if the page is dirtied
after the sync but before the network write then it's the false-dirty that
you're describing.
However, if the page is being written a few times, and so it would have been written
after the network write then it isn't a false-dirty.
You might be able to figure that out with some kernel tracing of when the dirtying
happens, but it might be easier to write the fix!
Dave
> So I think "booting" and "kernel compilation" should benefit a lot from this
> improvement. The reason of "kernel compilation" would benefit is that some
> iterations take around 600ms, and if they are halved into 300ms, then the precopy
> may have the chance to step into stop and copy phase.
>
> On the other hand, "idle" and "web server" would not benefit a lot, because
> most of the time are spent on the 1st iteration and little on the others.
>
> As to the "zeusmp" and "memcached", although the time spent on the other iterations
> but the 1st one may be halved, they still could not converge to stop and copy
> with the 300ms downtime.
>
> --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
>
> 1. booting : begin to migrate when the VM is booting
>
> Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414
> Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459
> Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356
> Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430
> Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430
> (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> "n" means "normal pages" and "d" means "duplicate (zero) pages".)
>
> 2. idle
>
> Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398
> Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294
> Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849
> Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849
>
> 3. kernel compilation (can not converge)
>
> Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067
> Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518
> Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207
> Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339
> Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200
> Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116
> Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033
> Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642
> Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427
> Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731
> Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252
> Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348
> Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920
> Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693
> Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382
>
> 4. cpu2006.zeusmp (can not converge)
>
> Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625
> Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360
> Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831
> Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768
> Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885
> Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634
> Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894
> Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476
> Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692
> Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727
> Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810
> Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813
> Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329
> Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918
> Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338
>
> 5. web server : An apache web server. The client is configured with 50 concurrent connections.
>
> Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628
> Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574
> Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261
> Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519
> Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167
> Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167
>
> --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
>
> 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge)
>
> Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023
> Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013
> Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551
> Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638
> Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712
> Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787
> Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942
> Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432
> Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096
> Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418
> Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321
> Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169
> Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989
> Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061
> Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717
> Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257
> Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576
> Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963
> Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888
> Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047
>
>
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
>
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-10-14 11:15 ` Dr. David Alan Gilbert
@ 2016-11-03 8:25 ` Chunguang Li
2016-11-03 9:59 ` Li, Liang Z
` (2 more replies)
0 siblings, 3 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-03 8:25 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela
> -----Original Messages-----
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Sent Time: Friday, October 14, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> >
> >
> >
> > > -----原始邮件-----
> > > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > > 发送时间: 2016年9月30日 星期五
> > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > >
> > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > >
> > > >
> > > >
> > > > > -----原始邮件-----
> > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > 发送时间: 2016年9月26日 星期一
> > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > >
> > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > > Hi all!
> > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > >
> > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > >
> > > > > >
> > > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > >
> > > > > I think you're right that this can happen; to clarify I think the
> > > > > case you're talking about is:
> > > > >
> > > > > Iteration 1
> > > > > sync bitmap
> > > > > start sending pages
> > > > > page 'n' is modified - but hasn't been sent yet
> > > > > page 'n' gets sent
> > > > > Iteration 2
> > > > > sync bitmap
> > > > > 'page n is shown as modified'
> > > > > send page 'n' again
> > > > >
> > > >
> > > > Yes,this is right the case I am talking about.
> > > >
> > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > on big VMs with slow networks where the length of each iteration
> > > > > is large.
> > > >
> > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > >
> > > It makes sense, can you get some perf numbers to show what kinds of
> > > workloads get impacted the most? That would also help us to figure
> > > out what kinds of speed improvements we can expect.
> > >
> > >
> > > Amit
> >
> > I have picked up 6 workloads and got the following statistics numbers
> > of every iteration (except the last stop-copy one) during precopy.
> > These numbers are obtained with the basic precopy migration, without
> > the capabilities like xbzrle or compression, etc. The network for the
> > migration is exclusive, with a separate network for the workloads.
> > They are both gigabit ethernet. I use qemu-2.5.1.
> >
> > Three (booting, idle, web server) of them converged to the stop-copy phase,
> > with the given bandwidth and default downtime (300ms), while the other
> > three (kernel compilation, zeusmp, memcached) did not.
> >
> > One page is "not-really-dirty", if it is written first and is sent later
> > (and not written again after that) during one iteration. I guess this
> > would not happen so often during the other iterations as during the 1st
> > iteration. Because all the pages of the VM are sent to the dest node during
> > the 1st iteration, while during the others, only part of the pages are sent.
> > So I think the "not-really-dirty" pages should be produced mainly during
> > the 1st iteration , and maybe very little during the other iterations.
> >
> > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > think the time spent on Iteration 2 would be halved. This is a chain reaction,
> > because the dirty pages produced during Iteration 2 is halved, which incurs
> > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
>
> Yes; these numbers don't show how many of them are false dirty though.
>
> One problem is thinking about pages that have been redirtied, if the page is dirtied
> after the sync but before the network write then it's the false-dirty that
> you're describing.
>
> However, if the page is being written a few times, and so it would have been written
> after the network write then it isn't a false-dirty.
>
> You might be able to figure that out with some kernel tracing of when the dirtying
> happens, but it might be easier to write the fix!
>
> Dave
Hi, I have made some new progress now.
To tell how many false dirty pages there are exactly in each iteration, I malloc a
buffer in memory as big as the size of the whole VM memory. When a page is
transferred to the dest node, it is copied to the buffer; During the next iteration,
if one page is transferred, it is compared to the old one in the buffer, and the
old one will be replaced for next comparison if it is really dirty. Thus, we are now
able to get the exact number of false dirty pages.
This time, I use 15 workloads to get the statistic number. They are:
1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific
computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick
up these 11 benchmarks because compared to others, they have bigger memory
occupation and higher memory dirty rate. Thus most of them could not converge
to stop-and-copy using the default migration speed (32MB/s).
2. kernel compilation
3. idle VM
4. Apache web server which serves static content
(the above workloads are all running in VM with 1 vcpu and 1GB memory, and the
migration speed is the default 32MB/s)
5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache.
After filling up the 4GB cache, a client writes the cache at a constant speed
during migration. This time, migration speed has no limit, and is up to the
capability of 1Gbps Ethernet.
Summarize the results first: (and you can read the precise number below)
1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations)
of false dirty pages out of all the dirty pages since iteration 2 (and the big
proportion lasts during the following iterations). They are cpu2006.zeusmp,
cpu2006.bzip2, cpu2006.mcf, and memcached.
2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even
though the proportion of false dirty pages is big since iteration 2, the space to
optimize is small.
3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not
in the other iterations.
4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false
dirty pages since iteration 2. So the spaces to optimize for them are small.
Now I want to talk a little more about the reasons why false dirty pages are produced.
The first reason is what we have discussed before---the mechanism to track the dirty
pages.
And then I come up with another reason. Here is the situation: a write operation to one
memory page happens, but it doesn't change any content of the page. So it's "write but
not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments
to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006
benchmark suit. According to his results, general workloads has a little proportion (<10%)
of "write but not dirty" out of all the write operations, while few workloads have higher
proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would
happen, it just happened.
So these two reasons contribute to the false dirty pages. To optimize, I compute and store
the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its
SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a
false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new
hash replaces the old one for next comparison.
The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1
hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which
is relatively small.
As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems.
They have proven that the probability of hash collision is far smaller than disk hardware fault,
so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the
same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery.
Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have
big proportions of false dirty pages, the improvement is remarkable. Without optimization,
they either can not converge to stop-and-copy, or take a very long time to complete. With the
SHA1 hash method, all of them now complete in a relatively short time.
For the reason I have talked above, the other workloads don't get notable improvements from the
optimization. So below, I only show the exact number after optimization for the 4 workloads with
remarkable improvements.
Any comments or suggestions?
Below is the experiments data:
(
"dup" means zero page, this kind of pages takes very little migration time and network
resources, so they are always not regard as dirty pages in my numbers;
"rd" means really dirty pages;
"fd" means false dirty pages;
The numbers refer to the quantities of pages.
)
------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)-------------------
1. memcached
----- original pre-copy (can not converge): -----
Iteration 1, duration: 42111 ms , transferred pages: 1568788 (dup: 416239, rd: 1152549, fd: 0) , new dirty pages: 499015 , remaining dirty pages: 507397
Iteration 2, duration: 17208 ms , transferred pages: 498946 (dup: 5456, rd: 160206, fd: 333284) , new dirty pages: 261237 , remaining dirty pages: 269688
Iteration 3, duration: 9134 ms , transferred pages: 262377 (dup: 519, rd: 111900, fd: 149958) , new dirty pages: 170281 , remaining dirty pages: 177592
Iteration 4, duration: 5920 ms , transferred pages: 169966 (dup: 87, rd: 82487, fd: 87392) , new dirty pages: 121154 , remaining dirty pages: 128780
Iteration 5, duration: 4239 ms , transferred pages: 121551 (dup: 81, rd: 64120, fd: 57350) , new dirty pages: 100976 , remaining dirty pages: 108205
Iteration 6, duration: 3495 ms , transferred pages: 100353 (dup: 90, rd: 56021, fd: 44242) , new dirty pages: 74547 , remaining dirty pages: 82399
Iteration 7, duration: 2583 ms , transferred pages: 74160 (dup: 56, rd: 38016, fd: 36088) , new dirty pages: 58209 , remaining dirty pages: 66448
Iteration 8, duration: 2039 ms , transferred pages: 58534 (dup: 81, rd: 26885, fd: 31568) , new dirty pages: 43511 , remaining dirty pages: 51425
Iteration 9, duration: 1513 ms , transferred pages: 43484 (dup: 55, rd: 26641, fd: 16788) , new dirty pages: 43722 , remaining dirty pages: 51663
Iteration 10, duration: 1521 ms , transferred pages: 43676 (dup: 62, rd: 26463, fd: 17151) , new dirty pages: 35347 , remaining dirty pages: 43334
Iteration 11, duration: 1230 ms , transferred pages: 35287 (dup: 0, rd: 21293, fd: 13994) , new dirty pages: 28851 , remaining dirty pages: 36898
Iteration 12, duration: 1031 ms , transferred pages: 29651 (dup: 82, rd: 18143, fd: 11426) , new dirty pages: 27062 , remaining dirty pages: 34309
Iteration 13, duration: 917 ms , transferred pages: 26385 (dup: 56, rd: 14149, fd: 12180) , new dirty pages: 22723 , remaining dirty pages: 30647
Iteration 14, duration: 762 ms , transferred pages: 21902 (dup: 55, rd: 16355, fd: 5492) , new dirty pages: 18208 , remaining dirty pages: 26953
Iteration 15, duration: 650 ms , transferred pages: 18636 (dup: 0, rd: 11943, fd: 6693) , new dirty pages: 16085 , remaining dirty pages: 24402
Iteration 16, duration: 554 ms , transferred pages: 15946 (dup: 56, rd: 9527, fd: 6363) , new dirty pages: 14766 , remaining dirty pages: 23222
Iteration 17, duration: 538 ms , transferred pages: 15434 (dup: 0, rd: 9779, fd: 5655) , new dirty pages: 13381 , remaining dirty pages: 21169
Iteration 18, duration: 487 ms , transferred pages: 14089 (dup: 81, rd: 7737, fd: 6271) , new dirty pages: 13325 , remaining dirty pages: 20405
Iteration 19, duration: 428 ms , transferred pages: 12232 (dup: 0, rd: 8488, fd: 3744) , new dirty pages: 10274 , remaining dirty pages: 18447
Iteration 20, duration: 377 ms , transferred pages: 10887 (dup: 56, rd: 6362, fd: 4469) , new dirty pages: 9708 , remaining dirty pages: 17268
Iteration 21, duration: 320 ms , transferred pages: 9222 (dup: 0, rd: 5789, fd: 3433) , new dirty pages: 8015 , remaining dirty pages: 16061
Iteration 22, duration: 268 ms , transferred pages: 7621 (dup: 0, rd: 6204, fd: 1417) , new dirty pages: 7227 , remaining dirty pages: 15667
Iteration 23, duration: 269 ms , transferred pages: 7813 (dup: 56, rd: 4410, fd: 3347) , new dirty pages: 7591 , remaining dirty pages: 15445
Iteration 24, duration: 271 ms , transferred pages: 7749 (dup: 0, rd: 4565, fd: 3184) , new dirty pages: 15126 , remaining dirty pages: 22822
Iteration 25, duration: 549 ms , transferred pages: 15818 (dup: 60, rd: 10545, fd: 5213) , new dirty pages: 14559 , remaining dirty pages: 21563
Iteration 26, duration: 499 ms , transferred pages: 14281 (dup: 3, rd: 8760, fd: 5518) , new dirty pages: 11947 , remaining dirty pages: 19229
Iteration 27, duration: 376 ms , transferred pages: 10823 (dup: 25, rd: 6550, fd: 4248) , new dirty pages: 8561 , remaining dirty pages: 16967
Iteration 28, duration: 324 ms , transferred pages: 9350 (dup: 31, rd: 5292, fd: 4027) , new dirty pages: 8655 , remaining dirty pages: 16272
Iteration 29, duration: 274 ms , transferred pages: 7813 (dup: 0, rd: 6088, fd: 1725) , new dirty pages: 6300 , remaining dirty pages: 14759
Iteration 30, duration: 218 ms , transferred pages: 6340 (dup: 45, rd: 3196, fd: 3099) , new dirty pages: 5143 , remaining dirty pages: 13562
----- after optimization: -----
Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979
Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479
Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100
Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414
Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536
Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516
Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802
Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060
Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973
Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014
Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516
Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842
Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871
Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989
Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989
total time: 54693 milliseconds
2. cpu2006.zeusmp
----- original pre-copy (can not converge): -----
Iteration 1, duration: 21112 ms , transferred pages: 266450 (dup: 93385, rd: 173065, fd: 0) , new dirty pages: 127866 , remaining dirty pages: 127866
Iteration 2, duration: 6192 ms , transferred pages: 125662 (dup: 75762, rd: 17389, fd: 32511) , new dirty pages: 131655 , remaining dirty pages: 133859
Iteration 3, duration: 6699 ms , transferred pages: 131937 (dup: 77298, rd: 20320, fd: 34319) , new dirty pages: 121027 , remaining dirty pages: 122949
Iteration 4, duration: 5999 ms , transferred pages: 122512 (dup: 73588, rd: 17236, fd: 31688) , new dirty pages: 122759 , remaining dirty pages: 123196
Iteration 5, duration: 5804 ms , transferred pages: 122717 (dup: 75436, rd: 19016, fd: 28265) , new dirty pages: 123697 , remaining dirty pages: 124176
Iteration 6, duration: 5698 ms , transferred pages: 123708 (dup: 77249, rd: 18022, fd: 28437) , new dirty pages: 121838 , remaining dirty pages: 122306
Iteration 7, duration: 5515 ms , transferred pages: 122306 (dup: 76727, rd: 14819, fd: 30760) , new dirty pages: 122382 , remaining dirty pages: 122382
Iteration 8, duration: 6086 ms , transferred pages: 120825 (dup: 71834, rd: 15987, fd: 33004) , new dirty pages: 121587 , remaining dirty pages: 123144
Iteration 9, duration: 5899 ms , transferred pages: 120964 (dup: 72860, rd: 18191, fd: 29913) , new dirty pages: 120391 , remaining dirty pages: 122571
Iteration 10, duration: 5801 ms , transferred pages: 121425 (dup: 74140, rd: 20722, fd: 26563) , new dirty pages: 122302 , remaining dirty pages: 123448
Iteration 11, duration: 5909 ms , transferred pages: 123448 (dup: 74735, rd: 19678, fd: 29035) , new dirty pages: 123258 , remaining dirty pages: 123258
Iteration 12, duration: 6293 ms , transferred pages: 121211 (dup: 70442, rd: 18128, fd: 32641) , new dirty pages: 123623 , remaining dirty pages: 125670
Iteration 13, duration: 6398 ms , transferred pages: 124897 (dup: 72701, rd: 21134, fd: 31062) , new dirty pages: 122355 , remaining dirty pages: 123128
Iteration 14, duration: 6301 ms , transferred pages: 121893 (dup: 70514, rd: 23470, fd: 27909) , new dirty pages: 120980 , remaining dirty pages: 122215
Iteration 15, duration: 6304 ms , transferred pages: 121389 (dup: 70005, rd: 21731, fd: 29653) , new dirty pages: 121628 , remaining dirty pages: 122454
Iteration 16, duration: 6398 ms , transferred pages: 122164 (dup: 69962, rd: 24376, fd: 27826) , new dirty pages: 122246 , remaining dirty pages: 122536
Iteration 17, duration: 6201 ms , transferred pages: 121548 (dup: 70984, rd: 23915, fd: 26649) , new dirty pages: 121460 , remaining dirty pages: 122448
Iteration 18, duration: 6401 ms , transferred pages: 122272 (dup: 70072, rd: 22261, fd: 29939) , new dirty pages: 123518 , remaining dirty pages: 123694
Iteration 19, duration: 7003 ms , transferred pages: 121873 (dup: 64754, rd: 27325, fd: 29794) , new dirty pages: 120568 , remaining dirty pages: 122389
Iteration 20, duration: 6400 ms , transferred pages: 121422 (dup: 69221, rd: 25300, fd: 26901) , new dirty pages: 121229 , remaining dirty pages: 122196
Iteration 21, duration: 6703 ms , transferred pages: 119895 (dup: 65232, rd: 25877, fd: 28786) , new dirty pages: 123284 , remaining dirty pages: 125585
Iteration 22, duration: 6902 ms , transferred pages: 123884 (dup: 67582, rd: 29020, fd: 27282) , new dirty pages: 122057 , remaining dirty pages: 123758
Iteration 23, duration: 6800 ms , transferred pages: 122010 (dup: 66529, rd: 30644, fd: 24837) , new dirty pages: 120916 , remaining dirty pages: 122664
Iteration 24, duration: 7202 ms , transferred pages: 121951 (dup: 63188, rd: 31105, fd: 27658) , new dirty pages: 122715 , remaining dirty pages: 123428
Iteration 25, duration: 7202 ms , transferred pages: 122919 (dup: 64161, rd: 32063, fd: 26695) , new dirty pages: 123180 , remaining dirty pages: 123689
Iteration 26, duration: 7404 ms , transferred pages: 123092 (dup: 62694, rd: 33459, fd: 26939) , new dirty pages: 122149 , remaining dirty pages: 122746
Iteration 27, duration: 7205 ms , transferred pages: 120427 (dup: 61664, rd: 34344, fd: 24419) , new dirty pages: 120299 , remaining dirty pages: 122618
Iteration 28, duration: 7100 ms , transferred pages: 121074 (dup: 63130, rd: 32403, fd: 25541) , new dirty pages: 122984 , remaining dirty pages: 124528
Iteration 29, duration: 7904 ms , transferred pages: 124060 (dup: 59564, rd: 35631, fd: 28865) , new dirty pages: 127080 , remaining dirty pages: 127548
Iteration 30, duration: 7906 ms , transferred pages: 127518 (dup: 63029, rd: 34416, fd: 30073) , new dirty pages: 125028 , remaining dirty pages: 125058
----- after optimization: -----
Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843
Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945
Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929
Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916
Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302
Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500
Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520
Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524
total time: 27225 milliseconds
3. cpu2006.bzip2
----- original pre-copy: -----
Iteration 1, duration: 18306 ms , transferred pages: 266450 (dup: 116569, rd: 149881, fd: 0) , new dirty pages: 106299 , remaining dirty pages: 106299
Iteration 2, duration: 10694 ms , transferred pages: 104611 (dup: 17550, rd: 10536, fd: 76525) , new dirty pages: 34394 , remaining dirty pages: 36082
Iteration 3, duration: 2998 ms , transferred pages: 34442 (dup: 9924, rd: 12254, fd: 12264) , new dirty pages: 6419 , remaining dirty pages: 8059
Iteration 4, duration: 699 ms , transferred pages: 5748 (dup: 22, rd: 2583, fd: 3143) , new dirty pages: 1226 , remaining dirty pages: 3537
Iteration 5, duration: 200 ms , transferred pages: 1636 (dup: 0, rd: 1194, fd: 442) , new dirty pages: 478 , remaining dirty pages: 2379
Iteration 6, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2379
----- after optimization: -----
Iteration 1, duration: 13995 ms , transferred pages: 266314 (dup: 152118, rd: 114196) , new dirty pages: 97009 , remaining dirty pages: 97145
Iteration 2, duration: 1215 ms , transferred pages: 33400 (dup: 26745, rd: 6655) , new dirty pages: 12866 , remaining dirty pages: 14017
Iteration 3, duration: 701 ms , transferred pages: 5774 (dup: 48, rd: 5726) , new dirty pages: 6342 , remaining dirty pages: 8761
Iteration 4, duration: 500 ms , transferred pages: 4111 (dup: 21, rd: 4090) , new dirty pages: 4311 , remaining dirty pages: 6485
Iteration 5, duration: 400 ms , transferred pages: 3273 (dup: 1, rd: 3272) , new dirty pages: 3034 , remaining dirty pages: 5431
Iteration 6, duration: 301 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2094 , remaining dirty pages: 4472
Iteration 7, duration: 299 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2066 , remaining dirty pages: 4082
Iteration 8, duration: 202 ms , transferred pages: 1636 (dup: 0, rd: 1636) , new dirty pages: 2881 , remaining dirty pages: 4648
Iteration 9, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 4775 , remaining dirty pages: 6778
Iteration 10, duration: 400 ms , transferred pages: 3281 (dup: 9, rd: 3272) , new dirty pages: 3757 , remaining dirty pages: 5576
Iteration 11, duration: 401 ms , transferred pages: 3279 (dup: 7, rd: 3272) , new dirty pages: 6980 , remaining dirty pages: 8906
Iteration 12, duration: 500 ms , transferred pages: 7118 (dup: 3035, rd: 4083) , new dirty pages: 10774 , remaining dirty pages: 11922
Iteration 13, duration: 116 ms , transferred pages: 11706 (dup: 10152, rd: 1554) , new dirty pages: 1326 , remaining dirty pages: 1326
Iteration 14, duration: 117 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 1326
total time: 19479 milliseconds
4. cpu2006.mcf
----- original pre-copy: -----
Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6925, rd: 259525, fd: 0) , new dirty pages: 244403 , remaining dirty pages: 244403
Iteration 2, duration: 29603 ms , transferred pages: 242275 (dup: 377, rd: 224001, fd: 17897) , new dirty pages: 227335 , remaining dirty pages: 229463
Iteration 3, duration: 27806 ms , transferred pages: 227573 (dup: 169, rd: 65681, fd: 161723) , new dirty pages: 195593 , remaining dirty pages: 197483
Iteration 4, duration: 23907 ms , transferred pages: 195543 (dup: 41, rd: 39838, fd: 155664) , new dirty pages: 215066 , remaining dirty pages: 217006
Iteration 5, duration: 26305 ms , transferred pages: 215289 (dup: 155, rd: 33082, fd: 182052) , new dirty pages: 111098 , remaining dirty pages: 112815
Iteration 6, duration: 13502 ms , transferred pages: 110452 (dup: 22, rd: 26793, fd: 83637) , new dirty pages: 161054 , remaining dirty pages: 163417
Iteration 7, duration: 19705 ms , transferred pages: 161266 (dup: 120, rd: 33818, fd: 127328) , new dirty pages: 220562 , remaining dirty pages: 222713
Iteration 8, duration: 27003 ms , transferred pages: 220881 (dup: 21, rd: 215721, fd: 5139) , new dirty pages: 219787 , remaining dirty pages: 221619
Iteration 9, duration: 26802 ms , transferred pages: 219248 (dup: 24, rd: 84648, fd: 134576) , new dirty pages: 207959 , remaining dirty pages: 210330
Iteration 10, duration: 25411 ms , transferred pages: 207916 (dup: 144, rd: 35842, fd: 171930) , new dirty pages: 144442 , remaining dirty pages: 146856
Iteration 11, duration: 17714 ms , transferred pages: 144804 (dup: 18, rd: 25414, fd: 119372) , new dirty pages: 205127 , remaining dirty pages: 207179
Iteration 12, duration: 25112 ms , transferred pages: 205446 (dup: 128, rd: 23197, fd: 182121) , new dirty pages: 167319 , remaining dirty pages: 169052
Iteration 13, duration: 20411 ms , transferred pages: 166886 (dup: 14, rd: 21960, fd: 144912) , new dirty pages: 221592 , remaining dirty pages: 223758
Iteration 14, duration: 27126 ms , transferred pages: 221800 (dup: 122, rd: 42368, fd: 179310) , new dirty pages: 233630 , remaining dirty pages: 235588
Iteration 15, duration: 28517 ms , transferred pages: 233321 (dup: 191, rd: 222528, fd: 10602) , new dirty pages: 224282 , remaining dirty pages: 226549
Iteration 16, duration: 27422 ms , transferred pages: 224187 (dup: 55, rd: 45773, fd: 178359) , new dirty pages: 209815 , remaining dirty pages: 212177
Iteration 17, duration: 25723 ms , transferred pages: 210260 (dup: 34, rd: 79405, fd: 130821) , new dirty pages: 220297 , remaining dirty pages: 222214
Iteration 18, duration: 26920 ms , transferred pages: 220056 (dup: 14, rd: 214128, fd: 5914) , new dirty pages: 192015 , remaining dirty pages: 194173
Iteration 19, duration: 23520 ms , transferred pages: 192239 (dup: 9, rd: 25140, fd: 167090) , new dirty pages: 96450 , remaining dirty pages: 98384
Iteration 20, duration: 11805 ms , transferred pages: 96538 (dup: 14, rd: 7424, fd: 89100) , new dirty pages: 6978 , remaining dirty pages: 8824
Iteration 21, duration: 799 ms , transferred pages: 6545 (dup: 1, rd: 1802, fd: 4742) , new dirty pages: 138 , remaining dirty pages: 2417
Iteration 22, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2417
----- after optimization: -----
Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209
Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571
Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478
Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172
Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305
Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675
Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265
Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668
Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014
Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214
Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725
Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903
Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260
Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266
Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792
Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639
Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170
Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547
Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521
Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054
Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624
Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265
Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813
Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608
Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311
Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294
Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936
Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423
Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292
Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457
Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304
Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815
Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670
Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422
Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222
Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337
Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251
Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251
total time: 97919 milliseconds
------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)-------------------
5. idle
Iteration 1, duration: 14702 ms , transferred pages: 266450 (dup: 146393, rd: 120057, fd: 0) , new dirty pages: 14595 , remaining dirty pages: 14595
Iteration 2, duration: 1592 ms , transferred pages: 12412 (dup: 103, rd: 3280, fd: 9029) , new dirty pages: 218 , remaining dirty pages: 2401
Iteration 3, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2401
6. kernel compilation (can not converge)
Iteration 1, duration: 20607 ms , transferred pages: 266450 (dup: 97552, rd: 168898, fd: 0) , new dirty pages: 19293 , remaining dirty pages: 19293
Iteration 2, duration: 2092 ms , transferred pages: 17176 (dup: 597, rd: 8625, fd: 7954) , new dirty pages: 8318 , remaining dirty pages: 10435
Iteration 3, duration: 1000 ms , transferred pages: 8484 (dup: 304, rd: 6256, fd: 1924) , new dirty pages: 8736 , remaining dirty pages: 10687
Iteration 4, duration: 1000 ms , transferred pages: 8435 (dup: 255, rd: 7089, fd: 1091) , new dirty pages: 7627 , remaining dirty pages: 9879
Iteration 5, duration: 900 ms , transferred pages: 7553 (dup: 191, rd: 5602, fd: 1760) , new dirty pages: 7287 , remaining dirty pages: 9613
Iteration 6, duration: 900 ms , transferred pages: 7620 (dup: 258, rd: 5761, fd: 1601) , new dirty pages: 8958 , remaining dirty pages: 10951
Iteration 7, duration: 1099 ms , transferred pages: 9309 (dup: 311, rd: 8051, fd: 947) , new dirty pages: 7189 , remaining dirty pages: 8831
Iteration 8, duration: 800 ms , transferred pages: 6832 (dup: 288, rd: 5717, fd: 827) , new dirty pages: 5782 , remaining dirty pages: 7781
Iteration 9, duration: 701 ms , transferred pages: 5875 (dup: 149, rd: 4005, fd: 1721) , new dirty pages: 4587 , remaining dirty pages: 6493
Iteration 10, duration: 500 ms , transferred pages: 4234 (dup: 144, rd: 3057, fd: 1033) , new dirty pages: 7352 , remaining dirty pages: 9611
Iteration 11, duration: 900 ms , transferred pages: 7759 (dup: 397, rd: 6563, fd: 799) , new dirty pages: 6686 , remaining dirty pages: 8538
Iteration 12, duration: 800 ms , transferred pages: 6808 (dup: 264, rd: 6017, fd: 527) , new dirty pages: 6871 , remaining dirty pages: 8601
Iteration 13, duration: 800 ms , transferred pages: 6775 (dup: 231, rd: 5722, fd: 822) , new dirty pages: 7540 , remaining dirty pages: 9366
Iteration 14, duration: 900 ms , transferred pages: 7507 (dup: 145, rd: 5900, fd: 1462) , new dirty pages: 7581 , remaining dirty pages: 9440
Iteration 15, duration: 900 ms , transferred pages: 7630 (dup: 268, rd: 6211, fd: 1151) , new dirty pages: 7268 , remaining dirty pages: 9078
Iteration 16, duration: 800 ms , transferred pages: 6759 (dup: 215, rd: 5763, fd: 781) , new dirty pages: 6861 , remaining dirty pages: 9180
Iteration 17, duration: 800 ms , transferred pages: 6838 (dup: 294, rd: 6037, fd: 507) , new dirty pages: 6196 , remaining dirty pages: 8538
Iteration 18, duration: 800 ms , transferred pages: 6852 (dup: 308, rd: 4905, fd: 1639) , new dirty pages: 5947 , remaining dirty pages: 7633
Iteration 19, duration: 700 ms , transferred pages: 5919 (dup: 193, rd: 4853, fd: 873) , new dirty pages: 5861 , remaining dirty pages: 7575
Iteration 20, duration: 600 ms , transferred pages: 5284 (dup: 376, rd: 4408, fd: 500) , new dirty pages: 5206 , remaining dirty pages: 7497
Iteration 21, duration: 600 ms , transferred pages: 5147 (dup: 239, rd: 4308, fd: 600) , new dirty pages: 5031 , remaining dirty pages: 7381
Iteration 22, duration: 599 ms , transferred pages: 5064 (dup: 156, rd: 4026, fd: 882) , new dirty pages: 5601 , remaining dirty pages: 7918
Iteration 23, duration: 702 ms , transferred pages: 5965 (dup: 239, rd: 5028, fd: 698) , new dirty pages: 6079 , remaining dirty pages: 8032
Iteration 24, duration: 700 ms , transferred pages: 6175 (dup: 449, rd: 5146, fd: 580) , new dirty pages: 10932 , remaining dirty pages: 12789
Iteration 25, duration: 1300 ms , transferred pages: 10936 (dup: 302, rd: 6205, fd: 4429) , new dirty pages: 8713 , remaining dirty pages: 10566
Iteration 26, duration: 1000 ms , transferred pages: 8282 (dup: 102, rd: 5662, fd: 2518) , new dirty pages: 5119 , remaining dirty pages: 7403
Iteration 27, duration: 600 ms , transferred pages: 5007 (dup: 99, rd: 4099, fd: 809) , new dirty pages: 2226 , remaining dirty pages: 4622
Iteration 28, duration: 300 ms , transferred pages: 2491 (dup: 37, rd: 1794, fd: 660) , new dirty pages: 6746 , remaining dirty pages: 8877
Iteration 29, duration: 800 ms , transferred pages: 6757 (dup: 213, rd: 5532, fd: 1012) , new dirty pages: 6070 , remaining dirty pages: 8190
Iteration 30, duration: 700 ms , transferred pages: 6052 (dup: 326, rd: 5107, fd: 619) , new dirty pages: 5177 , remaining dirty pages: 7315
7. web server
Iteration 1, duration: 20902 ms , transferred pages: 266450 (dup: 95497, rd: 170953, fd: 0) , new dirty pages: 8528 , remaining dirty pages: 8528
Iteration 2, duration: 796 ms , transferred pages: 6472 (dup: 131, rd: 1885, fd: 4456) , new dirty pages: 650 , remaining dirty pages: 2706
Iteration 3, duration: 100 ms , transferred pages: 818 (dup: 0, rd: 383, fd: 435) , new dirty pages: 328 , remaining dirty pages: 2216
Iteration 4, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2216
8. cpu2006.bwaves (can not converge)
Iteration 1, duration: 31715 ms , transferred pages: 266450 (dup: 6766, rd: 259684, fd: 0) , new dirty pages: 242702 , remaining dirty pages: 242702
Iteration 2, duration: 29397 ms , transferred pages: 240508 (dup: 405, rd: 225588, fd: 14515) , new dirty pages: 230889 , remaining dirty pages: 233083
Iteration 3, duration: 28205 ms , transferred pages: 230858 (dup: 182, rd: 214596, fd: 16080) , new dirty pages: 226998 , remaining dirty pages: 229223
Iteration 4, duration: 27805 ms , transferred pages: 227574 (dup: 170, rd: 217045, fd: 10359) , new dirty pages: 227360 , remaining dirty pages: 229009
Iteration 5, duration: 27703 ms , transferred pages: 226786 (dup: 200, rd: 212130, fd: 14456) , new dirty pages: 225885 , remaining dirty pages: 228108
Iteration 6, duration: 27600 ms , transferred pages: 225923 (dup: 155, rd: 215503, fd: 10265) , new dirty pages: 223555 , remaining dirty pages: 225740
Iteration 7, duration: 27309 ms , transferred pages: 223574 (dup: 260, rd: 215641, fd: 7673) , new dirty pages: 231975 , remaining dirty pages: 234141
Iteration 8, duration: 28403 ms , transferred pages: 232397 (dup: 85, rd: 214086, fd: 18226) , new dirty pages: 222170 , remaining dirty pages: 223914
Iteration 9, duration: 27105 ms , transferred pages: 221809 (dup: 131, rd: 214988, fd: 6690) , new dirty pages: 230065 , remaining dirty pages: 232170
Iteration 10, duration: 28104 ms , transferred pages: 230201 (dup: 343, rd: 213531, fd: 16327) , new dirty pages: 227590 , remaining dirty pages: 229559
Iteration 11, duration: 27801 ms , transferred pages: 227717 (dup: 313, rd: 221408, fd: 5996) , new dirty pages: 228457 , remaining dirty pages: 230299
Iteration 12, duration: 27916 ms , transferred pages: 228560 (dup: 338, rd: 219660, fd: 8562) , new dirty pages: 238326 , remaining dirty pages: 240065
9. cpu2006.lbm (can not converge)
Iteration 1, duration: 31012 ms , transferred pages: 266450 (dup: 12253, rd: 254197, fd: 0) , new dirty pages: 108960 , remaining dirty pages: 108960
Iteration 2, duration: 13095 ms , transferred pages: 106522 (dup: 3, rd: 102045, fd: 4474) , new dirty pages: 129292 , remaining dirty pages: 131730
Iteration 3, duration: 15802 ms , transferred pages: 129688 (dup: 444, rd: 110860, fd: 18384) , new dirty pages: 116682 , remaining dirty pages: 118724
Iteration 4, duration: 14204 ms , transferred pages: 116316 (dup: 160, rd: 104951, fd: 11205) , new dirty pages: 107246 , remaining dirty pages: 109654
Iteration 5, duration: 13208 ms , transferred pages: 107977 (dup: 1, rd: 101834, fd: 6142) , new dirty pages: 105371 , remaining dirty pages: 107048
Iteration 6, duration: 12804 ms , transferred pages: 104705 (dup: 1, rd: 99629, fd: 5075) , new dirty pages: 103841 , remaining dirty pages: 106184
Iteration 7, duration: 12709 ms , transferred pages: 103891 (dup: 5, rd: 99212, fd: 4674) , new dirty pages: 106692 , remaining dirty pages: 108985
Iteration 8, duration: 13105 ms , transferred pages: 107169 (dup: 11, rd: 100125, fd: 7033) , new dirty pages: 103132 , remaining dirty pages: 104948
Iteration 9, duration: 12607 ms , transferred pages: 103068 (dup: 0, rd: 99460, fd: 3608) , new dirty pages: 102511 , remaining dirty pages: 104391
Iteration 10, duration: 12514 ms , transferred pages: 102250 (dup: 0, rd: 99094, fd: 3156) , new dirty pages: 102888 , remaining dirty pages: 105029
10. cpu2006.astar (can not converge)
Iteration 1, duration: 28402 ms , transferred pages: 266450 (dup: 33770, rd: 232680, fd: 0) , new dirty pages: 62078 , remaining dirty pages: 62078
Iteration 2, duration: 7393 ms , transferred pages: 60107 (dup: 10, rd: 51722, fd: 8375) , new dirty pages: 48854 , remaining dirty pages: 50825
Iteration 3, duration: 6001 ms , transferred pages: 49094 (dup: 14, rd: 46540, fd: 2540) , new dirty pages: 48137 , remaining dirty pages: 49868
Iteration 4, duration: 5800 ms , transferred pages: 47444 (dup: 0, rd: 45389, fd: 2055) , new dirty pages: 49147 , remaining dirty pages: 51571
Iteration 5, duration: 6102 ms , transferred pages: 49912 (dup: 14, rd: 46216, fd: 3682) , new dirty pages: 55606 , remaining dirty pages: 57265
Iteration 6, duration: 6699 ms , transferred pages: 54949 (dup: 143, rd: 20745, fd: 34061) , new dirty pages: 9166 , remaining dirty pages: 11482
Iteration 7, duration: 1200 ms , transferred pages: 9830 (dup: 14, rd: 7011, fd: 2805) , new dirty pages: 8294 , remaining dirty pages: 9946
Iteration 8, duration: 1000 ms , transferred pages: 8194 (dup: 14, rd: 7178, fd: 1002) , new dirty pages: 5475 , remaining dirty pages: 7227
Iteration 9, duration: 600 ms , transferred pages: 4908 (dup: 0, rd: 3470, fd: 1438) , new dirty pages: 4175 , remaining dirty pages: 6494
Iteration 10, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3856, fd: 234) , new dirty pages: 4095 , remaining dirty pages: 6499
Iteration 11, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3313, fd: 777) , new dirty pages: 3371 , remaining dirty pages: 5780
Iteration 12, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3823, fd: 267) , new dirty pages: 7518 , remaining dirty pages: 9208
Iteration 13, duration: 899 ms , transferred pages: 7376 (dup: 14, rd: 6028, fd: 1334) , new dirty pages: 3931 , remaining dirty pages: 5763
Iteration 14, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 4346 , remaining dirty pages: 6019
Iteration 15, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3817, fd: 273) , new dirty pages: 3054 , remaining dirty pages: 4983
Iteration 16, duration: 400 ms , transferred pages: 3272 (dup: 0, rd: 3138, fd: 134) , new dirty pages: 3874 , remaining dirty pages: 5585
Iteration 17, duration: 399 ms , transferred pages: 3272 (dup: 0, rd: 3248, fd: 24) , new dirty pages: 5285 , remaining dirty pages: 7598
Iteration 18, duration: 701 ms , transferred pages: 5726 (dup: 0, rd: 4385, fd: 1341) , new dirty pages: 8903 , remaining dirty pages: 10775
Iteration 19, duration: 1101 ms , transferred pages: 9010 (dup: 12, rd: 5597, fd: 3401) , new dirty pages: 4199 , remaining dirty pages: 5964
Iteration 20, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 3829 , remaining dirty pages: 5703
11. cpu2006.xalancbmk (can not converge)
Iteration 1, duration: 30407 ms , transferred pages: 266450 (dup: 17700, rd: 248750, fd: 0) , new dirty pages: 96169 , remaining dirty pages: 96169
Iteration 2, duration: 11495 ms , transferred pages: 94164 (dup: 205, rd: 67068, fd: 26891) , new dirty pages: 61766 , remaining dirty pages: 63771
Iteration 3, duration: 7501 ms , transferred pages: 61471 (dup: 121, rd: 53587, fd: 7763) , new dirty pages: 56569 , remaining dirty pages: 58869
Iteration 4, duration: 6902 ms , transferred pages: 56461 (dup: 19, rd: 50553, fd: 5889) , new dirty pages: 52181 , remaining dirty pages: 54589
Iteration 5, duration: 6402 ms , transferred pages: 52459 (dup: 107, rd: 46986, fd: 5366) , new dirty pages: 54051 , remaining dirty pages: 56181
Iteration 6, duration: 6601 ms , transferred pages: 54003 (dup: 15, rd: 47566, fd: 6422) , new dirty pages: 50844 , remaining dirty pages: 53022
Iteration 7, duration: 6202 ms , transferred pages: 50723 (dup: 7, rd: 47143, fd: 3573) , new dirty pages: 64880 , remaining dirty pages: 67179
Iteration 8, duration: 8001 ms , transferred pages: 65447 (dup: 7, rd: 61159, fd: 4281) , new dirty pages: 67854 , remaining dirty pages: 69586
Iteration 9, duration: 8202 ms , transferred pages: 67444 (dup: 368, rd: 56357, fd: 10719) , new dirty pages: 65178 , remaining dirty pages: 67320
Iteration 10, duration: 8000 ms , transferred pages: 65455 (dup: 15, rd: 60581, fd: 4859) , new dirty pages: 52421 , remaining dirty pages: 54286
12. cpu2006.milc (can not converge)
Iteration 1, duration: 31410 ms , transferred pages: 266450 (dup: 9454, rd: 256996, fd: 0) , new dirty pages: 158860 , remaining dirty pages: 158860
Iteration 2, duration: 19193 ms , transferred pages: 157048 (dup: 150, rd: 96807, fd: 60091) , new dirty pages: 102238 , remaining dirty pages: 104050
Iteration 3, duration: 12504 ms , transferred pages: 102271 (dup: 21, rd: 95107, fd: 7143) , new dirty pages: 97944 , remaining dirty pages: 99723
Iteration 4, duration: 11905 ms , transferred pages: 97360 (dup: 18, rd: 93610, fd: 3732) , new dirty pages: 99150 , remaining dirty pages: 101513
Iteration 5, duration: 12105 ms , transferred pages: 99094 (dup: 116, rd: 94125, fd: 4853) , new dirty pages: 98589 , remaining dirty pages: 101008
Iteration 6, duration: 12101 ms , transferred pages: 98995 (dup: 17, rd: 94069, fd: 4909) , new dirty pages: 147403 , remaining dirty pages: 149416
Iteration 7, duration: 18001 ms , transferred pages: 147284 (dup: 44, rd: 135691, fd: 11549) , new dirty pages: 136445 , remaining dirty pages: 138577
Iteration 8, duration: 16702 ms , transferred pages: 136636 (dup: 30, rd: 130805, fd: 5801) , new dirty pages: 145481 , remaining dirty pages: 147422
Iteration 9, duration: 17800 ms , transferred pages: 145734 (dup: 130, rd: 133239, fd: 12365) , new dirty pages: 98032 , remaining dirty pages: 99720
Iteration 10, duration: 11902 ms , transferred pages: 97364 (dup: 22, rd: 93096, fd: 4246) , new dirty pages: 95391 , remaining dirty pages: 97747
13. cpu2006.cactusADM (can not converge)
Iteration 1, duration: 23508 ms , transferred pages: 266450 (dup: 73568, rd: 192882, fd: 0) , new dirty pages: 123869 , remaining dirty pages: 123869
Iteration 2, duration: 13989 ms , transferred pages: 121594 (dup: 7874, rd: 81653, fd: 32067) , new dirty pages: 112960 , remaining dirty pages: 115235
Iteration 3, duration: 13605 ms , transferred pages: 113276 (dup: 2028, rd: 83783, fd: 27465) , new dirty pages: 112314 , remaining dirty pages: 114273
Iteration 4, duration: 13509 ms , transferred pages: 111935 (dup: 1505, rd: 83535, fd: 26895) , new dirty pages: 114078 , remaining dirty pages: 116416
Iteration 5, duration: 13810 ms , transferred pages: 114262 (dup: 1378, rd: 84039, fd: 28845) , new dirty pages: 112271 , remaining dirty pages: 114425
Iteration 6, duration: 13604 ms , transferred pages: 112664 (dup: 1416, rd: 84300, fd: 26948) , new dirty pages: 112903 , remaining dirty pages: 114664
Iteration 7, duration: 13604 ms , transferred pages: 112655 (dup: 1407, rd: 84027, fd: 27221) , new dirty pages: 110943 , remaining dirty pages: 112952
Iteration 8, duration: 13406 ms , transferred pages: 110720 (dup: 1108, rd: 84075, fd: 25537) , new dirty pages: 109321 , remaining dirty pages: 111553
Iteration 9, duration: 13306 ms , transferred pages: 109726 (dup: 932, rd: 83652, fd: 25142) , new dirty pages: 113446 , remaining dirty pages: 115273
Iteration 10, duration: 13705 ms , transferred pages: 113121 (dup: 1055, rd: 84671, fd: 27395) , new dirty pages: 108776 , remaining dirty pages: 110928
14. cpu2006.GmesFDTD (can not converge)
Iteration 1, duration: 13303 ms , transferred pages: 266450 (dup: 157809, rd: 108641, fd: 0) , new dirty pages: 226802 , remaining dirty pages: 226802
Iteration 2, duration: 10797 ms , transferred pages: 226507 (dup: 138637, rd: 61818, fd: 26052) , new dirty pages: 200769 , remaining dirty pages: 201064
Iteration 3, duration: 8900 ms , transferred pages: 199717 (dup: 127187, rd: 69340, fd: 3190) , new dirty pages: 203436 , remaining dirty pages: 204783
Iteration 4, duration: 10904 ms , transferred pages: 204127 (dup: 115211, rd: 85767, fd: 3149) , new dirty pages: 198407 , remaining dirty pages: 199063
Iteration 5, duration: 12109 ms , transferred pages: 198206 (dup: 99435, rd: 96956, fd: 1815) , new dirty pages: 213719 , remaining dirty pages: 214576
Iteration 6, duration: 16307 ms , transferred pages: 213595 (dup: 80422, rd: 116885, fd: 16288) , new dirty pages: 199637 , remaining dirty pages: 200618
Iteration 7, duration: 16915 ms , transferred pages: 198289 (dup: 60169, rd: 134208, fd: 3912) , new dirty pages: 199343 , remaining dirty pages: 201672
Iteration 8, duration: 19518 ms , transferred pages: 200452 (dup: 41014, rd: 156083, fd: 3355) , new dirty pages: 222927 , remaining dirty pages: 224147
15. cpu2006.wrf (can not converge)
Iteration 1, duration: 18499 ms , transferred pages: 266380 (dup: 115285, rd: 151095, fd: 0) , new dirty pages: 112322 , remaining dirty pages: 112392
Iteration 2, duration: 9802 ms , transferred pages: 110025 (dup: 29917, rd: 65782, fd: 14326) , new dirty pages: 88855 , remaining dirty pages: 91222
Iteration 3, duration: 8199 ms , transferred pages: 89761 (dup: 22728, rd: 57262, fd: 9771) , new dirty pages: 58431 , remaining dirty pages: 59892
Iteration 4, duration: 5603 ms , transferred pages: 58502 (dup: 12716, rd: 41809, fd: 3977) , new dirty pages: 80556 , remaining dirty pages: 81946
Iteration 5, duration: 7101 ms , transferred pages: 79778 (dup: 21738, rd: 50896, fd: 7144) , new dirty pages: 62592 , remaining dirty pages: 64760
Iteration 6, duration: 5702 ms , transferred pages: 63388 (dup: 16793, rd: 42726, fd: 3869) , new dirty pages: 80747 , remaining dirty pages: 82119
Iteration 7, duration: 7000 ms , transferred pages: 80868 (dup: 23652, rd: 52194, fd: 5022) , new dirty pages: 84593 , remaining dirty pages: 85844
Iteration 8, duration: 7099 ms , transferred pages: 83799 (dup: 25769, rd: 51772, fd: 6258) , new dirty pages: 67951 , remaining dirty pages: 69996
Iteration 9, duration: 6303 ms , transferred pages: 68478 (dup: 16979, rd: 36490, fd: 15009) , new dirty pages: 81181 , remaining dirty pages: 82699
Iteration 10, duration: 7000 ms , transferred pages: 80724 (dup: 23503, rd: 52826, fd: 4395) , new dirty pages: 47930 , remaining dirty pages: 49905
>
> > So I think "booting" and "kernel compilation" should benefit a lot from this
> > improvement. The reason of "kernel compilation" would benefit is that some
> > iterations take around 600ms, and if they are halved into 300ms, then the precopy
> > may have the chance to step into stop and copy phase.
> >
> > On the other hand, "idle" and "web server" would not benefit a lot, because
> > most of the time are spent on the 1st iteration and little on the others.
> >
> > As to the "zeusmp" and "memcached", although the time spent on the other iterations
> > but the 1st one may be halved, they still could not converge to stop and copy
> > with the 300ms downtime.
> >
> > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> >
> > 1. booting : begin to migrate when the VM is booting
> >
> > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414
> > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459
> > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356
> > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430
> > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430
> > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> > "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> >
> > 2. idle
> >
> > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398
> > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294
> > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849
> > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849
> >
> > 3. kernel compilation (can not converge)
> >
> > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067
> > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518
> > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207
> > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339
> > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200
> > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116
> > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033
> > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642
> > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427
> > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731
> > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252
> > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348
> > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920
> > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693
> > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382
> >
> > 4. cpu2006.zeusmp (can not converge)
> >
> > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625
> > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360
> > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831
> > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768
> > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885
> > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634
> > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894
> > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476
> > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692
> > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727
> > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810
> > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813
> > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329
> > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918
> > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338
> >
> > 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> >
> > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628
> > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574
> > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261
> > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519
> > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167
> > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167
> >
> > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> >
> > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge)
> >
> > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023
> > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013
> > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551
> > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638
> > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712
> > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787
> > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942
> > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432
> > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096
> > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418
> > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321
> > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169
> > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989
> > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061
> > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717
> > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257
> > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576
> > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963
> > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888
> > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047
> >
> >
> > --
> > Chunguang Li, Ph.D. Candidate
> > Wuhan National Laboratory for Optoelectronics (WNLO)
> > Huazhong University of Science & Technology (HUST)
> > Wuhan, Hubei Prov., China
> >
> >
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-03 8:25 ` Chunguang Li
@ 2016-11-03 9:59 ` Li, Liang Z
2016-11-03 10:13 ` Li, Liang Z
2016-11-08 11:05 ` Dr. David Alan Gilbert
2 siblings, 0 replies; 21+ messages in thread
From: Li, Liang Z @ 2016-11-03 9:59 UTC (permalink / raw)
To: Chunguang Li, Dr. David Alan Gilbert
Cc: Amit Shah, pbonzini, qemu-devel, stefanha, quintela
> pages will be sent. Before that during the migration setup, the
> ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce
> the dirty bitmap from this moment. When the pages "that haven't been
> sent" are written, the kernel space marks them as dirty. However I don't
> think this is correct, because these pages will be sent during this and the next
> iterations with the same content (if they are not written again after they are
> sent). It only makes sense to mark the pages which have already been sent
> during one iteration as dirty when they are written.
> > > > > > >
> > > > > > >
> > > > > > > Am I right about this consideration? If I am right, is there some
> advice to improve this?
> > > > > >
> > > > > > I think you're right that this can happen; to clarify I think the
> > > > > > case you're talking about is:
> > > > > >
> > > > > > Iteration 1
> > > > > > sync bitmap
> > > > > > start sending pages
> > > > > > page 'n' is modified - but hasn't been sent yet
> > > > > > page 'n' gets sent
> > > > > > Iteration 2
> > > > > > sync bitmap
> > > > > > 'page n is shown as modified'
> > > > > > send page 'n' again
> > > > > >
> > > > >
> > > > > Yes,this is right the case I am talking about.
> > > > >
> > > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > > on big VMs with slow networks where the length of each iteration
> > > > > > is large.
> > > > >
> > > > > I think this is "very" wasteful. Assume the workload writes the pages
> dirty randomly within the guest address space, and the transfer speed is
> constant. Intuitively, I think nearly half of the dirty pages produced in
> Iteration 1 is not really dirty. This means the time of Iteration 2 is double of
> that to send only really dirty pages.
> > > >
> > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > workloads get impacted the most? That would also help us to figure
> > > > out what kinds of speed improvements we can expect.
> > > >
> > > >
> > > > Amit
> > >
> > > I have picked up 6 workloads and got the following statistics numbers
> > > of every iteration (except the last stop-copy one) during precopy.
> > > These numbers are obtained with the basic precopy migration, without
> > > the capabilities like xbzrle or compression, etc. The network for the
> > > migration is exclusive, with a separate network for the workloads.
> > > They are both gigabit ethernet. I use qemu-2.5.1.
> > >
> > > Three (booting, idle, web server) of them converged to the stop-copy
> phase,
> > > with the given bandwidth and default downtime (300ms), while the other
> > > three (kernel compilation, zeusmp, memcached) did not.
> > >
> > > One page is "not-really-dirty", if it is written first and is sent later
> > > (and not written again after that) during one iteration. I guess this
> > > would not happen so often during the other iterations as during the 1st
> > > iteration. Because all the pages of the VM are sent to the dest node
> during
> > > the 1st iteration, while during the others, only part of the pages are sent.
> > > So I think the "not-really-dirty" pages should be produced mainly during
> > > the 1st iteration , and maybe very little during the other iterations.
> > >
> > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > think the time spent on Iteration 2 would be halved. This is a chain
> reaction,
> > > because the dirty pages produced during Iteration 2 is halved, which
> incurs
> > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> >
> > Yes; these numbers don't show how many of them are false dirty though.
> >
> > One problem is thinking about pages that have been redirtied, if the page is
> dirtied
> > after the sync but before the network write then it's the false-dirty that
> > you're describing.
> >
> > However, if the page is being written a few times, and so it would have
> been written
> > after the network write then it isn't a false-dirty.
> >
> > You might be able to figure that out with some kernel tracing of when the
> dirtying
> > happens, but it might be easier to write the fix!
> >
> > Dave
>
> Hi, I have made some new progress now.
>
> To tell how many false dirty pages there are exactly in each iteration, I malloc
> a
> buffer in memory as big as the size of the whole VM memory. When a page
> is
> transferred to the dest node, it is copied to the buffer; During the next
> iteration,
> if one page is transferred, it is compared to the old one in the buffer, and the
> old one will be replaced for next comparison if it is really dirty. Thus, we are
> now
> able to get the exact number of false dirty pages.
>
> This time, I use 15 workloads to get the statistic number. They are:
>
> 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all
> scientific
> computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc.
> I pick
> up these 11 benchmarks because compared to others, they have bigger
> memory
> occupation and higher memory dirty rate. Thus most of them could not
> converge
> to stop-and-copy using the default migration speed (32MB/s).
> 2. kernel compilation
> 3. idle VM
> 4. Apache web server which serves static content
>
> (the above workloads are all running in VM with 1 vcpu and 1GB memory,
> and the
> migration speed is the default 32MB/s)
>
> 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used
> as the cache.
> After filling up the 4GB cache, a client writes the cache at a constant speed
> during migration. This time, migration speed has no limit, and is up to the
> capability of 1Gbps Ethernet.
>
> Summarize the results first: (and you can read the precise number below)
>
> 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during
> some iterations)
> of false dirty pages out of all the dirty pages since iteration 2 (and the big
> proportion lasts during the following iterations). They are cpu2006.zeusmp,
> cpu2006.bzip2, cpu2006.mcf, and memcached.
> 2. 2 workloads (idle, webserver) spend most of the migration time on
> iteration 1, even
> though the proportion of false dirty pages is big since iteration 2, the space
> to
> optimize is small.
> 3. 1 workload (kernel compilation) only have a big proportion during
> iteration 2, not
> in the other iterations.
> 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of
> false
> dirty pages since iteration 2. So the spaces to optimize for them are small.
>
> Now I want to talk a little more about the reasons why false dirty pages are
> produced.
> The first reason is what we have discussed before---the mechanism to track
> the dirty
> pages.
> And then I come up with another reason. Here is the situation: a write
> operation to one
> memory page happens, but it doesn't change any content of the page. So it's
> "write but
> not dirty", and kernel still marks it as dirty. One guy in our lab has done some
> experiments
> to figure out the proportion of "write but not dirty" operations, and he uses
> the cpu2006
> benchmark suit. According to his results, general workloads has a little
> proportion (<10%)
> of "write but not dirty" out of all the write operations, while few workloads
> have higher
> proportion (one even as high as 50%). Now we are not sure why "write but
> not dirty" would
> happen, it just happened.
>
> So these two reasons contribute to the false dirty pages. To optimize, I
> compute and store
> the SHA1 hash before transferring each page. Next time, if one page needs
> retransmission, its
> SHA1 hash is computed again, and compared to the old hash. If the hash is
> the same, it's a
> false dirty page, and we just skip this page; Otherwise, the page is
> transferred, and the new
> hash replaces the old one for next comparison.
> The reason to use SHA1 hash but not byte-by-byte comparison is the
> memory overheads. One SHA1
> hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the
> whole VM memory, which
> is relatively small.
> As far as I know, SHA1 hash is widely used in the scenes of deduplication for
> backup systems.
> They have proven that the probability of hash collision is far smaller than disk
> hardware fault,
> so it's secure hash, that is, if the hashes of two chunks are the same, the
> content must be the
> same. So I think the SHA1 hash could replace byte-to-byte comparison in the
> VM memory scenery.
>
> Then I do the same migration experiments using the SHA1 hash. For the 4
> workloads which have
> big proportions of false dirty pages, the improvement is remarkable. Without
> optimization,
> they either can not converge to stop-and-copy, or take a very long time to
> complete. With the
> SHA1 hash method, all of them now complete in a relatively short time.
> For the reason I have talked above, the other workloads don't get notable
> improvements from the
> optimization. So below, I only show the exact number after optimization for
> the 4 workloads with
> remarkable improvements.
>
> Any comments or suggestions?
>
It seems the current XBZRLE feature can be used to solve false dirty issue, no?
Liang
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-03 8:25 ` Chunguang Li
2016-11-03 9:59 ` Li, Liang Z
@ 2016-11-03 10:13 ` Li, Liang Z
2016-11-04 3:07 ` Chunguang Li
2016-11-08 11:05 ` Dr. David Alan Gilbert
2 siblings, 1 reply; 21+ messages in thread
From: Li, Liang Z @ 2016-11-03 10:13 UTC (permalink / raw)
To: Chunguang Li, Dr. David Alan Gilbert
Cc: Amit Shah, pbonzini, qemu-devel, stefanha, quintela
> > > > > I think this is "very" wasteful. Assume the workload writes the pages
> dirty randomly within the guest address space, and the transfer speed is
> constant. Intuitively, I think nearly half of the dirty pages produced in
> Iteration 1 is not really dirty. This means the time of Iteration 2 is double of
> that to send only really dirty pages.
> > > >
> > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > workloads get impacted the most? That would also help us to figure
> > > > out what kinds of speed improvements we can expect.
> > > >
> > > >
> > > > Amit
> > >
> > > I have picked up 6 workloads and got the following statistics numbers
> > > of every iteration (except the last stop-copy one) during precopy.
> > > These numbers are obtained with the basic precopy migration, without
> > > the capabilities like xbzrle or compression, etc. The network for the
> > > migration is exclusive, with a separate network for the workloads.
> > > They are both gigabit ethernet. I use qemu-2.5.1.
> > >
> > > Three (booting, idle, web server) of them converged to the stop-copy
> phase,
> > > with the given bandwidth and default downtime (300ms), while the other
> > > three (kernel compilation, zeusmp, memcached) did not.
> > >
> > > One page is "not-really-dirty", if it is written first and is sent later
> > > (and not written again after that) during one iteration. I guess this
> > > would not happen so often during the other iterations as during the 1st
> > > iteration. Because all the pages of the VM are sent to the dest node
> during
> > > the 1st iteration, while during the others, only part of the pages are sent.
> > > So I think the "not-really-dirty" pages should be produced mainly during
> > > the 1st iteration , and maybe very little during the other iterations.
> > >
> > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > think the time spent on Iteration 2 would be halved. This is a chain
> reaction,
> > > because the dirty pages produced during Iteration 2 is halved, which
> incurs
> > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> >
> > Yes; these numbers don't show how many of them are false dirty though.
> >
> > One problem is thinking about pages that have been redirtied, if the page is
> dirtied
> > after the sync but before the network write then it's the false-dirty that
> > you're describing.
> >
> > However, if the page is being written a few times, and so it would have
> been written
> > after the network write then it isn't a false-dirty.
> >
> > You might be able to figure that out with some kernel tracing of when the
> dirtying
> > happens, but it might be easier to write the fix!
> >
> > Dave
>
> Hi, I have made some new progress now.
>
> To tell how many false dirty pages there are exactly in each iteration, I malloc
> a
> buffer in memory as big as the size of the whole VM memory. When a page
> is
> transferred to the dest node, it is copied to the buffer; During the next
> iteration,
> if one page is transferred, it is compared to the old one in the buffer, and the
> old one will be replaced for next comparison if it is really dirty. Thus, we are
> now
> able to get the exact number of false dirty pages.
>
> This time, I use 15 workloads to get the statistic number. They are:
>
> 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all
> scientific
> computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc.
> I pick
> up these 11 benchmarks because compared to others, they have bigger
> memory
> occupation and higher memory dirty rate. Thus most of them could not
> converge
> to stop-and-copy using the default migration speed (32MB/s).
> 2. kernel compilation
> 3. idle VM
> 4. Apache web server which serves static content
>
> (the above workloads are all running in VM with 1 vcpu and 1GB memory,
> and the
> migration speed is the default 32MB/s)
>
> 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used
> as the cache.
> After filling up the 4GB cache, a client writes the cache at a constant speed
> during migration. This time, migration speed has no limit, and is up to the
> capability of 1Gbps Ethernet.
>
> Summarize the results first: (and you can read the precise number below)
>
> 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during
> some iterations)
> of false dirty pages out of all the dirty pages since iteration 2 (and the big
> proportion lasts during the following iterations). They are cpu2006.zeusmp,
> cpu2006.bzip2, cpu2006.mcf, and memcached.
> 2. 2 workloads (idle, webserver) spend most of the migration time on
> iteration 1, even
> though the proportion of false dirty pages is big since iteration 2, the space
> to
> optimize is small.
> 3. 1 workload (kernel compilation) only have a big proportion during
> iteration 2, not
> in the other iterations.
> 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of
> false
> dirty pages since iteration 2. So the spaces to optimize for them are small.
>
> Now I want to talk a little more about the reasons why false dirty pages are
> produced.
> The first reason is what we have discussed before---the mechanism to track
> the dirty
> pages.
> And then I come up with another reason. Here is the situation: a write
> operation to one
> memory page happens, but it doesn't change any content of the page. So it's
> "write but
> not dirty", and kernel still marks it as dirty. One guy in our lab has done some
> experiments
> to figure out the proportion of "write but not dirty" operations, and he uses
> the cpu2006
> benchmark suit. According to his results, general workloads has a little
> proportion (<10%)
> of "write but not dirty" out of all the write operations, while few workloads
> have higher
> proportion (one even as high as 50%). Now we are not sure why "write but
> not dirty" would
> happen, it just happened.
>
> So these two reasons contribute to the false dirty pages. To optimize, I
> compute and store
> the SHA1 hash before transferring each page. Next time, if one page needs
> retransmission, its
> SHA1 hash is computed again, and compared to the old hash. If the hash is
> the same, it's a
> false dirty page, and we just skip this page; Otherwise, the page is
> transferred, and the new
> hash replaces the old one for next comparison.
> The reason to use SHA1 hash but not byte-by-byte comparison is the
> memory overheads. One SHA1
> hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the
> whole VM memory, which
> is relatively small.
> As far as I know, SHA1 hash is widely used in the scenes of deduplication for
> backup systems.
> They have proven that the probability of hash collision is far smaller than disk
> hardware fault,
> so it's secure hash, that is, if the hashes of two chunks are the same, the
> content must be the
> same. So I think the SHA1 hash could replace byte-to-byte comparison in the
> VM memory scenery.
>
> Then I do the same migration experiments using the SHA1 hash. For the 4
> workloads which have
> big proportions of false dirty pages, the improvement is remarkable. Without
> optimization,
> they either can not converge to stop-and-copy, or take a very long time to
> complete. With the
> SHA1 hash method, all of them now complete in a relatively short time.
> For the reason I have talked above, the other workloads don't get notable
> improvements from the
> optimization. So below, I only show the exact number after optimization for
> the 4 workloads with
> remarkable improvements.
>
> Any comments or suggestions?
Maybe you can compare the performance of your solution as that of XBZRLE to see which one is better.
The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and need less buffer.
How about the overhead of calculating the SHA1? Is it faster than copying a page?
Liang
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-03 10:13 ` Li, Liang Z
@ 2016-11-04 3:07 ` Chunguang Li
2016-11-04 4:50 ` Li, Liang Z
0 siblings, 1 reply; 21+ messages in thread
From: Chunguang Li @ 2016-11-04 3:07 UTC (permalink / raw)
To: Li, Liang Z
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Thursday, November 3, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> > > > > > I think this is "very" wasteful. Assume the workload writes the pages
> > dirty randomly within the guest address space, and the transfer speed is
> > constant. Intuitively, I think nearly half of the dirty pages produced in
> > Iteration 1 is not really dirty. This means the time of Iteration 2 is double of
> > that to send only really dirty pages.
> > > > >
> > > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > > workloads get impacted the most? That would also help us to figure
> > > > > out what kinds of speed improvements we can expect.
> > > > >
> > > > >
> > > > > Amit
> > > >
> > > > I have picked up 6 workloads and got the following statistics numbers
> > > > of every iteration (except the last stop-copy one) during precopy.
> > > > These numbers are obtained with the basic precopy migration, without
> > > > the capabilities like xbzrle or compression, etc. The network for the
> > > > migration is exclusive, with a separate network for the workloads.
> > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > >
> > > > Three (booting, idle, web server) of them converged to the stop-copy
> > phase,
> > > > with the given bandwidth and default downtime (300ms), while the other
> > > > three (kernel compilation, zeusmp, memcached) did not.
> > > >
> > > > One page is "not-really-dirty", if it is written first and is sent later
> > > > (and not written again after that) during one iteration. I guess this
> > > > would not happen so often during the other iterations as during the 1st
> > > > iteration. Because all the pages of the VM are sent to the dest node
> > during
> > > > the 1st iteration, while during the others, only part of the pages are sent.
> > > > So I think the "not-really-dirty" pages should be produced mainly during
> > > > the 1st iteration , and maybe very little during the other iterations.
> > > >
> > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > > think the time spent on Iteration 2 would be halved. This is a chain
> > reaction,
> > > > because the dirty pages produced during Iteration 2 is halved, which
> > incurs
> > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > >
> > > Yes; these numbers don't show how many of them are false dirty though.
> > >
> > > One problem is thinking about pages that have been redirtied, if the page is
> > dirtied
> > > after the sync but before the network write then it's the false-dirty that
> > > you're describing.
> > >
> > > However, if the page is being written a few times, and so it would have
> > been written
> > > after the network write then it isn't a false-dirty.
> > >
> > > You might be able to figure that out with some kernel tracing of when the
> > dirtying
> > > happens, but it might be easier to write the fix!
> > >
> > > Dave
> >
> > Hi, I have made some new progress now.
> >
> > To tell how many false dirty pages there are exactly in each iteration, I malloc
> > a
> > buffer in memory as big as the size of the whole VM memory. When a page
> > is
> > transferred to the dest node, it is copied to the buffer; During the next
> > iteration,
> > if one page is transferred, it is compared to the old one in the buffer, and the
> > old one will be replaced for next comparison if it is really dirty. Thus, we are
> > now
> > able to get the exact number of false dirty pages.
> >
> > This time, I use 15 workloads to get the statistic number. They are:
> >
> > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all
> > scientific
> > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc.
> > I pick
> > up these 11 benchmarks because compared to others, they have bigger
> > memory
> > occupation and higher memory dirty rate. Thus most of them could not
> > converge
> > to stop-and-copy using the default migration speed (32MB/s).
> > 2. kernel compilation
> > 3. idle VM
> > 4. Apache web server which serves static content
> >
> > (the above workloads are all running in VM with 1 vcpu and 1GB memory,
> > and the
> > migration speed is the default 32MB/s)
> >
> > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used
> > as the cache.
> > After filling up the 4GB cache, a client writes the cache at a constant speed
> > during migration. This time, migration speed has no limit, and is up to the
> > capability of 1Gbps Ethernet.
> >
> > Summarize the results first: (and you can read the precise number below)
> >
> > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during
> > some iterations)
> > of false dirty pages out of all the dirty pages since iteration 2 (and the big
> > proportion lasts during the following iterations). They are cpu2006.zeusmp,
> > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > 2. 2 workloads (idle, webserver) spend most of the migration time on
> > iteration 1, even
> > though the proportion of false dirty pages is big since iteration 2, the space
> > to
> > optimize is small.
> > 3. 1 workload (kernel compilation) only have a big proportion during
> > iteration 2, not
> > in the other iterations.
> > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of
> > false
> > dirty pages since iteration 2. So the spaces to optimize for them are small.
> >
> > Now I want to talk a little more about the reasons why false dirty pages are
> > produced.
> > The first reason is what we have discussed before---the mechanism to track
> > the dirty
> > pages.
> > And then I come up with another reason. Here is the situation: a write
> > operation to one
> > memory page happens, but it doesn't change any content of the page. So it's
> > "write but
> > not dirty", and kernel still marks it as dirty. One guy in our lab has done some
> > experiments
> > to figure out the proportion of "write but not dirty" operations, and he uses
> > the cpu2006
> > benchmark suit. According to his results, general workloads has a little
> > proportion (<10%)
> > of "write but not dirty" out of all the write operations, while few workloads
> > have higher
> > proportion (one even as high as 50%). Now we are not sure why "write but
> > not dirty" would
> > happen, it just happened.
> >
> > So these two reasons contribute to the false dirty pages. To optimize, I
> > compute and store
> > the SHA1 hash before transferring each page. Next time, if one page needs
> > retransmission, its
> > SHA1 hash is computed again, and compared to the old hash. If the hash is
> > the same, it's a
> > false dirty page, and we just skip this page; Otherwise, the page is
> > transferred, and the new
> > hash replaces the old one for next comparison.
> > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > memory overheads. One SHA1
> > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the
> > whole VM memory, which
> > is relatively small.
> > As far as I know, SHA1 hash is widely used in the scenes of deduplication for
> > backup systems.
> > They have proven that the probability of hash collision is far smaller than disk
> > hardware fault,
> > so it's secure hash, that is, if the hashes of two chunks are the same, the
> > content must be the
> > same. So I think the SHA1 hash could replace byte-to-byte comparison in the
> > VM memory scenery.
> >
> > Then I do the same migration experiments using the SHA1 hash. For the 4
> > workloads which have
> > big proportions of false dirty pages, the improvement is remarkable. Without
> > optimization,
> > they either can not converge to stop-and-copy, or take a very long time to
> > complete. With the
> > SHA1 hash method, all of them now complete in a relatively short time.
> > For the reason I have talked above, the other workloads don't get notable
> > improvements from the
> > optimization. So below, I only show the exact number after optimization for
> > the 4 workloads with
> > remarkable improvements.
> >
> > Any comments or suggestions?
>
> Maybe you can compare the performance of your solution as that of XBZRLE to see which one is better.
> The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and need less buffer.
> How about the overhead of calculating the SHA1? Is it faster than copying a page?
>
> Liang
>
>
Yes, XBZRLE is able to handle the false dirty pages. However, if we want to avoid
transferring all of the false dirty pages using XBZRLE, we need a buffer as big as
the whole VM memory, while SHA1 needs a much small buffer. Of course, if we
have a buffer as big as the whole VM memory using XBZRLE, we could transfer less data
on network than SHA1, because XBZRLE is able to compress similar pages. In a word, yes,
the merit of using SHA1 is that it needs much less buffer, and leads to nice improvement
if there are many false dirty pages.
In terms of the overhead of calculating the SHA1 compared with transferring a page,
it's related to the CPU and network performance. In my test environment(Intel Xeon
E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing overhead
caused by calculating the SHA1, because the throughput of network (got by "info migrate")
remains almost the same.
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-04 3:07 ` Chunguang Li
@ 2016-11-04 4:50 ` Li, Liang Z
2016-11-04 7:03 ` Chunguang Li
2016-11-07 13:52 ` Chunguang Li
0 siblings, 2 replies; 21+ messages in thread
From: Li, Liang Z @ 2016-11-04 4:50 UTC (permalink / raw)
To: Chunguang Li
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> > > > > > > I think this is "very" wasteful. Assume the workload writes
> > > > > > > the pages
> > > dirty randomly within the guest address space, and the transfer
> > > speed is constant. Intuitively, I think nearly half of the dirty
> > > pages produced in Iteration 1 is not really dirty. This means the
> > > time of Iteration 2 is double of that to send only really dirty pages.
> > > > > >
> > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > kinds of workloads get impacted the most? That would also
> > > > > > help us to figure out what kinds of speed improvements we can
> expect.
> > > > > >
> > > > > >
> > > > > > Amit
> > > > >
> > > > > I have picked up 6 workloads and got the following statistics
> > > > > numbers of every iteration (except the last stop-copy one) during
> precopy.
> > > > > These numbers are obtained with the basic precopy migration,
> > > > > without the capabilities like xbzrle or compression, etc. The
> > > > > network for the migration is exclusive, with a separate network for
> the workloads.
> > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > >
> > > > > Three (booting, idle, web server) of them converged to the
> > > > > stop-copy
> > > phase,
> > > > > with the given bandwidth and default downtime (300ms), while the
> > > > > other three (kernel compilation, zeusmp, memcached) did not.
> > > > >
> > > > > One page is "not-really-dirty", if it is written first and is
> > > > > sent later (and not written again after that) during one
> > > > > iteration. I guess this would not happen so often during the
> > > > > other iterations as during the 1st iteration. Because all the
> > > > > pages of the VM are sent to the dest node
> > > during
> > > > > the 1st iteration, while during the others, only part of the pages are
> sent.
> > > > > So I think the "not-really-dirty" pages should be produced
> > > > > mainly during the 1st iteration , and maybe very little during the other
> iterations.
> > > > >
> > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > halved. This is a chain
> > > reaction,
> > > > > because the dirty pages produced during Iteration 2 is halved,
> > > > > which
> > > incurs
> > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > >
> > > > Yes; these numbers don't show how many of them are false dirty
> though.
> > > >
> > > > One problem is thinking about pages that have been redirtied, if
> > > > the page is
> > > dirtied
> > > > after the sync but before the network write then it's the
> > > > false-dirty that you're describing.
> > > >
> > > > However, if the page is being written a few times, and so it would
> > > > have
> > > been written
> > > > after the network write then it isn't a false-dirty.
> > > >
> > > > You might be able to figure that out with some kernel tracing of
> > > > when the
> > > dirtying
> > > > happens, but it might be easier to write the fix!
> > > >
> > > > Dave
> > >
> > > Hi, I have made some new progress now.
> > >
> > > To tell how many false dirty pages there are exactly in each
> > > iteration, I malloc a buffer in memory as big as the size of the
> > > whole VM memory. When a page is transferred to the dest node, it is
> > > copied to the buffer; During the next iteration, if one page is
> > > transferred, it is compared to the old one in the buffer, and the
> > > old one will be replaced for next comparison if it is really dirty.
> > > Thus, we are now able to get the exact number of false dirty pages.
> > >
> > > This time, I use 15 workloads to get the statistic number. They are:
> > >
> > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are
> > > all scientific
> > > computing workloads like Quantum Chromodynamics, Fluid Dynamics,
> etc.
> > > I pick
> > > up these 11 benchmarks because compared to others, they have
> > > bigger memory
> > > occupation and higher memory dirty rate. Thus most of them
> > > could not converge
> > > to stop-and-copy using the default migration speed (32MB/s).
> > > 2. kernel compilation
> > > 3. idle VM
> > > 4. Apache web server which serves static content
> > >
> > > (the above workloads are all running in VM with 1 vcpu and 1GB
> > > memory, and the
> > > migration speed is the default 32MB/s)
> > >
> > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are
> > > used as the cache.
> > > After filling up the 4GB cache, a client writes the cache at a constant
> speed
> > > during migration. This time, migration speed has no limit, and is up to
> the
> > > capability of 1Gbps Ethernet.
> > >
> > > Summarize the results first: (and you can read the precise number
> > > below)
> > >
> > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80%
> > > during some iterations)
> > > of false dirty pages out of all the dirty pages since iteration 2 (and the
> big
> > > proportion lasts during the following iterations). They are
> cpu2006.zeusmp,
> > > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > 2. 2 workloads (idle, webserver) spend most of the migration time
> > > on iteration 1, even
> > > though the proportion of false dirty pages is big since
> > > iteration 2, the space to
> > > optimize is small.
> > > 3. 1 workload (kernel compilation) only have a big proportion
> > > during iteration 2, not
> > > in the other iterations.
> > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > proportion of false
> > > dirty pages since iteration 2. So the spaces to optimize for them are
> small.
> > >
> > > Now I want to talk a little more about the reasons why false dirty
> > > pages are produced.
> > > The first reason is what we have discussed before---the mechanism to
> > > track the dirty pages.
> > > And then I come up with another reason. Here is the situation: a
> > > write operation to one memory page happens, but it doesn't change
> > > any content of the page. So it's "write but not dirty", and kernel
> > > still marks it as dirty. One guy in our lab has done some
> > > experiments to figure out the proportion of "write but not dirty"
> > > operations, and he uses the cpu2006 benchmark suit. According to his
> > > results, general workloads has a little proportion (<10%) of "write
> > > but not dirty" out of all the write operations, while few workloads
> > > have higher proportion (one even as high as 50%). Now we are not
> > > sure why "write but not dirty" would happen, it just happened.
> > >
> > > So these two reasons contribute to the false dirty pages. To
> > > optimize, I compute and store the SHA1 hash before transferring each
> > > page. Next time, if one page needs retransmission, its
> > > SHA1 hash is computed again, and compared to the old hash. If the
> > > hash is the same, it's a false dirty page, and we just skip this
> > > page; Otherwise, the page is transferred, and the new hash replaces
> > > the old one for next comparison.
> > > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > > memory overheads. One SHA1 hash is 20 bytes. So we need extra
> > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > relatively small.
> > > As far as I know, SHA1 hash is widely used in the scenes of
> > > deduplication for backup systems.
> > > They have proven that the probability of hash collision is far
> > > smaller than disk hardware fault, so it's secure hash, that is, if
> > > the hashes of two chunks are the same, the content must be the same.
> > > So I think the SHA1 hash could replace byte-to-byte comparison in
> > > the VM memory scenery.
> > >
> > > Then I do the same migration experiments using the SHA1 hash. For
> > > the 4 workloads which have big proportions of false dirty pages, the
> > > improvement is remarkable. Without optimization, they either can not
> > > converge to stop-and-copy, or take a very long time to complete.
> > > With the
> > > SHA1 hash method, all of them now complete in a relatively short time.
> > > For the reason I have talked above, the other workloads don't get
> > > notable improvements from the optimization. So below, I only show
> > > the exact number after optimization for the 4 workloads with
> > > remarkable improvements.
> > >
> > > Any comments or suggestions?
> >
> > Maybe you can compare the performance of your solution as that of
> XBZRLE to see which one is better.
> > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and
> need less buffer.
> > How about the overhead of calculating the SHA1? Is it faster than copying a
> page?
> >
> > Liang
> >
> >
>
> Yes, XBZRLE is able to handle the false dirty pages. However, if we want to
> avoid transferring all of the false dirty pages using XBZRLE, we need a buffer
> as big as the whole VM memory, while SHA1 needs a much small buffer. Of
> course, if we have a buffer as big as the whole VM memory using XBZRLE, we
> could transfer less data on network than SHA1, because XBZRLE is able to
> compress similar pages. In a word, yes, the merit of using SHA1 is that it
> needs much less buffer, and leads to nice improvement if there are many
> false dirty pages.
>
The current implementation of XBZRLE begins to buffer page from the second iteration,
Maybe it's worth to make it start to work from the first iteration based on your finding.
> In terms of the overhead of calculating the SHA1 compared with transferring
> a page, it's related to the CPU and network performance. In my test
> environment(Intel Xeon
> E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing
> overhead caused by calculating the SHA1, because the throughput of
> network (got by "info migrate") remains almost the same.
You can check the CPU usage, or to measure the time spend on a local live migration
which use SHA1/ XBZRLE.
Liang
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-04 4:50 ` Li, Liang Z
@ 2016-11-04 7:03 ` Chunguang Li
2016-11-07 13:52 ` Chunguang Li
1 sibling, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-04 7:03 UTC (permalink / raw)
To: Li, Liang Z
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Friday, November 4, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> > > > > > > > I think this is "very" wasteful. Assume the workload writes
> > > > > > > > the pages
> > > > dirty randomly within the guest address space, and the transfer
> > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > pages produced in Iteration 1 is not really dirty. This means the
> > > > time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > >
> > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > kinds of workloads get impacted the most? That would also
> > > > > > > help us to figure out what kinds of speed improvements we can
> > expect.
> > > > > > >
> > > > > > >
> > > > > > > Amit
> > > > > >
> > > > > > I have picked up 6 workloads and got the following statistics
> > > > > > numbers of every iteration (except the last stop-copy one) during
> > precopy.
> > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > without the capabilities like xbzrle or compression, etc. The
> > > > > > network for the migration is exclusive, with a separate network for
> > the workloads.
> > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > >
> > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > stop-copy
> > > > phase,
> > > > > > with the given bandwidth and default downtime (300ms), while the
> > > > > > other three (kernel compilation, zeusmp, memcached) did not.
> > > > > >
> > > > > > One page is "not-really-dirty", if it is written first and is
> > > > > > sent later (and not written again after that) during one
> > > > > > iteration. I guess this would not happen so often during the
> > > > > > other iterations as during the 1st iteration. Because all the
> > > > > > pages of the VM are sent to the dest node
> > > > during
> > > > > > the 1st iteration, while during the others, only part of the pages are
> > sent.
> > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > mainly during the 1st iteration , and maybe very little during the other
> > iterations.
> > > > > >
> > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > halved. This is a chain
> > > > reaction,
> > > > > > because the dirty pages produced during Iteration 2 is halved,
> > > > > > which
> > > > incurs
> > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > >
> > > > > Yes; these numbers don't show how many of them are false dirty
> > though.
> > > > >
> > > > > One problem is thinking about pages that have been redirtied, if
> > > > > the page is
> > > > dirtied
> > > > > after the sync but before the network write then it's the
> > > > > false-dirty that you're describing.
> > > > >
> > > > > However, if the page is being written a few times, and so it would
> > > > > have
> > > > been written
> > > > > after the network write then it isn't a false-dirty.
> > > > >
> > > > > You might be able to figure that out with some kernel tracing of
> > > > > when the
> > > > dirtying
> > > > > happens, but it might be easier to write the fix!
> > > > >
> > > > > Dave
> > > >
> > > > Hi, I have made some new progress now.
> > > >
> > > > To tell how many false dirty pages there are exactly in each
> > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > whole VM memory. When a page is transferred to the dest node, it is
> > > > copied to the buffer; During the next iteration, if one page is
> > > > transferred, it is compared to the old one in the buffer, and the
> > > > old one will be replaced for next comparison if it is really dirty.
> > > > Thus, we are now able to get the exact number of false dirty pages.
> > > >
> > > > This time, I use 15 workloads to get the statistic number. They are:
> > > >
> > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are
> > > > all scientific
> > > > computing workloads like Quantum Chromodynamics, Fluid Dynamics,
> > etc.
> > > > I pick
> > > > up these 11 benchmarks because compared to others, they have
> > > > bigger memory
> > > > occupation and higher memory dirty rate. Thus most of them
> > > > could not converge
> > > > to stop-and-copy using the default migration speed (32MB/s).
> > > > 2. kernel compilation
> > > > 3. idle VM
> > > > 4. Apache web server which serves static content
> > > >
> > > > (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > memory, and the
> > > > migration speed is the default 32MB/s)
> > > >
> > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are
> > > > used as the cache.
> > > > After filling up the 4GB cache, a client writes the cache at a constant
> > speed
> > > > during migration. This time, migration speed has no limit, and is up to
> > the
> > > > capability of 1Gbps Ethernet.
> > > >
> > > > Summarize the results first: (and you can read the precise number
> > > > below)
> > > >
> > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80%
> > > > during some iterations)
> > > > of false dirty pages out of all the dirty pages since iteration 2 (and the
> > big
> > > > proportion lasts during the following iterations). They are
> > cpu2006.zeusmp,
> > > > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > > 2. 2 workloads (idle, webserver) spend most of the migration time
> > > > on iteration 1, even
> > > > though the proportion of false dirty pages is big since
> > > > iteration 2, the space to
> > > > optimize is small.
> > > > 3. 1 workload (kernel compilation) only have a big proportion
> > > > during iteration 2, not
> > > > in the other iterations.
> > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > proportion of false
> > > > dirty pages since iteration 2. So the spaces to optimize for them are
> > small.
> > > >
> > > > Now I want to talk a little more about the reasons why false dirty
> > > > pages are produced.
> > > > The first reason is what we have discussed before---the mechanism to
> > > > track the dirty pages.
> > > > And then I come up with another reason. Here is the situation: a
> > > > write operation to one memory page happens, but it doesn't change
> > > > any content of the page. So it's "write but not dirty", and kernel
> > > > still marks it as dirty. One guy in our lab has done some
> > > > experiments to figure out the proportion of "write but not dirty"
> > > > operations, and he uses the cpu2006 benchmark suit. According to his
> > > > results, general workloads has a little proportion (<10%) of "write
> > > > but not dirty" out of all the write operations, while few workloads
> > > > have higher proportion (one even as high as 50%). Now we are not
> > > > sure why "write but not dirty" would happen, it just happened.
> > > >
> > > > So these two reasons contribute to the false dirty pages. To
> > > > optimize, I compute and store the SHA1 hash before transferring each
> > > > page. Next time, if one page needs retransmission, its
> > > > SHA1 hash is computed again, and compared to the old hash. If the
> > > > hash is the same, it's a false dirty page, and we just skip this
> > > > page; Otherwise, the page is transferred, and the new hash replaces
> > > > the old one for next comparison.
> > > > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra
> > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > relatively small.
> > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > deduplication for backup systems.
> > > > They have proven that the probability of hash collision is far
> > > > smaller than disk hardware fault, so it's secure hash, that is, if
> > > > the hashes of two chunks are the same, the content must be the same.
> > > > So I think the SHA1 hash could replace byte-to-byte comparison in
> > > > the VM memory scenery.
> > > >
> > > > Then I do the same migration experiments using the SHA1 hash. For
> > > > the 4 workloads which have big proportions of false dirty pages, the
> > > > improvement is remarkable. Without optimization, they either can not
> > > > converge to stop-and-copy, or take a very long time to complete.
> > > > With the
> > > > SHA1 hash method, all of them now complete in a relatively short time.
> > > > For the reason I have talked above, the other workloads don't get
> > > > notable improvements from the optimization. So below, I only show
> > > > the exact number after optimization for the 4 workloads with
> > > > remarkable improvements.
> > > >
> > > > Any comments or suggestions?
> > >
> > > Maybe you can compare the performance of your solution as that of
> > XBZRLE to see which one is better.
> > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and
> > need less buffer.
> > > How about the overhead of calculating the SHA1? Is it faster than copying a
> > page?
> > >
> > > Liang
> > >
> > >
> >
> > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to
> > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer
> > as big as the whole VM memory, while SHA1 needs a much small buffer. Of
> > course, if we have a buffer as big as the whole VM memory using XBZRLE, we
> > could transfer less data on network than SHA1, because XBZRLE is able to
> > compress similar pages. In a word, yes, the merit of using SHA1 is that it
> > needs much less buffer, and leads to nice improvement if there are many
> > false dirty pages.
> >
>
> The current implementation of XBZRLE begins to buffer page from the second iteration,
> Maybe it's worth to make it start to work from the first iteration based on your finding.
Yes, I noticed that. If we make it start to work from the first iteration, I think the
buffer should be large enough to obtain obvious effect.
>
> > In terms of the overhead of calculating the SHA1 compared with transferring
> > a page, it's related to the CPU and network performance. In my test
> > environment(Intel Xeon
> > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing
> > overhead caused by calculating the SHA1, because the throughput of
> > network (got by "info migrate") remains almost the same.
>
> You can check the CPU usage, or to measure the time spend on a local live migration
> which use SHA1/ XBZRLE.
Yes, I can compare SHA1 with XBZRLE. Maybe I will post the results later.
Chunguang
>
> Liang
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-04 4:50 ` Li, Liang Z
2016-11-04 7:03 ` Chunguang Li
@ 2016-11-07 13:52 ` Chunguang Li
2016-11-07 14:17 ` Li, Liang Z
2016-11-07 14:44 ` Li, Liang Z
1 sibling, 2 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-07 13:52 UTC (permalink / raw)
To: Li, Liang Z
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Friday, November 4, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> > > > > > > > I think this is "very" wasteful. Assume the workload writes
> > > > > > > > the pages
> > > > dirty randomly within the guest address space, and the transfer
> > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > pages produced in Iteration 1 is not really dirty. This means the
> > > > time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > >
> > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > kinds of workloads get impacted the most? That would also
> > > > > > > help us to figure out what kinds of speed improvements we can
> > expect.
> > > > > > >
> > > > > > >
> > > > > > > Amit
> > > > > >
> > > > > > I have picked up 6 workloads and got the following statistics
> > > > > > numbers of every iteration (except the last stop-copy one) during
> > precopy.
> > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > without the capabilities like xbzrle or compression, etc. The
> > > > > > network for the migration is exclusive, with a separate network for
> > the workloads.
> > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > >
> > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > stop-copy
> > > > phase,
> > > > > > with the given bandwidth and default downtime (300ms), while the
> > > > > > other three (kernel compilation, zeusmp, memcached) did not.
> > > > > >
> > > > > > One page is "not-really-dirty", if it is written first and is
> > > > > > sent later (and not written again after that) during one
> > > > > > iteration. I guess this would not happen so often during the
> > > > > > other iterations as during the 1st iteration. Because all the
> > > > > > pages of the VM are sent to the dest node
> > > > during
> > > > > > the 1st iteration, while during the others, only part of the pages are
> > sent.
> > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > mainly during the 1st iteration , and maybe very little during the other
> > iterations.
> > > > > >
> > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > halved. This is a chain
> > > > reaction,
> > > > > > because the dirty pages produced during Iteration 2 is halved,
> > > > > > which
> > > > incurs
> > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > >
> > > > > Yes; these numbers don't show how many of them are false dirty
> > though.
> > > > >
> > > > > One problem is thinking about pages that have been redirtied, if
> > > > > the page is
> > > > dirtied
> > > > > after the sync but before the network write then it's the
> > > > > false-dirty that you're describing.
> > > > >
> > > > > However, if the page is being written a few times, and so it would
> > > > > have
> > > > been written
> > > > > after the network write then it isn't a false-dirty.
> > > > >
> > > > > You might be able to figure that out with some kernel tracing of
> > > > > when the
> > > > dirtying
> > > > > happens, but it might be easier to write the fix!
> > > > >
> > > > > Dave
> > > >
> > > > Hi, I have made some new progress now.
> > > >
> > > > To tell how many false dirty pages there are exactly in each
> > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > whole VM memory. When a page is transferred to the dest node, it is
> > > > copied to the buffer; During the next iteration, if one page is
> > > > transferred, it is compared to the old one in the buffer, and the
> > > > old one will be replaced for next comparison if it is really dirty.
> > > > Thus, we are now able to get the exact number of false dirty pages.
> > > >
> > > > This time, I use 15 workloads to get the statistic number. They are:
> > > >
> > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are
> > > > all scientific
> > > > computing workloads like Quantum Chromodynamics, Fluid Dynamics,
> > etc.
> > > > I pick
> > > > up these 11 benchmarks because compared to others, they have
> > > > bigger memory
> > > > occupation and higher memory dirty rate. Thus most of them
> > > > could not converge
> > > > to stop-and-copy using the default migration speed (32MB/s).
> > > > 2. kernel compilation
> > > > 3. idle VM
> > > > 4. Apache web server which serves static content
> > > >
> > > > (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > memory, and the
> > > > migration speed is the default 32MB/s)
> > > >
> > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are
> > > > used as the cache.
> > > > After filling up the 4GB cache, a client writes the cache at a constant
> > speed
> > > > during migration. This time, migration speed has no limit, and is up to
> > the
> > > > capability of 1Gbps Ethernet.
> > > >
> > > > Summarize the results first: (and you can read the precise number
> > > > below)
> > > >
> > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80%
> > > > during some iterations)
> > > > of false dirty pages out of all the dirty pages since iteration 2 (and the
> > big
> > > > proportion lasts during the following iterations). They are
> > cpu2006.zeusmp,
> > > > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > > 2. 2 workloads (idle, webserver) spend most of the migration time
> > > > on iteration 1, even
> > > > though the proportion of false dirty pages is big since
> > > > iteration 2, the space to
> > > > optimize is small.
> > > > 3. 1 workload (kernel compilation) only have a big proportion
> > > > during iteration 2, not
> > > > in the other iterations.
> > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > proportion of false
> > > > dirty pages since iteration 2. So the spaces to optimize for them are
> > small.
> > > >
> > > > Now I want to talk a little more about the reasons why false dirty
> > > > pages are produced.
> > > > The first reason is what we have discussed before---the mechanism to
> > > > track the dirty pages.
> > > > And then I come up with another reason. Here is the situation: a
> > > > write operation to one memory page happens, but it doesn't change
> > > > any content of the page. So it's "write but not dirty", and kernel
> > > > still marks it as dirty. One guy in our lab has done some
> > > > experiments to figure out the proportion of "write but not dirty"
> > > > operations, and he uses the cpu2006 benchmark suit. According to his
> > > > results, general workloads has a little proportion (<10%) of "write
> > > > but not dirty" out of all the write operations, while few workloads
> > > > have higher proportion (one even as high as 50%). Now we are not
> > > > sure why "write but not dirty" would happen, it just happened.
> > > >
> > > > So these two reasons contribute to the false dirty pages. To
> > > > optimize, I compute and store the SHA1 hash before transferring each
> > > > page. Next time, if one page needs retransmission, its
> > > > SHA1 hash is computed again, and compared to the old hash. If the
> > > > hash is the same, it's a false dirty page, and we just skip this
> > > > page; Otherwise, the page is transferred, and the new hash replaces
> > > > the old one for next comparison.
> > > > The reason to use SHA1 hash but not byte-by-byte comparison is the
> > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra
> > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > relatively small.
> > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > deduplication for backup systems.
> > > > They have proven that the probability of hash collision is far
> > > > smaller than disk hardware fault, so it's secure hash, that is, if
> > > > the hashes of two chunks are the same, the content must be the same.
> > > > So I think the SHA1 hash could replace byte-to-byte comparison in
> > > > the VM memory scenery.
> > > >
> > > > Then I do the same migration experiments using the SHA1 hash. For
> > > > the 4 workloads which have big proportions of false dirty pages, the
> > > > improvement is remarkable. Without optimization, they either can not
> > > > converge to stop-and-copy, or take a very long time to complete.
> > > > With the
> > > > SHA1 hash method, all of them now complete in a relatively short time.
> > > > For the reason I have talked above, the other workloads don't get
> > > > notable improvements from the optimization. So below, I only show
> > > > the exact number after optimization for the 4 workloads with
> > > > remarkable improvements.
> > > >
> > > > Any comments or suggestions?
> > >
> > > Maybe you can compare the performance of your solution as that of
> > XBZRLE to see which one is better.
> > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and
> > need less buffer.
> > > How about the overhead of calculating the SHA1? Is it faster than copying a
> > page?
> > >
> > > Liang
> > >
> > >
> >
> > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to
> > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer
> > as big as the whole VM memory, while SHA1 needs a much small buffer. Of
> > course, if we have a buffer as big as the whole VM memory using XBZRLE, we
> > could transfer less data on network than SHA1, because XBZRLE is able to
> > compress similar pages. In a word, yes, the merit of using SHA1 is that it
> > needs much less buffer, and leads to nice improvement if there are many
> > false dirty pages.
> >
>
> The current implementation of XBZRLE begins to buffer page from the second iteration,
> Maybe it's worth to make it start to work from the first iteration based on your finding.
>
> > In terms of the overhead of calculating the SHA1 compared with transferring
> > a page, it's related to the CPU and network performance. In my test
> > environment(Intel Xeon
> > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing
> > overhead caused by calculating the SHA1, because the throughput of
> > network (got by "info migrate") remains almost the same.
>
> You can check the CPU usage, or to measure the time spend on a local live migration
> which use SHA1/ XBZRLE.
>
> Liang
>
>
I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
1. Begins to buffer pages from iteration 1;
2. As current implementation, begins to buffer pages from iteration 2.
I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf, memcached.
I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB ram),
and set the cache size as 1GB for memcached (it run in VM with 6GB ram, and
memcached takes 4GB as cache).
As you can read from the data below, beginning to buffer pages from iteration 1
is better than the current implementation(from iteration 2), because the total
migration time is shorter.
SHA1 is better than the XBZRLE with the cache size I choose, because it leads to shorter
migration time, and consumes far less memory overhead (<1/200 of the total VM memory).
1. zeusmp
(1) XBZRLE 256MB cache, begins to buffer pages from iteration 1
Iteration 1, duration: 21402 ms , transferred pages: 266450 (dup: 91456, n: 174994, x: 0) , new dirty pages: 129225 , remaining dirty pages: 129225
Iteration 2, duration: 2295 ms , transferred pages: 101471 (dup: 77921, n: 16665, x: 6885) , new dirty pages: 76125 , remaining dirty pages: 77424
Iteration 3, duration: 2000 ms , transferred pages: 56092 (dup: 36345, n: 11249, x: 8498) , new dirty pages: 111498 , remaining dirty pages: 112836
Iteration 4, duration: 1604 ms , transferred pages: 87335 (dup: 69441, n: 10018, x: 7876) , new dirty pages: 19982 , remaining dirty pages: 19982
Iteration 5, duration: 302 ms , transferred pages: 19850 (dup: 16718, n: 2547, x: 585) , new dirty pages: 14084 , remaining dirty pages: 14084
Iteration 6, duration: 194 ms , transferred pages: 13403 (dup: 12338, n: 846, x: 219) , new dirty pages: 3900 , remaining dirty pages: 4243
Iteration 7, duration: 8 ms , transferred pages: 3938 (dup: 3425, n: 239, x: 274) , new dirty pages: 372 , remaining dirty pages: 372
Iteration 8, duration: 71 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 372
total time: 27891 milliseconds
(2) XBZRLE 256MB cache, begins to buffer pages from iteration 2
can not converge
Iteration 1, duration: 21698 ms , transferred pages: 266331 (dup: 89009, n: 177322, x: 0) , new dirty pages: 125990 , remaining dirty pages: 126109
Iteration 2, duration: 5909 ms , transferred pages: 126109 (dup: 77248, n: 48861, x: 0) , new dirty pages: 124870 , remaining dirty pages: 124870
Iteration 3, duration: 3197 ms , transferred pages: 110583 (dup: 75471, n: 23129, x: 11983) , new dirty pages: 118035 , remaining dirty pages: 118035
Iteration 4, duration: 3195 ms , transferred pages: 102787 (dup: 72708, n: 22158, x: 7921) , new dirty pages: 86576 , remaining dirty pages: 86773
Iteration 5, duration: 3111 ms , transferred pages: 79563 (dup: 52073, n: 21289, x: 6201) , new dirty pages: 97402 , remaining dirty pages: 97402
Iteration 6, duration: 2407 ms , transferred pages: 79567 (dup: 56415, n: 16013, x: 7139) , new dirty pages: 101193 , remaining dirty pages: 101193
Iteration 7, duration: 2896 ms , transferred pages: 83278 (dup: 55778, n: 20652, x: 6848) , new dirty pages: 90683 , remaining dirty pages: 92977
Iteration 8, duration: 2701 ms , transferred pages: 89112 (dup: 62579, n: 18699, x: 7834) , new dirty pages: 109827 , remaining dirty pages: 110008
Iteration 9, duration: 3602 ms , transferred pages: 95866 (dup: 61631, n: 25632, x: 8603) , new dirty pages: 94551 , remaining dirty pages: 96227
Iteration 10, duration: 3802 ms , transferred pages: 83693 (dup: 50558, n: 26427, x: 6708) , new dirty pages: 123537 , remaining dirty pages: 124170
Iteration 11, duration: 3399 ms , transferred pages: 108770 (dup: 75144, n: 23952, x: 9674) , new dirty pages: 103934 , remaining dirty pages: 104981
Iteration 12, duration: 2700 ms , transferred pages: 91080 (dup: 62981, n: 16600, x: 11499) , new dirty pages: 88314 , remaining dirty pages: 88948
Iteration 13, duration: 3102 ms , transferred pages: 78406 (dup: 50165, n: 21409, x: 6832) , new dirty pages: 73586 , remaining dirty pages: 74025
Iteration 14, duration: 806 ms , transferred pages: 66530 (dup: 51013, n: 3973, x: 11544) , new dirty pages: 67941 , remaining dirty pages: 67941
Iteration 15, duration: 2398 ms , transferred pages: 53117 (dup: 33312, n: 18436, x: 1369) , new dirty pages: 116502 , remaining dirty pages: 118956
Iteration 16, duration: 3200 ms , transferred pages: 103009 (dup: 71642, n: 21378, x: 9989) , new dirty pages: 81777 , remaining dirty pages: 83724
Iteration 17, duration: 3005 ms , transferred pages: 73096 (dup: 45738, n: 19016, x: 8342) , new dirty pages: 116671 , remaining dirty pages: 118397
Iteration 18, duration: 3302 ms , transferred pages: 101507 (dup: 67290, n: 22721, x: 11496) , new dirty pages: 104163 , remaining dirty pages: 105921
Iteration 19, duration: 3705 ms , transferred pages: 90516 (dup: 56932, n: 26394, x: 7190) , new dirty pages: 118139 , remaining dirty pages: 120170
Iteration 20, duration: 3903 ms , transferred pages: 102710 (dup: 67623, n: 25811, x: 9276) , new dirty pages: 103608 , remaining dirty pages: 105496
(3) SHA1
Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843
Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945
Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929
Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916
Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302
Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500
Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520
Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524
total time: 27225 milliseconds
2. mcf
(1) XBZRLE 256MB cache, begins to buffer pages from iteration 1
Iteration 1, duration: 31706 ms , transferred pages: 266325 (dup: 7032, n: 259293, x: 0) , new dirty pages: 238215 , remaining dirty pages: 238340
Iteration 2, duration: 21807 ms , transferred pages: 186619 (dup: 335, n: 176826, x: 9458) , new dirty pages: 226886 , remaining dirty pages: 228857
Iteration 3, duration: 21300 ms , transferred pages: 181925 (dup: 201, n: 172974, x: 8750) , new dirty pages: 202288 , remaining dirty pages: 204100
Iteration 4, duration: 17300 ms , transferred pages: 148972 (dup: 38, n: 141113, x: 7821) , new dirty pages: 136220 , remaining dirty pages: 137992
Iteration 5, duration: 13699 ms , transferred pages: 118247 (dup: 38, n: 112030, x: 6179) , new dirty pages: 48397 , remaining dirty pages: 50466
Iteration 6, duration: 4499 ms , transferred pages: 41719 (dup: 24, n: 36790, x: 4905) , new dirty pages: 3753 , remaining dirty pages: 5690
Iteration 7, duration: 399 ms , transferred pages: 3826 (dup: 4, n: 3265, x: 557) , new dirty pages: 1261 , remaining dirty pages: 2437
Iteration 8, duration: 72 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 2437
total time: 110812 milliseconds
(2) XBZRLE 256MB cache, begins to buffer pages from iteration 2
Iteration 1, duration: 31606 ms , transferred pages: 266450 (dup: 7267, n: 259183, x: 0) , new dirty pages: 233582 , remaining dirty pages: 233582
Iteration 2, duration: 28413 ms , transferred pages: 231693 (dup: 89, n: 231604, x: 0) , new dirty pages: 216962 , remaining dirty pages: 218851
Iteration 3, duration: 18618 ms , transferred pages: 159936 (dup: 3, n: 151579, x: 8354) , new dirty pages: 216400 , remaining dirty pages: 218790
Iteration 4, duration: 18621 ms , transferred pages: 159665 (dup: 0, n: 152102, x: 7563) , new dirty pages: 209860 , remaining dirty pages: 211611
Iteration 5, duration: 17709 ms , transferred pages: 151672 (dup: 4, n: 144493, x: 7175) , new dirty pages: 146273 , remaining dirty pages: 148006
Iteration 6, duration: 9911 ms , transferred pages: 86971 (dup: 2, n: 80842, x: 6127) , new dirty pages: 118364 , remaining dirty pages: 120396
Iteration 7, duration: 14212 ms , transferred pages: 117460 (dup: 0, n: 116149, x: 1311) , new dirty pages: 213993 , remaining dirty pages: 216107
Iteration 8, duration: 22913 ms , transferred pages: 213698 (dup: 4, n: 161520, x: 52174) , new dirty pages: 217947 , remaining dirty pages: 219955
Iteration 9, duration: 23808 ms , transferred pages: 217375 (dup: 3, n: 152315, x: 65057) , new dirty pages: 172615 , remaining dirty pages: 174859
Iteration 10, duration: 15099 ms , transferred pages: 131265 (dup: 0, n: 123463, x: 7802) , new dirty pages: 113946 , remaining dirty pages: 116026
Iteration 11, duration: 10002 ms , transferred pages: 88477 (dup: 8, n: 81753, x: 6716) , new dirty pages: 97006 , remaining dirty pages: 99110
Iteration 12, duration: 6898 ms , transferred pages: 62861 (dup: 4, n: 56392, x: 6465) , new dirty pages: 45164 , remaining dirty pages: 47297
Iteration 13, duration: 3601 ms , transferred pages: 35360 (dup: 0, n: 29390, x: 5970) , new dirty pages: 24581 , remaining dirty pages: 26779
Iteration 14, duration: 1902 ms , transferred pages: 19794 (dup: 0, n: 15475, x: 4319) , new dirty pages: 66153 , remaining dirty pages: 67850
Iteration 15, duration: 5504 ms , transferred pages: 50369 (dup: 0, n: 44902, x: 5467) , new dirty pages: 49279 , remaining dirty pages: 51198
Iteration 16, duration: 3699 ms , transferred pages: 36519 (dup: 2, n: 30184, x: 6333) , new dirty pages: 23672 , remaining dirty pages: 25914
Iteration 17, duration: 1601 ms , transferred pages: 17628 (dup: 0, n: 12972, x: 4656) , new dirty pages: 8685 , remaining dirty pages: 10646
Iteration 18, duration: 599 ms , transferred pages: 7835 (dup: 0, n: 4825, x: 3010) , new dirty pages: 6167 , remaining dirty pages: 6266
Iteration 19, duration: 200 ms , transferred pages: 3590 (dup: 0, n: 1576, x: 2014) , new dirty pages: 3873 , remaining dirty pages: 5709
Iteration 20, duration: 200 ms , transferred pages: 4134 (dup: 0, n: 1574, x: 2560) , new dirty pages: 3609 , remaining dirty pages: 4099
Iteration 21, duration: 100 ms , transferred pages: 2785 (dup: 0, n: 787, x: 1998) , new dirty pages: 1900 , remaining dirty pages: 2585
Iteration 22, duration: 9 ms , transferred pages: 2191 (dup: 0, n: 539, x: 1652) , new dirty pages: 596 , remaining dirty pages: 596
Iteration 23, duration: 41 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 596
total time: 235286 milliseconds
(3) SHA1
Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209
Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571
Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478
Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172
Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305
Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675
Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265
Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668
Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014
Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214
Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725
Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903
Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260
Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266
Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792
Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639
Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170
Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547
Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521
Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054
Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624
Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265
Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813
Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608
Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311
Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294
Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936
Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423
Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292
Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457
Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304
Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815
Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670
Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422
Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222
Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337
Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251
Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251
total time: 97919 milliseconds
3. memcached
(1) XBZRLE 1024MB cache, begins to buffer pages from iteration 1
Iteration 1, duration: 40763 ms , transferred pages: 1570149 (dup: 404139, n: 1166010, x: 0) , new dirty pages: 526462 , remaining dirty pages: 533483
Iteration 2, duration: 15741 ms , transferred pages: 461867 (dup: 4070, n: 437704, x: 20093) , new dirty pages: 256841 , remaining dirty pages: 265501
Iteration 3, duration: 7874 ms , transferred pages: 231950 (dup: 280, n: 207569, x: 24101) , new dirty pages: 153526 , remaining dirty pages: 160865
Iteration 4, duration: 4260 ms , transferred pages: 135181 (dup: 135, n: 116768, x: 18278) , new dirty pages: 100298 , remaining dirty pages: 107278
Iteration 5, duration: 2506 ms , transferred pages: 87596 (dup: 180, n: 67600, x: 19816) , new dirty pages: 63685 , remaining dirty pages: 71790
Iteration 6, duration: 1373 ms , transferred pages: 51800 (dup: 128, n: 37336, x: 14336) , new dirty pages: 38785 , remaining dirty pages: 46064
Iteration 7, duration: 872 ms , transferred pages: 32015 (dup: 56, n: 23414, x: 8545) , new dirty pages: 23580 , remaining dirty pages: 31629
Iteration 8, duration: 527 ms , transferred pages: 21833 (dup: 40, n: 14372, x: 7421) , new dirty pages: 16624 , remaining dirty pages: 23482
Iteration 9, duration: 291 ms , transferred pages: 14917 (dup: 16, n: 6572, x: 8329) , new dirty pages: 10039 , remaining dirty pages: 16753
Iteration 10, duration: 113 ms , transferred pages: 6082 (dup: 111, n: 3300, x: 2671) , new dirty pages: 4081 , remaining dirty pages: 12703
Iteration 11, duration: 119 ms , transferred pages: 3970 (dup: 16, n: 2953, x: 1001) , new dirty pages: 3824 , remaining dirty pages: 11936
Iteration 12, duration: 51 ms , transferred pages: 3585 (dup: 0, n: 1154, x: 2431) , new dirty pages: 1711 , remaining dirty pages: 9900
Iteration 13, duration: 62 ms , transferred pages: 2945 (dup: 0, n: 1589, x: 1356) , new dirty pages: 1909 , remaining dirty pages: 8503
Iteration 14, duration: 2 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 1 , remaining dirty pages: 8504
total time: 74738 milliseconds
(2) XBZRLE 1024MB cache, begins to buffer pages from iteration 2
Iteration 1, duration: 40375 ms , transferred pages: 1570347 (dup: 415923, n: 1154424, x: 0) , new dirty pages: 511859 , remaining dirty pages: 518682
Iteration 2, duration: 17580 ms , transferred pages: 510145 (dup: 5970, n: 504175, x: 0) , new dirty pages: 291686 , remaining dirty pages: 300223
Iteration 3, duration: 8259 ms , transferred pages: 253656 (dup: 929, n: 230020, x: 22707) , new dirty pages: 166721 , remaining dirty pages: 174231
Iteration 4, duration: 4733 ms , transferred pages: 147925 (dup: 257, n: 132454, x: 15214) , new dirty pages: 103965 , remaining dirty pages: 111436
Iteration 5, duration: 2587 ms , transferred pages: 90734 (dup: 251, n: 70008, x: 20475) , new dirty pages: 61266 , remaining dirty pages: 69202
Iteration 6, duration: 1377 ms , transferred pages: 51416 (dup: 55, n: 37776, x: 13585) , new dirty pages: 45236 , remaining dirty pages: 52106
Iteration 7, duration: 1126 ms , transferred pages: 40020 (dup: 259, n: 30064, x: 9697) , new dirty pages: 28433 , remaining dirty pages: 35358
Iteration 8, duration: 574 ms , transferred pages: 23754 (dup: 40, n: 16066, x: 7648) , new dirty pages: 18067 , remaining dirty pages: 26353
Iteration 9, duration: 395 ms , transferred pages: 17607 (dup: 16, n: 9463, x: 8128) , new dirty pages: 11507 , remaining dirty pages: 18488
Iteration 10, duration: 171 ms , transferred pages: 8195 (dup: 40, n: 4726, x: 3429) , new dirty pages: 5482 , remaining dirty pages: 13898
Iteration 11, duration: 116 ms , transferred pages: 6594 (dup: 16, n: 2679, x: 3899) , new dirty pages: 3884 , remaining dirty pages: 9581
Iteration 12, duration: 54 ms , transferred pages: 1793 (dup: 0, n: 1634, x: 159) , new dirty pages: 1515 , remaining dirty pages: 9189
Iteration 13, duration: 62 ms , transferred pages: 1793 (dup: 0, n: 1643, x: 150) , new dirty pages: 1657 , remaining dirty pages: 8871
Iteration 14, duration: 3 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 1 , remaining dirty pages: 8872
total time: 77578 milliseconds
(3) SHA1
Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979
Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479
Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100
Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414
Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536
Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516
Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802
Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060
Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973
Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014
Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516
Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842
Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871
Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989
Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989
total time: 54693 milliseconds
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-07 13:52 ` Chunguang Li
@ 2016-11-07 14:17 ` Li, Liang Z
2016-11-08 5:27 ` Chunguang Li
2016-11-07 14:44 ` Li, Liang Z
1 sibling, 1 reply; 21+ messages in thread
From: Li, Liang Z @ 2016-11-07 14:17 UTC (permalink / raw)
To: Chunguang Li
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> > > > > > > > > I think this is "very" wasteful. Assume the workload
> > > > > > > > > writes the pages
> > > > > dirty randomly within the guest address space, and the transfer
> > > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > > pages produced in Iteration 1 is not really dirty. This means
> > > > > the time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > > >
> > > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > > kinds of workloads get impacted the most? That would also
> > > > > > > > help us to figure out what kinds of speed improvements we
> > > > > > > > can
> > > expect.
> > > > > > > >
> > > > > > > >
> > > > > > > > Amit
> > > > > > >
> > > > > > > I have picked up 6 workloads and got the following
> > > > > > > statistics numbers of every iteration (except the last
> > > > > > > stop-copy one) during
> > > precopy.
> > > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > > without the capabilities like xbzrle or compression, etc.
> > > > > > > The network for the migration is exclusive, with a separate
> > > > > > > network for
> > > the workloads.
> > > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > > >
> > > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > > stop-copy
> > > > > phase,
> > > > > > > with the given bandwidth and default downtime (300ms), while
> > > > > > > the other three (kernel compilation, zeusmp, memcached) did not.
> > > > > > >
> > > > > > > One page is "not-really-dirty", if it is written first and
> > > > > > > is sent later (and not written again after that) during one
> > > > > > > iteration. I guess this would not happen so often during the
> > > > > > > other iterations as during the 1st iteration. Because all
> > > > > > > the pages of the VM are sent to the dest node
> > > > > during
> > > > > > > the 1st iteration, while during the others, only part of the
> > > > > > > pages are
> > > sent.
> > > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > > mainly during the 1st iteration , and maybe very little
> > > > > > > during the other
> > > iterations.
> > > > > > >
> > > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > > halved. This is a chain
> > > > > reaction,
> > > > > > > because the dirty pages produced during Iteration 2 is
> > > > > > > halved, which
> > > > > incurs
> > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > > >
> > > > > > Yes; these numbers don't show how many of them are false dirty
> > > though.
> > > > > >
> > > > > > One problem is thinking about pages that have been redirtied,
> > > > > > if the page is
> > > > > dirtied
> > > > > > after the sync but before the network write then it's the
> > > > > > false-dirty that you're describing.
> > > > > >
> > > > > > However, if the page is being written a few times, and so it
> > > > > > would have
> > > > > been written
> > > > > > after the network write then it isn't a false-dirty.
> > > > > >
> > > > > > You might be able to figure that out with some kernel tracing
> > > > > > of when the
> > > > > dirtying
> > > > > > happens, but it might be easier to write the fix!
> > > > > >
> > > > > > Dave
> > > > >
> > > > > Hi, I have made some new progress now.
> > > > >
> > > > > To tell how many false dirty pages there are exactly in each
> > > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > > whole VM memory. When a page is transferred to the dest node, it
> > > > > is copied to the buffer; During the next iteration, if one page
> > > > > is transferred, it is compared to the old one in the buffer, and
> > > > > the old one will be replaced for next comparison if it is really dirty.
> > > > > Thus, we are now able to get the exact number of false dirty pages.
> > > > >
> > > > > This time, I use 15 workloads to get the statistic number. They are:
> > > > >
> > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They
> > > > > are all scientific
> > > > > computing workloads like Quantum Chromodynamics, Fluid
> > > > > Dynamics,
> > > etc.
> > > > > I pick
> > > > > up these 11 benchmarks because compared to others, they
> > > > > have bigger memory
> > > > > occupation and higher memory dirty rate. Thus most of them
> > > > > could not converge
> > > > > to stop-and-copy using the default migration speed (32MB/s).
> > > > > 2. kernel compilation
> > > > > 3. idle VM
> > > > > 4. Apache web server which serves static content
> > > > >
> > > > > (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > > memory, and the
> > > > > migration speed is the default 32MB/s)
> > > > >
> > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB
> > > > > are used as the cache.
> > > > > After filling up the 4GB cache, a client writes the cache
> > > > > at a constant
> > > speed
> > > > > during migration. This time, migration speed has no limit,
> > > > > and is up to
> > > the
> > > > > capability of 1Gbps Ethernet.
> > > > >
> > > > > Summarize the results first: (and you can read the precise
> > > > > number
> > > > > below)
> > > > >
> > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even
> > > > > >80% during some iterations)
> > > > > of false dirty pages out of all the dirty pages since
> > > > > iteration 2 (and the
> > > big
> > > > > proportion lasts during the following iterations). They are
> > > cpu2006.zeusmp,
> > > > > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > > > 2. 2 workloads (idle, webserver) spend most of the migration
> > > > > time on iteration 1, even
> > > > > though the proportion of false dirty pages is big since
> > > > > iteration 2, the space to
> > > > > optimize is small.
> > > > > 3. 1 workload (kernel compilation) only have a big proportion
> > > > > during iteration 2, not
> > > > > in the other iterations.
> > > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > > proportion of false
> > > > > dirty pages since iteration 2. So the spaces to optimize
> > > > > for them are
> > > small.
> > > > >
> > > > > Now I want to talk a little more about the reasons why false
> > > > > dirty pages are produced.
> > > > > The first reason is what we have discussed before---the
> > > > > mechanism to track the dirty pages.
> > > > > And then I come up with another reason. Here is the situation: a
> > > > > write operation to one memory page happens, but it doesn't
> > > > > change any content of the page. So it's "write but not dirty",
> > > > > and kernel still marks it as dirty. One guy in our lab has done
> > > > > some experiments to figure out the proportion of "write but not
> dirty"
> > > > > operations, and he uses the cpu2006 benchmark suit. According to
> > > > > his results, general workloads has a little proportion (<10%) of
> > > > > "write but not dirty" out of all the write operations, while few
> > > > > workloads have higher proportion (one even as high as 50%). Now
> > > > > we are not sure why "write but not dirty" would happen, it just
> happened.
> > > > >
> > > > > So these two reasons contribute to the false dirty pages. To
> > > > > optimize, I compute and store the SHA1 hash before transferring
> > > > > each page. Next time, if one page needs retransmission, its
> > > > > SHA1 hash is computed again, and compared to the old hash. If
> > > > > the hash is the same, it's a false dirty page, and we just skip
> > > > > this page; Otherwise, the page is transferred, and the new hash
> > > > > replaces the old one for next comparison.
> > > > > The reason to use SHA1 hash but not byte-by-byte comparison is
> > > > > the memory overheads. One SHA1 hash is 20 bytes. So we need
> > > > > extra
> > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > > relatively small.
> > > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > > deduplication for backup systems.
> > > > > They have proven that the probability of hash collision is far
> > > > > smaller than disk hardware fault, so it's secure hash, that is,
> > > > > if the hashes of two chunks are the same, the content must be the
> same.
> > > > > So I think the SHA1 hash could replace byte-to-byte comparison
> > > > > in the VM memory scenery.
> > > > >
> > > > > Then I do the same migration experiments using the SHA1 hash.
> > > > > For the 4 workloads which have big proportions of false dirty
> > > > > pages, the improvement is remarkable. Without optimization, they
> > > > > either can not converge to stop-and-copy, or take a very long time to
> complete.
> > > > > With the
> > > > > SHA1 hash method, all of them now complete in a relatively short
> time.
> > > > > For the reason I have talked above, the other workloads don't
> > > > > get notable improvements from the optimization. So below, I only
> > > > > show the exact number after optimization for the 4 workloads
> > > > > with remarkable improvements.
> > > > >
> > > > > Any comments or suggestions?
> > > >
> > > > Maybe you can compare the performance of your solution as that of
> > > XBZRLE to see which one is better.
> > > > The merit of using SHA1 is that it can avoid data copy as that in
> > > > XBZRLE, and
> > > need less buffer.
> > > > How about the overhead of calculating the SHA1? Is it faster than
> > > > copying a
> > > page?
> > > >
> > > > Liang
> > > >
> > > >
> > >
> > > Yes, XBZRLE is able to handle the false dirty pages. However, if we
> > > want to avoid transferring all of the false dirty pages using
> > > XBZRLE, we need a buffer as big as the whole VM memory, while SHA1
> > > needs a much small buffer. Of course, if we have a buffer as big as
> > > the whole VM memory using XBZRLE, we could transfer less data on
> > > network than SHA1, because XBZRLE is able to compress similar pages.
> > > In a word, yes, the merit of using SHA1 is that it needs much less
> > > buffer, and leads to nice improvement if there are many false dirty pages.
> > >
> >
> > The current implementation of XBZRLE begins to buffer page from the
> > second iteration, Maybe it's worth to make it start to work from the first
> iteration based on your finding.
> >
> > > In terms of the overhead of calculating the SHA1 compared with
> > > transferring a page, it's related to the CPU and network
> > > performance. In my test environment(Intel Xeon
> > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra
> > > computing overhead caused by calculating the SHA1, because the
> > > throughput of network (got by "info migrate") remains almost the same.
> >
> > You can check the CPU usage, or to measure the time spend on a local
> > live migration which use SHA1/ XBZRLE.
> >
> > Liang
> >
> >
>
> I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
> 1. Begins to buffer pages from iteration 1; 2. As current implementation,
> begins to buffer pages from iteration 2.
>
> I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf,
> memcached.
> I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB
> ram), and set the cache size as 1GB for memcached (it run in VM with 6GB
> ram, and memcached takes 4GB as cache).
>
> As you can read from the data below, beginning to buffer pages from
> iteration 1 is better than the current implementation(from iteration 2),
> because the total migration time is shorter.
>
> SHA1 is better than the XBZRLE with the cache size I choose, because it leads
> to shorter migration time, and consumes far less memory overhead (<1/200
> of the total VM memory).
>
Hi Chunguang,
Have you tried to use a large XBZRLE cache size which equals to the guest's RAM size?
Is SHA1 faster in that case?
Thanks!
Liang
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-07 13:52 ` Chunguang Li
2016-11-07 14:17 ` Li, Liang Z
@ 2016-11-07 14:44 ` Li, Liang Z
1 sibling, 0 replies; 21+ messages in thread
From: Li, Liang Z @ 2016-11-07 14:44 UTC (permalink / raw)
To: Chunguang Li
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> > I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
> > 1. Begins to buffer pages from iteration 1; 2. As current
> > implementation, begins to buffer pages from iteration 2.
> >
> > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf,
> > memcached.
> > I set the cache size as 256MB for zeusmp & mcf (they run in VM with
> > 1GB ram), and set the cache size as 1GB for memcached (it run in VM
> > with 6GB ram, and memcached takes 4GB as cache).
> >
> > As you can read from the data below, beginning to buffer pages from
> > iteration 1 is better than the current implementation(from iteration
> > 2), because the total migration time is shorter.
> >
> > SHA1 is better than the XBZRLE with the cache size I choose, because
> > it leads to shorter migration time, and consumes far less memory
> > overhead (<1/200 of the total VM memory).
> >
>
> Hi Chunguang,
>
> Have you tried to use a large XBZRLE cache size which equals to the guest's
> RAM size?
> Is SHA1 faster in that case?
>
> Thanks!
> Liang
Intel's future chipset will contain hardware engines which supports SHA-x and MD5,
We can make use these engines to offload the overhead from CPU for SHA/MD5 calculation.
Liang
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-07 14:17 ` Li, Liang Z
@ 2016-11-08 5:27 ` Chunguang Li
0 siblings, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-08 5:27 UTC (permalink / raw)
To: Li, Liang Z
Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel,
stefanha, quintela
> -----Original Messages-----
> From: "Li, Liang Z" <liang.z.li@intel.com>
> Sent Time: Monday, November 7, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>
> Subject: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> > > > > > > > > > I think this is "very" wasteful. Assume the workload
> > > > > > > > > > writes the pages
> > > > > > dirty randomly within the guest address space, and the transfer
> > > > > > speed is constant. Intuitively, I think nearly half of the dirty
> > > > > > pages produced in Iteration 1 is not really dirty. This means
> > > > > > the time of Iteration 2 is double of that to send only really dirty pages.
> > > > > > > > >
> > > > > > > > > It makes sense, can you get some perf numbers to show what
> > > > > > > > > kinds of workloads get impacted the most? That would also
> > > > > > > > > help us to figure out what kinds of speed improvements we
> > > > > > > > > can
> > > > expect.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Amit
> > > > > > > >
> > > > > > > > I have picked up 6 workloads and got the following
> > > > > > > > statistics numbers of every iteration (except the last
> > > > > > > > stop-copy one) during
> > > > precopy.
> > > > > > > > These numbers are obtained with the basic precopy migration,
> > > > > > > > without the capabilities like xbzrle or compression, etc.
> > > > > > > > The network for the migration is exclusive, with a separate
> > > > > > > > network for
> > > > the workloads.
> > > > > > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > > > > > >
> > > > > > > > Three (booting, idle, web server) of them converged to the
> > > > > > > > stop-copy
> > > > > > phase,
> > > > > > > > with the given bandwidth and default downtime (300ms), while
> > > > > > > > the other three (kernel compilation, zeusmp, memcached) did not.
> > > > > > > >
> > > > > > > > One page is "not-really-dirty", if it is written first and
> > > > > > > > is sent later (and not written again after that) during one
> > > > > > > > iteration. I guess this would not happen so often during the
> > > > > > > > other iterations as during the 1st iteration. Because all
> > > > > > > > the pages of the VM are sent to the dest node
> > > > > > during
> > > > > > > > the 1st iteration, while during the others, only part of the
> > > > > > > > pages are
> > > > sent.
> > > > > > > > So I think the "not-really-dirty" pages should be produced
> > > > > > > > mainly during the 1st iteration , and maybe very little
> > > > > > > > during the other
> > > > iterations.
> > > > > > > >
> > > > > > > > If we could avoid resending the "not-really-dirty" pages,
> > > > > > > > intuitively, I think the time spent on Iteration 2 would be
> > > > > > > > halved. This is a chain
> > > > > > reaction,
> > > > > > > > because the dirty pages produced during Iteration 2 is
> > > > > > > > halved, which
> > > > > > incurs
> > > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > > > > > >
> > > > > > > Yes; these numbers don't show how many of them are false dirty
> > > > though.
> > > > > > >
> > > > > > > One problem is thinking about pages that have been redirtied,
> > > > > > > if the page is
> > > > > > dirtied
> > > > > > > after the sync but before the network write then it's the
> > > > > > > false-dirty that you're describing.
> > > > > > >
> > > > > > > However, if the page is being written a few times, and so it
> > > > > > > would have
> > > > > > been written
> > > > > > > after the network write then it isn't a false-dirty.
> > > > > > >
> > > > > > > You might be able to figure that out with some kernel tracing
> > > > > > > of when the
> > > > > > dirtying
> > > > > > > happens, but it might be easier to write the fix!
> > > > > > >
> > > > > > > Dave
> > > > > >
> > > > > > Hi, I have made some new progress now.
> > > > > >
> > > > > > To tell how many false dirty pages there are exactly in each
> > > > > > iteration, I malloc a buffer in memory as big as the size of the
> > > > > > whole VM memory. When a page is transferred to the dest node, it
> > > > > > is copied to the buffer; During the next iteration, if one page
> > > > > > is transferred, it is compared to the old one in the buffer, and
> > > > > > the old one will be replaced for next comparison if it is really dirty.
> > > > > > Thus, we are now able to get the exact number of false dirty pages.
> > > > > >
> > > > > > This time, I use 15 workloads to get the statistic number. They are:
> > > > > >
> > > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They
> > > > > > are all scientific
> > > > > > computing workloads like Quantum Chromodynamics, Fluid
> > > > > > Dynamics,
> > > > etc.
> > > > > > I pick
> > > > > > up these 11 benchmarks because compared to others, they
> > > > > > have bigger memory
> > > > > > occupation and higher memory dirty rate. Thus most of them
> > > > > > could not converge
> > > > > > to stop-and-copy using the default migration speed (32MB/s).
> > > > > > 2. kernel compilation
> > > > > > 3. idle VM
> > > > > > 4. Apache web server which serves static content
> > > > > >
> > > > > > (the above workloads are all running in VM with 1 vcpu and 1GB
> > > > > > memory, and the
> > > > > > migration speed is the default 32MB/s)
> > > > > >
> > > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB
> > > > > > are used as the cache.
> > > > > > After filling up the 4GB cache, a client writes the cache
> > > > > > at a constant
> > > > speed
> > > > > > during migration. This time, migration speed has no limit,
> > > > > > and is up to
> > > > the
> > > > > > capability of 1Gbps Ethernet.
> > > > > >
> > > > > > Summarize the results first: (and you can read the precise
> > > > > > number
> > > > > > below)
> > > > > >
> > > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even
> > > > > > >80% during some iterations)
> > > > > > of false dirty pages out of all the dirty pages since
> > > > > > iteration 2 (and the
> > > > big
> > > > > > proportion lasts during the following iterations). They are
> > > > cpu2006.zeusmp,
> > > > > > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > > > > > 2. 2 workloads (idle, webserver) spend most of the migration
> > > > > > time on iteration 1, even
> > > > > > though the proportion of false dirty pages is big since
> > > > > > iteration 2, the space to
> > > > > > optimize is small.
> > > > > > 3. 1 workload (kernel compilation) only have a big proportion
> > > > > > during iteration 2, not
> > > > > > in the other iterations.
> > > > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little
> > > > > > proportion of false
> > > > > > dirty pages since iteration 2. So the spaces to optimize
> > > > > > for them are
> > > > small.
> > > > > >
> > > > > > Now I want to talk a little more about the reasons why false
> > > > > > dirty pages are produced.
> > > > > > The first reason is what we have discussed before---the
> > > > > > mechanism to track the dirty pages.
> > > > > > And then I come up with another reason. Here is the situation: a
> > > > > > write operation to one memory page happens, but it doesn't
> > > > > > change any content of the page. So it's "write but not dirty",
> > > > > > and kernel still marks it as dirty. One guy in our lab has done
> > > > > > some experiments to figure out the proportion of "write but not
> > dirty"
> > > > > > operations, and he uses the cpu2006 benchmark suit. According to
> > > > > > his results, general workloads has a little proportion (<10%) of
> > > > > > "write but not dirty" out of all the write operations, while few
> > > > > > workloads have higher proportion (one even as high as 50%). Now
> > > > > > we are not sure why "write but not dirty" would happen, it just
> > happened.
> > > > > >
> > > > > > So these two reasons contribute to the false dirty pages. To
> > > > > > optimize, I compute and store the SHA1 hash before transferring
> > > > > > each page. Next time, if one page needs retransmission, its
> > > > > > SHA1 hash is computed again, and compared to the old hash. If
> > > > > > the hash is the same, it's a false dirty page, and we just skip
> > > > > > this page; Otherwise, the page is transferred, and the new hash
> > > > > > replaces the old one for next comparison.
> > > > > > The reason to use SHA1 hash but not byte-by-byte comparison is
> > > > > > the memory overheads. One SHA1 hash is 20 bytes. So we need
> > > > > > extra
> > > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is
> > > > > > relatively small.
> > > > > > As far as I know, SHA1 hash is widely used in the scenes of
> > > > > > deduplication for backup systems.
> > > > > > They have proven that the probability of hash collision is far
> > > > > > smaller than disk hardware fault, so it's secure hash, that is,
> > > > > > if the hashes of two chunks are the same, the content must be the
> > same.
> > > > > > So I think the SHA1 hash could replace byte-to-byte comparison
> > > > > > in the VM memory scenery.
> > > > > >
> > > > > > Then I do the same migration experiments using the SHA1 hash.
> > > > > > For the 4 workloads which have big proportions of false dirty
> > > > > > pages, the improvement is remarkable. Without optimization, they
> > > > > > either can not converge to stop-and-copy, or take a very long time to
> > complete.
> > > > > > With the
> > > > > > SHA1 hash method, all of them now complete in a relatively short
> > time.
> > > > > > For the reason I have talked above, the other workloads don't
> > > > > > get notable improvements from the optimization. So below, I only
> > > > > > show the exact number after optimization for the 4 workloads
> > > > > > with remarkable improvements.
> > > > > >
> > > > > > Any comments or suggestions?
> > > > >
> > > > > Maybe you can compare the performance of your solution as that of
> > > > XBZRLE to see which one is better.
> > > > > The merit of using SHA1 is that it can avoid data copy as that in
> > > > > XBZRLE, and
> > > > need less buffer.
> > > > > How about the overhead of calculating the SHA1? Is it faster than
> > > > > copying a
> > > > page?
> > > > >
> > > > > Liang
> > > > >
> > > > >
> > > >
> > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we
> > > > want to avoid transferring all of the false dirty pages using
> > > > XBZRLE, we need a buffer as big as the whole VM memory, while SHA1
> > > > needs a much small buffer. Of course, if we have a buffer as big as
> > > > the whole VM memory using XBZRLE, we could transfer less data on
> > > > network than SHA1, because XBZRLE is able to compress similar pages.
> > > > In a word, yes, the merit of using SHA1 is that it needs much less
> > > > buffer, and leads to nice improvement if there are many false dirty pages.
> > > >
> > >
> > > The current implementation of XBZRLE begins to buffer page from the
> > > second iteration, Maybe it's worth to make it start to work from the first
> > iteration based on your finding.
> > >
> > > > In terms of the overhead of calculating the SHA1 compared with
> > > > transferring a page, it's related to the CPU and network
> > > > performance. In my test environment(Intel Xeon
> > > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra
> > > > computing overhead caused by calculating the SHA1, because the
> > > > throughput of network (got by "info migrate") remains almost the same.
> > >
> > > You can check the CPU usage, or to measure the time spend on a local
> > > live migration which use SHA1/ XBZRLE.
> > >
> > > Liang
> > >
> > >
> >
> > I compare SHA1 with XBZRLE. I use XBZRLE in two ways:
> > 1. Begins to buffer pages from iteration 1; 2. As current implementation,
> > begins to buffer pages from iteration 2.
> >
> > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf,
> > memcached.
> > I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB
> > ram), and set the cache size as 1GB for memcached (it run in VM with 6GB
> > ram, and memcached takes 4GB as cache).
> >
> > As you can read from the data below, beginning to buffer pages from
> > iteration 1 is better than the current implementation(from iteration 2),
> > because the total migration time is shorter.
> >
> > SHA1 is better than the XBZRLE with the cache size I choose, because it leads
> > to shorter migration time, and consumes far less memory overhead (<1/200
> > of the total VM memory).
> >
>
> Hi Chunguang,
>
> Have you tried to use a large XBZRLE cache size which equals to the guest's RAM size?
> Is SHA1 faster in that case?
>
> Thanks!
> Liang
You can check the data below. For zeusmp and mcf when the XBZRLE cache size equals to
the guest's RAM size (in fact, the 1024 cache size is a little smaller than the RAM
size, because the guest's RAM has a little extra ram space besides the 1GB we set),
XBZRLE is faster than SHA1.
For the memcached, I am not able to set the cache size as the 6GB RAM size, because the
cache size has to be a power of 2; And I am not able to set it larger than RAM size, because
the current implementation doesn't allow that. So I set the cache size as 4GB, and XBZRLE
with this cache size is almost the same as SHA1 in terms of migration time.
Note that XBZRLE begins to buffer pages from iteration 1.
zeusmp 1024MB cache
Iteration 1, duration: 21604 ms , transferred pages: 266450 (dup: 89509, n: 176941, x: 0) , new dirty pages: 129647 , remaining dirty pages: 129647
Iteration 2, duration: 652 ms , transferred pages: 89270 (dup: 78176, n: 1085, x: 10009) , new dirty pages: 46438 , remaining dirty pages: 46438
Iteration 3, duration: 400 ms , transferred pages: 35789 (dup: 30536, n: 0, x: 5253) , new dirty pages: 33569 , remaining dirty pages: 33569
Iteration 4, duration: 470 ms , transferred pages: 19106 (dup: 10317, n: 75, x: 8714) , new dirty pages: 39307 , remaining dirty pages: 39307
Iteration 5, duration: 72 ms , transferred pages: 17853 (dup: 15904, n: 0, x: 1949) , new dirty pages: 4078 , remaining dirty pages: 4078
Iteration 6, duration: 10 ms , transferred pages: 3280 (dup: 2910, n: 0, x: 370) , new dirty pages: 521 , remaining dirty pages: 521
Iteration 7, duration: 254 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 521
total time: 23481 milliseconds (v.s. 27225 milliseconds for SHA1)
mcf 1024MB cache
Iteration 1, duration: 31704 ms , transferred pages: 266450 (dup: 6794, n: 259656, x: 0) , new dirty pages: 233250 , remaining dirty pages: 233250
Iteration 2, duration: 544 ms , transferred pages: 34186 (dup: 182, n: 423, x: 33581) , new dirty pages: 32757 , remaining dirty pages: 32757
Iteration 3, duration: 67 ms , transferred pages: 8536 (dup: 0, n: 0, x: 8536) , new dirty pages: 5305 , remaining dirty pages: 5305
Iteration 4, duration: 13 ms , transferred pages: 2125 (dup: 0, n: 0, x: 2125) , new dirty pages: 1632 , remaining dirty pages: 1632
Iteration 5, duration: 9 ms , transferred pages: 1038 (dup: 0, n: 0, x: 1038) , new dirty pages: 1095 , remaining dirty pages: 1095
Iteration 6, duration: 3 ms , transferred pages: 592 (dup: 0, n: 0, x: 592) , new dirty pages: 1148 , remaining dirty pages: 1148
Iteration 7, duration: 2 ms , transferred pages: 136 (dup: 0, n: 0, x: 136) , new dirty pages: 1123 , remaining dirty pages: 1123
Iteration 8, duration: 2 ms , transferred pages: 2 (dup: 0, n: 0, x: 2) , new dirty pages: 985 , remaining dirty pages: 985
Iteration 9, duration: 2 ms , transferred pages: 14 (dup: 0, n: 0, x: 14) , new dirty pages: 640 , remaining dirty pages: 640
Iteration 10, duration: 2 ms , transferred pages: 16 (dup: 0, n: 0, x: 16) , new dirty pages: 622 , remaining dirty pages: 622
Iteration 11, duration: 1 ms , transferred pages: 1 (dup: 0, n: 0, x: 1) , new dirty pages: 693 , remaining dirty pages: 693
Iteration 12, duration: 1 ms , transferred pages: 122 (dup: 0, n: 0, x: 122) , new dirty pages: 639 , remaining dirty pages: 639
Iteration 13, duration: 2 ms , transferred pages: 475 (dup: 0, n: 0, x: 475) , new dirty pages: 522 , remaining dirty pages: 522
Iteration 14, duration: 22 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 27 , remaining dirty pages: 549
total time: 32393 milliseconds (v.s. 97919 milliseconds for SHA1)
memcached 4096MB cache
Iteration 1, duration: 41025 ms , transferred pages: 1569059 (dup: 395085, n: 1173974, x: 0) , new dirty pages: 560788 , remaining dirty pages: 568899
Iteration 2, duration: 8218 ms , transferred pages: 300889 (dup: 3963, n: 142928, x: 153998) , new dirty pages: 158832 , remaining dirty pages: 167022
Iteration 3, duration: 2408 ms , transferred pages: 98923 (dup: 285, n: 33854, x: 64784) , new dirty pages: 68647 , remaining dirty pages: 77338
Iteration 4, duration: 869 ms , transferred pages: 43408 (dup: 64, n: 17911, x: 25433) , new dirty pages: 26087 , remaining dirty pages: 33845
Iteration 5, duration: 455 ms , transferred pages: 23048 (dup: 55, n: 10156, x: 12837) , new dirty pages: 15275 , remaining dirty pages: 16636
Iteration 6, duration: 162 ms , transferred pages: 7939 (dup: 55, n: 2425, x: 5459) , new dirty pages: 6009 , remaining dirty pages: 10051
Iteration 7, duration: 52 ms , transferred pages: 5761 (dup: 212, n: 707, x: 4842) , new dirty pages: 2204 , remaining dirty pages: 4027
Iteration 8, duration: 1 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 4027
total time: 53255 milliseconds (v.s. 54693 milliseconds for SHA1)
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-03 8:25 ` Chunguang Li
2016-11-03 9:59 ` Li, Liang Z
2016-11-03 10:13 ` Li, Liang Z
@ 2016-11-08 11:05 ` Dr. David Alan Gilbert
2016-11-08 13:40 ` Chunguang Li
2 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-11-08 11:05 UTC (permalink / raw)
To: Chunguang Li; +Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela
* Chunguang Li (lichunguang@hust.edu.cn) wrote:
>
>
>
> > -----Original Messages-----
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > Sent Time: Friday, October 14, 2016
> > To: "Chunguang Li" <lichunguang@hust.edu.cn>
> > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> >
> > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > >
> > >
> > >
> > > > -----原始邮件-----
> > > > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > > > 发送时间: 2016年9月30日 星期五
> > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > >
> > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > > >
> > > > >
> > > > >
> > > > > > -----原始邮件-----
> > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > 发送时间: 2016年9月26日 星期一
> > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > > >
> > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > > > Hi all!
> > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > > >
> > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > > >
> > > > > > >
> > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > > >
> > > > > > I think you're right that this can happen; to clarify I think the
> > > > > > case you're talking about is:
> > > > > >
> > > > > > Iteration 1
> > > > > > sync bitmap
> > > > > > start sending pages
> > > > > > page 'n' is modified - but hasn't been sent yet
> > > > > > page 'n' gets sent
> > > > > > Iteration 2
> > > > > > sync bitmap
> > > > > > 'page n is shown as modified'
> > > > > > send page 'n' again
> > > > > >
> > > > >
> > > > > Yes,this is right the case I am talking about.
> > > > >
> > > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > > on big VMs with slow networks where the length of each iteration
> > > > > > is large.
> > > > >
> > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > > >
> > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > workloads get impacted the most? That would also help us to figure
> > > > out what kinds of speed improvements we can expect.
> > > >
> > > >
> > > > Amit
> > >
> > > I have picked up 6 workloads and got the following statistics numbers
> > > of every iteration (except the last stop-copy one) during precopy.
> > > These numbers are obtained with the basic precopy migration, without
> > > the capabilities like xbzrle or compression, etc. The network for the
> > > migration is exclusive, with a separate network for the workloads.
> > > They are both gigabit ethernet. I use qemu-2.5.1.
> > >
> > > Three (booting, idle, web server) of them converged to the stop-copy phase,
> > > with the given bandwidth and default downtime (300ms), while the other
> > > three (kernel compilation, zeusmp, memcached) did not.
> > >
> > > One page is "not-really-dirty", if it is written first and is sent later
> > > (and not written again after that) during one iteration. I guess this
> > > would not happen so often during the other iterations as during the 1st
> > > iteration. Because all the pages of the VM are sent to the dest node during
> > > the 1st iteration, while during the others, only part of the pages are sent.
> > > So I think the "not-really-dirty" pages should be produced mainly during
> > > the 1st iteration , and maybe very little during the other iterations.
> > >
> > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > think the time spent on Iteration 2 would be halved. This is a chain reaction,
> > > because the dirty pages produced during Iteration 2 is halved, which incurs
> > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> >
> > Yes; these numbers don't show how many of them are false dirty though.
> >
> > One problem is thinking about pages that have been redirtied, if the page is dirtied
> > after the sync but before the network write then it's the false-dirty that
> > you're describing.
> >
> > However, if the page is being written a few times, and so it would have been written
> > after the network write then it isn't a false-dirty.
> >
> > You might be able to figure that out with some kernel tracing of when the dirtying
> > happens, but it might be easier to write the fix!
> >
> > Dave
>
> Hi, I have made some new progress now.
>
> To tell how many false dirty pages there are exactly in each iteration, I malloc a
> buffer in memory as big as the size of the whole VM memory. When a page is
> transferred to the dest node, it is copied to the buffer; During the next iteration,
> if one page is transferred, it is compared to the old one in the buffer, and the
> old one will be replaced for next comparison if it is really dirty. Thus, we are now
> able to get the exact number of false dirty pages.
>
> This time, I use 15 workloads to get the statistic number. They are:
>
> 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific
> computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick
> up these 11 benchmarks because compared to others, they have bigger memory
> occupation and higher memory dirty rate. Thus most of them could not converge
> to stop-and-copy using the default migration speed (32MB/s).
> 2. kernel compilation
> 3. idle VM
> 4. Apache web server which serves static content
>
> (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the
> migration speed is the default 32MB/s)
>
> 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache.
> After filling up the 4GB cache, a client writes the cache at a constant speed
> during migration. This time, migration speed has no limit, and is up to the
> capability of 1Gbps Ethernet.
>
> Summarize the results first: (and you can read the precise number below)
>
> 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations)
> of false dirty pages out of all the dirty pages since iteration 2 (and the big
> proportion lasts during the following iterations). They are cpu2006.zeusmp,
> cpu2006.bzip2, cpu2006.mcf, and memcached.
> 2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even
> though the proportion of false dirty pages is big since iteration 2, the space to
> optimize is small.
> 3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not
> in the other iterations.
> 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false
> dirty pages since iteration 2. So the spaces to optimize for them are small.
>
> Now I want to talk a little more about the reasons why false dirty pages are produced.
> The first reason is what we have discussed before---the mechanism to track the dirty
> pages.
> And then I come up with another reason. Here is the situation: a write operation to one
> memory page happens, but it doesn't change any content of the page. So it's "write but
> not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments
> to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006
> benchmark suit. According to his results, general workloads has a little proportion (<10%)
> of "write but not dirty" out of all the write operations, while few workloads have higher
> proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would
> happen, it just happened.
I think there are a few different reasons I can think of:
a) You have a flag or mutex that's set and cleared; so it gets set (marked
dirty) and cleared around some operation. By the time we come to migrate
it then it's back to cleared again.
Similarly with other temporary data structures.
b) Some system operation causes the page to be moved - e.g. swap or the kernel
reorganising memory.
However, it's a shame I don't think you can tell in your experiment which of the
two cases we're hitting? I'd like to know if it's worth working on
making the page sync mechanism better or if it's nore important to deal
with the second reason you show.
> So these two reasons contribute to the false dirty pages. To optimize, I compute and store
> the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its
> SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a
> false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new
> hash replaces the old one for next comparison.
> The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1
> hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which
> is relatively small.
> As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems.
> They have proven that the probability of hash collision is far smaller than disk hardware fault,
> so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the
> same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery.
There was a proposal ( https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05331.html )
to do a migration system where
a copy of the migration RAM is stored on disc on the destination for cases where similar VMs
are migrated, and it used a checksum for each page to find the matching page
in the cache; that originally used a smaller hash, I think in the end they used a SHA-256.
(Hash based checks still make me nervous for intentional collisions but that's probably
me being paranoid?)
> Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have
> big proportions of false dirty pages, the improvement is remarkable. Without optimization,
> they either can not converge to stop-and-copy, or take a very long time to complete. With the
> SHA1 hash method, all of them now complete in a relatively short time.
> For the reason I have talked above, the other workloads don't get notable improvements from the
> optimization. So below, I only show the exact number after optimization for the 4 workloads with
> remarkable improvements.
>
> Any comments or suggestions?
You might be able to save some of the CPU time; we've
got a test that checks if a page is all-zero; if you're doing
the SHA calculation you could avoid doing the all-zero check
and replace it by comparing hte output of the SHA.
>
> Below is the experiments data:
> (
> "dup" means zero page, this kind of pages takes very little migration time and network
> resources, so they are always not regard as dirty pages in my numbers;
> "rd" means really dirty pages;
> "fd" means false dirty pages;
> The numbers refer to the quantities of pages.
> )
>
> ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)-------------------
>
> 1. memcached
>
> ----- original pre-copy (can not converge): -----
> Iteration 1, duration: 42111 ms , transferred pages: 1568788 (dup: 416239, rd: 1152549, fd: 0) , new dirty pages: 499015 , remaining dirty pages: 507397
> Iteration 2, duration: 17208 ms , transferred pages: 498946 (dup: 5456, rd: 160206, fd: 333284) , new dirty pages: 261237 , remaining dirty pages: 269688
> Iteration 3, duration: 9134 ms , transferred pages: 262377 (dup: 519, rd: 111900, fd: 149958) , new dirty pages: 170281 , remaining dirty pages: 177592
> Iteration 4, duration: 5920 ms , transferred pages: 169966 (dup: 87, rd: 82487, fd: 87392) , new dirty pages: 121154 , remaining dirty pages: 128780
> Iteration 5, duration: 4239 ms , transferred pages: 121551 (dup: 81, rd: 64120, fd: 57350) , new dirty pages: 100976 , remaining dirty pages: 108205
> Iteration 6, duration: 3495 ms , transferred pages: 100353 (dup: 90, rd: 56021, fd: 44242) , new dirty pages: 74547 , remaining dirty pages: 82399
> Iteration 7, duration: 2583 ms , transferred pages: 74160 (dup: 56, rd: 38016, fd: 36088) , new dirty pages: 58209 , remaining dirty pages: 66448
> Iteration 8, duration: 2039 ms , transferred pages: 58534 (dup: 81, rd: 26885, fd: 31568) , new dirty pages: 43511 , remaining dirty pages: 51425
> Iteration 9, duration: 1513 ms , transferred pages: 43484 (dup: 55, rd: 26641, fd: 16788) , new dirty pages: 43722 , remaining dirty pages: 51663
> Iteration 10, duration: 1521 ms , transferred pages: 43676 (dup: 62, rd: 26463, fd: 17151) , new dirty pages: 35347 , remaining dirty pages: 43334
> Iteration 11, duration: 1230 ms , transferred pages: 35287 (dup: 0, rd: 21293, fd: 13994) , new dirty pages: 28851 , remaining dirty pages: 36898
> Iteration 12, duration: 1031 ms , transferred pages: 29651 (dup: 82, rd: 18143, fd: 11426) , new dirty pages: 27062 , remaining dirty pages: 34309
> Iteration 13, duration: 917 ms , transferred pages: 26385 (dup: 56, rd: 14149, fd: 12180) , new dirty pages: 22723 , remaining dirty pages: 30647
> Iteration 14, duration: 762 ms , transferred pages: 21902 (dup: 55, rd: 16355, fd: 5492) , new dirty pages: 18208 , remaining dirty pages: 26953
> Iteration 15, duration: 650 ms , transferred pages: 18636 (dup: 0, rd: 11943, fd: 6693) , new dirty pages: 16085 , remaining dirty pages: 24402
> Iteration 16, duration: 554 ms , transferred pages: 15946 (dup: 56, rd: 9527, fd: 6363) , new dirty pages: 14766 , remaining dirty pages: 23222
> Iteration 17, duration: 538 ms , transferred pages: 15434 (dup: 0, rd: 9779, fd: 5655) , new dirty pages: 13381 , remaining dirty pages: 21169
> Iteration 18, duration: 487 ms , transferred pages: 14089 (dup: 81, rd: 7737, fd: 6271) , new dirty pages: 13325 , remaining dirty pages: 20405
> Iteration 19, duration: 428 ms , transferred pages: 12232 (dup: 0, rd: 8488, fd: 3744) , new dirty pages: 10274 , remaining dirty pages: 18447
> Iteration 20, duration: 377 ms , transferred pages: 10887 (dup: 56, rd: 6362, fd: 4469) , new dirty pages: 9708 , remaining dirty pages: 17268
> Iteration 21, duration: 320 ms , transferred pages: 9222 (dup: 0, rd: 5789, fd: 3433) , new dirty pages: 8015 , remaining dirty pages: 16061
> Iteration 22, duration: 268 ms , transferred pages: 7621 (dup: 0, rd: 6204, fd: 1417) , new dirty pages: 7227 , remaining dirty pages: 15667
> Iteration 23, duration: 269 ms , transferred pages: 7813 (dup: 56, rd: 4410, fd: 3347) , new dirty pages: 7591 , remaining dirty pages: 15445
> Iteration 24, duration: 271 ms , transferred pages: 7749 (dup: 0, rd: 4565, fd: 3184) , new dirty pages: 15126 , remaining dirty pages: 22822
> Iteration 25, duration: 549 ms , transferred pages: 15818 (dup: 60, rd: 10545, fd: 5213) , new dirty pages: 14559 , remaining dirty pages: 21563
> Iteration 26, duration: 499 ms , transferred pages: 14281 (dup: 3, rd: 8760, fd: 5518) , new dirty pages: 11947 , remaining dirty pages: 19229
> Iteration 27, duration: 376 ms , transferred pages: 10823 (dup: 25, rd: 6550, fd: 4248) , new dirty pages: 8561 , remaining dirty pages: 16967
> Iteration 28, duration: 324 ms , transferred pages: 9350 (dup: 31, rd: 5292, fd: 4027) , new dirty pages: 8655 , remaining dirty pages: 16272
> Iteration 29, duration: 274 ms , transferred pages: 7813 (dup: 0, rd: 6088, fd: 1725) , new dirty pages: 6300 , remaining dirty pages: 14759
> Iteration 30, duration: 218 ms , transferred pages: 6340 (dup: 45, rd: 3196, fd: 3099) , new dirty pages: 5143 , remaining dirty pages: 13562
>
> ----- after optimization: -----
> Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979
> Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479
Big difference.
> Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100
> Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414
> Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536
> Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516
> Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802
> Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060
> Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973
> Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014
> Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516
> Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842
> Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871
> Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989
> Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989
> total time: 54693 milliseconds
Very nice.
Dave
> 2. cpu2006.zeusmp
>
> ----- original pre-copy (can not converge): -----
> Iteration 1, duration: 21112 ms , transferred pages: 266450 (dup: 93385, rd: 173065, fd: 0) , new dirty pages: 127866 , remaining dirty pages: 127866
> Iteration 2, duration: 6192 ms , transferred pages: 125662 (dup: 75762, rd: 17389, fd: 32511) , new dirty pages: 131655 , remaining dirty pages: 133859
> Iteration 3, duration: 6699 ms , transferred pages: 131937 (dup: 77298, rd: 20320, fd: 34319) , new dirty pages: 121027 , remaining dirty pages: 122949
> Iteration 4, duration: 5999 ms , transferred pages: 122512 (dup: 73588, rd: 17236, fd: 31688) , new dirty pages: 122759 , remaining dirty pages: 123196
> Iteration 5, duration: 5804 ms , transferred pages: 122717 (dup: 75436, rd: 19016, fd: 28265) , new dirty pages: 123697 , remaining dirty pages: 124176
> Iteration 6, duration: 5698 ms , transferred pages: 123708 (dup: 77249, rd: 18022, fd: 28437) , new dirty pages: 121838 , remaining dirty pages: 122306
> Iteration 7, duration: 5515 ms , transferred pages: 122306 (dup: 76727, rd: 14819, fd: 30760) , new dirty pages: 122382 , remaining dirty pages: 122382
> Iteration 8, duration: 6086 ms , transferred pages: 120825 (dup: 71834, rd: 15987, fd: 33004) , new dirty pages: 121587 , remaining dirty pages: 123144
> Iteration 9, duration: 5899 ms , transferred pages: 120964 (dup: 72860, rd: 18191, fd: 29913) , new dirty pages: 120391 , remaining dirty pages: 122571
> Iteration 10, duration: 5801 ms , transferred pages: 121425 (dup: 74140, rd: 20722, fd: 26563) , new dirty pages: 122302 , remaining dirty pages: 123448
> Iteration 11, duration: 5909 ms , transferred pages: 123448 (dup: 74735, rd: 19678, fd: 29035) , new dirty pages: 123258 , remaining dirty pages: 123258
> Iteration 12, duration: 6293 ms , transferred pages: 121211 (dup: 70442, rd: 18128, fd: 32641) , new dirty pages: 123623 , remaining dirty pages: 125670
> Iteration 13, duration: 6398 ms , transferred pages: 124897 (dup: 72701, rd: 21134, fd: 31062) , new dirty pages: 122355 , remaining dirty pages: 123128
> Iteration 14, duration: 6301 ms , transferred pages: 121893 (dup: 70514, rd: 23470, fd: 27909) , new dirty pages: 120980 , remaining dirty pages: 122215
> Iteration 15, duration: 6304 ms , transferred pages: 121389 (dup: 70005, rd: 21731, fd: 29653) , new dirty pages: 121628 , remaining dirty pages: 122454
> Iteration 16, duration: 6398 ms , transferred pages: 122164 (dup: 69962, rd: 24376, fd: 27826) , new dirty pages: 122246 , remaining dirty pages: 122536
> Iteration 17, duration: 6201 ms , transferred pages: 121548 (dup: 70984, rd: 23915, fd: 26649) , new dirty pages: 121460 , remaining dirty pages: 122448
> Iteration 18, duration: 6401 ms , transferred pages: 122272 (dup: 70072, rd: 22261, fd: 29939) , new dirty pages: 123518 , remaining dirty pages: 123694
> Iteration 19, duration: 7003 ms , transferred pages: 121873 (dup: 64754, rd: 27325, fd: 29794) , new dirty pages: 120568 , remaining dirty pages: 122389
> Iteration 20, duration: 6400 ms , transferred pages: 121422 (dup: 69221, rd: 25300, fd: 26901) , new dirty pages: 121229 , remaining dirty pages: 122196
> Iteration 21, duration: 6703 ms , transferred pages: 119895 (dup: 65232, rd: 25877, fd: 28786) , new dirty pages: 123284 , remaining dirty pages: 125585
> Iteration 22, duration: 6902 ms , transferred pages: 123884 (dup: 67582, rd: 29020, fd: 27282) , new dirty pages: 122057 , remaining dirty pages: 123758
> Iteration 23, duration: 6800 ms , transferred pages: 122010 (dup: 66529, rd: 30644, fd: 24837) , new dirty pages: 120916 , remaining dirty pages: 122664
> Iteration 24, duration: 7202 ms , transferred pages: 121951 (dup: 63188, rd: 31105, fd: 27658) , new dirty pages: 122715 , remaining dirty pages: 123428
> Iteration 25, duration: 7202 ms , transferred pages: 122919 (dup: 64161, rd: 32063, fd: 26695) , new dirty pages: 123180 , remaining dirty pages: 123689
> Iteration 26, duration: 7404 ms , transferred pages: 123092 (dup: 62694, rd: 33459, fd: 26939) , new dirty pages: 122149 , remaining dirty pages: 122746
> Iteration 27, duration: 7205 ms , transferred pages: 120427 (dup: 61664, rd: 34344, fd: 24419) , new dirty pages: 120299 , remaining dirty pages: 122618
> Iteration 28, duration: 7100 ms , transferred pages: 121074 (dup: 63130, rd: 32403, fd: 25541) , new dirty pages: 122984 , remaining dirty pages: 124528
> Iteration 29, duration: 7904 ms , transferred pages: 124060 (dup: 59564, rd: 35631, fd: 28865) , new dirty pages: 127080 , remaining dirty pages: 127548
> Iteration 30, duration: 7906 ms , transferred pages: 127518 (dup: 63029, rd: 34416, fd: 30073) , new dirty pages: 125028 , remaining dirty pages: 125058
>
> ----- after optimization: -----
> Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843
> Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945
> Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929
> Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916
> Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302
> Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500
> Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520
> Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524
> total time: 27225 milliseconds
>
> 3. cpu2006.bzip2
>
> ----- original pre-copy: -----
> Iteration 1, duration: 18306 ms , transferred pages: 266450 (dup: 116569, rd: 149881, fd: 0) , new dirty pages: 106299 , remaining dirty pages: 106299
> Iteration 2, duration: 10694 ms , transferred pages: 104611 (dup: 17550, rd: 10536, fd: 76525) , new dirty pages: 34394 , remaining dirty pages: 36082
> Iteration 3, duration: 2998 ms , transferred pages: 34442 (dup: 9924, rd: 12254, fd: 12264) , new dirty pages: 6419 , remaining dirty pages: 8059
> Iteration 4, duration: 699 ms , transferred pages: 5748 (dup: 22, rd: 2583, fd: 3143) , new dirty pages: 1226 , remaining dirty pages: 3537
> Iteration 5, duration: 200 ms , transferred pages: 1636 (dup: 0, rd: 1194, fd: 442) , new dirty pages: 478 , remaining dirty pages: 2379
> Iteration 6, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2379
>
> ----- after optimization: -----
> Iteration 1, duration: 13995 ms , transferred pages: 266314 (dup: 152118, rd: 114196) , new dirty pages: 97009 , remaining dirty pages: 97145
> Iteration 2, duration: 1215 ms , transferred pages: 33400 (dup: 26745, rd: 6655) , new dirty pages: 12866 , remaining dirty pages: 14017
> Iteration 3, duration: 701 ms , transferred pages: 5774 (dup: 48, rd: 5726) , new dirty pages: 6342 , remaining dirty pages: 8761
> Iteration 4, duration: 500 ms , transferred pages: 4111 (dup: 21, rd: 4090) , new dirty pages: 4311 , remaining dirty pages: 6485
> Iteration 5, duration: 400 ms , transferred pages: 3273 (dup: 1, rd: 3272) , new dirty pages: 3034 , remaining dirty pages: 5431
> Iteration 6, duration: 301 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2094 , remaining dirty pages: 4472
> Iteration 7, duration: 299 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2066 , remaining dirty pages: 4082
> Iteration 8, duration: 202 ms , transferred pages: 1636 (dup: 0, rd: 1636) , new dirty pages: 2881 , remaining dirty pages: 4648
> Iteration 9, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 4775 , remaining dirty pages: 6778
> Iteration 10, duration: 400 ms , transferred pages: 3281 (dup: 9, rd: 3272) , new dirty pages: 3757 , remaining dirty pages: 5576
> Iteration 11, duration: 401 ms , transferred pages: 3279 (dup: 7, rd: 3272) , new dirty pages: 6980 , remaining dirty pages: 8906
> Iteration 12, duration: 500 ms , transferred pages: 7118 (dup: 3035, rd: 4083) , new dirty pages: 10774 , remaining dirty pages: 11922
> Iteration 13, duration: 116 ms , transferred pages: 11706 (dup: 10152, rd: 1554) , new dirty pages: 1326 , remaining dirty pages: 1326
> Iteration 14, duration: 117 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 1326
> total time: 19479 milliseconds
>
> 4. cpu2006.mcf
>
> ----- original pre-copy: -----
> Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6925, rd: 259525, fd: 0) , new dirty pages: 244403 , remaining dirty pages: 244403
> Iteration 2, duration: 29603 ms , transferred pages: 242275 (dup: 377, rd: 224001, fd: 17897) , new dirty pages: 227335 , remaining dirty pages: 229463
> Iteration 3, duration: 27806 ms , transferred pages: 227573 (dup: 169, rd: 65681, fd: 161723) , new dirty pages: 195593 , remaining dirty pages: 197483
> Iteration 4, duration: 23907 ms , transferred pages: 195543 (dup: 41, rd: 39838, fd: 155664) , new dirty pages: 215066 , remaining dirty pages: 217006
> Iteration 5, duration: 26305 ms , transferred pages: 215289 (dup: 155, rd: 33082, fd: 182052) , new dirty pages: 111098 , remaining dirty pages: 112815
> Iteration 6, duration: 13502 ms , transferred pages: 110452 (dup: 22, rd: 26793, fd: 83637) , new dirty pages: 161054 , remaining dirty pages: 163417
> Iteration 7, duration: 19705 ms , transferred pages: 161266 (dup: 120, rd: 33818, fd: 127328) , new dirty pages: 220562 , remaining dirty pages: 222713
> Iteration 8, duration: 27003 ms , transferred pages: 220881 (dup: 21, rd: 215721, fd: 5139) , new dirty pages: 219787 , remaining dirty pages: 221619
> Iteration 9, duration: 26802 ms , transferred pages: 219248 (dup: 24, rd: 84648, fd: 134576) , new dirty pages: 207959 , remaining dirty pages: 210330
> Iteration 10, duration: 25411 ms , transferred pages: 207916 (dup: 144, rd: 35842, fd: 171930) , new dirty pages: 144442 , remaining dirty pages: 146856
> Iteration 11, duration: 17714 ms , transferred pages: 144804 (dup: 18, rd: 25414, fd: 119372) , new dirty pages: 205127 , remaining dirty pages: 207179
> Iteration 12, duration: 25112 ms , transferred pages: 205446 (dup: 128, rd: 23197, fd: 182121) , new dirty pages: 167319 , remaining dirty pages: 169052
> Iteration 13, duration: 20411 ms , transferred pages: 166886 (dup: 14, rd: 21960, fd: 144912) , new dirty pages: 221592 , remaining dirty pages: 223758
> Iteration 14, duration: 27126 ms , transferred pages: 221800 (dup: 122, rd: 42368, fd: 179310) , new dirty pages: 233630 , remaining dirty pages: 235588
> Iteration 15, duration: 28517 ms , transferred pages: 233321 (dup: 191, rd: 222528, fd: 10602) , new dirty pages: 224282 , remaining dirty pages: 226549
> Iteration 16, duration: 27422 ms , transferred pages: 224187 (dup: 55, rd: 45773, fd: 178359) , new dirty pages: 209815 , remaining dirty pages: 212177
> Iteration 17, duration: 25723 ms , transferred pages: 210260 (dup: 34, rd: 79405, fd: 130821) , new dirty pages: 220297 , remaining dirty pages: 222214
> Iteration 18, duration: 26920 ms , transferred pages: 220056 (dup: 14, rd: 214128, fd: 5914) , new dirty pages: 192015 , remaining dirty pages: 194173
> Iteration 19, duration: 23520 ms , transferred pages: 192239 (dup: 9, rd: 25140, fd: 167090) , new dirty pages: 96450 , remaining dirty pages: 98384
> Iteration 20, duration: 11805 ms , transferred pages: 96538 (dup: 14, rd: 7424, fd: 89100) , new dirty pages: 6978 , remaining dirty pages: 8824
> Iteration 21, duration: 799 ms , transferred pages: 6545 (dup: 1, rd: 1802, fd: 4742) , new dirty pages: 138 , remaining dirty pages: 2417
> Iteration 22, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2417
>
> ----- after optimization: -----
> Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209
> Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571
> Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478
> Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172
> Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305
> Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675
> Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265
> Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668
> Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014
> Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214
> Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725
> Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903
> Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260
> Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266
> Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792
> Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639
> Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170
> Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547
> Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521
> Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054
> Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624
> Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265
> Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813
> Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608
> Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311
> Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294
> Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936
> Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423
> Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292
> Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457
> Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304
> Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815
> Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670
> Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422
> Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222
> Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337
> Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251
> Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251
> total time: 97919 milliseconds
>
> ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)-------------------
>
> 5. idle
>
> Iteration 1, duration: 14702 ms , transferred pages: 266450 (dup: 146393, rd: 120057, fd: 0) , new dirty pages: 14595 , remaining dirty pages: 14595
> Iteration 2, duration: 1592 ms , transferred pages: 12412 (dup: 103, rd: 3280, fd: 9029) , new dirty pages: 218 , remaining dirty pages: 2401
> Iteration 3, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2401
>
> 6. kernel compilation (can not converge)
>
> Iteration 1, duration: 20607 ms , transferred pages: 266450 (dup: 97552, rd: 168898, fd: 0) , new dirty pages: 19293 , remaining dirty pages: 19293
> Iteration 2, duration: 2092 ms , transferred pages: 17176 (dup: 597, rd: 8625, fd: 7954) , new dirty pages: 8318 , remaining dirty pages: 10435
> Iteration 3, duration: 1000 ms , transferred pages: 8484 (dup: 304, rd: 6256, fd: 1924) , new dirty pages: 8736 , remaining dirty pages: 10687
> Iteration 4, duration: 1000 ms , transferred pages: 8435 (dup: 255, rd: 7089, fd: 1091) , new dirty pages: 7627 , remaining dirty pages: 9879
> Iteration 5, duration: 900 ms , transferred pages: 7553 (dup: 191, rd: 5602, fd: 1760) , new dirty pages: 7287 , remaining dirty pages: 9613
> Iteration 6, duration: 900 ms , transferred pages: 7620 (dup: 258, rd: 5761, fd: 1601) , new dirty pages: 8958 , remaining dirty pages: 10951
> Iteration 7, duration: 1099 ms , transferred pages: 9309 (dup: 311, rd: 8051, fd: 947) , new dirty pages: 7189 , remaining dirty pages: 8831
> Iteration 8, duration: 800 ms , transferred pages: 6832 (dup: 288, rd: 5717, fd: 827) , new dirty pages: 5782 , remaining dirty pages: 7781
> Iteration 9, duration: 701 ms , transferred pages: 5875 (dup: 149, rd: 4005, fd: 1721) , new dirty pages: 4587 , remaining dirty pages: 6493
> Iteration 10, duration: 500 ms , transferred pages: 4234 (dup: 144, rd: 3057, fd: 1033) , new dirty pages: 7352 , remaining dirty pages: 9611
> Iteration 11, duration: 900 ms , transferred pages: 7759 (dup: 397, rd: 6563, fd: 799) , new dirty pages: 6686 , remaining dirty pages: 8538
> Iteration 12, duration: 800 ms , transferred pages: 6808 (dup: 264, rd: 6017, fd: 527) , new dirty pages: 6871 , remaining dirty pages: 8601
> Iteration 13, duration: 800 ms , transferred pages: 6775 (dup: 231, rd: 5722, fd: 822) , new dirty pages: 7540 , remaining dirty pages: 9366
> Iteration 14, duration: 900 ms , transferred pages: 7507 (dup: 145, rd: 5900, fd: 1462) , new dirty pages: 7581 , remaining dirty pages: 9440
> Iteration 15, duration: 900 ms , transferred pages: 7630 (dup: 268, rd: 6211, fd: 1151) , new dirty pages: 7268 , remaining dirty pages: 9078
> Iteration 16, duration: 800 ms , transferred pages: 6759 (dup: 215, rd: 5763, fd: 781) , new dirty pages: 6861 , remaining dirty pages: 9180
> Iteration 17, duration: 800 ms , transferred pages: 6838 (dup: 294, rd: 6037, fd: 507) , new dirty pages: 6196 , remaining dirty pages: 8538
> Iteration 18, duration: 800 ms , transferred pages: 6852 (dup: 308, rd: 4905, fd: 1639) , new dirty pages: 5947 , remaining dirty pages: 7633
> Iteration 19, duration: 700 ms , transferred pages: 5919 (dup: 193, rd: 4853, fd: 873) , new dirty pages: 5861 , remaining dirty pages: 7575
> Iteration 20, duration: 600 ms , transferred pages: 5284 (dup: 376, rd: 4408, fd: 500) , new dirty pages: 5206 , remaining dirty pages: 7497
> Iteration 21, duration: 600 ms , transferred pages: 5147 (dup: 239, rd: 4308, fd: 600) , new dirty pages: 5031 , remaining dirty pages: 7381
> Iteration 22, duration: 599 ms , transferred pages: 5064 (dup: 156, rd: 4026, fd: 882) , new dirty pages: 5601 , remaining dirty pages: 7918
> Iteration 23, duration: 702 ms , transferred pages: 5965 (dup: 239, rd: 5028, fd: 698) , new dirty pages: 6079 , remaining dirty pages: 8032
> Iteration 24, duration: 700 ms , transferred pages: 6175 (dup: 449, rd: 5146, fd: 580) , new dirty pages: 10932 , remaining dirty pages: 12789
> Iteration 25, duration: 1300 ms , transferred pages: 10936 (dup: 302, rd: 6205, fd: 4429) , new dirty pages: 8713 , remaining dirty pages: 10566
> Iteration 26, duration: 1000 ms , transferred pages: 8282 (dup: 102, rd: 5662, fd: 2518) , new dirty pages: 5119 , remaining dirty pages: 7403
> Iteration 27, duration: 600 ms , transferred pages: 5007 (dup: 99, rd: 4099, fd: 809) , new dirty pages: 2226 , remaining dirty pages: 4622
> Iteration 28, duration: 300 ms , transferred pages: 2491 (dup: 37, rd: 1794, fd: 660) , new dirty pages: 6746 , remaining dirty pages: 8877
> Iteration 29, duration: 800 ms , transferred pages: 6757 (dup: 213, rd: 5532, fd: 1012) , new dirty pages: 6070 , remaining dirty pages: 8190
> Iteration 30, duration: 700 ms , transferred pages: 6052 (dup: 326, rd: 5107, fd: 619) , new dirty pages: 5177 , remaining dirty pages: 7315
>
> 7. web server
>
> Iteration 1, duration: 20902 ms , transferred pages: 266450 (dup: 95497, rd: 170953, fd: 0) , new dirty pages: 8528 , remaining dirty pages: 8528
> Iteration 2, duration: 796 ms , transferred pages: 6472 (dup: 131, rd: 1885, fd: 4456) , new dirty pages: 650 , remaining dirty pages: 2706
> Iteration 3, duration: 100 ms , transferred pages: 818 (dup: 0, rd: 383, fd: 435) , new dirty pages: 328 , remaining dirty pages: 2216
> Iteration 4, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2216
>
>
> 8. cpu2006.bwaves (can not converge)
>
> Iteration 1, duration: 31715 ms , transferred pages: 266450 (dup: 6766, rd: 259684, fd: 0) , new dirty pages: 242702 , remaining dirty pages: 242702
> Iteration 2, duration: 29397 ms , transferred pages: 240508 (dup: 405, rd: 225588, fd: 14515) , new dirty pages: 230889 , remaining dirty pages: 233083
> Iteration 3, duration: 28205 ms , transferred pages: 230858 (dup: 182, rd: 214596, fd: 16080) , new dirty pages: 226998 , remaining dirty pages: 229223
> Iteration 4, duration: 27805 ms , transferred pages: 227574 (dup: 170, rd: 217045, fd: 10359) , new dirty pages: 227360 , remaining dirty pages: 229009
> Iteration 5, duration: 27703 ms , transferred pages: 226786 (dup: 200, rd: 212130, fd: 14456) , new dirty pages: 225885 , remaining dirty pages: 228108
> Iteration 6, duration: 27600 ms , transferred pages: 225923 (dup: 155, rd: 215503, fd: 10265) , new dirty pages: 223555 , remaining dirty pages: 225740
> Iteration 7, duration: 27309 ms , transferred pages: 223574 (dup: 260, rd: 215641, fd: 7673) , new dirty pages: 231975 , remaining dirty pages: 234141
> Iteration 8, duration: 28403 ms , transferred pages: 232397 (dup: 85, rd: 214086, fd: 18226) , new dirty pages: 222170 , remaining dirty pages: 223914
> Iteration 9, duration: 27105 ms , transferred pages: 221809 (dup: 131, rd: 214988, fd: 6690) , new dirty pages: 230065 , remaining dirty pages: 232170
> Iteration 10, duration: 28104 ms , transferred pages: 230201 (dup: 343, rd: 213531, fd: 16327) , new dirty pages: 227590 , remaining dirty pages: 229559
> Iteration 11, duration: 27801 ms , transferred pages: 227717 (dup: 313, rd: 221408, fd: 5996) , new dirty pages: 228457 , remaining dirty pages: 230299
> Iteration 12, duration: 27916 ms , transferred pages: 228560 (dup: 338, rd: 219660, fd: 8562) , new dirty pages: 238326 , remaining dirty pages: 240065
>
> 9. cpu2006.lbm (can not converge)
> Iteration 1, duration: 31012 ms , transferred pages: 266450 (dup: 12253, rd: 254197, fd: 0) , new dirty pages: 108960 , remaining dirty pages: 108960
> Iteration 2, duration: 13095 ms , transferred pages: 106522 (dup: 3, rd: 102045, fd: 4474) , new dirty pages: 129292 , remaining dirty pages: 131730
> Iteration 3, duration: 15802 ms , transferred pages: 129688 (dup: 444, rd: 110860, fd: 18384) , new dirty pages: 116682 , remaining dirty pages: 118724
> Iteration 4, duration: 14204 ms , transferred pages: 116316 (dup: 160, rd: 104951, fd: 11205) , new dirty pages: 107246 , remaining dirty pages: 109654
> Iteration 5, duration: 13208 ms , transferred pages: 107977 (dup: 1, rd: 101834, fd: 6142) , new dirty pages: 105371 , remaining dirty pages: 107048
> Iteration 6, duration: 12804 ms , transferred pages: 104705 (dup: 1, rd: 99629, fd: 5075) , new dirty pages: 103841 , remaining dirty pages: 106184
> Iteration 7, duration: 12709 ms , transferred pages: 103891 (dup: 5, rd: 99212, fd: 4674) , new dirty pages: 106692 , remaining dirty pages: 108985
> Iteration 8, duration: 13105 ms , transferred pages: 107169 (dup: 11, rd: 100125, fd: 7033) , new dirty pages: 103132 , remaining dirty pages: 104948
> Iteration 9, duration: 12607 ms , transferred pages: 103068 (dup: 0, rd: 99460, fd: 3608) , new dirty pages: 102511 , remaining dirty pages: 104391
> Iteration 10, duration: 12514 ms , transferred pages: 102250 (dup: 0, rd: 99094, fd: 3156) , new dirty pages: 102888 , remaining dirty pages: 105029
>
> 10. cpu2006.astar (can not converge)
>
> Iteration 1, duration: 28402 ms , transferred pages: 266450 (dup: 33770, rd: 232680, fd: 0) , new dirty pages: 62078 , remaining dirty pages: 62078
> Iteration 2, duration: 7393 ms , transferred pages: 60107 (dup: 10, rd: 51722, fd: 8375) , new dirty pages: 48854 , remaining dirty pages: 50825
> Iteration 3, duration: 6001 ms , transferred pages: 49094 (dup: 14, rd: 46540, fd: 2540) , new dirty pages: 48137 , remaining dirty pages: 49868
> Iteration 4, duration: 5800 ms , transferred pages: 47444 (dup: 0, rd: 45389, fd: 2055) , new dirty pages: 49147 , remaining dirty pages: 51571
> Iteration 5, duration: 6102 ms , transferred pages: 49912 (dup: 14, rd: 46216, fd: 3682) , new dirty pages: 55606 , remaining dirty pages: 57265
> Iteration 6, duration: 6699 ms , transferred pages: 54949 (dup: 143, rd: 20745, fd: 34061) , new dirty pages: 9166 , remaining dirty pages: 11482
> Iteration 7, duration: 1200 ms , transferred pages: 9830 (dup: 14, rd: 7011, fd: 2805) , new dirty pages: 8294 , remaining dirty pages: 9946
> Iteration 8, duration: 1000 ms , transferred pages: 8194 (dup: 14, rd: 7178, fd: 1002) , new dirty pages: 5475 , remaining dirty pages: 7227
> Iteration 9, duration: 600 ms , transferred pages: 4908 (dup: 0, rd: 3470, fd: 1438) , new dirty pages: 4175 , remaining dirty pages: 6494
> Iteration 10, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3856, fd: 234) , new dirty pages: 4095 , remaining dirty pages: 6499
> Iteration 11, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3313, fd: 777) , new dirty pages: 3371 , remaining dirty pages: 5780
> Iteration 12, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3823, fd: 267) , new dirty pages: 7518 , remaining dirty pages: 9208
> Iteration 13, duration: 899 ms , transferred pages: 7376 (dup: 14, rd: 6028, fd: 1334) , new dirty pages: 3931 , remaining dirty pages: 5763
> Iteration 14, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 4346 , remaining dirty pages: 6019
> Iteration 15, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3817, fd: 273) , new dirty pages: 3054 , remaining dirty pages: 4983
> Iteration 16, duration: 400 ms , transferred pages: 3272 (dup: 0, rd: 3138, fd: 134) , new dirty pages: 3874 , remaining dirty pages: 5585
> Iteration 17, duration: 399 ms , transferred pages: 3272 (dup: 0, rd: 3248, fd: 24) , new dirty pages: 5285 , remaining dirty pages: 7598
> Iteration 18, duration: 701 ms , transferred pages: 5726 (dup: 0, rd: 4385, fd: 1341) , new dirty pages: 8903 , remaining dirty pages: 10775
> Iteration 19, duration: 1101 ms , transferred pages: 9010 (dup: 12, rd: 5597, fd: 3401) , new dirty pages: 4199 , remaining dirty pages: 5964
> Iteration 20, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 3829 , remaining dirty pages: 5703
>
> 11. cpu2006.xalancbmk (can not converge)
>
> Iteration 1, duration: 30407 ms , transferred pages: 266450 (dup: 17700, rd: 248750, fd: 0) , new dirty pages: 96169 , remaining dirty pages: 96169
> Iteration 2, duration: 11495 ms , transferred pages: 94164 (dup: 205, rd: 67068, fd: 26891) , new dirty pages: 61766 , remaining dirty pages: 63771
> Iteration 3, duration: 7501 ms , transferred pages: 61471 (dup: 121, rd: 53587, fd: 7763) , new dirty pages: 56569 , remaining dirty pages: 58869
> Iteration 4, duration: 6902 ms , transferred pages: 56461 (dup: 19, rd: 50553, fd: 5889) , new dirty pages: 52181 , remaining dirty pages: 54589
> Iteration 5, duration: 6402 ms , transferred pages: 52459 (dup: 107, rd: 46986, fd: 5366) , new dirty pages: 54051 , remaining dirty pages: 56181
> Iteration 6, duration: 6601 ms , transferred pages: 54003 (dup: 15, rd: 47566, fd: 6422) , new dirty pages: 50844 , remaining dirty pages: 53022
> Iteration 7, duration: 6202 ms , transferred pages: 50723 (dup: 7, rd: 47143, fd: 3573) , new dirty pages: 64880 , remaining dirty pages: 67179
> Iteration 8, duration: 8001 ms , transferred pages: 65447 (dup: 7, rd: 61159, fd: 4281) , new dirty pages: 67854 , remaining dirty pages: 69586
> Iteration 9, duration: 8202 ms , transferred pages: 67444 (dup: 368, rd: 56357, fd: 10719) , new dirty pages: 65178 , remaining dirty pages: 67320
> Iteration 10, duration: 8000 ms , transferred pages: 65455 (dup: 15, rd: 60581, fd: 4859) , new dirty pages: 52421 , remaining dirty pages: 54286
>
> 12. cpu2006.milc (can not converge)
>
> Iteration 1, duration: 31410 ms , transferred pages: 266450 (dup: 9454, rd: 256996, fd: 0) , new dirty pages: 158860 , remaining dirty pages: 158860
> Iteration 2, duration: 19193 ms , transferred pages: 157048 (dup: 150, rd: 96807, fd: 60091) , new dirty pages: 102238 , remaining dirty pages: 104050
> Iteration 3, duration: 12504 ms , transferred pages: 102271 (dup: 21, rd: 95107, fd: 7143) , new dirty pages: 97944 , remaining dirty pages: 99723
> Iteration 4, duration: 11905 ms , transferred pages: 97360 (dup: 18, rd: 93610, fd: 3732) , new dirty pages: 99150 , remaining dirty pages: 101513
> Iteration 5, duration: 12105 ms , transferred pages: 99094 (dup: 116, rd: 94125, fd: 4853) , new dirty pages: 98589 , remaining dirty pages: 101008
> Iteration 6, duration: 12101 ms , transferred pages: 98995 (dup: 17, rd: 94069, fd: 4909) , new dirty pages: 147403 , remaining dirty pages: 149416
> Iteration 7, duration: 18001 ms , transferred pages: 147284 (dup: 44, rd: 135691, fd: 11549) , new dirty pages: 136445 , remaining dirty pages: 138577
> Iteration 8, duration: 16702 ms , transferred pages: 136636 (dup: 30, rd: 130805, fd: 5801) , new dirty pages: 145481 , remaining dirty pages: 147422
> Iteration 9, duration: 17800 ms , transferred pages: 145734 (dup: 130, rd: 133239, fd: 12365) , new dirty pages: 98032 , remaining dirty pages: 99720
> Iteration 10, duration: 11902 ms , transferred pages: 97364 (dup: 22, rd: 93096, fd: 4246) , new dirty pages: 95391 , remaining dirty pages: 97747
>
> 13. cpu2006.cactusADM (can not converge)
>
> Iteration 1, duration: 23508 ms , transferred pages: 266450 (dup: 73568, rd: 192882, fd: 0) , new dirty pages: 123869 , remaining dirty pages: 123869
> Iteration 2, duration: 13989 ms , transferred pages: 121594 (dup: 7874, rd: 81653, fd: 32067) , new dirty pages: 112960 , remaining dirty pages: 115235
> Iteration 3, duration: 13605 ms , transferred pages: 113276 (dup: 2028, rd: 83783, fd: 27465) , new dirty pages: 112314 , remaining dirty pages: 114273
> Iteration 4, duration: 13509 ms , transferred pages: 111935 (dup: 1505, rd: 83535, fd: 26895) , new dirty pages: 114078 , remaining dirty pages: 116416
> Iteration 5, duration: 13810 ms , transferred pages: 114262 (dup: 1378, rd: 84039, fd: 28845) , new dirty pages: 112271 , remaining dirty pages: 114425
> Iteration 6, duration: 13604 ms , transferred pages: 112664 (dup: 1416, rd: 84300, fd: 26948) , new dirty pages: 112903 , remaining dirty pages: 114664
> Iteration 7, duration: 13604 ms , transferred pages: 112655 (dup: 1407, rd: 84027, fd: 27221) , new dirty pages: 110943 , remaining dirty pages: 112952
> Iteration 8, duration: 13406 ms , transferred pages: 110720 (dup: 1108, rd: 84075, fd: 25537) , new dirty pages: 109321 , remaining dirty pages: 111553
> Iteration 9, duration: 13306 ms , transferred pages: 109726 (dup: 932, rd: 83652, fd: 25142) , new dirty pages: 113446 , remaining dirty pages: 115273
> Iteration 10, duration: 13705 ms , transferred pages: 113121 (dup: 1055, rd: 84671, fd: 27395) , new dirty pages: 108776 , remaining dirty pages: 110928
>
> 14. cpu2006.GmesFDTD (can not converge)
>
> Iteration 1, duration: 13303 ms , transferred pages: 266450 (dup: 157809, rd: 108641, fd: 0) , new dirty pages: 226802 , remaining dirty pages: 226802
> Iteration 2, duration: 10797 ms , transferred pages: 226507 (dup: 138637, rd: 61818, fd: 26052) , new dirty pages: 200769 , remaining dirty pages: 201064
> Iteration 3, duration: 8900 ms , transferred pages: 199717 (dup: 127187, rd: 69340, fd: 3190) , new dirty pages: 203436 , remaining dirty pages: 204783
> Iteration 4, duration: 10904 ms , transferred pages: 204127 (dup: 115211, rd: 85767, fd: 3149) , new dirty pages: 198407 , remaining dirty pages: 199063
> Iteration 5, duration: 12109 ms , transferred pages: 198206 (dup: 99435, rd: 96956, fd: 1815) , new dirty pages: 213719 , remaining dirty pages: 214576
> Iteration 6, duration: 16307 ms , transferred pages: 213595 (dup: 80422, rd: 116885, fd: 16288) , new dirty pages: 199637 , remaining dirty pages: 200618
> Iteration 7, duration: 16915 ms , transferred pages: 198289 (dup: 60169, rd: 134208, fd: 3912) , new dirty pages: 199343 , remaining dirty pages: 201672
> Iteration 8, duration: 19518 ms , transferred pages: 200452 (dup: 41014, rd: 156083, fd: 3355) , new dirty pages: 222927 , remaining dirty pages: 224147
>
> 15. cpu2006.wrf (can not converge)
>
> Iteration 1, duration: 18499 ms , transferred pages: 266380 (dup: 115285, rd: 151095, fd: 0) , new dirty pages: 112322 , remaining dirty pages: 112392
> Iteration 2, duration: 9802 ms , transferred pages: 110025 (dup: 29917, rd: 65782, fd: 14326) , new dirty pages: 88855 , remaining dirty pages: 91222
> Iteration 3, duration: 8199 ms , transferred pages: 89761 (dup: 22728, rd: 57262, fd: 9771) , new dirty pages: 58431 , remaining dirty pages: 59892
> Iteration 4, duration: 5603 ms , transferred pages: 58502 (dup: 12716, rd: 41809, fd: 3977) , new dirty pages: 80556 , remaining dirty pages: 81946
> Iteration 5, duration: 7101 ms , transferred pages: 79778 (dup: 21738, rd: 50896, fd: 7144) , new dirty pages: 62592 , remaining dirty pages: 64760
> Iteration 6, duration: 5702 ms , transferred pages: 63388 (dup: 16793, rd: 42726, fd: 3869) , new dirty pages: 80747 , remaining dirty pages: 82119
> Iteration 7, duration: 7000 ms , transferred pages: 80868 (dup: 23652, rd: 52194, fd: 5022) , new dirty pages: 84593 , remaining dirty pages: 85844
> Iteration 8, duration: 7099 ms , transferred pages: 83799 (dup: 25769, rd: 51772, fd: 6258) , new dirty pages: 67951 , remaining dirty pages: 69996
> Iteration 9, duration: 6303 ms , transferred pages: 68478 (dup: 16979, rd: 36490, fd: 15009) , new dirty pages: 81181 , remaining dirty pages: 82699
> Iteration 10, duration: 7000 ms , transferred pages: 80724 (dup: 23503, rd: 52826, fd: 4395) , new dirty pages: 47930 , remaining dirty pages: 49905
>
>
>
> >
> > > So I think "booting" and "kernel compilation" should benefit a lot from this
> > > improvement. The reason of "kernel compilation" would benefit is that some
> > > iterations take around 600ms, and if they are halved into 300ms, then the precopy
> > > may have the chance to step into stop and copy phase.
> > >
> > > On the other hand, "idle" and "web server" would not benefit a lot, because
> > > most of the time are spent on the 1st iteration and little on the others.
> > >
> > > As to the "zeusmp" and "memcached", although the time spent on the other iterations
> > > but the 1st one may be halved, they still could not converge to stop and copy
> > > with the 300ms downtime.
> > >
> > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> > >
> > > 1. booting : begin to migrate when the VM is booting
> > >
> > > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414
> > > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459
> > > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356
> > > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430
> > > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430
> > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> > > "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> > >
> > > 2. idle
> > >
> > > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398
> > > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294
> > > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849
> > > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849
> > >
> > > 3. kernel compilation (can not converge)
> > >
> > > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067
> > > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518
> > > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207
> > > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339
> > > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200
> > > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116
> > > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033
> > > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642
> > > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427
> > > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731
> > > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252
> > > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348
> > > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920
> > > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693
> > > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382
> > >
> > > 4. cpu2006.zeusmp (can not converge)
> > >
> > > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625
> > > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360
> > > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831
> > > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768
> > > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885
> > > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634
> > > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894
> > > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476
> > > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692
> > > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727
> > > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810
> > > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813
> > > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329
> > > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918
> > > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338
> > >
> > > 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> > >
> > > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628
> > > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574
> > > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261
> > > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519
> > > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167
> > > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167
> > >
> > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> > >
> > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge)
> > >
> > > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023
> > > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013
> > > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551
> > > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638
> > > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712
> > > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787
> > > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942
> > > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432
> > > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096
> > > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418
> > > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321
> > > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169
> > > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989
> > > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061
> > > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717
> > > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257
> > > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576
> > > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963
> > > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888
> > > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047
> > >
> > >
> > > --
> > > Chunguang Li, Ph.D. Candidate
> > > Wuhan National Laboratory for Optoelectronics (WNLO)
> > > Huazhong University of Science & Technology (HUST)
> > > Wuhan, Hubei Prov., China
> > >
> > >
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
>
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
2016-11-08 11:05 ` Dr. David Alan Gilbert
@ 2016-11-08 13:40 ` Chunguang Li
0 siblings, 0 replies; 21+ messages in thread
From: Chunguang Li @ 2016-11-08 13:40 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela
> -----Original Messages-----
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Sent Time: Tuesday, November 8, 2016
> To: "Chunguang Li" <lichunguang@hust.edu.cn>
> Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
>
> * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> >
> >
> >
> > > -----Original Messages-----
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > Sent Time: Friday, October 14, 2016
> > > To: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > >
> > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > >
> > > >
> > > >
> > > > > -----原始邮件-----
> > > > > 发件人: "Amit Shah" <amit.shah@redhat.com>
> > > > > 发送时间: 2016年9月30日 星期五
> > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > >
> > > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----原始邮件-----
> > > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > > 发送时间: 2016年9月26日 星期一
> > > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn>
> > > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com
> > > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent
> > > > > > >
> > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote:
> > > > > > > > Hi all!
> > > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages.
> > > > > > > >
> > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written.
> > > > > > > >
> > > > > > > >
> > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this?
> > > > > > >
> > > > > > > I think you're right that this can happen; to clarify I think the
> > > > > > > case you're talking about is:
> > > > > > >
> > > > > > > Iteration 1
> > > > > > > sync bitmap
> > > > > > > start sending pages
> > > > > > > page 'n' is modified - but hasn't been sent yet
> > > > > > > page 'n' gets sent
> > > > > > > Iteration 2
> > > > > > > sync bitmap
> > > > > > > 'page n is shown as modified'
> > > > > > > send page 'n' again
> > > > > > >
> > > > > >
> > > > > > Yes,this is right the case I am talking about.
> > > > > >
> > > > > > > So you're right that is wasteful; I guess it's more wasteful
> > > > > > > on big VMs with slow networks where the length of each iteration
> > > > > > > is large.
> > > > > >
> > > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages.
> > > > >
> > > > > It makes sense, can you get some perf numbers to show what kinds of
> > > > > workloads get impacted the most? That would also help us to figure
> > > > > out what kinds of speed improvements we can expect.
> > > > >
> > > > >
> > > > > Amit
> > > >
> > > > I have picked up 6 workloads and got the following statistics numbers
> > > > of every iteration (except the last stop-copy one) during precopy.
> > > > These numbers are obtained with the basic precopy migration, without
> > > > the capabilities like xbzrle or compression, etc. The network for the
> > > > migration is exclusive, with a separate network for the workloads.
> > > > They are both gigabit ethernet. I use qemu-2.5.1.
> > > >
> > > > Three (booting, idle, web server) of them converged to the stop-copy phase,
> > > > with the given bandwidth and default downtime (300ms), while the other
> > > > three (kernel compilation, zeusmp, memcached) did not.
> > > >
> > > > One page is "not-really-dirty", if it is written first and is sent later
> > > > (and not written again after that) during one iteration. I guess this
> > > > would not happen so often during the other iterations as during the 1st
> > > > iteration. Because all the pages of the VM are sent to the dest node during
> > > > the 1st iteration, while during the others, only part of the pages are sent.
> > > > So I think the "not-really-dirty" pages should be produced mainly during
> > > > the 1st iteration , and maybe very little during the other iterations.
> > > >
> > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I
> > > > think the time spent on Iteration 2 would be halved. This is a chain reaction,
> > > > because the dirty pages produced during Iteration 2 is halved, which incurs
> > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5...
> > >
> > > Yes; these numbers don't show how many of them are false dirty though.
> > >
> > > One problem is thinking about pages that have been redirtied, if the page is dirtied
> > > after the sync but before the network write then it's the false-dirty that
> > > you're describing.
> > >
> > > However, if the page is being written a few times, and so it would have been written
> > > after the network write then it isn't a false-dirty.
> > >
> > > You might be able to figure that out with some kernel tracing of when the dirtying
> > > happens, but it might be easier to write the fix!
> > >
> > > Dave
> >
> > Hi, I have made some new progress now.
> >
> > To tell how many false dirty pages there are exactly in each iteration, I malloc a
> > buffer in memory as big as the size of the whole VM memory. When a page is
> > transferred to the dest node, it is copied to the buffer; During the next iteration,
> > if one page is transferred, it is compared to the old one in the buffer, and the
> > old one will be replaced for next comparison if it is really dirty. Thus, we are now
> > able to get the exact number of false dirty pages.
> >
> > This time, I use 15 workloads to get the statistic number. They are:
> >
> > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific
> > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick
> > up these 11 benchmarks because compared to others, they have bigger memory
> > occupation and higher memory dirty rate. Thus most of them could not converge
> > to stop-and-copy using the default migration speed (32MB/s).
> > 2. kernel compilation
> > 3. idle VM
> > 4. Apache web server which serves static content
> >
> > (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the
> > migration speed is the default 32MB/s)
> >
> > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache.
> > After filling up the 4GB cache, a client writes the cache at a constant speed
> > during migration. This time, migration speed has no limit, and is up to the
> > capability of 1Gbps Ethernet.
> >
> > Summarize the results first: (and you can read the precise number below)
> >
> > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations)
> > of false dirty pages out of all the dirty pages since iteration 2 (and the big
> > proportion lasts during the following iterations). They are cpu2006.zeusmp,
> > cpu2006.bzip2, cpu2006.mcf, and memcached.
> > 2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even
> > though the proportion of false dirty pages is big since iteration 2, the space to
> > optimize is small.
> > 3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not
> > in the other iterations.
> > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false
> > dirty pages since iteration 2. So the spaces to optimize for them are small.
> >
> > Now I want to talk a little more about the reasons why false dirty pages are produced.
> > The first reason is what we have discussed before---the mechanism to track the dirty
> > pages.
> > And then I come up with another reason. Here is the situation: a write operation to one
> > memory page happens, but it doesn't change any content of the page. So it's "write but
> > not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments
> > to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006
> > benchmark suit. According to his results, general workloads has a little proportion (<10%)
> > of "write but not dirty" out of all the write operations, while few workloads have higher
> > proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would
> > happen, it just happened.
>
> I think there are a few different reasons I can think of:
> a) You have a flag or mutex that's set and cleared; so it gets set (marked
> dirty) and cleared around some operation. By the time we come to migrate
> it then it's back to cleared again.
> Similarly with other temporary data structures.
> b) Some system operation causes the page to be moved - e.g. swap or the kernel
> reorganising memory.
Sorry, I don't quite understand reason (b). Take swap as example, do you mean a page
is swapped out and swapped in to the old address again, so the content remains unchanged?
>
> However, it's a shame I don't think you can tell in your experiment which of the
> two cases we're hitting? I'd like to know if it's worth working on
> making the page sync mechanism better or if it's nore important to deal
> with the second reason you show.
Yes, you are right, it's hard to tell which case we're hitting (including the cases you
think of). However, as I use the SHA1 method, now I don't have to tell them. Because it
just handle all the cases we have thought of.
>
> > So these two reasons contribute to the false dirty pages. To optimize, I compute and store
> > the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its
> > SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a
> > false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new
> > hash replaces the old one for next comparison.
> > The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1
> > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which
> > is relatively small.
> > As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems.
> > They have proven that the probability of hash collision is far smaller than disk hardware fault,
> > so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the
> > same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery.
>
> There was a proposal ( https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05331.html )
> to do a migration system where
> a copy of the migration RAM is stored on disc on the destination for cases where similar VMs
> are migrated, and it used a checksum for each page to find the matching page
> in the cache; that originally used a smaller hash, I think in the end they used a SHA-256.
> (Hash based checks still make me nervous for intentional collisions but that's probably
> me being paranoid?)
Em... I don't know if most people would accept the hash based checks.
Maybe it needs some more mathematical proving like they have done in the
field of deduplication for backup systems.
>
> > Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have
> > big proportions of false dirty pages, the improvement is remarkable. Without optimization,
> > they either can not converge to stop-and-copy, or take a very long time to complete. With the
> > SHA1 hash method, all of them now complete in a relatively short time.
> > For the reason I have talked above, the other workloads don't get notable improvements from the
> > optimization. So below, I only show the exact number after optimization for the 4 workloads with
> > remarkable improvements.
> >
> > Any comments or suggestions?
>
> You might be able to save some of the CPU time; we've
> got a test that checks if a page is all-zero; if you're doing
> the SHA calculation you could avoid doing the all-zero check
> and replace it by comparing hte output of the SHA.
Yes, this is one way. However now I'm doing the opposite. I first
calculate the SHA1 of the all-zero page and remember that. Then next time,
if I recognize an all-zero page after the check, I just store the
SHA1 I have got earlier, avoiding calculating the SHA1 of the all-zero
page again. I think this is better, because I think the current implementation
to check all-zero pages is faster than calculating SHA1.
Thanks,
Chunguang
>
> >
> > Below is the experiments data:
> > (
> > "dup" means zero page, this kind of pages takes very little migration time and network
> > resources, so they are always not regard as dirty pages in my numbers;
> > "rd" means really dirty pages;
> > "fd" means false dirty pages;
> > The numbers refer to the quantities of pages.
> > )
> >
> > ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)-------------------
> >
> > 1. memcached
> >
> > ----- original pre-copy (can not converge): -----
> > Iteration 1, duration: 42111 ms , transferred pages: 1568788 (dup: 416239, rd: 1152549, fd: 0) , new dirty pages: 499015 , remaining dirty pages: 507397
> > Iteration 2, duration: 17208 ms , transferred pages: 498946 (dup: 5456, rd: 160206, fd: 333284) , new dirty pages: 261237 , remaining dirty pages: 269688
> > Iteration 3, duration: 9134 ms , transferred pages: 262377 (dup: 519, rd: 111900, fd: 149958) , new dirty pages: 170281 , remaining dirty pages: 177592
> > Iteration 4, duration: 5920 ms , transferred pages: 169966 (dup: 87, rd: 82487, fd: 87392) , new dirty pages: 121154 , remaining dirty pages: 128780
> > Iteration 5, duration: 4239 ms , transferred pages: 121551 (dup: 81, rd: 64120, fd: 57350) , new dirty pages: 100976 , remaining dirty pages: 108205
> > Iteration 6, duration: 3495 ms , transferred pages: 100353 (dup: 90, rd: 56021, fd: 44242) , new dirty pages: 74547 , remaining dirty pages: 82399
> > Iteration 7, duration: 2583 ms , transferred pages: 74160 (dup: 56, rd: 38016, fd: 36088) , new dirty pages: 58209 , remaining dirty pages: 66448
> > Iteration 8, duration: 2039 ms , transferred pages: 58534 (dup: 81, rd: 26885, fd: 31568) , new dirty pages: 43511 , remaining dirty pages: 51425
> > Iteration 9, duration: 1513 ms , transferred pages: 43484 (dup: 55, rd: 26641, fd: 16788) , new dirty pages: 43722 , remaining dirty pages: 51663
> > Iteration 10, duration: 1521 ms , transferred pages: 43676 (dup: 62, rd: 26463, fd: 17151) , new dirty pages: 35347 , remaining dirty pages: 43334
> > Iteration 11, duration: 1230 ms , transferred pages: 35287 (dup: 0, rd: 21293, fd: 13994) , new dirty pages: 28851 , remaining dirty pages: 36898
> > Iteration 12, duration: 1031 ms , transferred pages: 29651 (dup: 82, rd: 18143, fd: 11426) , new dirty pages: 27062 , remaining dirty pages: 34309
> > Iteration 13, duration: 917 ms , transferred pages: 26385 (dup: 56, rd: 14149, fd: 12180) , new dirty pages: 22723 , remaining dirty pages: 30647
> > Iteration 14, duration: 762 ms , transferred pages: 21902 (dup: 55, rd: 16355, fd: 5492) , new dirty pages: 18208 , remaining dirty pages: 26953
> > Iteration 15, duration: 650 ms , transferred pages: 18636 (dup: 0, rd: 11943, fd: 6693) , new dirty pages: 16085 , remaining dirty pages: 24402
> > Iteration 16, duration: 554 ms , transferred pages: 15946 (dup: 56, rd: 9527, fd: 6363) , new dirty pages: 14766 , remaining dirty pages: 23222
> > Iteration 17, duration: 538 ms , transferred pages: 15434 (dup: 0, rd: 9779, fd: 5655) , new dirty pages: 13381 , remaining dirty pages: 21169
> > Iteration 18, duration: 487 ms , transferred pages: 14089 (dup: 81, rd: 7737, fd: 6271) , new dirty pages: 13325 , remaining dirty pages: 20405
> > Iteration 19, duration: 428 ms , transferred pages: 12232 (dup: 0, rd: 8488, fd: 3744) , new dirty pages: 10274 , remaining dirty pages: 18447
> > Iteration 20, duration: 377 ms , transferred pages: 10887 (dup: 56, rd: 6362, fd: 4469) , new dirty pages: 9708 , remaining dirty pages: 17268
> > Iteration 21, duration: 320 ms , transferred pages: 9222 (dup: 0, rd: 5789, fd: 3433) , new dirty pages: 8015 , remaining dirty pages: 16061
> > Iteration 22, duration: 268 ms , transferred pages: 7621 (dup: 0, rd: 6204, fd: 1417) , new dirty pages: 7227 , remaining dirty pages: 15667
> > Iteration 23, duration: 269 ms , transferred pages: 7813 (dup: 56, rd: 4410, fd: 3347) , new dirty pages: 7591 , remaining dirty pages: 15445
> > Iteration 24, duration: 271 ms , transferred pages: 7749 (dup: 0, rd: 4565, fd: 3184) , new dirty pages: 15126 , remaining dirty pages: 22822
> > Iteration 25, duration: 549 ms , transferred pages: 15818 (dup: 60, rd: 10545, fd: 5213) , new dirty pages: 14559 , remaining dirty pages: 21563
> > Iteration 26, duration: 499 ms , transferred pages: 14281 (dup: 3, rd: 8760, fd: 5518) , new dirty pages: 11947 , remaining dirty pages: 19229
> > Iteration 27, duration: 376 ms , transferred pages: 10823 (dup: 25, rd: 6550, fd: 4248) , new dirty pages: 8561 , remaining dirty pages: 16967
> > Iteration 28, duration: 324 ms , transferred pages: 9350 (dup: 31, rd: 5292, fd: 4027) , new dirty pages: 8655 , remaining dirty pages: 16272
> > Iteration 29, duration: 274 ms , transferred pages: 7813 (dup: 0, rd: 6088, fd: 1725) , new dirty pages: 6300 , remaining dirty pages: 14759
> > Iteration 30, duration: 218 ms , transferred pages: 6340 (dup: 45, rd: 3196, fd: 3099) , new dirty pages: 5143 , remaining dirty pages: 13562
> >
> > ----- after optimization: -----
> > Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979
> > Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479
>
> Big difference.
>
> > Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100
> > Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414
> > Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536
> > Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516
> > Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802
> > Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060
> > Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973
> > Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014
> > Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516
> > Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842
> > Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871
> > Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989
> > Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989
> > total time: 54693 milliseconds
>
> Very nice.
>
> Dave
>
> > 2. cpu2006.zeusmp
> >
> > ----- original pre-copy (can not converge): -----
> > Iteration 1, duration: 21112 ms , transferred pages: 266450 (dup: 93385, rd: 173065, fd: 0) , new dirty pages: 127866 , remaining dirty pages: 127866
> > Iteration 2, duration: 6192 ms , transferred pages: 125662 (dup: 75762, rd: 17389, fd: 32511) , new dirty pages: 131655 , remaining dirty pages: 133859
> > Iteration 3, duration: 6699 ms , transferred pages: 131937 (dup: 77298, rd: 20320, fd: 34319) , new dirty pages: 121027 , remaining dirty pages: 122949
> > Iteration 4, duration: 5999 ms , transferred pages: 122512 (dup: 73588, rd: 17236, fd: 31688) , new dirty pages: 122759 , remaining dirty pages: 123196
> > Iteration 5, duration: 5804 ms , transferred pages: 122717 (dup: 75436, rd: 19016, fd: 28265) , new dirty pages: 123697 , remaining dirty pages: 124176
> > Iteration 6, duration: 5698 ms , transferred pages: 123708 (dup: 77249, rd: 18022, fd: 28437) , new dirty pages: 121838 , remaining dirty pages: 122306
> > Iteration 7, duration: 5515 ms , transferred pages: 122306 (dup: 76727, rd: 14819, fd: 30760) , new dirty pages: 122382 , remaining dirty pages: 122382
> > Iteration 8, duration: 6086 ms , transferred pages: 120825 (dup: 71834, rd: 15987, fd: 33004) , new dirty pages: 121587 , remaining dirty pages: 123144
> > Iteration 9, duration: 5899 ms , transferred pages: 120964 (dup: 72860, rd: 18191, fd: 29913) , new dirty pages: 120391 , remaining dirty pages: 122571
> > Iteration 10, duration: 5801 ms , transferred pages: 121425 (dup: 74140, rd: 20722, fd: 26563) , new dirty pages: 122302 , remaining dirty pages: 123448
> > Iteration 11, duration: 5909 ms , transferred pages: 123448 (dup: 74735, rd: 19678, fd: 29035) , new dirty pages: 123258 , remaining dirty pages: 123258
> > Iteration 12, duration: 6293 ms , transferred pages: 121211 (dup: 70442, rd: 18128, fd: 32641) , new dirty pages: 123623 , remaining dirty pages: 125670
> > Iteration 13, duration: 6398 ms , transferred pages: 124897 (dup: 72701, rd: 21134, fd: 31062) , new dirty pages: 122355 , remaining dirty pages: 123128
> > Iteration 14, duration: 6301 ms , transferred pages: 121893 (dup: 70514, rd: 23470, fd: 27909) , new dirty pages: 120980 , remaining dirty pages: 122215
> > Iteration 15, duration: 6304 ms , transferred pages: 121389 (dup: 70005, rd: 21731, fd: 29653) , new dirty pages: 121628 , remaining dirty pages: 122454
> > Iteration 16, duration: 6398 ms , transferred pages: 122164 (dup: 69962, rd: 24376, fd: 27826) , new dirty pages: 122246 , remaining dirty pages: 122536
> > Iteration 17, duration: 6201 ms , transferred pages: 121548 (dup: 70984, rd: 23915, fd: 26649) , new dirty pages: 121460 , remaining dirty pages: 122448
> > Iteration 18, duration: 6401 ms , transferred pages: 122272 (dup: 70072, rd: 22261, fd: 29939) , new dirty pages: 123518 , remaining dirty pages: 123694
> > Iteration 19, duration: 7003 ms , transferred pages: 121873 (dup: 64754, rd: 27325, fd: 29794) , new dirty pages: 120568 , remaining dirty pages: 122389
> > Iteration 20, duration: 6400 ms , transferred pages: 121422 (dup: 69221, rd: 25300, fd: 26901) , new dirty pages: 121229 , remaining dirty pages: 122196
> > Iteration 21, duration: 6703 ms , transferred pages: 119895 (dup: 65232, rd: 25877, fd: 28786) , new dirty pages: 123284 , remaining dirty pages: 125585
> > Iteration 22, duration: 6902 ms , transferred pages: 123884 (dup: 67582, rd: 29020, fd: 27282) , new dirty pages: 122057 , remaining dirty pages: 123758
> > Iteration 23, duration: 6800 ms , transferred pages: 122010 (dup: 66529, rd: 30644, fd: 24837) , new dirty pages: 120916 , remaining dirty pages: 122664
> > Iteration 24, duration: 7202 ms , transferred pages: 121951 (dup: 63188, rd: 31105, fd: 27658) , new dirty pages: 122715 , remaining dirty pages: 123428
> > Iteration 25, duration: 7202 ms , transferred pages: 122919 (dup: 64161, rd: 32063, fd: 26695) , new dirty pages: 123180 , remaining dirty pages: 123689
> > Iteration 26, duration: 7404 ms , transferred pages: 123092 (dup: 62694, rd: 33459, fd: 26939) , new dirty pages: 122149 , remaining dirty pages: 122746
> > Iteration 27, duration: 7205 ms , transferred pages: 120427 (dup: 61664, rd: 34344, fd: 24419) , new dirty pages: 120299 , remaining dirty pages: 122618
> > Iteration 28, duration: 7100 ms , transferred pages: 121074 (dup: 63130, rd: 32403, fd: 25541) , new dirty pages: 122984 , remaining dirty pages: 124528
> > Iteration 29, duration: 7904 ms , transferred pages: 124060 (dup: 59564, rd: 35631, fd: 28865) , new dirty pages: 127080 , remaining dirty pages: 127548
> > Iteration 30, duration: 7906 ms , transferred pages: 127518 (dup: 63029, rd: 34416, fd: 30073) , new dirty pages: 125028 , remaining dirty pages: 125058
> >
> > ----- after optimization: -----
> > Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843
> > Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945
> > Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929
> > Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916
> > Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302
> > Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500
> > Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520
> > Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524
> > total time: 27225 milliseconds
> >
> > 3. cpu2006.bzip2
> >
> > ----- original pre-copy: -----
> > Iteration 1, duration: 18306 ms , transferred pages: 266450 (dup: 116569, rd: 149881, fd: 0) , new dirty pages: 106299 , remaining dirty pages: 106299
> > Iteration 2, duration: 10694 ms , transferred pages: 104611 (dup: 17550, rd: 10536, fd: 76525) , new dirty pages: 34394 , remaining dirty pages: 36082
> > Iteration 3, duration: 2998 ms , transferred pages: 34442 (dup: 9924, rd: 12254, fd: 12264) , new dirty pages: 6419 , remaining dirty pages: 8059
> > Iteration 4, duration: 699 ms , transferred pages: 5748 (dup: 22, rd: 2583, fd: 3143) , new dirty pages: 1226 , remaining dirty pages: 3537
> > Iteration 5, duration: 200 ms , transferred pages: 1636 (dup: 0, rd: 1194, fd: 442) , new dirty pages: 478 , remaining dirty pages: 2379
> > Iteration 6, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2379
> >
> > ----- after optimization: -----
> > Iteration 1, duration: 13995 ms , transferred pages: 266314 (dup: 152118, rd: 114196) , new dirty pages: 97009 , remaining dirty pages: 97145
> > Iteration 2, duration: 1215 ms , transferred pages: 33400 (dup: 26745, rd: 6655) , new dirty pages: 12866 , remaining dirty pages: 14017
> > Iteration 3, duration: 701 ms , transferred pages: 5774 (dup: 48, rd: 5726) , new dirty pages: 6342 , remaining dirty pages: 8761
> > Iteration 4, duration: 500 ms , transferred pages: 4111 (dup: 21, rd: 4090) , new dirty pages: 4311 , remaining dirty pages: 6485
> > Iteration 5, duration: 400 ms , transferred pages: 3273 (dup: 1, rd: 3272) , new dirty pages: 3034 , remaining dirty pages: 5431
> > Iteration 6, duration: 301 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2094 , remaining dirty pages: 4472
> > Iteration 7, duration: 299 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2066 , remaining dirty pages: 4082
> > Iteration 8, duration: 202 ms , transferred pages: 1636 (dup: 0, rd: 1636) , new dirty pages: 2881 , remaining dirty pages: 4648
> > Iteration 9, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 4775 , remaining dirty pages: 6778
> > Iteration 10, duration: 400 ms , transferred pages: 3281 (dup: 9, rd: 3272) , new dirty pages: 3757 , remaining dirty pages: 5576
> > Iteration 11, duration: 401 ms , transferred pages: 3279 (dup: 7, rd: 3272) , new dirty pages: 6980 , remaining dirty pages: 8906
> > Iteration 12, duration: 500 ms , transferred pages: 7118 (dup: 3035, rd: 4083) , new dirty pages: 10774 , remaining dirty pages: 11922
> > Iteration 13, duration: 116 ms , transferred pages: 11706 (dup: 10152, rd: 1554) , new dirty pages: 1326 , remaining dirty pages: 1326
> > Iteration 14, duration: 117 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 1326
> > total time: 19479 milliseconds
> >
> > 4. cpu2006.mcf
> >
> > ----- original pre-copy: -----
> > Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6925, rd: 259525, fd: 0) , new dirty pages: 244403 , remaining dirty pages: 244403
> > Iteration 2, duration: 29603 ms , transferred pages: 242275 (dup: 377, rd: 224001, fd: 17897) , new dirty pages: 227335 , remaining dirty pages: 229463
> > Iteration 3, duration: 27806 ms , transferred pages: 227573 (dup: 169, rd: 65681, fd: 161723) , new dirty pages: 195593 , remaining dirty pages: 197483
> > Iteration 4, duration: 23907 ms , transferred pages: 195543 (dup: 41, rd: 39838, fd: 155664) , new dirty pages: 215066 , remaining dirty pages: 217006
> > Iteration 5, duration: 26305 ms , transferred pages: 215289 (dup: 155, rd: 33082, fd: 182052) , new dirty pages: 111098 , remaining dirty pages: 112815
> > Iteration 6, duration: 13502 ms , transferred pages: 110452 (dup: 22, rd: 26793, fd: 83637) , new dirty pages: 161054 , remaining dirty pages: 163417
> > Iteration 7, duration: 19705 ms , transferred pages: 161266 (dup: 120, rd: 33818, fd: 127328) , new dirty pages: 220562 , remaining dirty pages: 222713
> > Iteration 8, duration: 27003 ms , transferred pages: 220881 (dup: 21, rd: 215721, fd: 5139) , new dirty pages: 219787 , remaining dirty pages: 221619
> > Iteration 9, duration: 26802 ms , transferred pages: 219248 (dup: 24, rd: 84648, fd: 134576) , new dirty pages: 207959 , remaining dirty pages: 210330
> > Iteration 10, duration: 25411 ms , transferred pages: 207916 (dup: 144, rd: 35842, fd: 171930) , new dirty pages: 144442 , remaining dirty pages: 146856
> > Iteration 11, duration: 17714 ms , transferred pages: 144804 (dup: 18, rd: 25414, fd: 119372) , new dirty pages: 205127 , remaining dirty pages: 207179
> > Iteration 12, duration: 25112 ms , transferred pages: 205446 (dup: 128, rd: 23197, fd: 182121) , new dirty pages: 167319 , remaining dirty pages: 169052
> > Iteration 13, duration: 20411 ms , transferred pages: 166886 (dup: 14, rd: 21960, fd: 144912) , new dirty pages: 221592 , remaining dirty pages: 223758
> > Iteration 14, duration: 27126 ms , transferred pages: 221800 (dup: 122, rd: 42368, fd: 179310) , new dirty pages: 233630 , remaining dirty pages: 235588
> > Iteration 15, duration: 28517 ms , transferred pages: 233321 (dup: 191, rd: 222528, fd: 10602) , new dirty pages: 224282 , remaining dirty pages: 226549
> > Iteration 16, duration: 27422 ms , transferred pages: 224187 (dup: 55, rd: 45773, fd: 178359) , new dirty pages: 209815 , remaining dirty pages: 212177
> > Iteration 17, duration: 25723 ms , transferred pages: 210260 (dup: 34, rd: 79405, fd: 130821) , new dirty pages: 220297 , remaining dirty pages: 222214
> > Iteration 18, duration: 26920 ms , transferred pages: 220056 (dup: 14, rd: 214128, fd: 5914) , new dirty pages: 192015 , remaining dirty pages: 194173
> > Iteration 19, duration: 23520 ms , transferred pages: 192239 (dup: 9, rd: 25140, fd: 167090) , new dirty pages: 96450 , remaining dirty pages: 98384
> > Iteration 20, duration: 11805 ms , transferred pages: 96538 (dup: 14, rd: 7424, fd: 89100) , new dirty pages: 6978 , remaining dirty pages: 8824
> > Iteration 21, duration: 799 ms , transferred pages: 6545 (dup: 1, rd: 1802, fd: 4742) , new dirty pages: 138 , remaining dirty pages: 2417
> > Iteration 22, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2417
> >
> > ----- after optimization: -----
> > Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209
> > Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571
> > Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478
> > Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172
> > Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305
> > Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675
> > Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265
> > Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668
> > Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014
> > Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214
> > Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725
> > Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903
> > Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260
> > Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266
> > Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792
> > Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639
> > Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170
> > Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547
> > Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521
> > Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054
> > Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624
> > Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265
> > Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813
> > Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608
> > Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311
> > Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294
> > Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936
> > Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423
> > Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292
> > Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457
> > Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304
> > Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815
> > Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670
> > Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422
> > Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222
> > Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337
> > Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251
> > Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251
> > total time: 97919 milliseconds
> >
> > ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)-------------------
> >
> > 5. idle
> >
> > Iteration 1, duration: 14702 ms , transferred pages: 266450 (dup: 146393, rd: 120057, fd: 0) , new dirty pages: 14595 , remaining dirty pages: 14595
> > Iteration 2, duration: 1592 ms , transferred pages: 12412 (dup: 103, rd: 3280, fd: 9029) , new dirty pages: 218 , remaining dirty pages: 2401
> > Iteration 3, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2401
> >
> > 6. kernel compilation (can not converge)
> >
> > Iteration 1, duration: 20607 ms , transferred pages: 266450 (dup: 97552, rd: 168898, fd: 0) , new dirty pages: 19293 , remaining dirty pages: 19293
> > Iteration 2, duration: 2092 ms , transferred pages: 17176 (dup: 597, rd: 8625, fd: 7954) , new dirty pages: 8318 , remaining dirty pages: 10435
> > Iteration 3, duration: 1000 ms , transferred pages: 8484 (dup: 304, rd: 6256, fd: 1924) , new dirty pages: 8736 , remaining dirty pages: 10687
> > Iteration 4, duration: 1000 ms , transferred pages: 8435 (dup: 255, rd: 7089, fd: 1091) , new dirty pages: 7627 , remaining dirty pages: 9879
> > Iteration 5, duration: 900 ms , transferred pages: 7553 (dup: 191, rd: 5602, fd: 1760) , new dirty pages: 7287 , remaining dirty pages: 9613
> > Iteration 6, duration: 900 ms , transferred pages: 7620 (dup: 258, rd: 5761, fd: 1601) , new dirty pages: 8958 , remaining dirty pages: 10951
> > Iteration 7, duration: 1099 ms , transferred pages: 9309 (dup: 311, rd: 8051, fd: 947) , new dirty pages: 7189 , remaining dirty pages: 8831
> > Iteration 8, duration: 800 ms , transferred pages: 6832 (dup: 288, rd: 5717, fd: 827) , new dirty pages: 5782 , remaining dirty pages: 7781
> > Iteration 9, duration: 701 ms , transferred pages: 5875 (dup: 149, rd: 4005, fd: 1721) , new dirty pages: 4587 , remaining dirty pages: 6493
> > Iteration 10, duration: 500 ms , transferred pages: 4234 (dup: 144, rd: 3057, fd: 1033) , new dirty pages: 7352 , remaining dirty pages: 9611
> > Iteration 11, duration: 900 ms , transferred pages: 7759 (dup: 397, rd: 6563, fd: 799) , new dirty pages: 6686 , remaining dirty pages: 8538
> > Iteration 12, duration: 800 ms , transferred pages: 6808 (dup: 264, rd: 6017, fd: 527) , new dirty pages: 6871 , remaining dirty pages: 8601
> > Iteration 13, duration: 800 ms , transferred pages: 6775 (dup: 231, rd: 5722, fd: 822) , new dirty pages: 7540 , remaining dirty pages: 9366
> > Iteration 14, duration: 900 ms , transferred pages: 7507 (dup: 145, rd: 5900, fd: 1462) , new dirty pages: 7581 , remaining dirty pages: 9440
> > Iteration 15, duration: 900 ms , transferred pages: 7630 (dup: 268, rd: 6211, fd: 1151) , new dirty pages: 7268 , remaining dirty pages: 9078
> > Iteration 16, duration: 800 ms , transferred pages: 6759 (dup: 215, rd: 5763, fd: 781) , new dirty pages: 6861 , remaining dirty pages: 9180
> > Iteration 17, duration: 800 ms , transferred pages: 6838 (dup: 294, rd: 6037, fd: 507) , new dirty pages: 6196 , remaining dirty pages: 8538
> > Iteration 18, duration: 800 ms , transferred pages: 6852 (dup: 308, rd: 4905, fd: 1639) , new dirty pages: 5947 , remaining dirty pages: 7633
> > Iteration 19, duration: 700 ms , transferred pages: 5919 (dup: 193, rd: 4853, fd: 873) , new dirty pages: 5861 , remaining dirty pages: 7575
> > Iteration 20, duration: 600 ms , transferred pages: 5284 (dup: 376, rd: 4408, fd: 500) , new dirty pages: 5206 , remaining dirty pages: 7497
> > Iteration 21, duration: 600 ms , transferred pages: 5147 (dup: 239, rd: 4308, fd: 600) , new dirty pages: 5031 , remaining dirty pages: 7381
> > Iteration 22, duration: 599 ms , transferred pages: 5064 (dup: 156, rd: 4026, fd: 882) , new dirty pages: 5601 , remaining dirty pages: 7918
> > Iteration 23, duration: 702 ms , transferred pages: 5965 (dup: 239, rd: 5028, fd: 698) , new dirty pages: 6079 , remaining dirty pages: 8032
> > Iteration 24, duration: 700 ms , transferred pages: 6175 (dup: 449, rd: 5146, fd: 580) , new dirty pages: 10932 , remaining dirty pages: 12789
> > Iteration 25, duration: 1300 ms , transferred pages: 10936 (dup: 302, rd: 6205, fd: 4429) , new dirty pages: 8713 , remaining dirty pages: 10566
> > Iteration 26, duration: 1000 ms , transferred pages: 8282 (dup: 102, rd: 5662, fd: 2518) , new dirty pages: 5119 , remaining dirty pages: 7403
> > Iteration 27, duration: 600 ms , transferred pages: 5007 (dup: 99, rd: 4099, fd: 809) , new dirty pages: 2226 , remaining dirty pages: 4622
> > Iteration 28, duration: 300 ms , transferred pages: 2491 (dup: 37, rd: 1794, fd: 660) , new dirty pages: 6746 , remaining dirty pages: 8877
> > Iteration 29, duration: 800 ms , transferred pages: 6757 (dup: 213, rd: 5532, fd: 1012) , new dirty pages: 6070 , remaining dirty pages: 8190
> > Iteration 30, duration: 700 ms , transferred pages: 6052 (dup: 326, rd: 5107, fd: 619) , new dirty pages: 5177 , remaining dirty pages: 7315
> >
> > 7. web server
> >
> > Iteration 1, duration: 20902 ms , transferred pages: 266450 (dup: 95497, rd: 170953, fd: 0) , new dirty pages: 8528 , remaining dirty pages: 8528
> > Iteration 2, duration: 796 ms , transferred pages: 6472 (dup: 131, rd: 1885, fd: 4456) , new dirty pages: 650 , remaining dirty pages: 2706
> > Iteration 3, duration: 100 ms , transferred pages: 818 (dup: 0, rd: 383, fd: 435) , new dirty pages: 328 , remaining dirty pages: 2216
> > Iteration 4, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2216
> >
> >
> > 8. cpu2006.bwaves (can not converge)
> >
> > Iteration 1, duration: 31715 ms , transferred pages: 266450 (dup: 6766, rd: 259684, fd: 0) , new dirty pages: 242702 , remaining dirty pages: 242702
> > Iteration 2, duration: 29397 ms , transferred pages: 240508 (dup: 405, rd: 225588, fd: 14515) , new dirty pages: 230889 , remaining dirty pages: 233083
> > Iteration 3, duration: 28205 ms , transferred pages: 230858 (dup: 182, rd: 214596, fd: 16080) , new dirty pages: 226998 , remaining dirty pages: 229223
> > Iteration 4, duration: 27805 ms , transferred pages: 227574 (dup: 170, rd: 217045, fd: 10359) , new dirty pages: 227360 , remaining dirty pages: 229009
> > Iteration 5, duration: 27703 ms , transferred pages: 226786 (dup: 200, rd: 212130, fd: 14456) , new dirty pages: 225885 , remaining dirty pages: 228108
> > Iteration 6, duration: 27600 ms , transferred pages: 225923 (dup: 155, rd: 215503, fd: 10265) , new dirty pages: 223555 , remaining dirty pages: 225740
> > Iteration 7, duration: 27309 ms , transferred pages: 223574 (dup: 260, rd: 215641, fd: 7673) , new dirty pages: 231975 , remaining dirty pages: 234141
> > Iteration 8, duration: 28403 ms , transferred pages: 232397 (dup: 85, rd: 214086, fd: 18226) , new dirty pages: 222170 , remaining dirty pages: 223914
> > Iteration 9, duration: 27105 ms , transferred pages: 221809 (dup: 131, rd: 214988, fd: 6690) , new dirty pages: 230065 , remaining dirty pages: 232170
> > Iteration 10, duration: 28104 ms , transferred pages: 230201 (dup: 343, rd: 213531, fd: 16327) , new dirty pages: 227590 , remaining dirty pages: 229559
> > Iteration 11, duration: 27801 ms , transferred pages: 227717 (dup: 313, rd: 221408, fd: 5996) , new dirty pages: 228457 , remaining dirty pages: 230299
> > Iteration 12, duration: 27916 ms , transferred pages: 228560 (dup: 338, rd: 219660, fd: 8562) , new dirty pages: 238326 , remaining dirty pages: 240065
> >
> > 9. cpu2006.lbm (can not converge)
> > Iteration 1, duration: 31012 ms , transferred pages: 266450 (dup: 12253, rd: 254197, fd: 0) , new dirty pages: 108960 , remaining dirty pages: 108960
> > Iteration 2, duration: 13095 ms , transferred pages: 106522 (dup: 3, rd: 102045, fd: 4474) , new dirty pages: 129292 , remaining dirty pages: 131730
> > Iteration 3, duration: 15802 ms , transferred pages: 129688 (dup: 444, rd: 110860, fd: 18384) , new dirty pages: 116682 , remaining dirty pages: 118724
> > Iteration 4, duration: 14204 ms , transferred pages: 116316 (dup: 160, rd: 104951, fd: 11205) , new dirty pages: 107246 , remaining dirty pages: 109654
> > Iteration 5, duration: 13208 ms , transferred pages: 107977 (dup: 1, rd: 101834, fd: 6142) , new dirty pages: 105371 , remaining dirty pages: 107048
> > Iteration 6, duration: 12804 ms , transferred pages: 104705 (dup: 1, rd: 99629, fd: 5075) , new dirty pages: 103841 , remaining dirty pages: 106184
> > Iteration 7, duration: 12709 ms , transferred pages: 103891 (dup: 5, rd: 99212, fd: 4674) , new dirty pages: 106692 , remaining dirty pages: 108985
> > Iteration 8, duration: 13105 ms , transferred pages: 107169 (dup: 11, rd: 100125, fd: 7033) , new dirty pages: 103132 , remaining dirty pages: 104948
> > Iteration 9, duration: 12607 ms , transferred pages: 103068 (dup: 0, rd: 99460, fd: 3608) , new dirty pages: 102511 , remaining dirty pages: 104391
> > Iteration 10, duration: 12514 ms , transferred pages: 102250 (dup: 0, rd: 99094, fd: 3156) , new dirty pages: 102888 , remaining dirty pages: 105029
> >
> > 10. cpu2006.astar (can not converge)
> >
> > Iteration 1, duration: 28402 ms , transferred pages: 266450 (dup: 33770, rd: 232680, fd: 0) , new dirty pages: 62078 , remaining dirty pages: 62078
> > Iteration 2, duration: 7393 ms , transferred pages: 60107 (dup: 10, rd: 51722, fd: 8375) , new dirty pages: 48854 , remaining dirty pages: 50825
> > Iteration 3, duration: 6001 ms , transferred pages: 49094 (dup: 14, rd: 46540, fd: 2540) , new dirty pages: 48137 , remaining dirty pages: 49868
> > Iteration 4, duration: 5800 ms , transferred pages: 47444 (dup: 0, rd: 45389, fd: 2055) , new dirty pages: 49147 , remaining dirty pages: 51571
> > Iteration 5, duration: 6102 ms , transferred pages: 49912 (dup: 14, rd: 46216, fd: 3682) , new dirty pages: 55606 , remaining dirty pages: 57265
> > Iteration 6, duration: 6699 ms , transferred pages: 54949 (dup: 143, rd: 20745, fd: 34061) , new dirty pages: 9166 , remaining dirty pages: 11482
> > Iteration 7, duration: 1200 ms , transferred pages: 9830 (dup: 14, rd: 7011, fd: 2805) , new dirty pages: 8294 , remaining dirty pages: 9946
> > Iteration 8, duration: 1000 ms , transferred pages: 8194 (dup: 14, rd: 7178, fd: 1002) , new dirty pages: 5475 , remaining dirty pages: 7227
> > Iteration 9, duration: 600 ms , transferred pages: 4908 (dup: 0, rd: 3470, fd: 1438) , new dirty pages: 4175 , remaining dirty pages: 6494
> > Iteration 10, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3856, fd: 234) , new dirty pages: 4095 , remaining dirty pages: 6499
> > Iteration 11, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3313, fd: 777) , new dirty pages: 3371 , remaining dirty pages: 5780
> > Iteration 12, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3823, fd: 267) , new dirty pages: 7518 , remaining dirty pages: 9208
> > Iteration 13, duration: 899 ms , transferred pages: 7376 (dup: 14, rd: 6028, fd: 1334) , new dirty pages: 3931 , remaining dirty pages: 5763
> > Iteration 14, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 4346 , remaining dirty pages: 6019
> > Iteration 15, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3817, fd: 273) , new dirty pages: 3054 , remaining dirty pages: 4983
> > Iteration 16, duration: 400 ms , transferred pages: 3272 (dup: 0, rd: 3138, fd: 134) , new dirty pages: 3874 , remaining dirty pages: 5585
> > Iteration 17, duration: 399 ms , transferred pages: 3272 (dup: 0, rd: 3248, fd: 24) , new dirty pages: 5285 , remaining dirty pages: 7598
> > Iteration 18, duration: 701 ms , transferred pages: 5726 (dup: 0, rd: 4385, fd: 1341) , new dirty pages: 8903 , remaining dirty pages: 10775
> > Iteration 19, duration: 1101 ms , transferred pages: 9010 (dup: 12, rd: 5597, fd: 3401) , new dirty pages: 4199 , remaining dirty pages: 5964
> > Iteration 20, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 3829 , remaining dirty pages: 5703
> >
> > 11. cpu2006.xalancbmk (can not converge)
> >
> > Iteration 1, duration: 30407 ms , transferred pages: 266450 (dup: 17700, rd: 248750, fd: 0) , new dirty pages: 96169 , remaining dirty pages: 96169
> > Iteration 2, duration: 11495 ms , transferred pages: 94164 (dup: 205, rd: 67068, fd: 26891) , new dirty pages: 61766 , remaining dirty pages: 63771
> > Iteration 3, duration: 7501 ms , transferred pages: 61471 (dup: 121, rd: 53587, fd: 7763) , new dirty pages: 56569 , remaining dirty pages: 58869
> > Iteration 4, duration: 6902 ms , transferred pages: 56461 (dup: 19, rd: 50553, fd: 5889) , new dirty pages: 52181 , remaining dirty pages: 54589
> > Iteration 5, duration: 6402 ms , transferred pages: 52459 (dup: 107, rd: 46986, fd: 5366) , new dirty pages: 54051 , remaining dirty pages: 56181
> > Iteration 6, duration: 6601 ms , transferred pages: 54003 (dup: 15, rd: 47566, fd: 6422) , new dirty pages: 50844 , remaining dirty pages: 53022
> > Iteration 7, duration: 6202 ms , transferred pages: 50723 (dup: 7, rd: 47143, fd: 3573) , new dirty pages: 64880 , remaining dirty pages: 67179
> > Iteration 8, duration: 8001 ms , transferred pages: 65447 (dup: 7, rd: 61159, fd: 4281) , new dirty pages: 67854 , remaining dirty pages: 69586
> > Iteration 9, duration: 8202 ms , transferred pages: 67444 (dup: 368, rd: 56357, fd: 10719) , new dirty pages: 65178 , remaining dirty pages: 67320
> > Iteration 10, duration: 8000 ms , transferred pages: 65455 (dup: 15, rd: 60581, fd: 4859) , new dirty pages: 52421 , remaining dirty pages: 54286
> >
> > 12. cpu2006.milc (can not converge)
> >
> > Iteration 1, duration: 31410 ms , transferred pages: 266450 (dup: 9454, rd: 256996, fd: 0) , new dirty pages: 158860 , remaining dirty pages: 158860
> > Iteration 2, duration: 19193 ms , transferred pages: 157048 (dup: 150, rd: 96807, fd: 60091) , new dirty pages: 102238 , remaining dirty pages: 104050
> > Iteration 3, duration: 12504 ms , transferred pages: 102271 (dup: 21, rd: 95107, fd: 7143) , new dirty pages: 97944 , remaining dirty pages: 99723
> > Iteration 4, duration: 11905 ms , transferred pages: 97360 (dup: 18, rd: 93610, fd: 3732) , new dirty pages: 99150 , remaining dirty pages: 101513
> > Iteration 5, duration: 12105 ms , transferred pages: 99094 (dup: 116, rd: 94125, fd: 4853) , new dirty pages: 98589 , remaining dirty pages: 101008
> > Iteration 6, duration: 12101 ms , transferred pages: 98995 (dup: 17, rd: 94069, fd: 4909) , new dirty pages: 147403 , remaining dirty pages: 149416
> > Iteration 7, duration: 18001 ms , transferred pages: 147284 (dup: 44, rd: 135691, fd: 11549) , new dirty pages: 136445 , remaining dirty pages: 138577
> > Iteration 8, duration: 16702 ms , transferred pages: 136636 (dup: 30, rd: 130805, fd: 5801) , new dirty pages: 145481 , remaining dirty pages: 147422
> > Iteration 9, duration: 17800 ms , transferred pages: 145734 (dup: 130, rd: 133239, fd: 12365) , new dirty pages: 98032 , remaining dirty pages: 99720
> > Iteration 10, duration: 11902 ms , transferred pages: 97364 (dup: 22, rd: 93096, fd: 4246) , new dirty pages: 95391 , remaining dirty pages: 97747
> >
> > 13. cpu2006.cactusADM (can not converge)
> >
> > Iteration 1, duration: 23508 ms , transferred pages: 266450 (dup: 73568, rd: 192882, fd: 0) , new dirty pages: 123869 , remaining dirty pages: 123869
> > Iteration 2, duration: 13989 ms , transferred pages: 121594 (dup: 7874, rd: 81653, fd: 32067) , new dirty pages: 112960 , remaining dirty pages: 115235
> > Iteration 3, duration: 13605 ms , transferred pages: 113276 (dup: 2028, rd: 83783, fd: 27465) , new dirty pages: 112314 , remaining dirty pages: 114273
> > Iteration 4, duration: 13509 ms , transferred pages: 111935 (dup: 1505, rd: 83535, fd: 26895) , new dirty pages: 114078 , remaining dirty pages: 116416
> > Iteration 5, duration: 13810 ms , transferred pages: 114262 (dup: 1378, rd: 84039, fd: 28845) , new dirty pages: 112271 , remaining dirty pages: 114425
> > Iteration 6, duration: 13604 ms , transferred pages: 112664 (dup: 1416, rd: 84300, fd: 26948) , new dirty pages: 112903 , remaining dirty pages: 114664
> > Iteration 7, duration: 13604 ms , transferred pages: 112655 (dup: 1407, rd: 84027, fd: 27221) , new dirty pages: 110943 , remaining dirty pages: 112952
> > Iteration 8, duration: 13406 ms , transferred pages: 110720 (dup: 1108, rd: 84075, fd: 25537) , new dirty pages: 109321 , remaining dirty pages: 111553
> > Iteration 9, duration: 13306 ms , transferred pages: 109726 (dup: 932, rd: 83652, fd: 25142) , new dirty pages: 113446 , remaining dirty pages: 115273
> > Iteration 10, duration: 13705 ms , transferred pages: 113121 (dup: 1055, rd: 84671, fd: 27395) , new dirty pages: 108776 , remaining dirty pages: 110928
> >
> > 14. cpu2006.GmesFDTD (can not converge)
> >
> > Iteration 1, duration: 13303 ms , transferred pages: 266450 (dup: 157809, rd: 108641, fd: 0) , new dirty pages: 226802 , remaining dirty pages: 226802
> > Iteration 2, duration: 10797 ms , transferred pages: 226507 (dup: 138637, rd: 61818, fd: 26052) , new dirty pages: 200769 , remaining dirty pages: 201064
> > Iteration 3, duration: 8900 ms , transferred pages: 199717 (dup: 127187, rd: 69340, fd: 3190) , new dirty pages: 203436 , remaining dirty pages: 204783
> > Iteration 4, duration: 10904 ms , transferred pages: 204127 (dup: 115211, rd: 85767, fd: 3149) , new dirty pages: 198407 , remaining dirty pages: 199063
> > Iteration 5, duration: 12109 ms , transferred pages: 198206 (dup: 99435, rd: 96956, fd: 1815) , new dirty pages: 213719 , remaining dirty pages: 214576
> > Iteration 6, duration: 16307 ms , transferred pages: 213595 (dup: 80422, rd: 116885, fd: 16288) , new dirty pages: 199637 , remaining dirty pages: 200618
> > Iteration 7, duration: 16915 ms , transferred pages: 198289 (dup: 60169, rd: 134208, fd: 3912) , new dirty pages: 199343 , remaining dirty pages: 201672
> > Iteration 8, duration: 19518 ms , transferred pages: 200452 (dup: 41014, rd: 156083, fd: 3355) , new dirty pages: 222927 , remaining dirty pages: 224147
> >
> > 15. cpu2006.wrf (can not converge)
> >
> > Iteration 1, duration: 18499 ms , transferred pages: 266380 (dup: 115285, rd: 151095, fd: 0) , new dirty pages: 112322 , remaining dirty pages: 112392
> > Iteration 2, duration: 9802 ms , transferred pages: 110025 (dup: 29917, rd: 65782, fd: 14326) , new dirty pages: 88855 , remaining dirty pages: 91222
> > Iteration 3, duration: 8199 ms , transferred pages: 89761 (dup: 22728, rd: 57262, fd: 9771) , new dirty pages: 58431 , remaining dirty pages: 59892
> > Iteration 4, duration: 5603 ms , transferred pages: 58502 (dup: 12716, rd: 41809, fd: 3977) , new dirty pages: 80556 , remaining dirty pages: 81946
> > Iteration 5, duration: 7101 ms , transferred pages: 79778 (dup: 21738, rd: 50896, fd: 7144) , new dirty pages: 62592 , remaining dirty pages: 64760
> > Iteration 6, duration: 5702 ms , transferred pages: 63388 (dup: 16793, rd: 42726, fd: 3869) , new dirty pages: 80747 , remaining dirty pages: 82119
> > Iteration 7, duration: 7000 ms , transferred pages: 80868 (dup: 23652, rd: 52194, fd: 5022) , new dirty pages: 84593 , remaining dirty pages: 85844
> > Iteration 8, duration: 7099 ms , transferred pages: 83799 (dup: 25769, rd: 51772, fd: 6258) , new dirty pages: 67951 , remaining dirty pages: 69996
> > Iteration 9, duration: 6303 ms , transferred pages: 68478 (dup: 16979, rd: 36490, fd: 15009) , new dirty pages: 81181 , remaining dirty pages: 82699
> > Iteration 10, duration: 7000 ms , transferred pages: 80724 (dup: 23503, rd: 52826, fd: 4395) , new dirty pages: 47930 , remaining dirty pages: 49905
> >
> >
> >
> > >
> > > > So I think "booting" and "kernel compilation" should benefit a lot from this
> > > > improvement. The reason of "kernel compilation" would benefit is that some
> > > > iterations take around 600ms, and if they are halved into 300ms, then the precopy
> > > > may have the chance to step into stop and copy phase.
> > > >
> > > > On the other hand, "idle" and "web server" would not benefit a lot, because
> > > > most of the time are spent on the 1st iteration and little on the others.
> > > >
> > > > As to the "zeusmp" and "memcached", although the time spent on the other iterations
> > > > but the 1st one may be halved, they still could not converge to stop and copy
> > > > with the 300ms downtime.
> > > >
> > > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------
> > > >
> > > > 1. booting : begin to migrate when the VM is booting
> > > >
> > > > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414
> > > > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459
> > > > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356
> > > > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430
> > > > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430
> > > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now.
> > > > "n" means "normal pages" and "d" means "duplicate (zero) pages".)
> > > >
> > > > 2. idle
> > > >
> > > > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398
> > > > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294
> > > > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849
> > > > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849
> > > >
> > > > 3. kernel compilation (can not converge)
> > > >
> > > > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067
> > > > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518
> > > > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207
> > > > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339
> > > > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200
> > > > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116
> > > > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033
> > > > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642
> > > > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427
> > > > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731
> > > > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252
> > > > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348
> > > > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920
> > > > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693
> > > > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382
> > > >
> > > > 4. cpu2006.zeusmp (can not converge)
> > > >
> > > > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625
> > > > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360
> > > > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831
> > > > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768
> > > > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885
> > > > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634
> > > > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894
> > > > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476
> > > > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692
> > > > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727
> > > > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810
> > > > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813
> > > > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329
> > > > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918
> > > > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338
> > > >
> > > > 5. web server : An apache web server. The client is configured with 50 concurrent connections.
> > > >
> > > > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628
> > > > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574
> > > > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261
> > > > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519
> > > > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167
> > > > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167
> > > >
> > > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------
> > > >
> > > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge)
> > > >
> > > > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023
> > > > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013
> > > > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551
> > > > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638
> > > > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712
> > > > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787
> > > > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942
> > > > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432
> > > > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096
> > > > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418
> > > > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321
> > > > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169
> > > > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989
> > > > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061
> > > > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717
> > > > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257
> > > > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576
> > > > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963
> > > > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888
> > > > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047
> > > >
> > > >
> > > > --
> > > > Chunguang Li, Ph.D. Candidate
> > > > Wuhan National Laboratory for Optoelectronics (WNLO)
> > > > Huazhong University of Science & Technology (HUST)
> > > > Wuhan, Hubei Prov., China
> > > >
> > > >
> > > >
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >
> > --
> > Chunguang Li, Ph.D. Candidate
> > Wuhan National Laboratory for Optoelectronics (WNLO)
> > Huazhong University of Science & Technology (HUST)
> > Wuhan, Hubei Prov., China
> >
> >
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2016-11-08 13:32 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-25 8:22 [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent Chunguang Li
2016-09-26 11:23 ` Dr. David Alan Gilbert
2016-09-26 14:55 ` Chunguang Li
2016-09-26 18:52 ` Dr. David Alan Gilbert
2016-09-27 12:28 ` Chunguang Li
2016-09-30 5:46 ` Amit Shah
2016-09-30 8:18 ` Chunguang Li
2016-10-08 7:55 ` Chunguang Li
2016-10-14 11:15 ` Dr. David Alan Gilbert
2016-11-03 8:25 ` Chunguang Li
2016-11-03 9:59 ` Li, Liang Z
2016-11-03 10:13 ` Li, Liang Z
2016-11-04 3:07 ` Chunguang Li
2016-11-04 4:50 ` Li, Liang Z
2016-11-04 7:03 ` Chunguang Li
2016-11-07 13:52 ` Chunguang Li
2016-11-07 14:17 ` Li, Liang Z
2016-11-08 5:27 ` Chunguang Li
2016-11-07 14:44 ` Li, Liang Z
2016-11-08 11:05 ` Dr. David Alan Gilbert
2016-11-08 13:40 ` Chunguang Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.