* [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent @ 2016-09-25 8:22 Chunguang Li 2016-09-26 11:23 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 21+ messages in thread From: Chunguang Li @ 2016-09-25 8:22 UTC (permalink / raw) To: qemu-devel; +Cc: quintela, amit.shah, pbonzini, stefanha Hi all! I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. Am I right about this consideration? If I am right, is there some advice to improve this? Thanks, Chunguang Li ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-25 8:22 [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent Chunguang Li @ 2016-09-26 11:23 ` Dr. David Alan Gilbert 2016-09-26 14:55 ` Chunguang Li 0 siblings, 1 reply; 21+ messages in thread From: Dr. David Alan Gilbert @ 2016-09-26 11:23 UTC (permalink / raw) To: Chunguang Li; +Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela * Chunguang Li (lichunguang@hust.edu.cn) wrote: > Hi all! > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > Am I right about this consideration? If I am right, is there some advice to improve this? I think you're right that this can happen; to clarify I think the case you're talking about is: Iteration 1 sync bitmap start sending pages page 'n' is modified - but hasn't been sent yet page 'n' gets sent Iteration 2 sync bitmap 'page n is shown as modified' send page 'n' again So you're right that is wasteful; I guess it's more wasteful on big VMs with slow networks where the length of each iteration is large. Fixing it is not easy, because you have to be really careful never to miss a page modification, even if the page is sent about the same time it's dirtied. One way would be to sync the dirty log from the kernel in smaller chunks. Dave > > Thanks, > Chunguang Li -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-26 11:23 ` Dr. David Alan Gilbert @ 2016-09-26 14:55 ` Chunguang Li 2016-09-26 18:52 ` Dr. David Alan Gilbert 2016-09-30 5:46 ` Amit Shah 0 siblings, 2 replies; 21+ messages in thread From: Chunguang Li @ 2016-09-26 14:55 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela > -----原始邮件----- > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > 发送时间: 2016年9月26日 星期一 > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > Hi all! > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > I think you're right that this can happen; to clarify I think the > case you're talking about is: > > Iteration 1 > sync bitmap > start sending pages > page 'n' is modified - but hasn't been sent yet > page 'n' gets sent > Iteration 2 > sync bitmap > 'page n is shown as modified' > send page 'n' again > Yes,this is right the case I am talking about. > So you're right that is wasteful; I guess it's more wasteful > on big VMs with slow networks where the length of each iteration > is large. I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. Thanks, Chunguang > > Fixing it is not easy, because you have to be really careful > never to miss a page modification, even if the page is sent > about the same time it's dirtied. > > One way would be to sync the dirty log from the kernel > in smaller chunks. > > Dave > > > > > > Thanks, > > Chunguang Li > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-26 14:55 ` Chunguang Li @ 2016-09-26 18:52 ` Dr. David Alan Gilbert 2016-09-27 12:28 ` Chunguang Li 2016-09-30 5:46 ` Amit Shah 1 sibling, 1 reply; 21+ messages in thread From: Dr. David Alan Gilbert @ 2016-09-26 18:52 UTC (permalink / raw) To: Chunguang Li; +Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > -----原始邮件----- > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > 发送时间: 2016年9月26日 星期一 > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > Hi all! > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > I think you're right that this can happen; to clarify I think the > > case you're talking about is: > > > > Iteration 1 > > sync bitmap > > start sending pages > > page 'n' is modified - but hasn't been sent yet > > page 'n' gets sent > > Iteration 2 > > sync bitmap > > 'page n is shown as modified' > > send page 'n' again > > > > Yes,this is right the case I am talking about. > > > So you're right that is wasteful; I guess it's more wasteful > > on big VMs with slow networks where the length of each iteration > > is large. > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. Yes, it's probably pretty bad; and we really need to do something like split the sync into smaller chunks; there are other suggestions for how to improve it (e.g. there's the page-modification-logging changes). However, I don't think you usually get really random writes, if you do precopy rarely converges at all, because even without your observation it changes lots and lots of pages. Dave > Thanks, > > Chunguang > > > > > Fixing it is not easy, because you have to be really careful > > never to miss a page modification, even if the page is sent > > about the same time it's dirtied. > > > > One way would be to sync the dirty log from the kernel > > in smaller chunks. > > > > Dave > > > > > > > > > > Thanks, > > > Chunguang Li > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > -- > Chunguang Li, Ph.D. Candidate > Wuhan National Laboratory for Optoelectronics (WNLO) > Huazhong University of Science & Technology (HUST) > Wuhan, Hubei Prov., China > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-26 18:52 ` Dr. David Alan Gilbert @ 2016-09-27 12:28 ` Chunguang Li 0 siblings, 0 replies; 21+ messages in thread From: Chunguang Li @ 2016-09-27 12:28 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: qemu-devel, amit.shah, pbonzini, stefanha, quintela > -----原始邮件----- > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > 发送时间: 2016年9月27日 星期二 > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > Yes, it's probably pretty bad; and we really need to do something like > split the sync into smaller chunks; there are other suggestions > for how to improve it (e.g. there's the page-modification-logging > changes). > > However, I don't think you usually get really random writes, if you > do precopy rarely converges at all, because even without your > observation it changes lots and lots of pages. > > Dave > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK I have read a little about the page-modification-logging. I think it is only a more efficient way for dirty logging with better performance, compared with write protection, but will not solve the problem we are talking about. The only idea to handle this, which I have come up with so far, is to split the sync into smaller chunks that you have mentioned. Maybe I can start from this idea to try to fix it. If you come up with some other idea or suggestion, please let me know. Thank you~ Chunguang -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-26 14:55 ` Chunguang Li 2016-09-26 18:52 ` Dr. David Alan Gilbert @ 2016-09-30 5:46 ` Amit Shah 2016-09-30 8:18 ` Chunguang Li 2016-10-08 7:55 ` Chunguang Li 1 sibling, 2 replies; 21+ messages in thread From: Amit Shah @ 2016-09-30 5:46 UTC (permalink / raw) To: Chunguang Li Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > -----原始邮件----- > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > 发送时间: 2016年9月26日 星期一 > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > Hi all! > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > I think you're right that this can happen; to clarify I think the > > case you're talking about is: > > > > Iteration 1 > > sync bitmap > > start sending pages > > page 'n' is modified - but hasn't been sent yet > > page 'n' gets sent > > Iteration 2 > > sync bitmap > > 'page n is shown as modified' > > send page 'n' again > > > > Yes,this is right the case I am talking about. > > > So you're right that is wasteful; I guess it's more wasteful > > on big VMs with slow networks where the length of each iteration > > is large. > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. It makes sense, can you get some perf numbers to show what kinds of workloads get impacted the most? That would also help us to figure out what kinds of speed improvements we can expect. Amit ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-30 5:46 ` Amit Shah @ 2016-09-30 8:18 ` Chunguang Li 2016-10-08 7:55 ` Chunguang Li 1 sibling, 0 replies; 21+ messages in thread From: Chunguang Li @ 2016-09-30 8:18 UTC (permalink / raw) To: Amit Shah Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela > -----原始邮件----- > 发件人: "Amit Shah" <amit.shah@redhat.com> > 发送时间: 2016年9月30日 星期五 > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > > > > > -----原始邮件----- > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > 发送时间: 2016年9月26日 星期一 > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > Hi all! > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > > > I think you're right that this can happen; to clarify I think the > > > case you're talking about is: > > > > > > Iteration 1 > > > sync bitmap > > > start sending pages > > > page 'n' is modified - but hasn't been sent yet > > > page 'n' gets sent > > > Iteration 2 > > > sync bitmap > > > 'page n is shown as modified' > > > send page 'n' again > > > > > > > Yes,this is right the case I am talking about. > > > > > So you're right that is wasteful; I guess it's more wasteful > > > on big VMs with slow networks where the length of each iteration > > > is large. > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. > > It makes sense, can you get some perf numbers to show what kinds of > workloads get impacted the most? That would also help us to figure > out what kinds of speed improvements we can expect. > > > Amit Yes, I can pick up some workloads to get some perf numbers. However, I don't know how to get the quantity of non-dirty pages we are resending in each iteration. Instead, I can get the numbers below: 1. The time consuming of each iteration; 2. The quantity of pages transferred during each iteration; 3. The quantity of dirty pages (including not-really-dirty pages) produced during each iteration. With these numbers, we can only estimate the quantity of not-really-dirty pages to some extent. How do you think of this test plan? Any suggestions? Chunguang -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-09-30 5:46 ` Amit Shah 2016-09-30 8:18 ` Chunguang Li @ 2016-10-08 7:55 ` Chunguang Li 2016-10-14 11:15 ` Dr. David Alan Gilbert 1 sibling, 1 reply; 21+ messages in thread From: Chunguang Li @ 2016-10-08 7:55 UTC (permalink / raw) To: Amit Shah Cc: Dr. David Alan Gilbert, qemu-devel, pbonzini, stefanha, quintela > -----原始邮件----- > 发件人: "Amit Shah" <amit.shah@redhat.com> > 发送时间: 2016年9月30日 星期五 > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > > > > > -----原始邮件----- > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > 发送时间: 2016年9月26日 星期一 > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > Hi all! > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > > > I think you're right that this can happen; to clarify I think the > > > case you're talking about is: > > > > > > Iteration 1 > > > sync bitmap > > > start sending pages > > > page 'n' is modified - but hasn't been sent yet > > > page 'n' gets sent > > > Iteration 2 > > > sync bitmap > > > 'page n is shown as modified' > > > send page 'n' again > > > > > > > Yes,this is right the case I am talking about. > > > > > So you're right that is wasteful; I guess it's more wasteful > > > on big VMs with slow networks where the length of each iteration > > > is large. > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. > > It makes sense, can you get some perf numbers to show what kinds of > workloads get impacted the most? That would also help us to figure > out what kinds of speed improvements we can expect. > > > Amit I have picked up 6 workloads and got the following statistics numbers of every iteration (except the last stop-copy one) during precopy. These numbers are obtained with the basic precopy migration, without the capabilities like xbzrle or compression, etc. The network for the migration is exclusive, with a separate network for the workloads. They are both gigabit ethernet. I use qemu-2.5.1. Three (booting, idle, web server) of them converged to the stop-copy phase, with the given bandwidth and default downtime (300ms), while the other three (kernel compilation, zeusmp, memcached) did not. One page is "not-really-dirty", if it is written first and is sent later (and not written again after that) during one iteration. I guess this would not happen so often during the other iterations as during the 1st iteration. Because all the pages of the VM are sent to the dest node during the 1st iteration, while during the others, only part of the pages are sent. So I think the "not-really-dirty" pages should be produced mainly during the 1st iteration , and maybe very little during the other iterations. If we could avoid resending the "not-really-dirty" pages, intuitively, I think the time spent on Iteration 2 would be halved. This is a chain reaction, because the dirty pages produced during Iteration 2 is halved, which incurs that the time spent on Iteration 3 is halved, then Iteration 4, 5... So I think "booting" and "kernel compilation" should benefit a lot from this improvement. The reason of "kernel compilation" would benefit is that some iterations take around 600ms, and if they are halved into 300ms, then the precopy may have the chance to step into stop and copy phase. On the other hand, "idle" and "web server" would not benefit a lot, because most of the time are spent on the 1st iteration and little on the others. As to the "zeusmp" and "memcached", although the time spent on the other iterations but the 1st one may be halved, they still could not converge to stop and copy with the 300ms downtime. --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------ 1. booting : begin to migrate when the VM is booting Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414 Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459 Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356 Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430 Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430 (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now. "n" means "normal pages" and "d" means "duplicate (zero) pages".) 2. idle Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398 Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294 Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849 Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849 3. kernel compilation (can not converge) Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067 Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518 Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207 Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339 Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200 Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116 Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033 Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642 Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427 Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731 Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252 Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348 Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920 Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693 Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382 4. cpu2006.zeusmp (can not converge) Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625 Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360 Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831 Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768 Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885 Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634 Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894 Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476 Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692 Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727 Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810 Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813 Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329 Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918 Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338 5. web server : An apache web server. The client is configured with 50 concurrent connections. Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628 Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574 Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261 Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519 Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167 Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167 --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------ 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge) Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023 Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013 Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551 Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638 Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712 Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787 Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942 Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432 Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096 Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418 Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321 Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169 Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989 Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061 Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717 Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257 Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576 Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963 Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888 Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047 -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-10-08 7:55 ` Chunguang Li @ 2016-10-14 11:15 ` Dr. David Alan Gilbert 2016-11-03 8:25 ` Chunguang Li 0 siblings, 1 reply; 21+ messages in thread From: Dr. David Alan Gilbert @ 2016-10-14 11:15 UTC (permalink / raw) To: Chunguang Li; +Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > -----原始邮件----- > > 发件人: "Amit Shah" <amit.shah@redhat.com> > > 发送时间: 2016年9月30日 星期五 > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > > > > > > > > > -----原始邮件----- > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > 发送时间: 2016年9月26日 星期一 > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > Hi all! > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > > > > > I think you're right that this can happen; to clarify I think the > > > > case you're talking about is: > > > > > > > > Iteration 1 > > > > sync bitmap > > > > start sending pages > > > > page 'n' is modified - but hasn't been sent yet > > > > page 'n' gets sent > > > > Iteration 2 > > > > sync bitmap > > > > 'page n is shown as modified' > > > > send page 'n' again > > > > > > > > > > Yes,this is right the case I am talking about. > > > > > > > So you're right that is wasteful; I guess it's more wasteful > > > > on big VMs with slow networks where the length of each iteration > > > > is large. > > > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. > > > > It makes sense, can you get some perf numbers to show what kinds of > > workloads get impacted the most? That would also help us to figure > > out what kinds of speed improvements we can expect. > > > > > > Amit > > I have picked up 6 workloads and got the following statistics numbers > of every iteration (except the last stop-copy one) during precopy. > These numbers are obtained with the basic precopy migration, without > the capabilities like xbzrle or compression, etc. The network for the > migration is exclusive, with a separate network for the workloads. > They are both gigabit ethernet. I use qemu-2.5.1. > > Three (booting, idle, web server) of them converged to the stop-copy phase, > with the given bandwidth and default downtime (300ms), while the other > three (kernel compilation, zeusmp, memcached) did not. > > One page is "not-really-dirty", if it is written first and is sent later > (and not written again after that) during one iteration. I guess this > would not happen so often during the other iterations as during the 1st > iteration. Because all the pages of the VM are sent to the dest node during > the 1st iteration, while during the others, only part of the pages are sent. > So I think the "not-really-dirty" pages should be produced mainly during > the 1st iteration , and maybe very little during the other iterations. > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > think the time spent on Iteration 2 would be halved. This is a chain reaction, > because the dirty pages produced during Iteration 2 is halved, which incurs > that the time spent on Iteration 3 is halved, then Iteration 4, 5... Yes; these numbers don't show how many of them are false dirty though. One problem is thinking about pages that have been redirtied, if the page is dirtied after the sync but before the network write then it's the false-dirty that you're describing. However, if the page is being written a few times, and so it would have been written after the network write then it isn't a false-dirty. You might be able to figure that out with some kernel tracing of when the dirtying happens, but it might be easier to write the fix! Dave > So I think "booting" and "kernel compilation" should benefit a lot from this > improvement. The reason of "kernel compilation" would benefit is that some > iterations take around 600ms, and if they are halved into 300ms, then the precopy > may have the chance to step into stop and copy phase. > > On the other hand, "idle" and "web server" would not benefit a lot, because > most of the time are spent on the 1st iteration and little on the others. > > As to the "zeusmp" and "memcached", although the time spent on the other iterations > but the 1st one may be halved, they still could not converge to stop and copy > with the 300ms downtime. > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------ > > 1. booting : begin to migrate when the VM is booting > > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414 > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459 > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356 > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430 > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430 > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now. > "n" means "normal pages" and "d" means "duplicate (zero) pages".) > > 2. idle > > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398 > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294 > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849 > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849 > > 3. kernel compilation (can not converge) > > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067 > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518 > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207 > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339 > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200 > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116 > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033 > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642 > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427 > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731 > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252 > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348 > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920 > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693 > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382 > > 4. cpu2006.zeusmp (can not converge) > > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625 > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360 > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831 > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768 > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885 > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634 > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894 > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476 > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692 > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727 > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810 > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813 > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329 > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918 > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338 > > 5. web server : An apache web server. The client is configured with 50 concurrent connections. > > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628 > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574 > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261 > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519 > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167 > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167 > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------ > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge) > > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023 > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013 > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551 > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638 > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712 > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787 > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942 > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432 > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096 > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418 > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321 > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169 > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989 > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061 > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717 > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257 > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576 > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963 > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888 > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047 > > > -- > Chunguang Li, Ph.D. Candidate > Wuhan National Laboratory for Optoelectronics (WNLO) > Huazhong University of Science & Technology (HUST) > Wuhan, Hubei Prov., China > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-10-14 11:15 ` Dr. David Alan Gilbert @ 2016-11-03 8:25 ` Chunguang Li 2016-11-03 9:59 ` Li, Liang Z ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Chunguang Li @ 2016-11-03 8:25 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela > -----Original Messages----- > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > Sent Time: Friday, October 14, 2016 > To: "Chunguang Li" <lichunguang@hust.edu.cn> > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > > > > -----原始邮件----- > > > 发件人: "Amit Shah" <amit.shah@redhat.com> > > > 发送时间: 2016年9月30日 星期五 > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > > > > > > > > > > > > > -----原始邮件----- > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > > 发送时间: 2016年9月26日 星期一 > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > Hi all! > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > > > > > > > I think you're right that this can happen; to clarify I think the > > > > > case you're talking about is: > > > > > > > > > > Iteration 1 > > > > > sync bitmap > > > > > start sending pages > > > > > page 'n' is modified - but hasn't been sent yet > > > > > page 'n' gets sent > > > > > Iteration 2 > > > > > sync bitmap > > > > > 'page n is shown as modified' > > > > > send page 'n' again > > > > > > > > > > > > > Yes,this is right the case I am talking about. > > > > > > > > > So you're right that is wasteful; I guess it's more wasteful > > > > > on big VMs with slow networks where the length of each iteration > > > > > is large. > > > > > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. > > > > > > It makes sense, can you get some perf numbers to show what kinds of > > > workloads get impacted the most? That would also help us to figure > > > out what kinds of speed improvements we can expect. > > > > > > > > > Amit > > > > I have picked up 6 workloads and got the following statistics numbers > > of every iteration (except the last stop-copy one) during precopy. > > These numbers are obtained with the basic precopy migration, without > > the capabilities like xbzrle or compression, etc. The network for the > > migration is exclusive, with a separate network for the workloads. > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > Three (booting, idle, web server) of them converged to the stop-copy phase, > > with the given bandwidth and default downtime (300ms), while the other > > three (kernel compilation, zeusmp, memcached) did not. > > > > One page is "not-really-dirty", if it is written first and is sent later > > (and not written again after that) during one iteration. I guess this > > would not happen so often during the other iterations as during the 1st > > iteration. Because all the pages of the VM are sent to the dest node during > > the 1st iteration, while during the others, only part of the pages are sent. > > So I think the "not-really-dirty" pages should be produced mainly during > > the 1st iteration , and maybe very little during the other iterations. > > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > > think the time spent on Iteration 2 would be halved. This is a chain reaction, > > because the dirty pages produced during Iteration 2 is halved, which incurs > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > Yes; these numbers don't show how many of them are false dirty though. > > One problem is thinking about pages that have been redirtied, if the page is dirtied > after the sync but before the network write then it's the false-dirty that > you're describing. > > However, if the page is being written a few times, and so it would have been written > after the network write then it isn't a false-dirty. > > You might be able to figure that out with some kernel tracing of when the dirtying > happens, but it might be easier to write the fix! > > Dave Hi, I have made some new progress now. To tell how many false dirty pages there are exactly in each iteration, I malloc a buffer in memory as big as the size of the whole VM memory. When a page is transferred to the dest node, it is copied to the buffer; During the next iteration, if one page is transferred, it is compared to the old one in the buffer, and the old one will be replaced for next comparison if it is really dirty. Thus, we are now able to get the exact number of false dirty pages. This time, I use 15 workloads to get the statistic number. They are: 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick up these 11 benchmarks because compared to others, they have bigger memory occupation and higher memory dirty rate. Thus most of them could not converge to stop-and-copy using the default migration speed (32MB/s). 2. kernel compilation 3. idle VM 4. Apache web server which serves static content (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the migration speed is the default 32MB/s) 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache. After filling up the 4GB cache, a client writes the cache at a constant speed during migration. This time, migration speed has no limit, and is up to the capability of 1Gbps Ethernet. Summarize the results first: (and you can read the precise number below) 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations) of false dirty pages out of all the dirty pages since iteration 2 (and the big proportion lasts during the following iterations). They are cpu2006.zeusmp, cpu2006.bzip2, cpu2006.mcf, and memcached. 2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even though the proportion of false dirty pages is big since iteration 2, the space to optimize is small. 3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not in the other iterations. 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false dirty pages since iteration 2. So the spaces to optimize for them are small. Now I want to talk a little more about the reasons why false dirty pages are produced. The first reason is what we have discussed before---the mechanism to track the dirty pages. And then I come up with another reason. Here is the situation: a write operation to one memory page happens, but it doesn't change any content of the page. So it's "write but not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006 benchmark suit. According to his results, general workloads has a little proportion (<10%) of "write but not dirty" out of all the write operations, while few workloads have higher proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would happen, it just happened. So these two reasons contribute to the false dirty pages. To optimize, I compute and store the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new hash replaces the old one for next comparison. The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1 hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which is relatively small. As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems. They have proven that the probability of hash collision is far smaller than disk hardware fault, so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery. Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have big proportions of false dirty pages, the improvement is remarkable. Without optimization, they either can not converge to stop-and-copy, or take a very long time to complete. With the SHA1 hash method, all of them now complete in a relatively short time. For the reason I have talked above, the other workloads don't get notable improvements from the optimization. So below, I only show the exact number after optimization for the 4 workloads with remarkable improvements. Any comments or suggestions? Below is the experiments data: ( "dup" means zero page, this kind of pages takes very little migration time and network resources, so they are always not regard as dirty pages in my numbers; "rd" means really dirty pages; "fd" means false dirty pages; The numbers refer to the quantities of pages. ) ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)------------------- 1. memcached ----- original pre-copy (can not converge): ----- Iteration 1, duration: 42111 ms , transferred pages: 1568788 (dup: 416239, rd: 1152549, fd: 0) , new dirty pages: 499015 , remaining dirty pages: 507397 Iteration 2, duration: 17208 ms , transferred pages: 498946 (dup: 5456, rd: 160206, fd: 333284) , new dirty pages: 261237 , remaining dirty pages: 269688 Iteration 3, duration: 9134 ms , transferred pages: 262377 (dup: 519, rd: 111900, fd: 149958) , new dirty pages: 170281 , remaining dirty pages: 177592 Iteration 4, duration: 5920 ms , transferred pages: 169966 (dup: 87, rd: 82487, fd: 87392) , new dirty pages: 121154 , remaining dirty pages: 128780 Iteration 5, duration: 4239 ms , transferred pages: 121551 (dup: 81, rd: 64120, fd: 57350) , new dirty pages: 100976 , remaining dirty pages: 108205 Iteration 6, duration: 3495 ms , transferred pages: 100353 (dup: 90, rd: 56021, fd: 44242) , new dirty pages: 74547 , remaining dirty pages: 82399 Iteration 7, duration: 2583 ms , transferred pages: 74160 (dup: 56, rd: 38016, fd: 36088) , new dirty pages: 58209 , remaining dirty pages: 66448 Iteration 8, duration: 2039 ms , transferred pages: 58534 (dup: 81, rd: 26885, fd: 31568) , new dirty pages: 43511 , remaining dirty pages: 51425 Iteration 9, duration: 1513 ms , transferred pages: 43484 (dup: 55, rd: 26641, fd: 16788) , new dirty pages: 43722 , remaining dirty pages: 51663 Iteration 10, duration: 1521 ms , transferred pages: 43676 (dup: 62, rd: 26463, fd: 17151) , new dirty pages: 35347 , remaining dirty pages: 43334 Iteration 11, duration: 1230 ms , transferred pages: 35287 (dup: 0, rd: 21293, fd: 13994) , new dirty pages: 28851 , remaining dirty pages: 36898 Iteration 12, duration: 1031 ms , transferred pages: 29651 (dup: 82, rd: 18143, fd: 11426) , new dirty pages: 27062 , remaining dirty pages: 34309 Iteration 13, duration: 917 ms , transferred pages: 26385 (dup: 56, rd: 14149, fd: 12180) , new dirty pages: 22723 , remaining dirty pages: 30647 Iteration 14, duration: 762 ms , transferred pages: 21902 (dup: 55, rd: 16355, fd: 5492) , new dirty pages: 18208 , remaining dirty pages: 26953 Iteration 15, duration: 650 ms , transferred pages: 18636 (dup: 0, rd: 11943, fd: 6693) , new dirty pages: 16085 , remaining dirty pages: 24402 Iteration 16, duration: 554 ms , transferred pages: 15946 (dup: 56, rd: 9527, fd: 6363) , new dirty pages: 14766 , remaining dirty pages: 23222 Iteration 17, duration: 538 ms , transferred pages: 15434 (dup: 0, rd: 9779, fd: 5655) , new dirty pages: 13381 , remaining dirty pages: 21169 Iteration 18, duration: 487 ms , transferred pages: 14089 (dup: 81, rd: 7737, fd: 6271) , new dirty pages: 13325 , remaining dirty pages: 20405 Iteration 19, duration: 428 ms , transferred pages: 12232 (dup: 0, rd: 8488, fd: 3744) , new dirty pages: 10274 , remaining dirty pages: 18447 Iteration 20, duration: 377 ms , transferred pages: 10887 (dup: 56, rd: 6362, fd: 4469) , new dirty pages: 9708 , remaining dirty pages: 17268 Iteration 21, duration: 320 ms , transferred pages: 9222 (dup: 0, rd: 5789, fd: 3433) , new dirty pages: 8015 , remaining dirty pages: 16061 Iteration 22, duration: 268 ms , transferred pages: 7621 (dup: 0, rd: 6204, fd: 1417) , new dirty pages: 7227 , remaining dirty pages: 15667 Iteration 23, duration: 269 ms , transferred pages: 7813 (dup: 56, rd: 4410, fd: 3347) , new dirty pages: 7591 , remaining dirty pages: 15445 Iteration 24, duration: 271 ms , transferred pages: 7749 (dup: 0, rd: 4565, fd: 3184) , new dirty pages: 15126 , remaining dirty pages: 22822 Iteration 25, duration: 549 ms , transferred pages: 15818 (dup: 60, rd: 10545, fd: 5213) , new dirty pages: 14559 , remaining dirty pages: 21563 Iteration 26, duration: 499 ms , transferred pages: 14281 (dup: 3, rd: 8760, fd: 5518) , new dirty pages: 11947 , remaining dirty pages: 19229 Iteration 27, duration: 376 ms , transferred pages: 10823 (dup: 25, rd: 6550, fd: 4248) , new dirty pages: 8561 , remaining dirty pages: 16967 Iteration 28, duration: 324 ms , transferred pages: 9350 (dup: 31, rd: 5292, fd: 4027) , new dirty pages: 8655 , remaining dirty pages: 16272 Iteration 29, duration: 274 ms , transferred pages: 7813 (dup: 0, rd: 6088, fd: 1725) , new dirty pages: 6300 , remaining dirty pages: 14759 Iteration 30, duration: 218 ms , transferred pages: 6340 (dup: 45, rd: 3196, fd: 3099) , new dirty pages: 5143 , remaining dirty pages: 13562 ----- after optimization: ----- Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979 Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479 Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100 Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414 Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536 Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516 Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802 Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060 Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973 Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014 Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516 Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842 Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871 Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989 Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989 total time: 54693 milliseconds 2. cpu2006.zeusmp ----- original pre-copy (can not converge): ----- Iteration 1, duration: 21112 ms , transferred pages: 266450 (dup: 93385, rd: 173065, fd: 0) , new dirty pages: 127866 , remaining dirty pages: 127866 Iteration 2, duration: 6192 ms , transferred pages: 125662 (dup: 75762, rd: 17389, fd: 32511) , new dirty pages: 131655 , remaining dirty pages: 133859 Iteration 3, duration: 6699 ms , transferred pages: 131937 (dup: 77298, rd: 20320, fd: 34319) , new dirty pages: 121027 , remaining dirty pages: 122949 Iteration 4, duration: 5999 ms , transferred pages: 122512 (dup: 73588, rd: 17236, fd: 31688) , new dirty pages: 122759 , remaining dirty pages: 123196 Iteration 5, duration: 5804 ms , transferred pages: 122717 (dup: 75436, rd: 19016, fd: 28265) , new dirty pages: 123697 , remaining dirty pages: 124176 Iteration 6, duration: 5698 ms , transferred pages: 123708 (dup: 77249, rd: 18022, fd: 28437) , new dirty pages: 121838 , remaining dirty pages: 122306 Iteration 7, duration: 5515 ms , transferred pages: 122306 (dup: 76727, rd: 14819, fd: 30760) , new dirty pages: 122382 , remaining dirty pages: 122382 Iteration 8, duration: 6086 ms , transferred pages: 120825 (dup: 71834, rd: 15987, fd: 33004) , new dirty pages: 121587 , remaining dirty pages: 123144 Iteration 9, duration: 5899 ms , transferred pages: 120964 (dup: 72860, rd: 18191, fd: 29913) , new dirty pages: 120391 , remaining dirty pages: 122571 Iteration 10, duration: 5801 ms , transferred pages: 121425 (dup: 74140, rd: 20722, fd: 26563) , new dirty pages: 122302 , remaining dirty pages: 123448 Iteration 11, duration: 5909 ms , transferred pages: 123448 (dup: 74735, rd: 19678, fd: 29035) , new dirty pages: 123258 , remaining dirty pages: 123258 Iteration 12, duration: 6293 ms , transferred pages: 121211 (dup: 70442, rd: 18128, fd: 32641) , new dirty pages: 123623 , remaining dirty pages: 125670 Iteration 13, duration: 6398 ms , transferred pages: 124897 (dup: 72701, rd: 21134, fd: 31062) , new dirty pages: 122355 , remaining dirty pages: 123128 Iteration 14, duration: 6301 ms , transferred pages: 121893 (dup: 70514, rd: 23470, fd: 27909) , new dirty pages: 120980 , remaining dirty pages: 122215 Iteration 15, duration: 6304 ms , transferred pages: 121389 (dup: 70005, rd: 21731, fd: 29653) , new dirty pages: 121628 , remaining dirty pages: 122454 Iteration 16, duration: 6398 ms , transferred pages: 122164 (dup: 69962, rd: 24376, fd: 27826) , new dirty pages: 122246 , remaining dirty pages: 122536 Iteration 17, duration: 6201 ms , transferred pages: 121548 (dup: 70984, rd: 23915, fd: 26649) , new dirty pages: 121460 , remaining dirty pages: 122448 Iteration 18, duration: 6401 ms , transferred pages: 122272 (dup: 70072, rd: 22261, fd: 29939) , new dirty pages: 123518 , remaining dirty pages: 123694 Iteration 19, duration: 7003 ms , transferred pages: 121873 (dup: 64754, rd: 27325, fd: 29794) , new dirty pages: 120568 , remaining dirty pages: 122389 Iteration 20, duration: 6400 ms , transferred pages: 121422 (dup: 69221, rd: 25300, fd: 26901) , new dirty pages: 121229 , remaining dirty pages: 122196 Iteration 21, duration: 6703 ms , transferred pages: 119895 (dup: 65232, rd: 25877, fd: 28786) , new dirty pages: 123284 , remaining dirty pages: 125585 Iteration 22, duration: 6902 ms , transferred pages: 123884 (dup: 67582, rd: 29020, fd: 27282) , new dirty pages: 122057 , remaining dirty pages: 123758 Iteration 23, duration: 6800 ms , transferred pages: 122010 (dup: 66529, rd: 30644, fd: 24837) , new dirty pages: 120916 , remaining dirty pages: 122664 Iteration 24, duration: 7202 ms , transferred pages: 121951 (dup: 63188, rd: 31105, fd: 27658) , new dirty pages: 122715 , remaining dirty pages: 123428 Iteration 25, duration: 7202 ms , transferred pages: 122919 (dup: 64161, rd: 32063, fd: 26695) , new dirty pages: 123180 , remaining dirty pages: 123689 Iteration 26, duration: 7404 ms , transferred pages: 123092 (dup: 62694, rd: 33459, fd: 26939) , new dirty pages: 122149 , remaining dirty pages: 122746 Iteration 27, duration: 7205 ms , transferred pages: 120427 (dup: 61664, rd: 34344, fd: 24419) , new dirty pages: 120299 , remaining dirty pages: 122618 Iteration 28, duration: 7100 ms , transferred pages: 121074 (dup: 63130, rd: 32403, fd: 25541) , new dirty pages: 122984 , remaining dirty pages: 124528 Iteration 29, duration: 7904 ms , transferred pages: 124060 (dup: 59564, rd: 35631, fd: 28865) , new dirty pages: 127080 , remaining dirty pages: 127548 Iteration 30, duration: 7906 ms , transferred pages: 127518 (dup: 63029, rd: 34416, fd: 30073) , new dirty pages: 125028 , remaining dirty pages: 125058 ----- after optimization: ----- Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843 Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945 Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929 Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916 Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302 Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500 Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520 Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524 total time: 27225 milliseconds 3. cpu2006.bzip2 ----- original pre-copy: ----- Iteration 1, duration: 18306 ms , transferred pages: 266450 (dup: 116569, rd: 149881, fd: 0) , new dirty pages: 106299 , remaining dirty pages: 106299 Iteration 2, duration: 10694 ms , transferred pages: 104611 (dup: 17550, rd: 10536, fd: 76525) , new dirty pages: 34394 , remaining dirty pages: 36082 Iteration 3, duration: 2998 ms , transferred pages: 34442 (dup: 9924, rd: 12254, fd: 12264) , new dirty pages: 6419 , remaining dirty pages: 8059 Iteration 4, duration: 699 ms , transferred pages: 5748 (dup: 22, rd: 2583, fd: 3143) , new dirty pages: 1226 , remaining dirty pages: 3537 Iteration 5, duration: 200 ms , transferred pages: 1636 (dup: 0, rd: 1194, fd: 442) , new dirty pages: 478 , remaining dirty pages: 2379 Iteration 6, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2379 ----- after optimization: ----- Iteration 1, duration: 13995 ms , transferred pages: 266314 (dup: 152118, rd: 114196) , new dirty pages: 97009 , remaining dirty pages: 97145 Iteration 2, duration: 1215 ms , transferred pages: 33400 (dup: 26745, rd: 6655) , new dirty pages: 12866 , remaining dirty pages: 14017 Iteration 3, duration: 701 ms , transferred pages: 5774 (dup: 48, rd: 5726) , new dirty pages: 6342 , remaining dirty pages: 8761 Iteration 4, duration: 500 ms , transferred pages: 4111 (dup: 21, rd: 4090) , new dirty pages: 4311 , remaining dirty pages: 6485 Iteration 5, duration: 400 ms , transferred pages: 3273 (dup: 1, rd: 3272) , new dirty pages: 3034 , remaining dirty pages: 5431 Iteration 6, duration: 301 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2094 , remaining dirty pages: 4472 Iteration 7, duration: 299 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2066 , remaining dirty pages: 4082 Iteration 8, duration: 202 ms , transferred pages: 1636 (dup: 0, rd: 1636) , new dirty pages: 2881 , remaining dirty pages: 4648 Iteration 9, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 4775 , remaining dirty pages: 6778 Iteration 10, duration: 400 ms , transferred pages: 3281 (dup: 9, rd: 3272) , new dirty pages: 3757 , remaining dirty pages: 5576 Iteration 11, duration: 401 ms , transferred pages: 3279 (dup: 7, rd: 3272) , new dirty pages: 6980 , remaining dirty pages: 8906 Iteration 12, duration: 500 ms , transferred pages: 7118 (dup: 3035, rd: 4083) , new dirty pages: 10774 , remaining dirty pages: 11922 Iteration 13, duration: 116 ms , transferred pages: 11706 (dup: 10152, rd: 1554) , new dirty pages: 1326 , remaining dirty pages: 1326 Iteration 14, duration: 117 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 1326 total time: 19479 milliseconds 4. cpu2006.mcf ----- original pre-copy: ----- Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6925, rd: 259525, fd: 0) , new dirty pages: 244403 , remaining dirty pages: 244403 Iteration 2, duration: 29603 ms , transferred pages: 242275 (dup: 377, rd: 224001, fd: 17897) , new dirty pages: 227335 , remaining dirty pages: 229463 Iteration 3, duration: 27806 ms , transferred pages: 227573 (dup: 169, rd: 65681, fd: 161723) , new dirty pages: 195593 , remaining dirty pages: 197483 Iteration 4, duration: 23907 ms , transferred pages: 195543 (dup: 41, rd: 39838, fd: 155664) , new dirty pages: 215066 , remaining dirty pages: 217006 Iteration 5, duration: 26305 ms , transferred pages: 215289 (dup: 155, rd: 33082, fd: 182052) , new dirty pages: 111098 , remaining dirty pages: 112815 Iteration 6, duration: 13502 ms , transferred pages: 110452 (dup: 22, rd: 26793, fd: 83637) , new dirty pages: 161054 , remaining dirty pages: 163417 Iteration 7, duration: 19705 ms , transferred pages: 161266 (dup: 120, rd: 33818, fd: 127328) , new dirty pages: 220562 , remaining dirty pages: 222713 Iteration 8, duration: 27003 ms , transferred pages: 220881 (dup: 21, rd: 215721, fd: 5139) , new dirty pages: 219787 , remaining dirty pages: 221619 Iteration 9, duration: 26802 ms , transferred pages: 219248 (dup: 24, rd: 84648, fd: 134576) , new dirty pages: 207959 , remaining dirty pages: 210330 Iteration 10, duration: 25411 ms , transferred pages: 207916 (dup: 144, rd: 35842, fd: 171930) , new dirty pages: 144442 , remaining dirty pages: 146856 Iteration 11, duration: 17714 ms , transferred pages: 144804 (dup: 18, rd: 25414, fd: 119372) , new dirty pages: 205127 , remaining dirty pages: 207179 Iteration 12, duration: 25112 ms , transferred pages: 205446 (dup: 128, rd: 23197, fd: 182121) , new dirty pages: 167319 , remaining dirty pages: 169052 Iteration 13, duration: 20411 ms , transferred pages: 166886 (dup: 14, rd: 21960, fd: 144912) , new dirty pages: 221592 , remaining dirty pages: 223758 Iteration 14, duration: 27126 ms , transferred pages: 221800 (dup: 122, rd: 42368, fd: 179310) , new dirty pages: 233630 , remaining dirty pages: 235588 Iteration 15, duration: 28517 ms , transferred pages: 233321 (dup: 191, rd: 222528, fd: 10602) , new dirty pages: 224282 , remaining dirty pages: 226549 Iteration 16, duration: 27422 ms , transferred pages: 224187 (dup: 55, rd: 45773, fd: 178359) , new dirty pages: 209815 , remaining dirty pages: 212177 Iteration 17, duration: 25723 ms , transferred pages: 210260 (dup: 34, rd: 79405, fd: 130821) , new dirty pages: 220297 , remaining dirty pages: 222214 Iteration 18, duration: 26920 ms , transferred pages: 220056 (dup: 14, rd: 214128, fd: 5914) , new dirty pages: 192015 , remaining dirty pages: 194173 Iteration 19, duration: 23520 ms , transferred pages: 192239 (dup: 9, rd: 25140, fd: 167090) , new dirty pages: 96450 , remaining dirty pages: 98384 Iteration 20, duration: 11805 ms , transferred pages: 96538 (dup: 14, rd: 7424, fd: 89100) , new dirty pages: 6978 , remaining dirty pages: 8824 Iteration 21, duration: 799 ms , transferred pages: 6545 (dup: 1, rd: 1802, fd: 4742) , new dirty pages: 138 , remaining dirty pages: 2417 Iteration 22, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2417 ----- after optimization: ----- Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209 Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571 Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478 Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172 Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305 Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675 Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265 Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668 Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014 Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214 Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725 Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903 Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260 Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266 Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792 Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639 Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170 Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547 Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521 Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054 Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624 Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265 Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813 Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608 Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311 Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294 Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936 Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423 Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292 Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457 Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304 Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815 Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670 Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422 Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222 Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337 Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251 Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251 total time: 97919 milliseconds ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)------------------- 5. idle Iteration 1, duration: 14702 ms , transferred pages: 266450 (dup: 146393, rd: 120057, fd: 0) , new dirty pages: 14595 , remaining dirty pages: 14595 Iteration 2, duration: 1592 ms , transferred pages: 12412 (dup: 103, rd: 3280, fd: 9029) , new dirty pages: 218 , remaining dirty pages: 2401 Iteration 3, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2401 6. kernel compilation (can not converge) Iteration 1, duration: 20607 ms , transferred pages: 266450 (dup: 97552, rd: 168898, fd: 0) , new dirty pages: 19293 , remaining dirty pages: 19293 Iteration 2, duration: 2092 ms , transferred pages: 17176 (dup: 597, rd: 8625, fd: 7954) , new dirty pages: 8318 , remaining dirty pages: 10435 Iteration 3, duration: 1000 ms , transferred pages: 8484 (dup: 304, rd: 6256, fd: 1924) , new dirty pages: 8736 , remaining dirty pages: 10687 Iteration 4, duration: 1000 ms , transferred pages: 8435 (dup: 255, rd: 7089, fd: 1091) , new dirty pages: 7627 , remaining dirty pages: 9879 Iteration 5, duration: 900 ms , transferred pages: 7553 (dup: 191, rd: 5602, fd: 1760) , new dirty pages: 7287 , remaining dirty pages: 9613 Iteration 6, duration: 900 ms , transferred pages: 7620 (dup: 258, rd: 5761, fd: 1601) , new dirty pages: 8958 , remaining dirty pages: 10951 Iteration 7, duration: 1099 ms , transferred pages: 9309 (dup: 311, rd: 8051, fd: 947) , new dirty pages: 7189 , remaining dirty pages: 8831 Iteration 8, duration: 800 ms , transferred pages: 6832 (dup: 288, rd: 5717, fd: 827) , new dirty pages: 5782 , remaining dirty pages: 7781 Iteration 9, duration: 701 ms , transferred pages: 5875 (dup: 149, rd: 4005, fd: 1721) , new dirty pages: 4587 , remaining dirty pages: 6493 Iteration 10, duration: 500 ms , transferred pages: 4234 (dup: 144, rd: 3057, fd: 1033) , new dirty pages: 7352 , remaining dirty pages: 9611 Iteration 11, duration: 900 ms , transferred pages: 7759 (dup: 397, rd: 6563, fd: 799) , new dirty pages: 6686 , remaining dirty pages: 8538 Iteration 12, duration: 800 ms , transferred pages: 6808 (dup: 264, rd: 6017, fd: 527) , new dirty pages: 6871 , remaining dirty pages: 8601 Iteration 13, duration: 800 ms , transferred pages: 6775 (dup: 231, rd: 5722, fd: 822) , new dirty pages: 7540 , remaining dirty pages: 9366 Iteration 14, duration: 900 ms , transferred pages: 7507 (dup: 145, rd: 5900, fd: 1462) , new dirty pages: 7581 , remaining dirty pages: 9440 Iteration 15, duration: 900 ms , transferred pages: 7630 (dup: 268, rd: 6211, fd: 1151) , new dirty pages: 7268 , remaining dirty pages: 9078 Iteration 16, duration: 800 ms , transferred pages: 6759 (dup: 215, rd: 5763, fd: 781) , new dirty pages: 6861 , remaining dirty pages: 9180 Iteration 17, duration: 800 ms , transferred pages: 6838 (dup: 294, rd: 6037, fd: 507) , new dirty pages: 6196 , remaining dirty pages: 8538 Iteration 18, duration: 800 ms , transferred pages: 6852 (dup: 308, rd: 4905, fd: 1639) , new dirty pages: 5947 , remaining dirty pages: 7633 Iteration 19, duration: 700 ms , transferred pages: 5919 (dup: 193, rd: 4853, fd: 873) , new dirty pages: 5861 , remaining dirty pages: 7575 Iteration 20, duration: 600 ms , transferred pages: 5284 (dup: 376, rd: 4408, fd: 500) , new dirty pages: 5206 , remaining dirty pages: 7497 Iteration 21, duration: 600 ms , transferred pages: 5147 (dup: 239, rd: 4308, fd: 600) , new dirty pages: 5031 , remaining dirty pages: 7381 Iteration 22, duration: 599 ms , transferred pages: 5064 (dup: 156, rd: 4026, fd: 882) , new dirty pages: 5601 , remaining dirty pages: 7918 Iteration 23, duration: 702 ms , transferred pages: 5965 (dup: 239, rd: 5028, fd: 698) , new dirty pages: 6079 , remaining dirty pages: 8032 Iteration 24, duration: 700 ms , transferred pages: 6175 (dup: 449, rd: 5146, fd: 580) , new dirty pages: 10932 , remaining dirty pages: 12789 Iteration 25, duration: 1300 ms , transferred pages: 10936 (dup: 302, rd: 6205, fd: 4429) , new dirty pages: 8713 , remaining dirty pages: 10566 Iteration 26, duration: 1000 ms , transferred pages: 8282 (dup: 102, rd: 5662, fd: 2518) , new dirty pages: 5119 , remaining dirty pages: 7403 Iteration 27, duration: 600 ms , transferred pages: 5007 (dup: 99, rd: 4099, fd: 809) , new dirty pages: 2226 , remaining dirty pages: 4622 Iteration 28, duration: 300 ms , transferred pages: 2491 (dup: 37, rd: 1794, fd: 660) , new dirty pages: 6746 , remaining dirty pages: 8877 Iteration 29, duration: 800 ms , transferred pages: 6757 (dup: 213, rd: 5532, fd: 1012) , new dirty pages: 6070 , remaining dirty pages: 8190 Iteration 30, duration: 700 ms , transferred pages: 6052 (dup: 326, rd: 5107, fd: 619) , new dirty pages: 5177 , remaining dirty pages: 7315 7. web server Iteration 1, duration: 20902 ms , transferred pages: 266450 (dup: 95497, rd: 170953, fd: 0) , new dirty pages: 8528 , remaining dirty pages: 8528 Iteration 2, duration: 796 ms , transferred pages: 6472 (dup: 131, rd: 1885, fd: 4456) , new dirty pages: 650 , remaining dirty pages: 2706 Iteration 3, duration: 100 ms , transferred pages: 818 (dup: 0, rd: 383, fd: 435) , new dirty pages: 328 , remaining dirty pages: 2216 Iteration 4, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2216 8. cpu2006.bwaves (can not converge) Iteration 1, duration: 31715 ms , transferred pages: 266450 (dup: 6766, rd: 259684, fd: 0) , new dirty pages: 242702 , remaining dirty pages: 242702 Iteration 2, duration: 29397 ms , transferred pages: 240508 (dup: 405, rd: 225588, fd: 14515) , new dirty pages: 230889 , remaining dirty pages: 233083 Iteration 3, duration: 28205 ms , transferred pages: 230858 (dup: 182, rd: 214596, fd: 16080) , new dirty pages: 226998 , remaining dirty pages: 229223 Iteration 4, duration: 27805 ms , transferred pages: 227574 (dup: 170, rd: 217045, fd: 10359) , new dirty pages: 227360 , remaining dirty pages: 229009 Iteration 5, duration: 27703 ms , transferred pages: 226786 (dup: 200, rd: 212130, fd: 14456) , new dirty pages: 225885 , remaining dirty pages: 228108 Iteration 6, duration: 27600 ms , transferred pages: 225923 (dup: 155, rd: 215503, fd: 10265) , new dirty pages: 223555 , remaining dirty pages: 225740 Iteration 7, duration: 27309 ms , transferred pages: 223574 (dup: 260, rd: 215641, fd: 7673) , new dirty pages: 231975 , remaining dirty pages: 234141 Iteration 8, duration: 28403 ms , transferred pages: 232397 (dup: 85, rd: 214086, fd: 18226) , new dirty pages: 222170 , remaining dirty pages: 223914 Iteration 9, duration: 27105 ms , transferred pages: 221809 (dup: 131, rd: 214988, fd: 6690) , new dirty pages: 230065 , remaining dirty pages: 232170 Iteration 10, duration: 28104 ms , transferred pages: 230201 (dup: 343, rd: 213531, fd: 16327) , new dirty pages: 227590 , remaining dirty pages: 229559 Iteration 11, duration: 27801 ms , transferred pages: 227717 (dup: 313, rd: 221408, fd: 5996) , new dirty pages: 228457 , remaining dirty pages: 230299 Iteration 12, duration: 27916 ms , transferred pages: 228560 (dup: 338, rd: 219660, fd: 8562) , new dirty pages: 238326 , remaining dirty pages: 240065 9. cpu2006.lbm (can not converge) Iteration 1, duration: 31012 ms , transferred pages: 266450 (dup: 12253, rd: 254197, fd: 0) , new dirty pages: 108960 , remaining dirty pages: 108960 Iteration 2, duration: 13095 ms , transferred pages: 106522 (dup: 3, rd: 102045, fd: 4474) , new dirty pages: 129292 , remaining dirty pages: 131730 Iteration 3, duration: 15802 ms , transferred pages: 129688 (dup: 444, rd: 110860, fd: 18384) , new dirty pages: 116682 , remaining dirty pages: 118724 Iteration 4, duration: 14204 ms , transferred pages: 116316 (dup: 160, rd: 104951, fd: 11205) , new dirty pages: 107246 , remaining dirty pages: 109654 Iteration 5, duration: 13208 ms , transferred pages: 107977 (dup: 1, rd: 101834, fd: 6142) , new dirty pages: 105371 , remaining dirty pages: 107048 Iteration 6, duration: 12804 ms , transferred pages: 104705 (dup: 1, rd: 99629, fd: 5075) , new dirty pages: 103841 , remaining dirty pages: 106184 Iteration 7, duration: 12709 ms , transferred pages: 103891 (dup: 5, rd: 99212, fd: 4674) , new dirty pages: 106692 , remaining dirty pages: 108985 Iteration 8, duration: 13105 ms , transferred pages: 107169 (dup: 11, rd: 100125, fd: 7033) , new dirty pages: 103132 , remaining dirty pages: 104948 Iteration 9, duration: 12607 ms , transferred pages: 103068 (dup: 0, rd: 99460, fd: 3608) , new dirty pages: 102511 , remaining dirty pages: 104391 Iteration 10, duration: 12514 ms , transferred pages: 102250 (dup: 0, rd: 99094, fd: 3156) , new dirty pages: 102888 , remaining dirty pages: 105029 10. cpu2006.astar (can not converge) Iteration 1, duration: 28402 ms , transferred pages: 266450 (dup: 33770, rd: 232680, fd: 0) , new dirty pages: 62078 , remaining dirty pages: 62078 Iteration 2, duration: 7393 ms , transferred pages: 60107 (dup: 10, rd: 51722, fd: 8375) , new dirty pages: 48854 , remaining dirty pages: 50825 Iteration 3, duration: 6001 ms , transferred pages: 49094 (dup: 14, rd: 46540, fd: 2540) , new dirty pages: 48137 , remaining dirty pages: 49868 Iteration 4, duration: 5800 ms , transferred pages: 47444 (dup: 0, rd: 45389, fd: 2055) , new dirty pages: 49147 , remaining dirty pages: 51571 Iteration 5, duration: 6102 ms , transferred pages: 49912 (dup: 14, rd: 46216, fd: 3682) , new dirty pages: 55606 , remaining dirty pages: 57265 Iteration 6, duration: 6699 ms , transferred pages: 54949 (dup: 143, rd: 20745, fd: 34061) , new dirty pages: 9166 , remaining dirty pages: 11482 Iteration 7, duration: 1200 ms , transferred pages: 9830 (dup: 14, rd: 7011, fd: 2805) , new dirty pages: 8294 , remaining dirty pages: 9946 Iteration 8, duration: 1000 ms , transferred pages: 8194 (dup: 14, rd: 7178, fd: 1002) , new dirty pages: 5475 , remaining dirty pages: 7227 Iteration 9, duration: 600 ms , transferred pages: 4908 (dup: 0, rd: 3470, fd: 1438) , new dirty pages: 4175 , remaining dirty pages: 6494 Iteration 10, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3856, fd: 234) , new dirty pages: 4095 , remaining dirty pages: 6499 Iteration 11, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3313, fd: 777) , new dirty pages: 3371 , remaining dirty pages: 5780 Iteration 12, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3823, fd: 267) , new dirty pages: 7518 , remaining dirty pages: 9208 Iteration 13, duration: 899 ms , transferred pages: 7376 (dup: 14, rd: 6028, fd: 1334) , new dirty pages: 3931 , remaining dirty pages: 5763 Iteration 14, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 4346 , remaining dirty pages: 6019 Iteration 15, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3817, fd: 273) , new dirty pages: 3054 , remaining dirty pages: 4983 Iteration 16, duration: 400 ms , transferred pages: 3272 (dup: 0, rd: 3138, fd: 134) , new dirty pages: 3874 , remaining dirty pages: 5585 Iteration 17, duration: 399 ms , transferred pages: 3272 (dup: 0, rd: 3248, fd: 24) , new dirty pages: 5285 , remaining dirty pages: 7598 Iteration 18, duration: 701 ms , transferred pages: 5726 (dup: 0, rd: 4385, fd: 1341) , new dirty pages: 8903 , remaining dirty pages: 10775 Iteration 19, duration: 1101 ms , transferred pages: 9010 (dup: 12, rd: 5597, fd: 3401) , new dirty pages: 4199 , remaining dirty pages: 5964 Iteration 20, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 3829 , remaining dirty pages: 5703 11. cpu2006.xalancbmk (can not converge) Iteration 1, duration: 30407 ms , transferred pages: 266450 (dup: 17700, rd: 248750, fd: 0) , new dirty pages: 96169 , remaining dirty pages: 96169 Iteration 2, duration: 11495 ms , transferred pages: 94164 (dup: 205, rd: 67068, fd: 26891) , new dirty pages: 61766 , remaining dirty pages: 63771 Iteration 3, duration: 7501 ms , transferred pages: 61471 (dup: 121, rd: 53587, fd: 7763) , new dirty pages: 56569 , remaining dirty pages: 58869 Iteration 4, duration: 6902 ms , transferred pages: 56461 (dup: 19, rd: 50553, fd: 5889) , new dirty pages: 52181 , remaining dirty pages: 54589 Iteration 5, duration: 6402 ms , transferred pages: 52459 (dup: 107, rd: 46986, fd: 5366) , new dirty pages: 54051 , remaining dirty pages: 56181 Iteration 6, duration: 6601 ms , transferred pages: 54003 (dup: 15, rd: 47566, fd: 6422) , new dirty pages: 50844 , remaining dirty pages: 53022 Iteration 7, duration: 6202 ms , transferred pages: 50723 (dup: 7, rd: 47143, fd: 3573) , new dirty pages: 64880 , remaining dirty pages: 67179 Iteration 8, duration: 8001 ms , transferred pages: 65447 (dup: 7, rd: 61159, fd: 4281) , new dirty pages: 67854 , remaining dirty pages: 69586 Iteration 9, duration: 8202 ms , transferred pages: 67444 (dup: 368, rd: 56357, fd: 10719) , new dirty pages: 65178 , remaining dirty pages: 67320 Iteration 10, duration: 8000 ms , transferred pages: 65455 (dup: 15, rd: 60581, fd: 4859) , new dirty pages: 52421 , remaining dirty pages: 54286 12. cpu2006.milc (can not converge) Iteration 1, duration: 31410 ms , transferred pages: 266450 (dup: 9454, rd: 256996, fd: 0) , new dirty pages: 158860 , remaining dirty pages: 158860 Iteration 2, duration: 19193 ms , transferred pages: 157048 (dup: 150, rd: 96807, fd: 60091) , new dirty pages: 102238 , remaining dirty pages: 104050 Iteration 3, duration: 12504 ms , transferred pages: 102271 (dup: 21, rd: 95107, fd: 7143) , new dirty pages: 97944 , remaining dirty pages: 99723 Iteration 4, duration: 11905 ms , transferred pages: 97360 (dup: 18, rd: 93610, fd: 3732) , new dirty pages: 99150 , remaining dirty pages: 101513 Iteration 5, duration: 12105 ms , transferred pages: 99094 (dup: 116, rd: 94125, fd: 4853) , new dirty pages: 98589 , remaining dirty pages: 101008 Iteration 6, duration: 12101 ms , transferred pages: 98995 (dup: 17, rd: 94069, fd: 4909) , new dirty pages: 147403 , remaining dirty pages: 149416 Iteration 7, duration: 18001 ms , transferred pages: 147284 (dup: 44, rd: 135691, fd: 11549) , new dirty pages: 136445 , remaining dirty pages: 138577 Iteration 8, duration: 16702 ms , transferred pages: 136636 (dup: 30, rd: 130805, fd: 5801) , new dirty pages: 145481 , remaining dirty pages: 147422 Iteration 9, duration: 17800 ms , transferred pages: 145734 (dup: 130, rd: 133239, fd: 12365) , new dirty pages: 98032 , remaining dirty pages: 99720 Iteration 10, duration: 11902 ms , transferred pages: 97364 (dup: 22, rd: 93096, fd: 4246) , new dirty pages: 95391 , remaining dirty pages: 97747 13. cpu2006.cactusADM (can not converge) Iteration 1, duration: 23508 ms , transferred pages: 266450 (dup: 73568, rd: 192882, fd: 0) , new dirty pages: 123869 , remaining dirty pages: 123869 Iteration 2, duration: 13989 ms , transferred pages: 121594 (dup: 7874, rd: 81653, fd: 32067) , new dirty pages: 112960 , remaining dirty pages: 115235 Iteration 3, duration: 13605 ms , transferred pages: 113276 (dup: 2028, rd: 83783, fd: 27465) , new dirty pages: 112314 , remaining dirty pages: 114273 Iteration 4, duration: 13509 ms , transferred pages: 111935 (dup: 1505, rd: 83535, fd: 26895) , new dirty pages: 114078 , remaining dirty pages: 116416 Iteration 5, duration: 13810 ms , transferred pages: 114262 (dup: 1378, rd: 84039, fd: 28845) , new dirty pages: 112271 , remaining dirty pages: 114425 Iteration 6, duration: 13604 ms , transferred pages: 112664 (dup: 1416, rd: 84300, fd: 26948) , new dirty pages: 112903 , remaining dirty pages: 114664 Iteration 7, duration: 13604 ms , transferred pages: 112655 (dup: 1407, rd: 84027, fd: 27221) , new dirty pages: 110943 , remaining dirty pages: 112952 Iteration 8, duration: 13406 ms , transferred pages: 110720 (dup: 1108, rd: 84075, fd: 25537) , new dirty pages: 109321 , remaining dirty pages: 111553 Iteration 9, duration: 13306 ms , transferred pages: 109726 (dup: 932, rd: 83652, fd: 25142) , new dirty pages: 113446 , remaining dirty pages: 115273 Iteration 10, duration: 13705 ms , transferred pages: 113121 (dup: 1055, rd: 84671, fd: 27395) , new dirty pages: 108776 , remaining dirty pages: 110928 14. cpu2006.GmesFDTD (can not converge) Iteration 1, duration: 13303 ms , transferred pages: 266450 (dup: 157809, rd: 108641, fd: 0) , new dirty pages: 226802 , remaining dirty pages: 226802 Iteration 2, duration: 10797 ms , transferred pages: 226507 (dup: 138637, rd: 61818, fd: 26052) , new dirty pages: 200769 , remaining dirty pages: 201064 Iteration 3, duration: 8900 ms , transferred pages: 199717 (dup: 127187, rd: 69340, fd: 3190) , new dirty pages: 203436 , remaining dirty pages: 204783 Iteration 4, duration: 10904 ms , transferred pages: 204127 (dup: 115211, rd: 85767, fd: 3149) , new dirty pages: 198407 , remaining dirty pages: 199063 Iteration 5, duration: 12109 ms , transferred pages: 198206 (dup: 99435, rd: 96956, fd: 1815) , new dirty pages: 213719 , remaining dirty pages: 214576 Iteration 6, duration: 16307 ms , transferred pages: 213595 (dup: 80422, rd: 116885, fd: 16288) , new dirty pages: 199637 , remaining dirty pages: 200618 Iteration 7, duration: 16915 ms , transferred pages: 198289 (dup: 60169, rd: 134208, fd: 3912) , new dirty pages: 199343 , remaining dirty pages: 201672 Iteration 8, duration: 19518 ms , transferred pages: 200452 (dup: 41014, rd: 156083, fd: 3355) , new dirty pages: 222927 , remaining dirty pages: 224147 15. cpu2006.wrf (can not converge) Iteration 1, duration: 18499 ms , transferred pages: 266380 (dup: 115285, rd: 151095, fd: 0) , new dirty pages: 112322 , remaining dirty pages: 112392 Iteration 2, duration: 9802 ms , transferred pages: 110025 (dup: 29917, rd: 65782, fd: 14326) , new dirty pages: 88855 , remaining dirty pages: 91222 Iteration 3, duration: 8199 ms , transferred pages: 89761 (dup: 22728, rd: 57262, fd: 9771) , new dirty pages: 58431 , remaining dirty pages: 59892 Iteration 4, duration: 5603 ms , transferred pages: 58502 (dup: 12716, rd: 41809, fd: 3977) , new dirty pages: 80556 , remaining dirty pages: 81946 Iteration 5, duration: 7101 ms , transferred pages: 79778 (dup: 21738, rd: 50896, fd: 7144) , new dirty pages: 62592 , remaining dirty pages: 64760 Iteration 6, duration: 5702 ms , transferred pages: 63388 (dup: 16793, rd: 42726, fd: 3869) , new dirty pages: 80747 , remaining dirty pages: 82119 Iteration 7, duration: 7000 ms , transferred pages: 80868 (dup: 23652, rd: 52194, fd: 5022) , new dirty pages: 84593 , remaining dirty pages: 85844 Iteration 8, duration: 7099 ms , transferred pages: 83799 (dup: 25769, rd: 51772, fd: 6258) , new dirty pages: 67951 , remaining dirty pages: 69996 Iteration 9, duration: 6303 ms , transferred pages: 68478 (dup: 16979, rd: 36490, fd: 15009) , new dirty pages: 81181 , remaining dirty pages: 82699 Iteration 10, duration: 7000 ms , transferred pages: 80724 (dup: 23503, rd: 52826, fd: 4395) , new dirty pages: 47930 , remaining dirty pages: 49905 > > > So I think "booting" and "kernel compilation" should benefit a lot from this > > improvement. The reason of "kernel compilation" would benefit is that some > > iterations take around 600ms, and if they are halved into 300ms, then the precopy > > may have the chance to step into stop and copy phase. > > > > On the other hand, "idle" and "web server" would not benefit a lot, because > > most of the time are spent on the 1st iteration and little on the others. > > > > As to the "zeusmp" and "memcached", although the time spent on the other iterations > > but the 1st one may be halved, they still could not converge to stop and copy > > with the 300ms downtime. > > > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------ > > > > 1. booting : begin to migrate when the VM is booting > > > > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414 > > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459 > > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356 > > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430 > > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430 > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now. > > "n" means "normal pages" and "d" means "duplicate (zero) pages".) > > > > 2. idle > > > > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398 > > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294 > > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849 > > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849 > > > > 3. kernel compilation (can not converge) > > > > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067 > > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518 > > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207 > > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339 > > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200 > > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116 > > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033 > > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642 > > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427 > > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731 > > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252 > > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348 > > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920 > > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693 > > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382 > > > > 4. cpu2006.zeusmp (can not converge) > > > > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625 > > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360 > > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831 > > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768 > > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885 > > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634 > > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894 > > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476 > > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692 > > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727 > > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810 > > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813 > > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329 > > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918 > > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338 > > > > 5. web server : An apache web server. The client is configured with 50 concurrent connections. > > > > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628 > > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574 > > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261 > > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519 > > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167 > > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167 > > > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------ > > > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge) > > > > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023 > > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013 > > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551 > > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638 > > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712 > > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787 > > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942 > > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432 > > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096 > > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418 > > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321 > > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169 > > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989 > > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061 > > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717 > > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257 > > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576 > > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963 > > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888 > > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047 > > > > > > -- > > Chunguang Li, Ph.D. Candidate > > Wuhan National Laboratory for Optoelectronics (WNLO) > > Huazhong University of Science & Technology (HUST) > > Wuhan, Hubei Prov., China > > > > > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-03 8:25 ` Chunguang Li @ 2016-11-03 9:59 ` Li, Liang Z 2016-11-03 10:13 ` Li, Liang Z 2016-11-08 11:05 ` Dr. David Alan Gilbert 2 siblings, 0 replies; 21+ messages in thread From: Li, Liang Z @ 2016-11-03 9:59 UTC (permalink / raw) To: Chunguang Li, Dr. David Alan Gilbert Cc: Amit Shah, pbonzini, qemu-devel, stefanha, quintela > pages will be sent. Before that during the migration setup, the > ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce > the dirty bitmap from this moment. When the pages "that haven't been > sent" are written, the kernel space marks them as dirty. However I don't > think this is correct, because these pages will be sent during this and the next > iterations with the same content (if they are not written again after they are > sent). It only makes sense to mark the pages which have already been sent > during one iteration as dirty when they are written. > > > > > > > > > > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some > advice to improve this? > > > > > > > > > > > > I think you're right that this can happen; to clarify I think the > > > > > > case you're talking about is: > > > > > > > > > > > > Iteration 1 > > > > > > sync bitmap > > > > > > start sending pages > > > > > > page 'n' is modified - but hasn't been sent yet > > > > > > page 'n' gets sent > > > > > > Iteration 2 > > > > > > sync bitmap > > > > > > 'page n is shown as modified' > > > > > > send page 'n' again > > > > > > > > > > > > > > > > Yes,this is right the case I am talking about. > > > > > > > > > > > So you're right that is wasteful; I guess it's more wasteful > > > > > > on big VMs with slow networks where the length of each iteration > > > > > > is large. > > > > > > > > > > I think this is "very" wasteful. Assume the workload writes the pages > dirty randomly within the guest address space, and the transfer speed is > constant. Intuitively, I think nearly half of the dirty pages produced in > Iteration 1 is not really dirty. This means the time of Iteration 2 is double of > that to send only really dirty pages. > > > > > > > > It makes sense, can you get some perf numbers to show what kinds of > > > > workloads get impacted the most? That would also help us to figure > > > > out what kinds of speed improvements we can expect. > > > > > > > > > > > > Amit > > > > > > I have picked up 6 workloads and got the following statistics numbers > > > of every iteration (except the last stop-copy one) during precopy. > > > These numbers are obtained with the basic precopy migration, without > > > the capabilities like xbzrle or compression, etc. The network for the > > > migration is exclusive, with a separate network for the workloads. > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > Three (booting, idle, web server) of them converged to the stop-copy > phase, > > > with the given bandwidth and default downtime (300ms), while the other > > > three (kernel compilation, zeusmp, memcached) did not. > > > > > > One page is "not-really-dirty", if it is written first and is sent later > > > (and not written again after that) during one iteration. I guess this > > > would not happen so often during the other iterations as during the 1st > > > iteration. Because all the pages of the VM are sent to the dest node > during > > > the 1st iteration, while during the others, only part of the pages are sent. > > > So I think the "not-really-dirty" pages should be produced mainly during > > > the 1st iteration , and maybe very little during the other iterations. > > > > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > > > think the time spent on Iteration 2 would be halved. This is a chain > reaction, > > > because the dirty pages produced during Iteration 2 is halved, which > incurs > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > Yes; these numbers don't show how many of them are false dirty though. > > > > One problem is thinking about pages that have been redirtied, if the page is > dirtied > > after the sync but before the network write then it's the false-dirty that > > you're describing. > > > > However, if the page is being written a few times, and so it would have > been written > > after the network write then it isn't a false-dirty. > > > > You might be able to figure that out with some kernel tracing of when the > dirtying > > happens, but it might be easier to write the fix! > > > > Dave > > Hi, I have made some new progress now. > > To tell how many false dirty pages there are exactly in each iteration, I malloc > a > buffer in memory as big as the size of the whole VM memory. When a page > is > transferred to the dest node, it is copied to the buffer; During the next > iteration, > if one page is transferred, it is compared to the old one in the buffer, and the > old one will be replaced for next comparison if it is really dirty. Thus, we are > now > able to get the exact number of false dirty pages. > > This time, I use 15 workloads to get the statistic number. They are: > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all > scientific > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. > I pick > up these 11 benchmarks because compared to others, they have bigger > memory > occupation and higher memory dirty rate. Thus most of them could not > converge > to stop-and-copy using the default migration speed (32MB/s). > 2. kernel compilation > 3. idle VM > 4. Apache web server which serves static content > > (the above workloads are all running in VM with 1 vcpu and 1GB memory, > and the > migration speed is the default 32MB/s) > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used > as the cache. > After filling up the 4GB cache, a client writes the cache at a constant speed > during migration. This time, migration speed has no limit, and is up to the > capability of 1Gbps Ethernet. > > Summarize the results first: (and you can read the precise number below) > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during > some iterations) > of false dirty pages out of all the dirty pages since iteration 2 (and the big > proportion lasts during the following iterations). They are cpu2006.zeusmp, > cpu2006.bzip2, cpu2006.mcf, and memcached. > 2. 2 workloads (idle, webserver) spend most of the migration time on > iteration 1, even > though the proportion of false dirty pages is big since iteration 2, the space > to > optimize is small. > 3. 1 workload (kernel compilation) only have a big proportion during > iteration 2, not > in the other iterations. > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of > false > dirty pages since iteration 2. So the spaces to optimize for them are small. > > Now I want to talk a little more about the reasons why false dirty pages are > produced. > The first reason is what we have discussed before---the mechanism to track > the dirty > pages. > And then I come up with another reason. Here is the situation: a write > operation to one > memory page happens, but it doesn't change any content of the page. So it's > "write but > not dirty", and kernel still marks it as dirty. One guy in our lab has done some > experiments > to figure out the proportion of "write but not dirty" operations, and he uses > the cpu2006 > benchmark suit. According to his results, general workloads has a little > proportion (<10%) > of "write but not dirty" out of all the write operations, while few workloads > have higher > proportion (one even as high as 50%). Now we are not sure why "write but > not dirty" would > happen, it just happened. > > So these two reasons contribute to the false dirty pages. To optimize, I > compute and store > the SHA1 hash before transferring each page. Next time, if one page needs > retransmission, its > SHA1 hash is computed again, and compared to the old hash. If the hash is > the same, it's a > false dirty page, and we just skip this page; Otherwise, the page is > transferred, and the new > hash replaces the old one for next comparison. > The reason to use SHA1 hash but not byte-by-byte comparison is the > memory overheads. One SHA1 > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the > whole VM memory, which > is relatively small. > As far as I know, SHA1 hash is widely used in the scenes of deduplication for > backup systems. > They have proven that the probability of hash collision is far smaller than disk > hardware fault, > so it's secure hash, that is, if the hashes of two chunks are the same, the > content must be the > same. So I think the SHA1 hash could replace byte-to-byte comparison in the > VM memory scenery. > > Then I do the same migration experiments using the SHA1 hash. For the 4 > workloads which have > big proportions of false dirty pages, the improvement is remarkable. Without > optimization, > they either can not converge to stop-and-copy, or take a very long time to > complete. With the > SHA1 hash method, all of them now complete in a relatively short time. > For the reason I have talked above, the other workloads don't get notable > improvements from the > optimization. So below, I only show the exact number after optimization for > the 4 workloads with > remarkable improvements. > > Any comments or suggestions? > It seems the current XBZRLE feature can be used to solve false dirty issue, no? Liang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-03 8:25 ` Chunguang Li 2016-11-03 9:59 ` Li, Liang Z @ 2016-11-03 10:13 ` Li, Liang Z 2016-11-04 3:07 ` Chunguang Li 2016-11-08 11:05 ` Dr. David Alan Gilbert 2 siblings, 1 reply; 21+ messages in thread From: Li, Liang Z @ 2016-11-03 10:13 UTC (permalink / raw) To: Chunguang Li, Dr. David Alan Gilbert Cc: Amit Shah, pbonzini, qemu-devel, stefanha, quintela > > > > > I think this is "very" wasteful. Assume the workload writes the pages > dirty randomly within the guest address space, and the transfer speed is > constant. Intuitively, I think nearly half of the dirty pages produced in > Iteration 1 is not really dirty. This means the time of Iteration 2 is double of > that to send only really dirty pages. > > > > > > > > It makes sense, can you get some perf numbers to show what kinds of > > > > workloads get impacted the most? That would also help us to figure > > > > out what kinds of speed improvements we can expect. > > > > > > > > > > > > Amit > > > > > > I have picked up 6 workloads and got the following statistics numbers > > > of every iteration (except the last stop-copy one) during precopy. > > > These numbers are obtained with the basic precopy migration, without > > > the capabilities like xbzrle or compression, etc. The network for the > > > migration is exclusive, with a separate network for the workloads. > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > Three (booting, idle, web server) of them converged to the stop-copy > phase, > > > with the given bandwidth and default downtime (300ms), while the other > > > three (kernel compilation, zeusmp, memcached) did not. > > > > > > One page is "not-really-dirty", if it is written first and is sent later > > > (and not written again after that) during one iteration. I guess this > > > would not happen so often during the other iterations as during the 1st > > > iteration. Because all the pages of the VM are sent to the dest node > during > > > the 1st iteration, while during the others, only part of the pages are sent. > > > So I think the "not-really-dirty" pages should be produced mainly during > > > the 1st iteration , and maybe very little during the other iterations. > > > > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > > > think the time spent on Iteration 2 would be halved. This is a chain > reaction, > > > because the dirty pages produced during Iteration 2 is halved, which > incurs > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > Yes; these numbers don't show how many of them are false dirty though. > > > > One problem is thinking about pages that have been redirtied, if the page is > dirtied > > after the sync but before the network write then it's the false-dirty that > > you're describing. > > > > However, if the page is being written a few times, and so it would have > been written > > after the network write then it isn't a false-dirty. > > > > You might be able to figure that out with some kernel tracing of when the > dirtying > > happens, but it might be easier to write the fix! > > > > Dave > > Hi, I have made some new progress now. > > To tell how many false dirty pages there are exactly in each iteration, I malloc > a > buffer in memory as big as the size of the whole VM memory. When a page > is > transferred to the dest node, it is copied to the buffer; During the next > iteration, > if one page is transferred, it is compared to the old one in the buffer, and the > old one will be replaced for next comparison if it is really dirty. Thus, we are > now > able to get the exact number of false dirty pages. > > This time, I use 15 workloads to get the statistic number. They are: > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all > scientific > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. > I pick > up these 11 benchmarks because compared to others, they have bigger > memory > occupation and higher memory dirty rate. Thus most of them could not > converge > to stop-and-copy using the default migration speed (32MB/s). > 2. kernel compilation > 3. idle VM > 4. Apache web server which serves static content > > (the above workloads are all running in VM with 1 vcpu and 1GB memory, > and the > migration speed is the default 32MB/s) > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used > as the cache. > After filling up the 4GB cache, a client writes the cache at a constant speed > during migration. This time, migration speed has no limit, and is up to the > capability of 1Gbps Ethernet. > > Summarize the results first: (and you can read the precise number below) > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during > some iterations) > of false dirty pages out of all the dirty pages since iteration 2 (and the big > proportion lasts during the following iterations). They are cpu2006.zeusmp, > cpu2006.bzip2, cpu2006.mcf, and memcached. > 2. 2 workloads (idle, webserver) spend most of the migration time on > iteration 1, even > though the proportion of false dirty pages is big since iteration 2, the space > to > optimize is small. > 3. 1 workload (kernel compilation) only have a big proportion during > iteration 2, not > in the other iterations. > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of > false > dirty pages since iteration 2. So the spaces to optimize for them are small. > > Now I want to talk a little more about the reasons why false dirty pages are > produced. > The first reason is what we have discussed before---the mechanism to track > the dirty > pages. > And then I come up with another reason. Here is the situation: a write > operation to one > memory page happens, but it doesn't change any content of the page. So it's > "write but > not dirty", and kernel still marks it as dirty. One guy in our lab has done some > experiments > to figure out the proportion of "write but not dirty" operations, and he uses > the cpu2006 > benchmark suit. According to his results, general workloads has a little > proportion (<10%) > of "write but not dirty" out of all the write operations, while few workloads > have higher > proportion (one even as high as 50%). Now we are not sure why "write but > not dirty" would > happen, it just happened. > > So these two reasons contribute to the false dirty pages. To optimize, I > compute and store > the SHA1 hash before transferring each page. Next time, if one page needs > retransmission, its > SHA1 hash is computed again, and compared to the old hash. If the hash is > the same, it's a > false dirty page, and we just skip this page; Otherwise, the page is > transferred, and the new > hash replaces the old one for next comparison. > The reason to use SHA1 hash but not byte-by-byte comparison is the > memory overheads. One SHA1 > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the > whole VM memory, which > is relatively small. > As far as I know, SHA1 hash is widely used in the scenes of deduplication for > backup systems. > They have proven that the probability of hash collision is far smaller than disk > hardware fault, > so it's secure hash, that is, if the hashes of two chunks are the same, the > content must be the > same. So I think the SHA1 hash could replace byte-to-byte comparison in the > VM memory scenery. > > Then I do the same migration experiments using the SHA1 hash. For the 4 > workloads which have > big proportions of false dirty pages, the improvement is remarkable. Without > optimization, > they either can not converge to stop-and-copy, or take a very long time to > complete. With the > SHA1 hash method, all of them now complete in a relatively short time. > For the reason I have talked above, the other workloads don't get notable > improvements from the > optimization. So below, I only show the exact number after optimization for > the 4 workloads with > remarkable improvements. > > Any comments or suggestions? Maybe you can compare the performance of your solution as that of XBZRLE to see which one is better. The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and need less buffer. How about the overhead of calculating the SHA1? Is it faster than copying a page? Liang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-03 10:13 ` Li, Liang Z @ 2016-11-04 3:07 ` Chunguang Li 2016-11-04 4:50 ` Li, Liang Z 0 siblings, 1 reply; 21+ messages in thread From: Chunguang Li @ 2016-11-04 3:07 UTC (permalink / raw) To: Li, Liang Z Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > -----Original Messages----- > From: "Li, Liang Z" <liang.z.li@intel.com> > Sent Time: Thursday, November 3, 2016 > To: "Chunguang Li" <lichunguang@hust.edu.cn>, "Dr. David Alan Gilbert" <dgilbert@redhat.com> > Cc: "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com> > Subject: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > I think this is "very" wasteful. Assume the workload writes the pages > > dirty randomly within the guest address space, and the transfer speed is > > constant. Intuitively, I think nearly half of the dirty pages produced in > > Iteration 1 is not really dirty. This means the time of Iteration 2 is double of > > that to send only really dirty pages. > > > > > > > > > > It makes sense, can you get some perf numbers to show what kinds of > > > > > workloads get impacted the most? That would also help us to figure > > > > > out what kinds of speed improvements we can expect. > > > > > > > > > > > > > > > Amit > > > > > > > > I have picked up 6 workloads and got the following statistics numbers > > > > of every iteration (except the last stop-copy one) during precopy. > > > > These numbers are obtained with the basic precopy migration, without > > > > the capabilities like xbzrle or compression, etc. The network for the > > > > migration is exclusive, with a separate network for the workloads. > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > Three (booting, idle, web server) of them converged to the stop-copy > > phase, > > > > with the given bandwidth and default downtime (300ms), while the other > > > > three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > One page is "not-really-dirty", if it is written first and is sent later > > > > (and not written again after that) during one iteration. I guess this > > > > would not happen so often during the other iterations as during the 1st > > > > iteration. Because all the pages of the VM are sent to the dest node > > during > > > > the 1st iteration, while during the others, only part of the pages are sent. > > > > So I think the "not-really-dirty" pages should be produced mainly during > > > > the 1st iteration , and maybe very little during the other iterations. > > > > > > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > > > > think the time spent on Iteration 2 would be halved. This is a chain > > reaction, > > > > because the dirty pages produced during Iteration 2 is halved, which > > incurs > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > Yes; these numbers don't show how many of them are false dirty though. > > > > > > One problem is thinking about pages that have been redirtied, if the page is > > dirtied > > > after the sync but before the network write then it's the false-dirty that > > > you're describing. > > > > > > However, if the page is being written a few times, and so it would have > > been written > > > after the network write then it isn't a false-dirty. > > > > > > You might be able to figure that out with some kernel tracing of when the > > dirtying > > > happens, but it might be easier to write the fix! > > > > > > Dave > > > > Hi, I have made some new progress now. > > > > To tell how many false dirty pages there are exactly in each iteration, I malloc > > a > > buffer in memory as big as the size of the whole VM memory. When a page > > is > > transferred to the dest node, it is copied to the buffer; During the next > > iteration, > > if one page is transferred, it is compared to the old one in the buffer, and the > > old one will be replaced for next comparison if it is really dirty. Thus, we are > > now > > able to get the exact number of false dirty pages. > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all > > scientific > > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. > > I pick > > up these 11 benchmarks because compared to others, they have bigger > > memory > > occupation and higher memory dirty rate. Thus most of them could not > > converge > > to stop-and-copy using the default migration speed (32MB/s). > > 2. kernel compilation > > 3. idle VM > > 4. Apache web server which serves static content > > > > (the above workloads are all running in VM with 1 vcpu and 1GB memory, > > and the > > migration speed is the default 32MB/s) > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used > > as the cache. > > After filling up the 4GB cache, a client writes the cache at a constant speed > > during migration. This time, migration speed has no limit, and is up to the > > capability of 1Gbps Ethernet. > > > > Summarize the results first: (and you can read the precise number below) > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during > > some iterations) > > of false dirty pages out of all the dirty pages since iteration 2 (and the big > > proportion lasts during the following iterations). They are cpu2006.zeusmp, > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > 2. 2 workloads (idle, webserver) spend most of the migration time on > > iteration 1, even > > though the proportion of false dirty pages is big since iteration 2, the space > > to > > optimize is small. > > 3. 1 workload (kernel compilation) only have a big proportion during > > iteration 2, not > > in the other iterations. > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of > > false > > dirty pages since iteration 2. So the spaces to optimize for them are small. > > > > Now I want to talk a little more about the reasons why false dirty pages are > > produced. > > The first reason is what we have discussed before---the mechanism to track > > the dirty > > pages. > > And then I come up with another reason. Here is the situation: a write > > operation to one > > memory page happens, but it doesn't change any content of the page. So it's > > "write but > > not dirty", and kernel still marks it as dirty. One guy in our lab has done some > > experiments > > to figure out the proportion of "write but not dirty" operations, and he uses > > the cpu2006 > > benchmark suit. According to his results, general workloads has a little > > proportion (<10%) > > of "write but not dirty" out of all the write operations, while few workloads > > have higher > > proportion (one even as high as 50%). Now we are not sure why "write but > > not dirty" would > > happen, it just happened. > > > > So these two reasons contribute to the false dirty pages. To optimize, I > > compute and store > > the SHA1 hash before transferring each page. Next time, if one page needs > > retransmission, its > > SHA1 hash is computed again, and compared to the old hash. If the hash is > > the same, it's a > > false dirty page, and we just skip this page; Otherwise, the page is > > transferred, and the new > > hash replaces the old one for next comparison. > > The reason to use SHA1 hash but not byte-by-byte comparison is the > > memory overheads. One SHA1 > > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the > > whole VM memory, which > > is relatively small. > > As far as I know, SHA1 hash is widely used in the scenes of deduplication for > > backup systems. > > They have proven that the probability of hash collision is far smaller than disk > > hardware fault, > > so it's secure hash, that is, if the hashes of two chunks are the same, the > > content must be the > > same. So I think the SHA1 hash could replace byte-to-byte comparison in the > > VM memory scenery. > > > > Then I do the same migration experiments using the SHA1 hash. For the 4 > > workloads which have > > big proportions of false dirty pages, the improvement is remarkable. Without > > optimization, > > they either can not converge to stop-and-copy, or take a very long time to > > complete. With the > > SHA1 hash method, all of them now complete in a relatively short time. > > For the reason I have talked above, the other workloads don't get notable > > improvements from the > > optimization. So below, I only show the exact number after optimization for > > the 4 workloads with > > remarkable improvements. > > > > Any comments or suggestions? > > Maybe you can compare the performance of your solution as that of XBZRLE to see which one is better. > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and need less buffer. > How about the overhead of calculating the SHA1? Is it faster than copying a page? > > Liang > > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to avoid transferring all of the false dirty pages using XBZRLE, we need a buffer as big as the whole VM memory, while SHA1 needs a much small buffer. Of course, if we have a buffer as big as the whole VM memory using XBZRLE, we could transfer less data on network than SHA1, because XBZRLE is able to compress similar pages. In a word, yes, the merit of using SHA1 is that it needs much less buffer, and leads to nice improvement if there are many false dirty pages. In terms of the overhead of calculating the SHA1 compared with transferring a page, it's related to the CPU and network performance. In my test environment(Intel Xeon E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing overhead caused by calculating the SHA1, because the throughput of network (got by "info migrate") remains almost the same. -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-04 3:07 ` Chunguang Li @ 2016-11-04 4:50 ` Li, Liang Z 2016-11-04 7:03 ` Chunguang Li 2016-11-07 13:52 ` Chunguang Li 0 siblings, 2 replies; 21+ messages in thread From: Li, Liang Z @ 2016-11-04 4:50 UTC (permalink / raw) To: Chunguang Li Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > > > > > > > I think this is "very" wasteful. Assume the workload writes > > > > > > > the pages > > > dirty randomly within the guest address space, and the transfer > > > speed is constant. Intuitively, I think nearly half of the dirty > > > pages produced in Iteration 1 is not really dirty. This means the > > > time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > > > > > It makes sense, can you get some perf numbers to show what > > > > > > kinds of workloads get impacted the most? That would also > > > > > > help us to figure out what kinds of speed improvements we can > expect. > > > > > > > > > > > > > > > > > > Amit > > > > > > > > > > I have picked up 6 workloads and got the following statistics > > > > > numbers of every iteration (except the last stop-copy one) during > precopy. > > > > > These numbers are obtained with the basic precopy migration, > > > > > without the capabilities like xbzrle or compression, etc. The > > > > > network for the migration is exclusive, with a separate network for > the workloads. > > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > > > Three (booting, idle, web server) of them converged to the > > > > > stop-copy > > > phase, > > > > > with the given bandwidth and default downtime (300ms), while the > > > > > other three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > > > One page is "not-really-dirty", if it is written first and is > > > > > sent later (and not written again after that) during one > > > > > iteration. I guess this would not happen so often during the > > > > > other iterations as during the 1st iteration. Because all the > > > > > pages of the VM are sent to the dest node > > > during > > > > > the 1st iteration, while during the others, only part of the pages are > sent. > > > > > So I think the "not-really-dirty" pages should be produced > > > > > mainly during the 1st iteration , and maybe very little during the other > iterations. > > > > > > > > > > If we could avoid resending the "not-really-dirty" pages, > > > > > intuitively, I think the time spent on Iteration 2 would be > > > > > halved. This is a chain > > > reaction, > > > > > because the dirty pages produced during Iteration 2 is halved, > > > > > which > > > incurs > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > > > Yes; these numbers don't show how many of them are false dirty > though. > > > > > > > > One problem is thinking about pages that have been redirtied, if > > > > the page is > > > dirtied > > > > after the sync but before the network write then it's the > > > > false-dirty that you're describing. > > > > > > > > However, if the page is being written a few times, and so it would > > > > have > > > been written > > > > after the network write then it isn't a false-dirty. > > > > > > > > You might be able to figure that out with some kernel tracing of > > > > when the > > > dirtying > > > > happens, but it might be easier to write the fix! > > > > > > > > Dave > > > > > > Hi, I have made some new progress now. > > > > > > To tell how many false dirty pages there are exactly in each > > > iteration, I malloc a buffer in memory as big as the size of the > > > whole VM memory. When a page is transferred to the dest node, it is > > > copied to the buffer; During the next iteration, if one page is > > > transferred, it is compared to the old one in the buffer, and the > > > old one will be replaced for next comparison if it is really dirty. > > > Thus, we are now able to get the exact number of false dirty pages. > > > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are > > > all scientific > > > computing workloads like Quantum Chromodynamics, Fluid Dynamics, > etc. > > > I pick > > > up these 11 benchmarks because compared to others, they have > > > bigger memory > > > occupation and higher memory dirty rate. Thus most of them > > > could not converge > > > to stop-and-copy using the default migration speed (32MB/s). > > > 2. kernel compilation > > > 3. idle VM > > > 4. Apache web server which serves static content > > > > > > (the above workloads are all running in VM with 1 vcpu and 1GB > > > memory, and the > > > migration speed is the default 32MB/s) > > > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are > > > used as the cache. > > > After filling up the 4GB cache, a client writes the cache at a constant > speed > > > during migration. This time, migration speed has no limit, and is up to > the > > > capability of 1Gbps Ethernet. > > > > > > Summarize the results first: (and you can read the precise number > > > below) > > > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% > > > during some iterations) > > > of false dirty pages out of all the dirty pages since iteration 2 (and the > big > > > proportion lasts during the following iterations). They are > cpu2006.zeusmp, > > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > > 2. 2 workloads (idle, webserver) spend most of the migration time > > > on iteration 1, even > > > though the proportion of false dirty pages is big since > > > iteration 2, the space to > > > optimize is small. > > > 3. 1 workload (kernel compilation) only have a big proportion > > > during iteration 2, not > > > in the other iterations. > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little > > > proportion of false > > > dirty pages since iteration 2. So the spaces to optimize for them are > small. > > > > > > Now I want to talk a little more about the reasons why false dirty > > > pages are produced. > > > The first reason is what we have discussed before---the mechanism to > > > track the dirty pages. > > > And then I come up with another reason. Here is the situation: a > > > write operation to one memory page happens, but it doesn't change > > > any content of the page. So it's "write but not dirty", and kernel > > > still marks it as dirty. One guy in our lab has done some > > > experiments to figure out the proportion of "write but not dirty" > > > operations, and he uses the cpu2006 benchmark suit. According to his > > > results, general workloads has a little proportion (<10%) of "write > > > but not dirty" out of all the write operations, while few workloads > > > have higher proportion (one even as high as 50%). Now we are not > > > sure why "write but not dirty" would happen, it just happened. > > > > > > So these two reasons contribute to the false dirty pages. To > > > optimize, I compute and store the SHA1 hash before transferring each > > > page. Next time, if one page needs retransmission, its > > > SHA1 hash is computed again, and compared to the old hash. If the > > > hash is the same, it's a false dirty page, and we just skip this > > > page; Otherwise, the page is transferred, and the new hash replaces > > > the old one for next comparison. > > > The reason to use SHA1 hash but not byte-by-byte comparison is the > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra > > > 20/4096 (<1/200) memory space of the whole VM memory, which is > > > relatively small. > > > As far as I know, SHA1 hash is widely used in the scenes of > > > deduplication for backup systems. > > > They have proven that the probability of hash collision is far > > > smaller than disk hardware fault, so it's secure hash, that is, if > > > the hashes of two chunks are the same, the content must be the same. > > > So I think the SHA1 hash could replace byte-to-byte comparison in > > > the VM memory scenery. > > > > > > Then I do the same migration experiments using the SHA1 hash. For > > > the 4 workloads which have big proportions of false dirty pages, the > > > improvement is remarkable. Without optimization, they either can not > > > converge to stop-and-copy, or take a very long time to complete. > > > With the > > > SHA1 hash method, all of them now complete in a relatively short time. > > > For the reason I have talked above, the other workloads don't get > > > notable improvements from the optimization. So below, I only show > > > the exact number after optimization for the 4 workloads with > > > remarkable improvements. > > > > > > Any comments or suggestions? > > > > Maybe you can compare the performance of your solution as that of > XBZRLE to see which one is better. > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and > need less buffer. > > How about the overhead of calculating the SHA1? Is it faster than copying a > page? > > > > Liang > > > > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer > as big as the whole VM memory, while SHA1 needs a much small buffer. Of > course, if we have a buffer as big as the whole VM memory using XBZRLE, we > could transfer less data on network than SHA1, because XBZRLE is able to > compress similar pages. In a word, yes, the merit of using SHA1 is that it > needs much less buffer, and leads to nice improvement if there are many > false dirty pages. > The current implementation of XBZRLE begins to buffer page from the second iteration, Maybe it's worth to make it start to work from the first iteration based on your finding. > In terms of the overhead of calculating the SHA1 compared with transferring > a page, it's related to the CPU and network performance. In my test > environment(Intel Xeon > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing > overhead caused by calculating the SHA1, because the throughput of > network (got by "info migrate") remains almost the same. You can check the CPU usage, or to measure the time spend on a local live migration which use SHA1/ XBZRLE. Liang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-04 4:50 ` Li, Liang Z @ 2016-11-04 7:03 ` Chunguang Li 2016-11-07 13:52 ` Chunguang Li 1 sibling, 0 replies; 21+ messages in thread From: Chunguang Li @ 2016-11-04 7:03 UTC (permalink / raw) To: Li, Liang Z Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > -----Original Messages----- > From: "Li, Liang Z" <liang.z.li@intel.com> > Sent Time: Friday, November 4, 2016 > To: "Chunguang Li" <lichunguang@hust.edu.cn> > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com> > Subject: RE: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > I think this is "very" wasteful. Assume the workload writes > > > > > > > > the pages > > > > dirty randomly within the guest address space, and the transfer > > > > speed is constant. Intuitively, I think nearly half of the dirty > > > > pages produced in Iteration 1 is not really dirty. This means the > > > > time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > > > > > > > It makes sense, can you get some perf numbers to show what > > > > > > > kinds of workloads get impacted the most? That would also > > > > > > > help us to figure out what kinds of speed improvements we can > > expect. > > > > > > > > > > > > > > > > > > > > > Amit > > > > > > > > > > > > I have picked up 6 workloads and got the following statistics > > > > > > numbers of every iteration (except the last stop-copy one) during > > precopy. > > > > > > These numbers are obtained with the basic precopy migration, > > > > > > without the capabilities like xbzrle or compression, etc. The > > > > > > network for the migration is exclusive, with a separate network for > > the workloads. > > > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > > > > > Three (booting, idle, web server) of them converged to the > > > > > > stop-copy > > > > phase, > > > > > > with the given bandwidth and default downtime (300ms), while the > > > > > > other three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > > > > > One page is "not-really-dirty", if it is written first and is > > > > > > sent later (and not written again after that) during one > > > > > > iteration. I guess this would not happen so often during the > > > > > > other iterations as during the 1st iteration. Because all the > > > > > > pages of the VM are sent to the dest node > > > > during > > > > > > the 1st iteration, while during the others, only part of the pages are > > sent. > > > > > > So I think the "not-really-dirty" pages should be produced > > > > > > mainly during the 1st iteration , and maybe very little during the other > > iterations. > > > > > > > > > > > > If we could avoid resending the "not-really-dirty" pages, > > > > > > intuitively, I think the time spent on Iteration 2 would be > > > > > > halved. This is a chain > > > > reaction, > > > > > > because the dirty pages produced during Iteration 2 is halved, > > > > > > which > > > > incurs > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > > > > > Yes; these numbers don't show how many of them are false dirty > > though. > > > > > > > > > > One problem is thinking about pages that have been redirtied, if > > > > > the page is > > > > dirtied > > > > > after the sync but before the network write then it's the > > > > > false-dirty that you're describing. > > > > > > > > > > However, if the page is being written a few times, and so it would > > > > > have > > > > been written > > > > > after the network write then it isn't a false-dirty. > > > > > > > > > > You might be able to figure that out with some kernel tracing of > > > > > when the > > > > dirtying > > > > > happens, but it might be easier to write the fix! > > > > > > > > > > Dave > > > > > > > > Hi, I have made some new progress now. > > > > > > > > To tell how many false dirty pages there are exactly in each > > > > iteration, I malloc a buffer in memory as big as the size of the > > > > whole VM memory. When a page is transferred to the dest node, it is > > > > copied to the buffer; During the next iteration, if one page is > > > > transferred, it is compared to the old one in the buffer, and the > > > > old one will be replaced for next comparison if it is really dirty. > > > > Thus, we are now able to get the exact number of false dirty pages. > > > > > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are > > > > all scientific > > > > computing workloads like Quantum Chromodynamics, Fluid Dynamics, > > etc. > > > > I pick > > > > up these 11 benchmarks because compared to others, they have > > > > bigger memory > > > > occupation and higher memory dirty rate. Thus most of them > > > > could not converge > > > > to stop-and-copy using the default migration speed (32MB/s). > > > > 2. kernel compilation > > > > 3. idle VM > > > > 4. Apache web server which serves static content > > > > > > > > (the above workloads are all running in VM with 1 vcpu and 1GB > > > > memory, and the > > > > migration speed is the default 32MB/s) > > > > > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are > > > > used as the cache. > > > > After filling up the 4GB cache, a client writes the cache at a constant > > speed > > > > during migration. This time, migration speed has no limit, and is up to > > the > > > > capability of 1Gbps Ethernet. > > > > > > > > Summarize the results first: (and you can read the precise number > > > > below) > > > > > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% > > > > during some iterations) > > > > of false dirty pages out of all the dirty pages since iteration 2 (and the > > big > > > > proportion lasts during the following iterations). They are > > cpu2006.zeusmp, > > > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > > > 2. 2 workloads (idle, webserver) spend most of the migration time > > > > on iteration 1, even > > > > though the proportion of false dirty pages is big since > > > > iteration 2, the space to > > > > optimize is small. > > > > 3. 1 workload (kernel compilation) only have a big proportion > > > > during iteration 2, not > > > > in the other iterations. > > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little > > > > proportion of false > > > > dirty pages since iteration 2. So the spaces to optimize for them are > > small. > > > > > > > > Now I want to talk a little more about the reasons why false dirty > > > > pages are produced. > > > > The first reason is what we have discussed before---the mechanism to > > > > track the dirty pages. > > > > And then I come up with another reason. Here is the situation: a > > > > write operation to one memory page happens, but it doesn't change > > > > any content of the page. So it's "write but not dirty", and kernel > > > > still marks it as dirty. One guy in our lab has done some > > > > experiments to figure out the proportion of "write but not dirty" > > > > operations, and he uses the cpu2006 benchmark suit. According to his > > > > results, general workloads has a little proportion (<10%) of "write > > > > but not dirty" out of all the write operations, while few workloads > > > > have higher proportion (one even as high as 50%). Now we are not > > > > sure why "write but not dirty" would happen, it just happened. > > > > > > > > So these two reasons contribute to the false dirty pages. To > > > > optimize, I compute and store the SHA1 hash before transferring each > > > > page. Next time, if one page needs retransmission, its > > > > SHA1 hash is computed again, and compared to the old hash. If the > > > > hash is the same, it's a false dirty page, and we just skip this > > > > page; Otherwise, the page is transferred, and the new hash replaces > > > > the old one for next comparison. > > > > The reason to use SHA1 hash but not byte-by-byte comparison is the > > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is > > > > relatively small. > > > > As far as I know, SHA1 hash is widely used in the scenes of > > > > deduplication for backup systems. > > > > They have proven that the probability of hash collision is far > > > > smaller than disk hardware fault, so it's secure hash, that is, if > > > > the hashes of two chunks are the same, the content must be the same. > > > > So I think the SHA1 hash could replace byte-to-byte comparison in > > > > the VM memory scenery. > > > > > > > > Then I do the same migration experiments using the SHA1 hash. For > > > > the 4 workloads which have big proportions of false dirty pages, the > > > > improvement is remarkable. Without optimization, they either can not > > > > converge to stop-and-copy, or take a very long time to complete. > > > > With the > > > > SHA1 hash method, all of them now complete in a relatively short time. > > > > For the reason I have talked above, the other workloads don't get > > > > notable improvements from the optimization. So below, I only show > > > > the exact number after optimization for the 4 workloads with > > > > remarkable improvements. > > > > > > > > Any comments or suggestions? > > > > > > Maybe you can compare the performance of your solution as that of > > XBZRLE to see which one is better. > > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and > > need less buffer. > > > How about the overhead of calculating the SHA1? Is it faster than copying a > > page? > > > > > > Liang > > > > > > > > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to > > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer > > as big as the whole VM memory, while SHA1 needs a much small buffer. Of > > course, if we have a buffer as big as the whole VM memory using XBZRLE, we > > could transfer less data on network than SHA1, because XBZRLE is able to > > compress similar pages. In a word, yes, the merit of using SHA1 is that it > > needs much less buffer, and leads to nice improvement if there are many > > false dirty pages. > > > > The current implementation of XBZRLE begins to buffer page from the second iteration, > Maybe it's worth to make it start to work from the first iteration based on your finding. Yes, I noticed that. If we make it start to work from the first iteration, I think the buffer should be large enough to obtain obvious effect. > > > In terms of the overhead of calculating the SHA1 compared with transferring > > a page, it's related to the CPU and network performance. In my test > > environment(Intel Xeon > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing > > overhead caused by calculating the SHA1, because the throughput of > > network (got by "info migrate") remains almost the same. > > You can check the CPU usage, or to measure the time spend on a local live migration > which use SHA1/ XBZRLE. Yes, I can compare SHA1 with XBZRLE. Maybe I will post the results later. Chunguang > > Liang > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-04 4:50 ` Li, Liang Z 2016-11-04 7:03 ` Chunguang Li @ 2016-11-07 13:52 ` Chunguang Li 2016-11-07 14:17 ` Li, Liang Z 2016-11-07 14:44 ` Li, Liang Z 1 sibling, 2 replies; 21+ messages in thread From: Chunguang Li @ 2016-11-07 13:52 UTC (permalink / raw) To: Li, Liang Z Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > -----Original Messages----- > From: "Li, Liang Z" <liang.z.li@intel.com> > Sent Time: Friday, November 4, 2016 > To: "Chunguang Li" <lichunguang@hust.edu.cn> > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com> > Subject: RE: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > I think this is "very" wasteful. Assume the workload writes > > > > > > > > the pages > > > > dirty randomly within the guest address space, and the transfer > > > > speed is constant. Intuitively, I think nearly half of the dirty > > > > pages produced in Iteration 1 is not really dirty. This means the > > > > time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > > > > > > > It makes sense, can you get some perf numbers to show what > > > > > > > kinds of workloads get impacted the most? That would also > > > > > > > help us to figure out what kinds of speed improvements we can > > expect. > > > > > > > > > > > > > > > > > > > > > Amit > > > > > > > > > > > > I have picked up 6 workloads and got the following statistics > > > > > > numbers of every iteration (except the last stop-copy one) during > > precopy. > > > > > > These numbers are obtained with the basic precopy migration, > > > > > > without the capabilities like xbzrle or compression, etc. The > > > > > > network for the migration is exclusive, with a separate network for > > the workloads. > > > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > > > > > Three (booting, idle, web server) of them converged to the > > > > > > stop-copy > > > > phase, > > > > > > with the given bandwidth and default downtime (300ms), while the > > > > > > other three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > > > > > One page is "not-really-dirty", if it is written first and is > > > > > > sent later (and not written again after that) during one > > > > > > iteration. I guess this would not happen so often during the > > > > > > other iterations as during the 1st iteration. Because all the > > > > > > pages of the VM are sent to the dest node > > > > during > > > > > > the 1st iteration, while during the others, only part of the pages are > > sent. > > > > > > So I think the "not-really-dirty" pages should be produced > > > > > > mainly during the 1st iteration , and maybe very little during the other > > iterations. > > > > > > > > > > > > If we could avoid resending the "not-really-dirty" pages, > > > > > > intuitively, I think the time spent on Iteration 2 would be > > > > > > halved. This is a chain > > > > reaction, > > > > > > because the dirty pages produced during Iteration 2 is halved, > > > > > > which > > > > incurs > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > > > > > Yes; these numbers don't show how many of them are false dirty > > though. > > > > > > > > > > One problem is thinking about pages that have been redirtied, if > > > > > the page is > > > > dirtied > > > > > after the sync but before the network write then it's the > > > > > false-dirty that you're describing. > > > > > > > > > > However, if the page is being written a few times, and so it would > > > > > have > > > > been written > > > > > after the network write then it isn't a false-dirty. > > > > > > > > > > You might be able to figure that out with some kernel tracing of > > > > > when the > > > > dirtying > > > > > happens, but it might be easier to write the fix! > > > > > > > > > > Dave > > > > > > > > Hi, I have made some new progress now. > > > > > > > > To tell how many false dirty pages there are exactly in each > > > > iteration, I malloc a buffer in memory as big as the size of the > > > > whole VM memory. When a page is transferred to the dest node, it is > > > > copied to the buffer; During the next iteration, if one page is > > > > transferred, it is compared to the old one in the buffer, and the > > > > old one will be replaced for next comparison if it is really dirty. > > > > Thus, we are now able to get the exact number of false dirty pages. > > > > > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are > > > > all scientific > > > > computing workloads like Quantum Chromodynamics, Fluid Dynamics, > > etc. > > > > I pick > > > > up these 11 benchmarks because compared to others, they have > > > > bigger memory > > > > occupation and higher memory dirty rate. Thus most of them > > > > could not converge > > > > to stop-and-copy using the default migration speed (32MB/s). > > > > 2. kernel compilation > > > > 3. idle VM > > > > 4. Apache web server which serves static content > > > > > > > > (the above workloads are all running in VM with 1 vcpu and 1GB > > > > memory, and the > > > > migration speed is the default 32MB/s) > > > > > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are > > > > used as the cache. > > > > After filling up the 4GB cache, a client writes the cache at a constant > > speed > > > > during migration. This time, migration speed has no limit, and is up to > > the > > > > capability of 1Gbps Ethernet. > > > > > > > > Summarize the results first: (and you can read the precise number > > > > below) > > > > > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% > > > > during some iterations) > > > > of false dirty pages out of all the dirty pages since iteration 2 (and the > > big > > > > proportion lasts during the following iterations). They are > > cpu2006.zeusmp, > > > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > > > 2. 2 workloads (idle, webserver) spend most of the migration time > > > > on iteration 1, even > > > > though the proportion of false dirty pages is big since > > > > iteration 2, the space to > > > > optimize is small. > > > > 3. 1 workload (kernel compilation) only have a big proportion > > > > during iteration 2, not > > > > in the other iterations. > > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little > > > > proportion of false > > > > dirty pages since iteration 2. So the spaces to optimize for them are > > small. > > > > > > > > Now I want to talk a little more about the reasons why false dirty > > > > pages are produced. > > > > The first reason is what we have discussed before---the mechanism to > > > > track the dirty pages. > > > > And then I come up with another reason. Here is the situation: a > > > > write operation to one memory page happens, but it doesn't change > > > > any content of the page. So it's "write but not dirty", and kernel > > > > still marks it as dirty. One guy in our lab has done some > > > > experiments to figure out the proportion of "write but not dirty" > > > > operations, and he uses the cpu2006 benchmark suit. According to his > > > > results, general workloads has a little proportion (<10%) of "write > > > > but not dirty" out of all the write operations, while few workloads > > > > have higher proportion (one even as high as 50%). Now we are not > > > > sure why "write but not dirty" would happen, it just happened. > > > > > > > > So these two reasons contribute to the false dirty pages. To > > > > optimize, I compute and store the SHA1 hash before transferring each > > > > page. Next time, if one page needs retransmission, its > > > > SHA1 hash is computed again, and compared to the old hash. If the > > > > hash is the same, it's a false dirty page, and we just skip this > > > > page; Otherwise, the page is transferred, and the new hash replaces > > > > the old one for next comparison. > > > > The reason to use SHA1 hash but not byte-by-byte comparison is the > > > > memory overheads. One SHA1 hash is 20 bytes. So we need extra > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is > > > > relatively small. > > > > As far as I know, SHA1 hash is widely used in the scenes of > > > > deduplication for backup systems. > > > > They have proven that the probability of hash collision is far > > > > smaller than disk hardware fault, so it's secure hash, that is, if > > > > the hashes of two chunks are the same, the content must be the same. > > > > So I think the SHA1 hash could replace byte-to-byte comparison in > > > > the VM memory scenery. > > > > > > > > Then I do the same migration experiments using the SHA1 hash. For > > > > the 4 workloads which have big proportions of false dirty pages, the > > > > improvement is remarkable. Without optimization, they either can not > > > > converge to stop-and-copy, or take a very long time to complete. > > > > With the > > > > SHA1 hash method, all of them now complete in a relatively short time. > > > > For the reason I have talked above, the other workloads don't get > > > > notable improvements from the optimization. So below, I only show > > > > the exact number after optimization for the 4 workloads with > > > > remarkable improvements. > > > > > > > > Any comments or suggestions? > > > > > > Maybe you can compare the performance of your solution as that of > > XBZRLE to see which one is better. > > > The merit of using SHA1 is that it can avoid data copy as that in XBZRLE, and > > need less buffer. > > > How about the overhead of calculating the SHA1? Is it faster than copying a > > page? > > > > > > Liang > > > > > > > > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we want to > > avoid transferring all of the false dirty pages using XBZRLE, we need a buffer > > as big as the whole VM memory, while SHA1 needs a much small buffer. Of > > course, if we have a buffer as big as the whole VM memory using XBZRLE, we > > could transfer less data on network than SHA1, because XBZRLE is able to > > compress similar pages. In a word, yes, the merit of using SHA1 is that it > > needs much less buffer, and leads to nice improvement if there are many > > false dirty pages. > > > > The current implementation of XBZRLE begins to buffer page from the second iteration, > Maybe it's worth to make it start to work from the first iteration based on your finding. > > > In terms of the overhead of calculating the SHA1 compared with transferring > > a page, it's related to the CPU and network performance. In my test > > environment(Intel Xeon > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra computing > > overhead caused by calculating the SHA1, because the throughput of > > network (got by "info migrate") remains almost the same. > > You can check the CPU usage, or to measure the time spend on a local live migration > which use SHA1/ XBZRLE. > > Liang > > I compare SHA1 with XBZRLE. I use XBZRLE in two ways: 1. Begins to buffer pages from iteration 1; 2. As current implementation, begins to buffer pages from iteration 2. I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf, memcached. I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB ram), and set the cache size as 1GB for memcached (it run in VM with 6GB ram, and memcached takes 4GB as cache). As you can read from the data below, beginning to buffer pages from iteration 1 is better than the current implementation(from iteration 2), because the total migration time is shorter. SHA1 is better than the XBZRLE with the cache size I choose, because it leads to shorter migration time, and consumes far less memory overhead (<1/200 of the total VM memory). 1. zeusmp (1) XBZRLE 256MB cache, begins to buffer pages from iteration 1 Iteration 1, duration: 21402 ms , transferred pages: 266450 (dup: 91456, n: 174994, x: 0) , new dirty pages: 129225 , remaining dirty pages: 129225 Iteration 2, duration: 2295 ms , transferred pages: 101471 (dup: 77921, n: 16665, x: 6885) , new dirty pages: 76125 , remaining dirty pages: 77424 Iteration 3, duration: 2000 ms , transferred pages: 56092 (dup: 36345, n: 11249, x: 8498) , new dirty pages: 111498 , remaining dirty pages: 112836 Iteration 4, duration: 1604 ms , transferred pages: 87335 (dup: 69441, n: 10018, x: 7876) , new dirty pages: 19982 , remaining dirty pages: 19982 Iteration 5, duration: 302 ms , transferred pages: 19850 (dup: 16718, n: 2547, x: 585) , new dirty pages: 14084 , remaining dirty pages: 14084 Iteration 6, duration: 194 ms , transferred pages: 13403 (dup: 12338, n: 846, x: 219) , new dirty pages: 3900 , remaining dirty pages: 4243 Iteration 7, duration: 8 ms , transferred pages: 3938 (dup: 3425, n: 239, x: 274) , new dirty pages: 372 , remaining dirty pages: 372 Iteration 8, duration: 71 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 372 total time: 27891 milliseconds (2) XBZRLE 256MB cache, begins to buffer pages from iteration 2 can not converge Iteration 1, duration: 21698 ms , transferred pages: 266331 (dup: 89009, n: 177322, x: 0) , new dirty pages: 125990 , remaining dirty pages: 126109 Iteration 2, duration: 5909 ms , transferred pages: 126109 (dup: 77248, n: 48861, x: 0) , new dirty pages: 124870 , remaining dirty pages: 124870 Iteration 3, duration: 3197 ms , transferred pages: 110583 (dup: 75471, n: 23129, x: 11983) , new dirty pages: 118035 , remaining dirty pages: 118035 Iteration 4, duration: 3195 ms , transferred pages: 102787 (dup: 72708, n: 22158, x: 7921) , new dirty pages: 86576 , remaining dirty pages: 86773 Iteration 5, duration: 3111 ms , transferred pages: 79563 (dup: 52073, n: 21289, x: 6201) , new dirty pages: 97402 , remaining dirty pages: 97402 Iteration 6, duration: 2407 ms , transferred pages: 79567 (dup: 56415, n: 16013, x: 7139) , new dirty pages: 101193 , remaining dirty pages: 101193 Iteration 7, duration: 2896 ms , transferred pages: 83278 (dup: 55778, n: 20652, x: 6848) , new dirty pages: 90683 , remaining dirty pages: 92977 Iteration 8, duration: 2701 ms , transferred pages: 89112 (dup: 62579, n: 18699, x: 7834) , new dirty pages: 109827 , remaining dirty pages: 110008 Iteration 9, duration: 3602 ms , transferred pages: 95866 (dup: 61631, n: 25632, x: 8603) , new dirty pages: 94551 , remaining dirty pages: 96227 Iteration 10, duration: 3802 ms , transferred pages: 83693 (dup: 50558, n: 26427, x: 6708) , new dirty pages: 123537 , remaining dirty pages: 124170 Iteration 11, duration: 3399 ms , transferred pages: 108770 (dup: 75144, n: 23952, x: 9674) , new dirty pages: 103934 , remaining dirty pages: 104981 Iteration 12, duration: 2700 ms , transferred pages: 91080 (dup: 62981, n: 16600, x: 11499) , new dirty pages: 88314 , remaining dirty pages: 88948 Iteration 13, duration: 3102 ms , transferred pages: 78406 (dup: 50165, n: 21409, x: 6832) , new dirty pages: 73586 , remaining dirty pages: 74025 Iteration 14, duration: 806 ms , transferred pages: 66530 (dup: 51013, n: 3973, x: 11544) , new dirty pages: 67941 , remaining dirty pages: 67941 Iteration 15, duration: 2398 ms , transferred pages: 53117 (dup: 33312, n: 18436, x: 1369) , new dirty pages: 116502 , remaining dirty pages: 118956 Iteration 16, duration: 3200 ms , transferred pages: 103009 (dup: 71642, n: 21378, x: 9989) , new dirty pages: 81777 , remaining dirty pages: 83724 Iteration 17, duration: 3005 ms , transferred pages: 73096 (dup: 45738, n: 19016, x: 8342) , new dirty pages: 116671 , remaining dirty pages: 118397 Iteration 18, duration: 3302 ms , transferred pages: 101507 (dup: 67290, n: 22721, x: 11496) , new dirty pages: 104163 , remaining dirty pages: 105921 Iteration 19, duration: 3705 ms , transferred pages: 90516 (dup: 56932, n: 26394, x: 7190) , new dirty pages: 118139 , remaining dirty pages: 120170 Iteration 20, duration: 3903 ms , transferred pages: 102710 (dup: 67623, n: 25811, x: 9276) , new dirty pages: 103608 , remaining dirty pages: 105496 (3) SHA1 Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843 Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945 Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929 Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916 Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302 Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500 Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520 Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524 total time: 27225 milliseconds 2. mcf (1) XBZRLE 256MB cache, begins to buffer pages from iteration 1 Iteration 1, duration: 31706 ms , transferred pages: 266325 (dup: 7032, n: 259293, x: 0) , new dirty pages: 238215 , remaining dirty pages: 238340 Iteration 2, duration: 21807 ms , transferred pages: 186619 (dup: 335, n: 176826, x: 9458) , new dirty pages: 226886 , remaining dirty pages: 228857 Iteration 3, duration: 21300 ms , transferred pages: 181925 (dup: 201, n: 172974, x: 8750) , new dirty pages: 202288 , remaining dirty pages: 204100 Iteration 4, duration: 17300 ms , transferred pages: 148972 (dup: 38, n: 141113, x: 7821) , new dirty pages: 136220 , remaining dirty pages: 137992 Iteration 5, duration: 13699 ms , transferred pages: 118247 (dup: 38, n: 112030, x: 6179) , new dirty pages: 48397 , remaining dirty pages: 50466 Iteration 6, duration: 4499 ms , transferred pages: 41719 (dup: 24, n: 36790, x: 4905) , new dirty pages: 3753 , remaining dirty pages: 5690 Iteration 7, duration: 399 ms , transferred pages: 3826 (dup: 4, n: 3265, x: 557) , new dirty pages: 1261 , remaining dirty pages: 2437 Iteration 8, duration: 72 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 2437 total time: 110812 milliseconds (2) XBZRLE 256MB cache, begins to buffer pages from iteration 2 Iteration 1, duration: 31606 ms , transferred pages: 266450 (dup: 7267, n: 259183, x: 0) , new dirty pages: 233582 , remaining dirty pages: 233582 Iteration 2, duration: 28413 ms , transferred pages: 231693 (dup: 89, n: 231604, x: 0) , new dirty pages: 216962 , remaining dirty pages: 218851 Iteration 3, duration: 18618 ms , transferred pages: 159936 (dup: 3, n: 151579, x: 8354) , new dirty pages: 216400 , remaining dirty pages: 218790 Iteration 4, duration: 18621 ms , transferred pages: 159665 (dup: 0, n: 152102, x: 7563) , new dirty pages: 209860 , remaining dirty pages: 211611 Iteration 5, duration: 17709 ms , transferred pages: 151672 (dup: 4, n: 144493, x: 7175) , new dirty pages: 146273 , remaining dirty pages: 148006 Iteration 6, duration: 9911 ms , transferred pages: 86971 (dup: 2, n: 80842, x: 6127) , new dirty pages: 118364 , remaining dirty pages: 120396 Iteration 7, duration: 14212 ms , transferred pages: 117460 (dup: 0, n: 116149, x: 1311) , new dirty pages: 213993 , remaining dirty pages: 216107 Iteration 8, duration: 22913 ms , transferred pages: 213698 (dup: 4, n: 161520, x: 52174) , new dirty pages: 217947 , remaining dirty pages: 219955 Iteration 9, duration: 23808 ms , transferred pages: 217375 (dup: 3, n: 152315, x: 65057) , new dirty pages: 172615 , remaining dirty pages: 174859 Iteration 10, duration: 15099 ms , transferred pages: 131265 (dup: 0, n: 123463, x: 7802) , new dirty pages: 113946 , remaining dirty pages: 116026 Iteration 11, duration: 10002 ms , transferred pages: 88477 (dup: 8, n: 81753, x: 6716) , new dirty pages: 97006 , remaining dirty pages: 99110 Iteration 12, duration: 6898 ms , transferred pages: 62861 (dup: 4, n: 56392, x: 6465) , new dirty pages: 45164 , remaining dirty pages: 47297 Iteration 13, duration: 3601 ms , transferred pages: 35360 (dup: 0, n: 29390, x: 5970) , new dirty pages: 24581 , remaining dirty pages: 26779 Iteration 14, duration: 1902 ms , transferred pages: 19794 (dup: 0, n: 15475, x: 4319) , new dirty pages: 66153 , remaining dirty pages: 67850 Iteration 15, duration: 5504 ms , transferred pages: 50369 (dup: 0, n: 44902, x: 5467) , new dirty pages: 49279 , remaining dirty pages: 51198 Iteration 16, duration: 3699 ms , transferred pages: 36519 (dup: 2, n: 30184, x: 6333) , new dirty pages: 23672 , remaining dirty pages: 25914 Iteration 17, duration: 1601 ms , transferred pages: 17628 (dup: 0, n: 12972, x: 4656) , new dirty pages: 8685 , remaining dirty pages: 10646 Iteration 18, duration: 599 ms , transferred pages: 7835 (dup: 0, n: 4825, x: 3010) , new dirty pages: 6167 , remaining dirty pages: 6266 Iteration 19, duration: 200 ms , transferred pages: 3590 (dup: 0, n: 1576, x: 2014) , new dirty pages: 3873 , remaining dirty pages: 5709 Iteration 20, duration: 200 ms , transferred pages: 4134 (dup: 0, n: 1574, x: 2560) , new dirty pages: 3609 , remaining dirty pages: 4099 Iteration 21, duration: 100 ms , transferred pages: 2785 (dup: 0, n: 787, x: 1998) , new dirty pages: 1900 , remaining dirty pages: 2585 Iteration 22, duration: 9 ms , transferred pages: 2191 (dup: 0, n: 539, x: 1652) , new dirty pages: 596 , remaining dirty pages: 596 Iteration 23, duration: 41 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 596 total time: 235286 milliseconds (3) SHA1 Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209 Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571 Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478 Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172 Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305 Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675 Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265 Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668 Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014 Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214 Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725 Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903 Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260 Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266 Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792 Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639 Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170 Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547 Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521 Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054 Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624 Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265 Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813 Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608 Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311 Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294 Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936 Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423 Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292 Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457 Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304 Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815 Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670 Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422 Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222 Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337 Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251 Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251 total time: 97919 milliseconds 3. memcached (1) XBZRLE 1024MB cache, begins to buffer pages from iteration 1 Iteration 1, duration: 40763 ms , transferred pages: 1570149 (dup: 404139, n: 1166010, x: 0) , new dirty pages: 526462 , remaining dirty pages: 533483 Iteration 2, duration: 15741 ms , transferred pages: 461867 (dup: 4070, n: 437704, x: 20093) , new dirty pages: 256841 , remaining dirty pages: 265501 Iteration 3, duration: 7874 ms , transferred pages: 231950 (dup: 280, n: 207569, x: 24101) , new dirty pages: 153526 , remaining dirty pages: 160865 Iteration 4, duration: 4260 ms , transferred pages: 135181 (dup: 135, n: 116768, x: 18278) , new dirty pages: 100298 , remaining dirty pages: 107278 Iteration 5, duration: 2506 ms , transferred pages: 87596 (dup: 180, n: 67600, x: 19816) , new dirty pages: 63685 , remaining dirty pages: 71790 Iteration 6, duration: 1373 ms , transferred pages: 51800 (dup: 128, n: 37336, x: 14336) , new dirty pages: 38785 , remaining dirty pages: 46064 Iteration 7, duration: 872 ms , transferred pages: 32015 (dup: 56, n: 23414, x: 8545) , new dirty pages: 23580 , remaining dirty pages: 31629 Iteration 8, duration: 527 ms , transferred pages: 21833 (dup: 40, n: 14372, x: 7421) , new dirty pages: 16624 , remaining dirty pages: 23482 Iteration 9, duration: 291 ms , transferred pages: 14917 (dup: 16, n: 6572, x: 8329) , new dirty pages: 10039 , remaining dirty pages: 16753 Iteration 10, duration: 113 ms , transferred pages: 6082 (dup: 111, n: 3300, x: 2671) , new dirty pages: 4081 , remaining dirty pages: 12703 Iteration 11, duration: 119 ms , transferred pages: 3970 (dup: 16, n: 2953, x: 1001) , new dirty pages: 3824 , remaining dirty pages: 11936 Iteration 12, duration: 51 ms , transferred pages: 3585 (dup: 0, n: 1154, x: 2431) , new dirty pages: 1711 , remaining dirty pages: 9900 Iteration 13, duration: 62 ms , transferred pages: 2945 (dup: 0, n: 1589, x: 1356) , new dirty pages: 1909 , remaining dirty pages: 8503 Iteration 14, duration: 2 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 1 , remaining dirty pages: 8504 total time: 74738 milliseconds (2) XBZRLE 1024MB cache, begins to buffer pages from iteration 2 Iteration 1, duration: 40375 ms , transferred pages: 1570347 (dup: 415923, n: 1154424, x: 0) , new dirty pages: 511859 , remaining dirty pages: 518682 Iteration 2, duration: 17580 ms , transferred pages: 510145 (dup: 5970, n: 504175, x: 0) , new dirty pages: 291686 , remaining dirty pages: 300223 Iteration 3, duration: 8259 ms , transferred pages: 253656 (dup: 929, n: 230020, x: 22707) , new dirty pages: 166721 , remaining dirty pages: 174231 Iteration 4, duration: 4733 ms , transferred pages: 147925 (dup: 257, n: 132454, x: 15214) , new dirty pages: 103965 , remaining dirty pages: 111436 Iteration 5, duration: 2587 ms , transferred pages: 90734 (dup: 251, n: 70008, x: 20475) , new dirty pages: 61266 , remaining dirty pages: 69202 Iteration 6, duration: 1377 ms , transferred pages: 51416 (dup: 55, n: 37776, x: 13585) , new dirty pages: 45236 , remaining dirty pages: 52106 Iteration 7, duration: 1126 ms , transferred pages: 40020 (dup: 259, n: 30064, x: 9697) , new dirty pages: 28433 , remaining dirty pages: 35358 Iteration 8, duration: 574 ms , transferred pages: 23754 (dup: 40, n: 16066, x: 7648) , new dirty pages: 18067 , remaining dirty pages: 26353 Iteration 9, duration: 395 ms , transferred pages: 17607 (dup: 16, n: 9463, x: 8128) , new dirty pages: 11507 , remaining dirty pages: 18488 Iteration 10, duration: 171 ms , transferred pages: 8195 (dup: 40, n: 4726, x: 3429) , new dirty pages: 5482 , remaining dirty pages: 13898 Iteration 11, duration: 116 ms , transferred pages: 6594 (dup: 16, n: 2679, x: 3899) , new dirty pages: 3884 , remaining dirty pages: 9581 Iteration 12, duration: 54 ms , transferred pages: 1793 (dup: 0, n: 1634, x: 159) , new dirty pages: 1515 , remaining dirty pages: 9189 Iteration 13, duration: 62 ms , transferred pages: 1793 (dup: 0, n: 1643, x: 150) , new dirty pages: 1657 , remaining dirty pages: 8871 Iteration 14, duration: 3 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 1 , remaining dirty pages: 8872 total time: 77578 milliseconds (3) SHA1 Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979 Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479 Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100 Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414 Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536 Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516 Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802 Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060 Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973 Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014 Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516 Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842 Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871 Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989 Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989 total time: 54693 milliseconds ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-07 13:52 ` Chunguang Li @ 2016-11-07 14:17 ` Li, Liang Z 2016-11-08 5:27 ` Chunguang Li 2016-11-07 14:44 ` Li, Liang Z 1 sibling, 1 reply; 21+ messages in thread From: Li, Liang Z @ 2016-11-07 14:17 UTC (permalink / raw) To: Chunguang Li Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > > > > > > > > > I think this is "very" wasteful. Assume the workload > > > > > > > > > writes the pages > > > > > dirty randomly within the guest address space, and the transfer > > > > > speed is constant. Intuitively, I think nearly half of the dirty > > > > > pages produced in Iteration 1 is not really dirty. This means > > > > > the time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > > > > > > > > > It makes sense, can you get some perf numbers to show what > > > > > > > > kinds of workloads get impacted the most? That would also > > > > > > > > help us to figure out what kinds of speed improvements we > > > > > > > > can > > > expect. > > > > > > > > > > > > > > > > > > > > > > > > Amit > > > > > > > > > > > > > > I have picked up 6 workloads and got the following > > > > > > > statistics numbers of every iteration (except the last > > > > > > > stop-copy one) during > > > precopy. > > > > > > > These numbers are obtained with the basic precopy migration, > > > > > > > without the capabilities like xbzrle or compression, etc. > > > > > > > The network for the migration is exclusive, with a separate > > > > > > > network for > > > the workloads. > > > > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > > > > > > > Three (booting, idle, web server) of them converged to the > > > > > > > stop-copy > > > > > phase, > > > > > > > with the given bandwidth and default downtime (300ms), while > > > > > > > the other three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > > > > > > > One page is "not-really-dirty", if it is written first and > > > > > > > is sent later (and not written again after that) during one > > > > > > > iteration. I guess this would not happen so often during the > > > > > > > other iterations as during the 1st iteration. Because all > > > > > > > the pages of the VM are sent to the dest node > > > > > during > > > > > > > the 1st iteration, while during the others, only part of the > > > > > > > pages are > > > sent. > > > > > > > So I think the "not-really-dirty" pages should be produced > > > > > > > mainly during the 1st iteration , and maybe very little > > > > > > > during the other > > > iterations. > > > > > > > > > > > > > > If we could avoid resending the "not-really-dirty" pages, > > > > > > > intuitively, I think the time spent on Iteration 2 would be > > > > > > > halved. This is a chain > > > > > reaction, > > > > > > > because the dirty pages produced during Iteration 2 is > > > > > > > halved, which > > > > > incurs > > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > > > > > > > Yes; these numbers don't show how many of them are false dirty > > > though. > > > > > > > > > > > > One problem is thinking about pages that have been redirtied, > > > > > > if the page is > > > > > dirtied > > > > > > after the sync but before the network write then it's the > > > > > > false-dirty that you're describing. > > > > > > > > > > > > However, if the page is being written a few times, and so it > > > > > > would have > > > > > been written > > > > > > after the network write then it isn't a false-dirty. > > > > > > > > > > > > You might be able to figure that out with some kernel tracing > > > > > > of when the > > > > > dirtying > > > > > > happens, but it might be easier to write the fix! > > > > > > > > > > > > Dave > > > > > > > > > > Hi, I have made some new progress now. > > > > > > > > > > To tell how many false dirty pages there are exactly in each > > > > > iteration, I malloc a buffer in memory as big as the size of the > > > > > whole VM memory. When a page is transferred to the dest node, it > > > > > is copied to the buffer; During the next iteration, if one page > > > > > is transferred, it is compared to the old one in the buffer, and > > > > > the old one will be replaced for next comparison if it is really dirty. > > > > > Thus, we are now able to get the exact number of false dirty pages. > > > > > > > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > > > > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They > > > > > are all scientific > > > > > computing workloads like Quantum Chromodynamics, Fluid > > > > > Dynamics, > > > etc. > > > > > I pick > > > > > up these 11 benchmarks because compared to others, they > > > > > have bigger memory > > > > > occupation and higher memory dirty rate. Thus most of them > > > > > could not converge > > > > > to stop-and-copy using the default migration speed (32MB/s). > > > > > 2. kernel compilation > > > > > 3. idle VM > > > > > 4. Apache web server which serves static content > > > > > > > > > > (the above workloads are all running in VM with 1 vcpu and 1GB > > > > > memory, and the > > > > > migration speed is the default 32MB/s) > > > > > > > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB > > > > > are used as the cache. > > > > > After filling up the 4GB cache, a client writes the cache > > > > > at a constant > > > speed > > > > > during migration. This time, migration speed has no limit, > > > > > and is up to > > > the > > > > > capability of 1Gbps Ethernet. > > > > > > > > > > Summarize the results first: (and you can read the precise > > > > > number > > > > > below) > > > > > > > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even > > > > > >80% during some iterations) > > > > > of false dirty pages out of all the dirty pages since > > > > > iteration 2 (and the > > > big > > > > > proportion lasts during the following iterations). They are > > > cpu2006.zeusmp, > > > > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > > > > 2. 2 workloads (idle, webserver) spend most of the migration > > > > > time on iteration 1, even > > > > > though the proportion of false dirty pages is big since > > > > > iteration 2, the space to > > > > > optimize is small. > > > > > 3. 1 workload (kernel compilation) only have a big proportion > > > > > during iteration 2, not > > > > > in the other iterations. > > > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little > > > > > proportion of false > > > > > dirty pages since iteration 2. So the spaces to optimize > > > > > for them are > > > small. > > > > > > > > > > Now I want to talk a little more about the reasons why false > > > > > dirty pages are produced. > > > > > The first reason is what we have discussed before---the > > > > > mechanism to track the dirty pages. > > > > > And then I come up with another reason. Here is the situation: a > > > > > write operation to one memory page happens, but it doesn't > > > > > change any content of the page. So it's "write but not dirty", > > > > > and kernel still marks it as dirty. One guy in our lab has done > > > > > some experiments to figure out the proportion of "write but not > dirty" > > > > > operations, and he uses the cpu2006 benchmark suit. According to > > > > > his results, general workloads has a little proportion (<10%) of > > > > > "write but not dirty" out of all the write operations, while few > > > > > workloads have higher proportion (one even as high as 50%). Now > > > > > we are not sure why "write but not dirty" would happen, it just > happened. > > > > > > > > > > So these two reasons contribute to the false dirty pages. To > > > > > optimize, I compute and store the SHA1 hash before transferring > > > > > each page. Next time, if one page needs retransmission, its > > > > > SHA1 hash is computed again, and compared to the old hash. If > > > > > the hash is the same, it's a false dirty page, and we just skip > > > > > this page; Otherwise, the page is transferred, and the new hash > > > > > replaces the old one for next comparison. > > > > > The reason to use SHA1 hash but not byte-by-byte comparison is > > > > > the memory overheads. One SHA1 hash is 20 bytes. So we need > > > > > extra > > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is > > > > > relatively small. > > > > > As far as I know, SHA1 hash is widely used in the scenes of > > > > > deduplication for backup systems. > > > > > They have proven that the probability of hash collision is far > > > > > smaller than disk hardware fault, so it's secure hash, that is, > > > > > if the hashes of two chunks are the same, the content must be the > same. > > > > > So I think the SHA1 hash could replace byte-to-byte comparison > > > > > in the VM memory scenery. > > > > > > > > > > Then I do the same migration experiments using the SHA1 hash. > > > > > For the 4 workloads which have big proportions of false dirty > > > > > pages, the improvement is remarkable. Without optimization, they > > > > > either can not converge to stop-and-copy, or take a very long time to > complete. > > > > > With the > > > > > SHA1 hash method, all of them now complete in a relatively short > time. > > > > > For the reason I have talked above, the other workloads don't > > > > > get notable improvements from the optimization. So below, I only > > > > > show the exact number after optimization for the 4 workloads > > > > > with remarkable improvements. > > > > > > > > > > Any comments or suggestions? > > > > > > > > Maybe you can compare the performance of your solution as that of > > > XBZRLE to see which one is better. > > > > The merit of using SHA1 is that it can avoid data copy as that in > > > > XBZRLE, and > > > need less buffer. > > > > How about the overhead of calculating the SHA1? Is it faster than > > > > copying a > > > page? > > > > > > > > Liang > > > > > > > > > > > > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we > > > want to avoid transferring all of the false dirty pages using > > > XBZRLE, we need a buffer as big as the whole VM memory, while SHA1 > > > needs a much small buffer. Of course, if we have a buffer as big as > > > the whole VM memory using XBZRLE, we could transfer less data on > > > network than SHA1, because XBZRLE is able to compress similar pages. > > > In a word, yes, the merit of using SHA1 is that it needs much less > > > buffer, and leads to nice improvement if there are many false dirty pages. > > > > > > > The current implementation of XBZRLE begins to buffer page from the > > second iteration, Maybe it's worth to make it start to work from the first > iteration based on your finding. > > > > > In terms of the overhead of calculating the SHA1 compared with > > > transferring a page, it's related to the CPU and network > > > performance. In my test environment(Intel Xeon > > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra > > > computing overhead caused by calculating the SHA1, because the > > > throughput of network (got by "info migrate") remains almost the same. > > > > You can check the CPU usage, or to measure the time spend on a local > > live migration which use SHA1/ XBZRLE. > > > > Liang > > > > > > I compare SHA1 with XBZRLE. I use XBZRLE in two ways: > 1. Begins to buffer pages from iteration 1; 2. As current implementation, > begins to buffer pages from iteration 2. > > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf, > memcached. > I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB > ram), and set the cache size as 1GB for memcached (it run in VM with 6GB > ram, and memcached takes 4GB as cache). > > As you can read from the data below, beginning to buffer pages from > iteration 1 is better than the current implementation(from iteration 2), > because the total migration time is shorter. > > SHA1 is better than the XBZRLE with the cache size I choose, because it leads > to shorter migration time, and consumes far less memory overhead (<1/200 > of the total VM memory). > Hi Chunguang, Have you tried to use a large XBZRLE cache size which equals to the guest's RAM size? Is SHA1 faster in that case? Thanks! Liang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-07 14:17 ` Li, Liang Z @ 2016-11-08 5:27 ` Chunguang Li 0 siblings, 0 replies; 21+ messages in thread From: Chunguang Li @ 2016-11-08 5:27 UTC (permalink / raw) To: Li, Liang Z Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > -----Original Messages----- > From: "Li, Liang Z" <liang.z.li@intel.com> > Sent Time: Monday, November 7, 2016 > To: "Chunguang Li" <lichunguang@hust.edu.cn> > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, "Amit Shah" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "stefanha@redhat.com" <stefanha@redhat.com>, "quintela@redhat.com" <quintela@redhat.com> > Subject: RE: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > > > I think this is "very" wasteful. Assume the workload > > > > > > > > > > writes the pages > > > > > > dirty randomly within the guest address space, and the transfer > > > > > > speed is constant. Intuitively, I think nearly half of the dirty > > > > > > pages produced in Iteration 1 is not really dirty. This means > > > > > > the time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > > > > > > > > > > > It makes sense, can you get some perf numbers to show what > > > > > > > > > kinds of workloads get impacted the most? That would also > > > > > > > > > help us to figure out what kinds of speed improvements we > > > > > > > > > can > > > > expect. > > > > > > > > > > > > > > > > > > > > > > > > > > > Amit > > > > > > > > > > > > > > > > I have picked up 6 workloads and got the following > > > > > > > > statistics numbers of every iteration (except the last > > > > > > > > stop-copy one) during > > > > precopy. > > > > > > > > These numbers are obtained with the basic precopy migration, > > > > > > > > without the capabilities like xbzrle or compression, etc. > > > > > > > > The network for the migration is exclusive, with a separate > > > > > > > > network for > > > > the workloads. > > > > > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > > > > > > > > > Three (booting, idle, web server) of them converged to the > > > > > > > > stop-copy > > > > > > phase, > > > > > > > > with the given bandwidth and default downtime (300ms), while > > > > > > > > the other three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > > > > > > > > > One page is "not-really-dirty", if it is written first and > > > > > > > > is sent later (and not written again after that) during one > > > > > > > > iteration. I guess this would not happen so often during the > > > > > > > > other iterations as during the 1st iteration. Because all > > > > > > > > the pages of the VM are sent to the dest node > > > > > > during > > > > > > > > the 1st iteration, while during the others, only part of the > > > > > > > > pages are > > > > sent. > > > > > > > > So I think the "not-really-dirty" pages should be produced > > > > > > > > mainly during the 1st iteration , and maybe very little > > > > > > > > during the other > > > > iterations. > > > > > > > > > > > > > > > > If we could avoid resending the "not-really-dirty" pages, > > > > > > > > intuitively, I think the time spent on Iteration 2 would be > > > > > > > > halved. This is a chain > > > > > > reaction, > > > > > > > > because the dirty pages produced during Iteration 2 is > > > > > > > > halved, which > > > > > > incurs > > > > > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > > > > > > > > > Yes; these numbers don't show how many of them are false dirty > > > > though. > > > > > > > > > > > > > > One problem is thinking about pages that have been redirtied, > > > > > > > if the page is > > > > > > dirtied > > > > > > > after the sync but before the network write then it's the > > > > > > > false-dirty that you're describing. > > > > > > > > > > > > > > However, if the page is being written a few times, and so it > > > > > > > would have > > > > > > been written > > > > > > > after the network write then it isn't a false-dirty. > > > > > > > > > > > > > > You might be able to figure that out with some kernel tracing > > > > > > > of when the > > > > > > dirtying > > > > > > > happens, but it might be easier to write the fix! > > > > > > > > > > > > > > Dave > > > > > > > > > > > > Hi, I have made some new progress now. > > > > > > > > > > > > To tell how many false dirty pages there are exactly in each > > > > > > iteration, I malloc a buffer in memory as big as the size of the > > > > > > whole VM memory. When a page is transferred to the dest node, it > > > > > > is copied to the buffer; During the next iteration, if one page > > > > > > is transferred, it is compared to the old one in the buffer, and > > > > > > the old one will be replaced for next comparison if it is really dirty. > > > > > > Thus, we are now able to get the exact number of false dirty pages. > > > > > > > > > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > > > > > > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They > > > > > > are all scientific > > > > > > computing workloads like Quantum Chromodynamics, Fluid > > > > > > Dynamics, > > > > etc. > > > > > > I pick > > > > > > up these 11 benchmarks because compared to others, they > > > > > > have bigger memory > > > > > > occupation and higher memory dirty rate. Thus most of them > > > > > > could not converge > > > > > > to stop-and-copy using the default migration speed (32MB/s). > > > > > > 2. kernel compilation > > > > > > 3. idle VM > > > > > > 4. Apache web server which serves static content > > > > > > > > > > > > (the above workloads are all running in VM with 1 vcpu and 1GB > > > > > > memory, and the > > > > > > migration speed is the default 32MB/s) > > > > > > > > > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB > > > > > > are used as the cache. > > > > > > After filling up the 4GB cache, a client writes the cache > > > > > > at a constant > > > > speed > > > > > > during migration. This time, migration speed has no limit, > > > > > > and is up to > > > > the > > > > > > capability of 1Gbps Ethernet. > > > > > > > > > > > > Summarize the results first: (and you can read the precise > > > > > > number > > > > > > below) > > > > > > > > > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even > > > > > > >80% during some iterations) > > > > > > of false dirty pages out of all the dirty pages since > > > > > > iteration 2 (and the > > > > big > > > > > > proportion lasts during the following iterations). They are > > > > cpu2006.zeusmp, > > > > > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > > > > > 2. 2 workloads (idle, webserver) spend most of the migration > > > > > > time on iteration 1, even > > > > > > though the proportion of false dirty pages is big since > > > > > > iteration 2, the space to > > > > > > optimize is small. > > > > > > 3. 1 workload (kernel compilation) only have a big proportion > > > > > > during iteration 2, not > > > > > > in the other iterations. > > > > > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little > > > > > > proportion of false > > > > > > dirty pages since iteration 2. So the spaces to optimize > > > > > > for them are > > > > small. > > > > > > > > > > > > Now I want to talk a little more about the reasons why false > > > > > > dirty pages are produced. > > > > > > The first reason is what we have discussed before---the > > > > > > mechanism to track the dirty pages. > > > > > > And then I come up with another reason. Here is the situation: a > > > > > > write operation to one memory page happens, but it doesn't > > > > > > change any content of the page. So it's "write but not dirty", > > > > > > and kernel still marks it as dirty. One guy in our lab has done > > > > > > some experiments to figure out the proportion of "write but not > > dirty" > > > > > > operations, and he uses the cpu2006 benchmark suit. According to > > > > > > his results, general workloads has a little proportion (<10%) of > > > > > > "write but not dirty" out of all the write operations, while few > > > > > > workloads have higher proportion (one even as high as 50%). Now > > > > > > we are not sure why "write but not dirty" would happen, it just > > happened. > > > > > > > > > > > > So these two reasons contribute to the false dirty pages. To > > > > > > optimize, I compute and store the SHA1 hash before transferring > > > > > > each page. Next time, if one page needs retransmission, its > > > > > > SHA1 hash is computed again, and compared to the old hash. If > > > > > > the hash is the same, it's a false dirty page, and we just skip > > > > > > this page; Otherwise, the page is transferred, and the new hash > > > > > > replaces the old one for next comparison. > > > > > > The reason to use SHA1 hash but not byte-by-byte comparison is > > > > > > the memory overheads. One SHA1 hash is 20 bytes. So we need > > > > > > extra > > > > > > 20/4096 (<1/200) memory space of the whole VM memory, which is > > > > > > relatively small. > > > > > > As far as I know, SHA1 hash is widely used in the scenes of > > > > > > deduplication for backup systems. > > > > > > They have proven that the probability of hash collision is far > > > > > > smaller than disk hardware fault, so it's secure hash, that is, > > > > > > if the hashes of two chunks are the same, the content must be the > > same. > > > > > > So I think the SHA1 hash could replace byte-to-byte comparison > > > > > > in the VM memory scenery. > > > > > > > > > > > > Then I do the same migration experiments using the SHA1 hash. > > > > > > For the 4 workloads which have big proportions of false dirty > > > > > > pages, the improvement is remarkable. Without optimization, they > > > > > > either can not converge to stop-and-copy, or take a very long time to > > complete. > > > > > > With the > > > > > > SHA1 hash method, all of them now complete in a relatively short > > time. > > > > > > For the reason I have talked above, the other workloads don't > > > > > > get notable improvements from the optimization. So below, I only > > > > > > show the exact number after optimization for the 4 workloads > > > > > > with remarkable improvements. > > > > > > > > > > > > Any comments or suggestions? > > > > > > > > > > Maybe you can compare the performance of your solution as that of > > > > XBZRLE to see which one is better. > > > > > The merit of using SHA1 is that it can avoid data copy as that in > > > > > XBZRLE, and > > > > need less buffer. > > > > > How about the overhead of calculating the SHA1? Is it faster than > > > > > copying a > > > > page? > > > > > > > > > > Liang > > > > > > > > > > > > > > > > > > Yes, XBZRLE is able to handle the false dirty pages. However, if we > > > > want to avoid transferring all of the false dirty pages using > > > > XBZRLE, we need a buffer as big as the whole VM memory, while SHA1 > > > > needs a much small buffer. Of course, if we have a buffer as big as > > > > the whole VM memory using XBZRLE, we could transfer less data on > > > > network than SHA1, because XBZRLE is able to compress similar pages. > > > > In a word, yes, the merit of using SHA1 is that it needs much less > > > > buffer, and leads to nice improvement if there are many false dirty pages. > > > > > > > > > > The current implementation of XBZRLE begins to buffer page from the > > > second iteration, Maybe it's worth to make it start to work from the first > > iteration based on your finding. > > > > > > > In terms of the overhead of calculating the SHA1 compared with > > > > transferring a page, it's related to the CPU and network > > > > performance. In my test environment(Intel Xeon > > > > E5620 @2.4GHz, 1Gbps Ethernet), I didn't observe obvious extra > > > > computing overhead caused by calculating the SHA1, because the > > > > throughput of network (got by "info migrate") remains almost the same. > > > > > > You can check the CPU usage, or to measure the time spend on a local > > > live migration which use SHA1/ XBZRLE. > > > > > > Liang > > > > > > > > > > I compare SHA1 with XBZRLE. I use XBZRLE in two ways: > > 1. Begins to buffer pages from iteration 1; 2. As current implementation, > > begins to buffer pages from iteration 2. > > > > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf, > > memcached. > > I set the cache size as 256MB for zeusmp & mcf (they run in VM with 1GB > > ram), and set the cache size as 1GB for memcached (it run in VM with 6GB > > ram, and memcached takes 4GB as cache). > > > > As you can read from the data below, beginning to buffer pages from > > iteration 1 is better than the current implementation(from iteration 2), > > because the total migration time is shorter. > > > > SHA1 is better than the XBZRLE with the cache size I choose, because it leads > > to shorter migration time, and consumes far less memory overhead (<1/200 > > of the total VM memory). > > > > Hi Chunguang, > > Have you tried to use a large XBZRLE cache size which equals to the guest's RAM size? > Is SHA1 faster in that case? > > Thanks! > Liang You can check the data below. For zeusmp and mcf when the XBZRLE cache size equals to the guest's RAM size (in fact, the 1024 cache size is a little smaller than the RAM size, because the guest's RAM has a little extra ram space besides the 1GB we set), XBZRLE is faster than SHA1. For the memcached, I am not able to set the cache size as the 6GB RAM size, because the cache size has to be a power of 2; And I am not able to set it larger than RAM size, because the current implementation doesn't allow that. So I set the cache size as 4GB, and XBZRLE with this cache size is almost the same as SHA1 in terms of migration time. Note that XBZRLE begins to buffer pages from iteration 1. zeusmp 1024MB cache Iteration 1, duration: 21604 ms , transferred pages: 266450 (dup: 89509, n: 176941, x: 0) , new dirty pages: 129647 , remaining dirty pages: 129647 Iteration 2, duration: 652 ms , transferred pages: 89270 (dup: 78176, n: 1085, x: 10009) , new dirty pages: 46438 , remaining dirty pages: 46438 Iteration 3, duration: 400 ms , transferred pages: 35789 (dup: 30536, n: 0, x: 5253) , new dirty pages: 33569 , remaining dirty pages: 33569 Iteration 4, duration: 470 ms , transferred pages: 19106 (dup: 10317, n: 75, x: 8714) , new dirty pages: 39307 , remaining dirty pages: 39307 Iteration 5, duration: 72 ms , transferred pages: 17853 (dup: 15904, n: 0, x: 1949) , new dirty pages: 4078 , remaining dirty pages: 4078 Iteration 6, duration: 10 ms , transferred pages: 3280 (dup: 2910, n: 0, x: 370) , new dirty pages: 521 , remaining dirty pages: 521 Iteration 7, duration: 254 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 521 total time: 23481 milliseconds (v.s. 27225 milliseconds for SHA1) mcf 1024MB cache Iteration 1, duration: 31704 ms , transferred pages: 266450 (dup: 6794, n: 259656, x: 0) , new dirty pages: 233250 , remaining dirty pages: 233250 Iteration 2, duration: 544 ms , transferred pages: 34186 (dup: 182, n: 423, x: 33581) , new dirty pages: 32757 , remaining dirty pages: 32757 Iteration 3, duration: 67 ms , transferred pages: 8536 (dup: 0, n: 0, x: 8536) , new dirty pages: 5305 , remaining dirty pages: 5305 Iteration 4, duration: 13 ms , transferred pages: 2125 (dup: 0, n: 0, x: 2125) , new dirty pages: 1632 , remaining dirty pages: 1632 Iteration 5, duration: 9 ms , transferred pages: 1038 (dup: 0, n: 0, x: 1038) , new dirty pages: 1095 , remaining dirty pages: 1095 Iteration 6, duration: 3 ms , transferred pages: 592 (dup: 0, n: 0, x: 592) , new dirty pages: 1148 , remaining dirty pages: 1148 Iteration 7, duration: 2 ms , transferred pages: 136 (dup: 0, n: 0, x: 136) , new dirty pages: 1123 , remaining dirty pages: 1123 Iteration 8, duration: 2 ms , transferred pages: 2 (dup: 0, n: 0, x: 2) , new dirty pages: 985 , remaining dirty pages: 985 Iteration 9, duration: 2 ms , transferred pages: 14 (dup: 0, n: 0, x: 14) , new dirty pages: 640 , remaining dirty pages: 640 Iteration 10, duration: 2 ms , transferred pages: 16 (dup: 0, n: 0, x: 16) , new dirty pages: 622 , remaining dirty pages: 622 Iteration 11, duration: 1 ms , transferred pages: 1 (dup: 0, n: 0, x: 1) , new dirty pages: 693 , remaining dirty pages: 693 Iteration 12, duration: 1 ms , transferred pages: 122 (dup: 0, n: 0, x: 122) , new dirty pages: 639 , remaining dirty pages: 639 Iteration 13, duration: 2 ms , transferred pages: 475 (dup: 0, n: 0, x: 475) , new dirty pages: 522 , remaining dirty pages: 522 Iteration 14, duration: 22 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 27 , remaining dirty pages: 549 total time: 32393 milliseconds (v.s. 97919 milliseconds for SHA1) memcached 4096MB cache Iteration 1, duration: 41025 ms , transferred pages: 1569059 (dup: 395085, n: 1173974, x: 0) , new dirty pages: 560788 , remaining dirty pages: 568899 Iteration 2, duration: 8218 ms , transferred pages: 300889 (dup: 3963, n: 142928, x: 153998) , new dirty pages: 158832 , remaining dirty pages: 167022 Iteration 3, duration: 2408 ms , transferred pages: 98923 (dup: 285, n: 33854, x: 64784) , new dirty pages: 68647 , remaining dirty pages: 77338 Iteration 4, duration: 869 ms , transferred pages: 43408 (dup: 64, n: 17911, x: 25433) , new dirty pages: 26087 , remaining dirty pages: 33845 Iteration 5, duration: 455 ms , transferred pages: 23048 (dup: 55, n: 10156, x: 12837) , new dirty pages: 15275 , remaining dirty pages: 16636 Iteration 6, duration: 162 ms , transferred pages: 7939 (dup: 55, n: 2425, x: 5459) , new dirty pages: 6009 , remaining dirty pages: 10051 Iteration 7, duration: 52 ms , transferred pages: 5761 (dup: 212, n: 707, x: 4842) , new dirty pages: 2204 , remaining dirty pages: 4027 Iteration 8, duration: 1 ms , transferred pages: 0 (dup: 0, n: 0, x: 0) , new dirty pages: 0 , remaining dirty pages: 4027 total time: 53255 milliseconds (v.s. 54693 milliseconds for SHA1) -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-07 13:52 ` Chunguang Li 2016-11-07 14:17 ` Li, Liang Z @ 2016-11-07 14:44 ` Li, Liang Z 1 sibling, 0 replies; 21+ messages in thread From: Li, Liang Z @ 2016-11-07 14:44 UTC (permalink / raw) To: Chunguang Li Cc: Dr. David Alan Gilbert, Amit Shah, pbonzini, qemu-devel, stefanha, quintela > > I compare SHA1 with XBZRLE. I use XBZRLE in two ways: > > 1. Begins to buffer pages from iteration 1; 2. As current > > implementation, begins to buffer pages from iteration 2. > > > > I post the results of three workloads: cpu2006.zeusmp, cpu2006.mcf, > > memcached. > > I set the cache size as 256MB for zeusmp & mcf (they run in VM with > > 1GB ram), and set the cache size as 1GB for memcached (it run in VM > > with 6GB ram, and memcached takes 4GB as cache). > > > > As you can read from the data below, beginning to buffer pages from > > iteration 1 is better than the current implementation(from iteration > > 2), because the total migration time is shorter. > > > > SHA1 is better than the XBZRLE with the cache size I choose, because > > it leads to shorter migration time, and consumes far less memory > > overhead (<1/200 of the total VM memory). > > > > Hi Chunguang, > > Have you tried to use a large XBZRLE cache size which equals to the guest's > RAM size? > Is SHA1 faster in that case? > > Thanks! > Liang Intel's future chipset will contain hardware engines which supports SHA-x and MD5, We can make use these engines to offload the overhead from CPU for SHA/MD5 calculation. Liang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-03 8:25 ` Chunguang Li 2016-11-03 9:59 ` Li, Liang Z 2016-11-03 10:13 ` Li, Liang Z @ 2016-11-08 11:05 ` Dr. David Alan Gilbert 2016-11-08 13:40 ` Chunguang Li 2 siblings, 1 reply; 21+ messages in thread From: Dr. David Alan Gilbert @ 2016-11-08 11:05 UTC (permalink / raw) To: Chunguang Li; +Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > -----Original Messages----- > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Sent Time: Friday, October 14, 2016 > > To: "Chunguang Li" <lichunguang@hust.edu.cn> > > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > > > > > > > > -----原始邮件----- > > > > 发件人: "Amit Shah" <amit.shah@redhat.com> > > > > 发送时间: 2016年9月30日 星期五 > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > > > > > > > > > > > > > > > > > -----原始邮件----- > > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > > > 发送时间: 2016年9月26日 星期一 > > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > > Hi all! > > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > > > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > > > > > > > > > I think you're right that this can happen; to clarify I think the > > > > > > case you're talking about is: > > > > > > > > > > > > Iteration 1 > > > > > > sync bitmap > > > > > > start sending pages > > > > > > page 'n' is modified - but hasn't been sent yet > > > > > > page 'n' gets sent > > > > > > Iteration 2 > > > > > > sync bitmap > > > > > > 'page n is shown as modified' > > > > > > send page 'n' again > > > > > > > > > > > > > > > > Yes,this is right the case I am talking about. > > > > > > > > > > > So you're right that is wasteful; I guess it's more wasteful > > > > > > on big VMs with slow networks where the length of each iteration > > > > > > is large. > > > > > > > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > It makes sense, can you get some perf numbers to show what kinds of > > > > workloads get impacted the most? That would also help us to figure > > > > out what kinds of speed improvements we can expect. > > > > > > > > > > > > Amit > > > > > > I have picked up 6 workloads and got the following statistics numbers > > > of every iteration (except the last stop-copy one) during precopy. > > > These numbers are obtained with the basic precopy migration, without > > > the capabilities like xbzrle or compression, etc. The network for the > > > migration is exclusive, with a separate network for the workloads. > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > Three (booting, idle, web server) of them converged to the stop-copy phase, > > > with the given bandwidth and default downtime (300ms), while the other > > > three (kernel compilation, zeusmp, memcached) did not. > > > > > > One page is "not-really-dirty", if it is written first and is sent later > > > (and not written again after that) during one iteration. I guess this > > > would not happen so often during the other iterations as during the 1st > > > iteration. Because all the pages of the VM are sent to the dest node during > > > the 1st iteration, while during the others, only part of the pages are sent. > > > So I think the "not-really-dirty" pages should be produced mainly during > > > the 1st iteration , and maybe very little during the other iterations. > > > > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > > > think the time spent on Iteration 2 would be halved. This is a chain reaction, > > > because the dirty pages produced during Iteration 2 is halved, which incurs > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > Yes; these numbers don't show how many of them are false dirty though. > > > > One problem is thinking about pages that have been redirtied, if the page is dirtied > > after the sync but before the network write then it's the false-dirty that > > you're describing. > > > > However, if the page is being written a few times, and so it would have been written > > after the network write then it isn't a false-dirty. > > > > You might be able to figure that out with some kernel tracing of when the dirtying > > happens, but it might be easier to write the fix! > > > > Dave > > Hi, I have made some new progress now. > > To tell how many false dirty pages there are exactly in each iteration, I malloc a > buffer in memory as big as the size of the whole VM memory. When a page is > transferred to the dest node, it is copied to the buffer; During the next iteration, > if one page is transferred, it is compared to the old one in the buffer, and the > old one will be replaced for next comparison if it is really dirty. Thus, we are now > able to get the exact number of false dirty pages. > > This time, I use 15 workloads to get the statistic number. They are: > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick > up these 11 benchmarks because compared to others, they have bigger memory > occupation and higher memory dirty rate. Thus most of them could not converge > to stop-and-copy using the default migration speed (32MB/s). > 2. kernel compilation > 3. idle VM > 4. Apache web server which serves static content > > (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the > migration speed is the default 32MB/s) > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache. > After filling up the 4GB cache, a client writes the cache at a constant speed > during migration. This time, migration speed has no limit, and is up to the > capability of 1Gbps Ethernet. > > Summarize the results first: (and you can read the precise number below) > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations) > of false dirty pages out of all the dirty pages since iteration 2 (and the big > proportion lasts during the following iterations). They are cpu2006.zeusmp, > cpu2006.bzip2, cpu2006.mcf, and memcached. > 2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even > though the proportion of false dirty pages is big since iteration 2, the space to > optimize is small. > 3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not > in the other iterations. > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false > dirty pages since iteration 2. So the spaces to optimize for them are small. > > Now I want to talk a little more about the reasons why false dirty pages are produced. > The first reason is what we have discussed before---the mechanism to track the dirty > pages. > And then I come up with another reason. Here is the situation: a write operation to one > memory page happens, but it doesn't change any content of the page. So it's "write but > not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments > to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006 > benchmark suit. According to his results, general workloads has a little proportion (<10%) > of "write but not dirty" out of all the write operations, while few workloads have higher > proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would > happen, it just happened. I think there are a few different reasons I can think of: a) You have a flag or mutex that's set and cleared; so it gets set (marked dirty) and cleared around some operation. By the time we come to migrate it then it's back to cleared again. Similarly with other temporary data structures. b) Some system operation causes the page to be moved - e.g. swap or the kernel reorganising memory. However, it's a shame I don't think you can tell in your experiment which of the two cases we're hitting? I'd like to know if it's worth working on making the page sync mechanism better or if it's nore important to deal with the second reason you show. > So these two reasons contribute to the false dirty pages. To optimize, I compute and store > the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its > SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a > false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new > hash replaces the old one for next comparison. > The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1 > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which > is relatively small. > As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems. > They have proven that the probability of hash collision is far smaller than disk hardware fault, > so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the > same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery. There was a proposal ( https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05331.html ) to do a migration system where a copy of the migration RAM is stored on disc on the destination for cases where similar VMs are migrated, and it used a checksum for each page to find the matching page in the cache; that originally used a smaller hash, I think in the end they used a SHA-256. (Hash based checks still make me nervous for intentional collisions but that's probably me being paranoid?) > Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have > big proportions of false dirty pages, the improvement is remarkable. Without optimization, > they either can not converge to stop-and-copy, or take a very long time to complete. With the > SHA1 hash method, all of them now complete in a relatively short time. > For the reason I have talked above, the other workloads don't get notable improvements from the > optimization. So below, I only show the exact number after optimization for the 4 workloads with > remarkable improvements. > > Any comments or suggestions? You might be able to save some of the CPU time; we've got a test that checks if a page is all-zero; if you're doing the SHA calculation you could avoid doing the all-zero check and replace it by comparing hte output of the SHA. > > Below is the experiments data: > ( > "dup" means zero page, this kind of pages takes very little migration time and network > resources, so they are always not regard as dirty pages in my numbers; > "rd" means really dirty pages; > "fd" means false dirty pages; > The numbers refer to the quantities of pages. > ) > > ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)------------------- > > 1. memcached > > ----- original pre-copy (can not converge): ----- > Iteration 1, duration: 42111 ms , transferred pages: 1568788 (dup: 416239, rd: 1152549, fd: 0) , new dirty pages: 499015 , remaining dirty pages: 507397 > Iteration 2, duration: 17208 ms , transferred pages: 498946 (dup: 5456, rd: 160206, fd: 333284) , new dirty pages: 261237 , remaining dirty pages: 269688 > Iteration 3, duration: 9134 ms , transferred pages: 262377 (dup: 519, rd: 111900, fd: 149958) , new dirty pages: 170281 , remaining dirty pages: 177592 > Iteration 4, duration: 5920 ms , transferred pages: 169966 (dup: 87, rd: 82487, fd: 87392) , new dirty pages: 121154 , remaining dirty pages: 128780 > Iteration 5, duration: 4239 ms , transferred pages: 121551 (dup: 81, rd: 64120, fd: 57350) , new dirty pages: 100976 , remaining dirty pages: 108205 > Iteration 6, duration: 3495 ms , transferred pages: 100353 (dup: 90, rd: 56021, fd: 44242) , new dirty pages: 74547 , remaining dirty pages: 82399 > Iteration 7, duration: 2583 ms , transferred pages: 74160 (dup: 56, rd: 38016, fd: 36088) , new dirty pages: 58209 , remaining dirty pages: 66448 > Iteration 8, duration: 2039 ms , transferred pages: 58534 (dup: 81, rd: 26885, fd: 31568) , new dirty pages: 43511 , remaining dirty pages: 51425 > Iteration 9, duration: 1513 ms , transferred pages: 43484 (dup: 55, rd: 26641, fd: 16788) , new dirty pages: 43722 , remaining dirty pages: 51663 > Iteration 10, duration: 1521 ms , transferred pages: 43676 (dup: 62, rd: 26463, fd: 17151) , new dirty pages: 35347 , remaining dirty pages: 43334 > Iteration 11, duration: 1230 ms , transferred pages: 35287 (dup: 0, rd: 21293, fd: 13994) , new dirty pages: 28851 , remaining dirty pages: 36898 > Iteration 12, duration: 1031 ms , transferred pages: 29651 (dup: 82, rd: 18143, fd: 11426) , new dirty pages: 27062 , remaining dirty pages: 34309 > Iteration 13, duration: 917 ms , transferred pages: 26385 (dup: 56, rd: 14149, fd: 12180) , new dirty pages: 22723 , remaining dirty pages: 30647 > Iteration 14, duration: 762 ms , transferred pages: 21902 (dup: 55, rd: 16355, fd: 5492) , new dirty pages: 18208 , remaining dirty pages: 26953 > Iteration 15, duration: 650 ms , transferred pages: 18636 (dup: 0, rd: 11943, fd: 6693) , new dirty pages: 16085 , remaining dirty pages: 24402 > Iteration 16, duration: 554 ms , transferred pages: 15946 (dup: 56, rd: 9527, fd: 6363) , new dirty pages: 14766 , remaining dirty pages: 23222 > Iteration 17, duration: 538 ms , transferred pages: 15434 (dup: 0, rd: 9779, fd: 5655) , new dirty pages: 13381 , remaining dirty pages: 21169 > Iteration 18, duration: 487 ms , transferred pages: 14089 (dup: 81, rd: 7737, fd: 6271) , new dirty pages: 13325 , remaining dirty pages: 20405 > Iteration 19, duration: 428 ms , transferred pages: 12232 (dup: 0, rd: 8488, fd: 3744) , new dirty pages: 10274 , remaining dirty pages: 18447 > Iteration 20, duration: 377 ms , transferred pages: 10887 (dup: 56, rd: 6362, fd: 4469) , new dirty pages: 9708 , remaining dirty pages: 17268 > Iteration 21, duration: 320 ms , transferred pages: 9222 (dup: 0, rd: 5789, fd: 3433) , new dirty pages: 8015 , remaining dirty pages: 16061 > Iteration 22, duration: 268 ms , transferred pages: 7621 (dup: 0, rd: 6204, fd: 1417) , new dirty pages: 7227 , remaining dirty pages: 15667 > Iteration 23, duration: 269 ms , transferred pages: 7813 (dup: 56, rd: 4410, fd: 3347) , new dirty pages: 7591 , remaining dirty pages: 15445 > Iteration 24, duration: 271 ms , transferred pages: 7749 (dup: 0, rd: 4565, fd: 3184) , new dirty pages: 15126 , remaining dirty pages: 22822 > Iteration 25, duration: 549 ms , transferred pages: 15818 (dup: 60, rd: 10545, fd: 5213) , new dirty pages: 14559 , remaining dirty pages: 21563 > Iteration 26, duration: 499 ms , transferred pages: 14281 (dup: 3, rd: 8760, fd: 5518) , new dirty pages: 11947 , remaining dirty pages: 19229 > Iteration 27, duration: 376 ms , transferred pages: 10823 (dup: 25, rd: 6550, fd: 4248) , new dirty pages: 8561 , remaining dirty pages: 16967 > Iteration 28, duration: 324 ms , transferred pages: 9350 (dup: 31, rd: 5292, fd: 4027) , new dirty pages: 8655 , remaining dirty pages: 16272 > Iteration 29, duration: 274 ms , transferred pages: 7813 (dup: 0, rd: 6088, fd: 1725) , new dirty pages: 6300 , remaining dirty pages: 14759 > Iteration 30, duration: 218 ms , transferred pages: 6340 (dup: 45, rd: 3196, fd: 3099) , new dirty pages: 5143 , remaining dirty pages: 13562 > > ----- after optimization: ----- > Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979 > Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479 Big difference. > Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100 > Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414 > Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536 > Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516 > Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802 > Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060 > Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973 > Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014 > Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516 > Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842 > Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871 > Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989 > Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989 > total time: 54693 milliseconds Very nice. Dave > 2. cpu2006.zeusmp > > ----- original pre-copy (can not converge): ----- > Iteration 1, duration: 21112 ms , transferred pages: 266450 (dup: 93385, rd: 173065, fd: 0) , new dirty pages: 127866 , remaining dirty pages: 127866 > Iteration 2, duration: 6192 ms , transferred pages: 125662 (dup: 75762, rd: 17389, fd: 32511) , new dirty pages: 131655 , remaining dirty pages: 133859 > Iteration 3, duration: 6699 ms , transferred pages: 131937 (dup: 77298, rd: 20320, fd: 34319) , new dirty pages: 121027 , remaining dirty pages: 122949 > Iteration 4, duration: 5999 ms , transferred pages: 122512 (dup: 73588, rd: 17236, fd: 31688) , new dirty pages: 122759 , remaining dirty pages: 123196 > Iteration 5, duration: 5804 ms , transferred pages: 122717 (dup: 75436, rd: 19016, fd: 28265) , new dirty pages: 123697 , remaining dirty pages: 124176 > Iteration 6, duration: 5698 ms , transferred pages: 123708 (dup: 77249, rd: 18022, fd: 28437) , new dirty pages: 121838 , remaining dirty pages: 122306 > Iteration 7, duration: 5515 ms , transferred pages: 122306 (dup: 76727, rd: 14819, fd: 30760) , new dirty pages: 122382 , remaining dirty pages: 122382 > Iteration 8, duration: 6086 ms , transferred pages: 120825 (dup: 71834, rd: 15987, fd: 33004) , new dirty pages: 121587 , remaining dirty pages: 123144 > Iteration 9, duration: 5899 ms , transferred pages: 120964 (dup: 72860, rd: 18191, fd: 29913) , new dirty pages: 120391 , remaining dirty pages: 122571 > Iteration 10, duration: 5801 ms , transferred pages: 121425 (dup: 74140, rd: 20722, fd: 26563) , new dirty pages: 122302 , remaining dirty pages: 123448 > Iteration 11, duration: 5909 ms , transferred pages: 123448 (dup: 74735, rd: 19678, fd: 29035) , new dirty pages: 123258 , remaining dirty pages: 123258 > Iteration 12, duration: 6293 ms , transferred pages: 121211 (dup: 70442, rd: 18128, fd: 32641) , new dirty pages: 123623 , remaining dirty pages: 125670 > Iteration 13, duration: 6398 ms , transferred pages: 124897 (dup: 72701, rd: 21134, fd: 31062) , new dirty pages: 122355 , remaining dirty pages: 123128 > Iteration 14, duration: 6301 ms , transferred pages: 121893 (dup: 70514, rd: 23470, fd: 27909) , new dirty pages: 120980 , remaining dirty pages: 122215 > Iteration 15, duration: 6304 ms , transferred pages: 121389 (dup: 70005, rd: 21731, fd: 29653) , new dirty pages: 121628 , remaining dirty pages: 122454 > Iteration 16, duration: 6398 ms , transferred pages: 122164 (dup: 69962, rd: 24376, fd: 27826) , new dirty pages: 122246 , remaining dirty pages: 122536 > Iteration 17, duration: 6201 ms , transferred pages: 121548 (dup: 70984, rd: 23915, fd: 26649) , new dirty pages: 121460 , remaining dirty pages: 122448 > Iteration 18, duration: 6401 ms , transferred pages: 122272 (dup: 70072, rd: 22261, fd: 29939) , new dirty pages: 123518 , remaining dirty pages: 123694 > Iteration 19, duration: 7003 ms , transferred pages: 121873 (dup: 64754, rd: 27325, fd: 29794) , new dirty pages: 120568 , remaining dirty pages: 122389 > Iteration 20, duration: 6400 ms , transferred pages: 121422 (dup: 69221, rd: 25300, fd: 26901) , new dirty pages: 121229 , remaining dirty pages: 122196 > Iteration 21, duration: 6703 ms , transferred pages: 119895 (dup: 65232, rd: 25877, fd: 28786) , new dirty pages: 123284 , remaining dirty pages: 125585 > Iteration 22, duration: 6902 ms , transferred pages: 123884 (dup: 67582, rd: 29020, fd: 27282) , new dirty pages: 122057 , remaining dirty pages: 123758 > Iteration 23, duration: 6800 ms , transferred pages: 122010 (dup: 66529, rd: 30644, fd: 24837) , new dirty pages: 120916 , remaining dirty pages: 122664 > Iteration 24, duration: 7202 ms , transferred pages: 121951 (dup: 63188, rd: 31105, fd: 27658) , new dirty pages: 122715 , remaining dirty pages: 123428 > Iteration 25, duration: 7202 ms , transferred pages: 122919 (dup: 64161, rd: 32063, fd: 26695) , new dirty pages: 123180 , remaining dirty pages: 123689 > Iteration 26, duration: 7404 ms , transferred pages: 123092 (dup: 62694, rd: 33459, fd: 26939) , new dirty pages: 122149 , remaining dirty pages: 122746 > Iteration 27, duration: 7205 ms , transferred pages: 120427 (dup: 61664, rd: 34344, fd: 24419) , new dirty pages: 120299 , remaining dirty pages: 122618 > Iteration 28, duration: 7100 ms , transferred pages: 121074 (dup: 63130, rd: 32403, fd: 25541) , new dirty pages: 122984 , remaining dirty pages: 124528 > Iteration 29, duration: 7904 ms , transferred pages: 124060 (dup: 59564, rd: 35631, fd: 28865) , new dirty pages: 127080 , remaining dirty pages: 127548 > Iteration 30, duration: 7906 ms , transferred pages: 127518 (dup: 63029, rd: 34416, fd: 30073) , new dirty pages: 125028 , remaining dirty pages: 125058 > > ----- after optimization: ----- > Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843 > Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945 > Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929 > Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916 > Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302 > Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500 > Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520 > Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524 > total time: 27225 milliseconds > > 3. cpu2006.bzip2 > > ----- original pre-copy: ----- > Iteration 1, duration: 18306 ms , transferred pages: 266450 (dup: 116569, rd: 149881, fd: 0) , new dirty pages: 106299 , remaining dirty pages: 106299 > Iteration 2, duration: 10694 ms , transferred pages: 104611 (dup: 17550, rd: 10536, fd: 76525) , new dirty pages: 34394 , remaining dirty pages: 36082 > Iteration 3, duration: 2998 ms , transferred pages: 34442 (dup: 9924, rd: 12254, fd: 12264) , new dirty pages: 6419 , remaining dirty pages: 8059 > Iteration 4, duration: 699 ms , transferred pages: 5748 (dup: 22, rd: 2583, fd: 3143) , new dirty pages: 1226 , remaining dirty pages: 3537 > Iteration 5, duration: 200 ms , transferred pages: 1636 (dup: 0, rd: 1194, fd: 442) , new dirty pages: 478 , remaining dirty pages: 2379 > Iteration 6, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2379 > > ----- after optimization: ----- > Iteration 1, duration: 13995 ms , transferred pages: 266314 (dup: 152118, rd: 114196) , new dirty pages: 97009 , remaining dirty pages: 97145 > Iteration 2, duration: 1215 ms , transferred pages: 33400 (dup: 26745, rd: 6655) , new dirty pages: 12866 , remaining dirty pages: 14017 > Iteration 3, duration: 701 ms , transferred pages: 5774 (dup: 48, rd: 5726) , new dirty pages: 6342 , remaining dirty pages: 8761 > Iteration 4, duration: 500 ms , transferred pages: 4111 (dup: 21, rd: 4090) , new dirty pages: 4311 , remaining dirty pages: 6485 > Iteration 5, duration: 400 ms , transferred pages: 3273 (dup: 1, rd: 3272) , new dirty pages: 3034 , remaining dirty pages: 5431 > Iteration 6, duration: 301 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2094 , remaining dirty pages: 4472 > Iteration 7, duration: 299 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2066 , remaining dirty pages: 4082 > Iteration 8, duration: 202 ms , transferred pages: 1636 (dup: 0, rd: 1636) , new dirty pages: 2881 , remaining dirty pages: 4648 > Iteration 9, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 4775 , remaining dirty pages: 6778 > Iteration 10, duration: 400 ms , transferred pages: 3281 (dup: 9, rd: 3272) , new dirty pages: 3757 , remaining dirty pages: 5576 > Iteration 11, duration: 401 ms , transferred pages: 3279 (dup: 7, rd: 3272) , new dirty pages: 6980 , remaining dirty pages: 8906 > Iteration 12, duration: 500 ms , transferred pages: 7118 (dup: 3035, rd: 4083) , new dirty pages: 10774 , remaining dirty pages: 11922 > Iteration 13, duration: 116 ms , transferred pages: 11706 (dup: 10152, rd: 1554) , new dirty pages: 1326 , remaining dirty pages: 1326 > Iteration 14, duration: 117 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 1326 > total time: 19479 milliseconds > > 4. cpu2006.mcf > > ----- original pre-copy: ----- > Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6925, rd: 259525, fd: 0) , new dirty pages: 244403 , remaining dirty pages: 244403 > Iteration 2, duration: 29603 ms , transferred pages: 242275 (dup: 377, rd: 224001, fd: 17897) , new dirty pages: 227335 , remaining dirty pages: 229463 > Iteration 3, duration: 27806 ms , transferred pages: 227573 (dup: 169, rd: 65681, fd: 161723) , new dirty pages: 195593 , remaining dirty pages: 197483 > Iteration 4, duration: 23907 ms , transferred pages: 195543 (dup: 41, rd: 39838, fd: 155664) , new dirty pages: 215066 , remaining dirty pages: 217006 > Iteration 5, duration: 26305 ms , transferred pages: 215289 (dup: 155, rd: 33082, fd: 182052) , new dirty pages: 111098 , remaining dirty pages: 112815 > Iteration 6, duration: 13502 ms , transferred pages: 110452 (dup: 22, rd: 26793, fd: 83637) , new dirty pages: 161054 , remaining dirty pages: 163417 > Iteration 7, duration: 19705 ms , transferred pages: 161266 (dup: 120, rd: 33818, fd: 127328) , new dirty pages: 220562 , remaining dirty pages: 222713 > Iteration 8, duration: 27003 ms , transferred pages: 220881 (dup: 21, rd: 215721, fd: 5139) , new dirty pages: 219787 , remaining dirty pages: 221619 > Iteration 9, duration: 26802 ms , transferred pages: 219248 (dup: 24, rd: 84648, fd: 134576) , new dirty pages: 207959 , remaining dirty pages: 210330 > Iteration 10, duration: 25411 ms , transferred pages: 207916 (dup: 144, rd: 35842, fd: 171930) , new dirty pages: 144442 , remaining dirty pages: 146856 > Iteration 11, duration: 17714 ms , transferred pages: 144804 (dup: 18, rd: 25414, fd: 119372) , new dirty pages: 205127 , remaining dirty pages: 207179 > Iteration 12, duration: 25112 ms , transferred pages: 205446 (dup: 128, rd: 23197, fd: 182121) , new dirty pages: 167319 , remaining dirty pages: 169052 > Iteration 13, duration: 20411 ms , transferred pages: 166886 (dup: 14, rd: 21960, fd: 144912) , new dirty pages: 221592 , remaining dirty pages: 223758 > Iteration 14, duration: 27126 ms , transferred pages: 221800 (dup: 122, rd: 42368, fd: 179310) , new dirty pages: 233630 , remaining dirty pages: 235588 > Iteration 15, duration: 28517 ms , transferred pages: 233321 (dup: 191, rd: 222528, fd: 10602) , new dirty pages: 224282 , remaining dirty pages: 226549 > Iteration 16, duration: 27422 ms , transferred pages: 224187 (dup: 55, rd: 45773, fd: 178359) , new dirty pages: 209815 , remaining dirty pages: 212177 > Iteration 17, duration: 25723 ms , transferred pages: 210260 (dup: 34, rd: 79405, fd: 130821) , new dirty pages: 220297 , remaining dirty pages: 222214 > Iteration 18, duration: 26920 ms , transferred pages: 220056 (dup: 14, rd: 214128, fd: 5914) , new dirty pages: 192015 , remaining dirty pages: 194173 > Iteration 19, duration: 23520 ms , transferred pages: 192239 (dup: 9, rd: 25140, fd: 167090) , new dirty pages: 96450 , remaining dirty pages: 98384 > Iteration 20, duration: 11805 ms , transferred pages: 96538 (dup: 14, rd: 7424, fd: 89100) , new dirty pages: 6978 , remaining dirty pages: 8824 > Iteration 21, duration: 799 ms , transferred pages: 6545 (dup: 1, rd: 1802, fd: 4742) , new dirty pages: 138 , remaining dirty pages: 2417 > Iteration 22, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2417 > > ----- after optimization: ----- > Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209 > Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571 > Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478 > Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172 > Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305 > Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675 > Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265 > Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668 > Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014 > Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214 > Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725 > Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903 > Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260 > Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266 > Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792 > Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639 > Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170 > Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547 > Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521 > Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054 > Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624 > Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265 > Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813 > Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608 > Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311 > Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294 > Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936 > Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423 > Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292 > Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457 > Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304 > Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815 > Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670 > Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422 > Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222 > Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337 > Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251 > Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251 > total time: 97919 milliseconds > > ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)------------------- > > 5. idle > > Iteration 1, duration: 14702 ms , transferred pages: 266450 (dup: 146393, rd: 120057, fd: 0) , new dirty pages: 14595 , remaining dirty pages: 14595 > Iteration 2, duration: 1592 ms , transferred pages: 12412 (dup: 103, rd: 3280, fd: 9029) , new dirty pages: 218 , remaining dirty pages: 2401 > Iteration 3, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2401 > > 6. kernel compilation (can not converge) > > Iteration 1, duration: 20607 ms , transferred pages: 266450 (dup: 97552, rd: 168898, fd: 0) , new dirty pages: 19293 , remaining dirty pages: 19293 > Iteration 2, duration: 2092 ms , transferred pages: 17176 (dup: 597, rd: 8625, fd: 7954) , new dirty pages: 8318 , remaining dirty pages: 10435 > Iteration 3, duration: 1000 ms , transferred pages: 8484 (dup: 304, rd: 6256, fd: 1924) , new dirty pages: 8736 , remaining dirty pages: 10687 > Iteration 4, duration: 1000 ms , transferred pages: 8435 (dup: 255, rd: 7089, fd: 1091) , new dirty pages: 7627 , remaining dirty pages: 9879 > Iteration 5, duration: 900 ms , transferred pages: 7553 (dup: 191, rd: 5602, fd: 1760) , new dirty pages: 7287 , remaining dirty pages: 9613 > Iteration 6, duration: 900 ms , transferred pages: 7620 (dup: 258, rd: 5761, fd: 1601) , new dirty pages: 8958 , remaining dirty pages: 10951 > Iteration 7, duration: 1099 ms , transferred pages: 9309 (dup: 311, rd: 8051, fd: 947) , new dirty pages: 7189 , remaining dirty pages: 8831 > Iteration 8, duration: 800 ms , transferred pages: 6832 (dup: 288, rd: 5717, fd: 827) , new dirty pages: 5782 , remaining dirty pages: 7781 > Iteration 9, duration: 701 ms , transferred pages: 5875 (dup: 149, rd: 4005, fd: 1721) , new dirty pages: 4587 , remaining dirty pages: 6493 > Iteration 10, duration: 500 ms , transferred pages: 4234 (dup: 144, rd: 3057, fd: 1033) , new dirty pages: 7352 , remaining dirty pages: 9611 > Iteration 11, duration: 900 ms , transferred pages: 7759 (dup: 397, rd: 6563, fd: 799) , new dirty pages: 6686 , remaining dirty pages: 8538 > Iteration 12, duration: 800 ms , transferred pages: 6808 (dup: 264, rd: 6017, fd: 527) , new dirty pages: 6871 , remaining dirty pages: 8601 > Iteration 13, duration: 800 ms , transferred pages: 6775 (dup: 231, rd: 5722, fd: 822) , new dirty pages: 7540 , remaining dirty pages: 9366 > Iteration 14, duration: 900 ms , transferred pages: 7507 (dup: 145, rd: 5900, fd: 1462) , new dirty pages: 7581 , remaining dirty pages: 9440 > Iteration 15, duration: 900 ms , transferred pages: 7630 (dup: 268, rd: 6211, fd: 1151) , new dirty pages: 7268 , remaining dirty pages: 9078 > Iteration 16, duration: 800 ms , transferred pages: 6759 (dup: 215, rd: 5763, fd: 781) , new dirty pages: 6861 , remaining dirty pages: 9180 > Iteration 17, duration: 800 ms , transferred pages: 6838 (dup: 294, rd: 6037, fd: 507) , new dirty pages: 6196 , remaining dirty pages: 8538 > Iteration 18, duration: 800 ms , transferred pages: 6852 (dup: 308, rd: 4905, fd: 1639) , new dirty pages: 5947 , remaining dirty pages: 7633 > Iteration 19, duration: 700 ms , transferred pages: 5919 (dup: 193, rd: 4853, fd: 873) , new dirty pages: 5861 , remaining dirty pages: 7575 > Iteration 20, duration: 600 ms , transferred pages: 5284 (dup: 376, rd: 4408, fd: 500) , new dirty pages: 5206 , remaining dirty pages: 7497 > Iteration 21, duration: 600 ms , transferred pages: 5147 (dup: 239, rd: 4308, fd: 600) , new dirty pages: 5031 , remaining dirty pages: 7381 > Iteration 22, duration: 599 ms , transferred pages: 5064 (dup: 156, rd: 4026, fd: 882) , new dirty pages: 5601 , remaining dirty pages: 7918 > Iteration 23, duration: 702 ms , transferred pages: 5965 (dup: 239, rd: 5028, fd: 698) , new dirty pages: 6079 , remaining dirty pages: 8032 > Iteration 24, duration: 700 ms , transferred pages: 6175 (dup: 449, rd: 5146, fd: 580) , new dirty pages: 10932 , remaining dirty pages: 12789 > Iteration 25, duration: 1300 ms , transferred pages: 10936 (dup: 302, rd: 6205, fd: 4429) , new dirty pages: 8713 , remaining dirty pages: 10566 > Iteration 26, duration: 1000 ms , transferred pages: 8282 (dup: 102, rd: 5662, fd: 2518) , new dirty pages: 5119 , remaining dirty pages: 7403 > Iteration 27, duration: 600 ms , transferred pages: 5007 (dup: 99, rd: 4099, fd: 809) , new dirty pages: 2226 , remaining dirty pages: 4622 > Iteration 28, duration: 300 ms , transferred pages: 2491 (dup: 37, rd: 1794, fd: 660) , new dirty pages: 6746 , remaining dirty pages: 8877 > Iteration 29, duration: 800 ms , transferred pages: 6757 (dup: 213, rd: 5532, fd: 1012) , new dirty pages: 6070 , remaining dirty pages: 8190 > Iteration 30, duration: 700 ms , transferred pages: 6052 (dup: 326, rd: 5107, fd: 619) , new dirty pages: 5177 , remaining dirty pages: 7315 > > 7. web server > > Iteration 1, duration: 20902 ms , transferred pages: 266450 (dup: 95497, rd: 170953, fd: 0) , new dirty pages: 8528 , remaining dirty pages: 8528 > Iteration 2, duration: 796 ms , transferred pages: 6472 (dup: 131, rd: 1885, fd: 4456) , new dirty pages: 650 , remaining dirty pages: 2706 > Iteration 3, duration: 100 ms , transferred pages: 818 (dup: 0, rd: 383, fd: 435) , new dirty pages: 328 , remaining dirty pages: 2216 > Iteration 4, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2216 > > > 8. cpu2006.bwaves (can not converge) > > Iteration 1, duration: 31715 ms , transferred pages: 266450 (dup: 6766, rd: 259684, fd: 0) , new dirty pages: 242702 , remaining dirty pages: 242702 > Iteration 2, duration: 29397 ms , transferred pages: 240508 (dup: 405, rd: 225588, fd: 14515) , new dirty pages: 230889 , remaining dirty pages: 233083 > Iteration 3, duration: 28205 ms , transferred pages: 230858 (dup: 182, rd: 214596, fd: 16080) , new dirty pages: 226998 , remaining dirty pages: 229223 > Iteration 4, duration: 27805 ms , transferred pages: 227574 (dup: 170, rd: 217045, fd: 10359) , new dirty pages: 227360 , remaining dirty pages: 229009 > Iteration 5, duration: 27703 ms , transferred pages: 226786 (dup: 200, rd: 212130, fd: 14456) , new dirty pages: 225885 , remaining dirty pages: 228108 > Iteration 6, duration: 27600 ms , transferred pages: 225923 (dup: 155, rd: 215503, fd: 10265) , new dirty pages: 223555 , remaining dirty pages: 225740 > Iteration 7, duration: 27309 ms , transferred pages: 223574 (dup: 260, rd: 215641, fd: 7673) , new dirty pages: 231975 , remaining dirty pages: 234141 > Iteration 8, duration: 28403 ms , transferred pages: 232397 (dup: 85, rd: 214086, fd: 18226) , new dirty pages: 222170 , remaining dirty pages: 223914 > Iteration 9, duration: 27105 ms , transferred pages: 221809 (dup: 131, rd: 214988, fd: 6690) , new dirty pages: 230065 , remaining dirty pages: 232170 > Iteration 10, duration: 28104 ms , transferred pages: 230201 (dup: 343, rd: 213531, fd: 16327) , new dirty pages: 227590 , remaining dirty pages: 229559 > Iteration 11, duration: 27801 ms , transferred pages: 227717 (dup: 313, rd: 221408, fd: 5996) , new dirty pages: 228457 , remaining dirty pages: 230299 > Iteration 12, duration: 27916 ms , transferred pages: 228560 (dup: 338, rd: 219660, fd: 8562) , new dirty pages: 238326 , remaining dirty pages: 240065 > > 9. cpu2006.lbm (can not converge) > Iteration 1, duration: 31012 ms , transferred pages: 266450 (dup: 12253, rd: 254197, fd: 0) , new dirty pages: 108960 , remaining dirty pages: 108960 > Iteration 2, duration: 13095 ms , transferred pages: 106522 (dup: 3, rd: 102045, fd: 4474) , new dirty pages: 129292 , remaining dirty pages: 131730 > Iteration 3, duration: 15802 ms , transferred pages: 129688 (dup: 444, rd: 110860, fd: 18384) , new dirty pages: 116682 , remaining dirty pages: 118724 > Iteration 4, duration: 14204 ms , transferred pages: 116316 (dup: 160, rd: 104951, fd: 11205) , new dirty pages: 107246 , remaining dirty pages: 109654 > Iteration 5, duration: 13208 ms , transferred pages: 107977 (dup: 1, rd: 101834, fd: 6142) , new dirty pages: 105371 , remaining dirty pages: 107048 > Iteration 6, duration: 12804 ms , transferred pages: 104705 (dup: 1, rd: 99629, fd: 5075) , new dirty pages: 103841 , remaining dirty pages: 106184 > Iteration 7, duration: 12709 ms , transferred pages: 103891 (dup: 5, rd: 99212, fd: 4674) , new dirty pages: 106692 , remaining dirty pages: 108985 > Iteration 8, duration: 13105 ms , transferred pages: 107169 (dup: 11, rd: 100125, fd: 7033) , new dirty pages: 103132 , remaining dirty pages: 104948 > Iteration 9, duration: 12607 ms , transferred pages: 103068 (dup: 0, rd: 99460, fd: 3608) , new dirty pages: 102511 , remaining dirty pages: 104391 > Iteration 10, duration: 12514 ms , transferred pages: 102250 (dup: 0, rd: 99094, fd: 3156) , new dirty pages: 102888 , remaining dirty pages: 105029 > > 10. cpu2006.astar (can not converge) > > Iteration 1, duration: 28402 ms , transferred pages: 266450 (dup: 33770, rd: 232680, fd: 0) , new dirty pages: 62078 , remaining dirty pages: 62078 > Iteration 2, duration: 7393 ms , transferred pages: 60107 (dup: 10, rd: 51722, fd: 8375) , new dirty pages: 48854 , remaining dirty pages: 50825 > Iteration 3, duration: 6001 ms , transferred pages: 49094 (dup: 14, rd: 46540, fd: 2540) , new dirty pages: 48137 , remaining dirty pages: 49868 > Iteration 4, duration: 5800 ms , transferred pages: 47444 (dup: 0, rd: 45389, fd: 2055) , new dirty pages: 49147 , remaining dirty pages: 51571 > Iteration 5, duration: 6102 ms , transferred pages: 49912 (dup: 14, rd: 46216, fd: 3682) , new dirty pages: 55606 , remaining dirty pages: 57265 > Iteration 6, duration: 6699 ms , transferred pages: 54949 (dup: 143, rd: 20745, fd: 34061) , new dirty pages: 9166 , remaining dirty pages: 11482 > Iteration 7, duration: 1200 ms , transferred pages: 9830 (dup: 14, rd: 7011, fd: 2805) , new dirty pages: 8294 , remaining dirty pages: 9946 > Iteration 8, duration: 1000 ms , transferred pages: 8194 (dup: 14, rd: 7178, fd: 1002) , new dirty pages: 5475 , remaining dirty pages: 7227 > Iteration 9, duration: 600 ms , transferred pages: 4908 (dup: 0, rd: 3470, fd: 1438) , new dirty pages: 4175 , remaining dirty pages: 6494 > Iteration 10, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3856, fd: 234) , new dirty pages: 4095 , remaining dirty pages: 6499 > Iteration 11, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3313, fd: 777) , new dirty pages: 3371 , remaining dirty pages: 5780 > Iteration 12, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3823, fd: 267) , new dirty pages: 7518 , remaining dirty pages: 9208 > Iteration 13, duration: 899 ms , transferred pages: 7376 (dup: 14, rd: 6028, fd: 1334) , new dirty pages: 3931 , remaining dirty pages: 5763 > Iteration 14, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 4346 , remaining dirty pages: 6019 > Iteration 15, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3817, fd: 273) , new dirty pages: 3054 , remaining dirty pages: 4983 > Iteration 16, duration: 400 ms , transferred pages: 3272 (dup: 0, rd: 3138, fd: 134) , new dirty pages: 3874 , remaining dirty pages: 5585 > Iteration 17, duration: 399 ms , transferred pages: 3272 (dup: 0, rd: 3248, fd: 24) , new dirty pages: 5285 , remaining dirty pages: 7598 > Iteration 18, duration: 701 ms , transferred pages: 5726 (dup: 0, rd: 4385, fd: 1341) , new dirty pages: 8903 , remaining dirty pages: 10775 > Iteration 19, duration: 1101 ms , transferred pages: 9010 (dup: 12, rd: 5597, fd: 3401) , new dirty pages: 4199 , remaining dirty pages: 5964 > Iteration 20, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 3829 , remaining dirty pages: 5703 > > 11. cpu2006.xalancbmk (can not converge) > > Iteration 1, duration: 30407 ms , transferred pages: 266450 (dup: 17700, rd: 248750, fd: 0) , new dirty pages: 96169 , remaining dirty pages: 96169 > Iteration 2, duration: 11495 ms , transferred pages: 94164 (dup: 205, rd: 67068, fd: 26891) , new dirty pages: 61766 , remaining dirty pages: 63771 > Iteration 3, duration: 7501 ms , transferred pages: 61471 (dup: 121, rd: 53587, fd: 7763) , new dirty pages: 56569 , remaining dirty pages: 58869 > Iteration 4, duration: 6902 ms , transferred pages: 56461 (dup: 19, rd: 50553, fd: 5889) , new dirty pages: 52181 , remaining dirty pages: 54589 > Iteration 5, duration: 6402 ms , transferred pages: 52459 (dup: 107, rd: 46986, fd: 5366) , new dirty pages: 54051 , remaining dirty pages: 56181 > Iteration 6, duration: 6601 ms , transferred pages: 54003 (dup: 15, rd: 47566, fd: 6422) , new dirty pages: 50844 , remaining dirty pages: 53022 > Iteration 7, duration: 6202 ms , transferred pages: 50723 (dup: 7, rd: 47143, fd: 3573) , new dirty pages: 64880 , remaining dirty pages: 67179 > Iteration 8, duration: 8001 ms , transferred pages: 65447 (dup: 7, rd: 61159, fd: 4281) , new dirty pages: 67854 , remaining dirty pages: 69586 > Iteration 9, duration: 8202 ms , transferred pages: 67444 (dup: 368, rd: 56357, fd: 10719) , new dirty pages: 65178 , remaining dirty pages: 67320 > Iteration 10, duration: 8000 ms , transferred pages: 65455 (dup: 15, rd: 60581, fd: 4859) , new dirty pages: 52421 , remaining dirty pages: 54286 > > 12. cpu2006.milc (can not converge) > > Iteration 1, duration: 31410 ms , transferred pages: 266450 (dup: 9454, rd: 256996, fd: 0) , new dirty pages: 158860 , remaining dirty pages: 158860 > Iteration 2, duration: 19193 ms , transferred pages: 157048 (dup: 150, rd: 96807, fd: 60091) , new dirty pages: 102238 , remaining dirty pages: 104050 > Iteration 3, duration: 12504 ms , transferred pages: 102271 (dup: 21, rd: 95107, fd: 7143) , new dirty pages: 97944 , remaining dirty pages: 99723 > Iteration 4, duration: 11905 ms , transferred pages: 97360 (dup: 18, rd: 93610, fd: 3732) , new dirty pages: 99150 , remaining dirty pages: 101513 > Iteration 5, duration: 12105 ms , transferred pages: 99094 (dup: 116, rd: 94125, fd: 4853) , new dirty pages: 98589 , remaining dirty pages: 101008 > Iteration 6, duration: 12101 ms , transferred pages: 98995 (dup: 17, rd: 94069, fd: 4909) , new dirty pages: 147403 , remaining dirty pages: 149416 > Iteration 7, duration: 18001 ms , transferred pages: 147284 (dup: 44, rd: 135691, fd: 11549) , new dirty pages: 136445 , remaining dirty pages: 138577 > Iteration 8, duration: 16702 ms , transferred pages: 136636 (dup: 30, rd: 130805, fd: 5801) , new dirty pages: 145481 , remaining dirty pages: 147422 > Iteration 9, duration: 17800 ms , transferred pages: 145734 (dup: 130, rd: 133239, fd: 12365) , new dirty pages: 98032 , remaining dirty pages: 99720 > Iteration 10, duration: 11902 ms , transferred pages: 97364 (dup: 22, rd: 93096, fd: 4246) , new dirty pages: 95391 , remaining dirty pages: 97747 > > 13. cpu2006.cactusADM (can not converge) > > Iteration 1, duration: 23508 ms , transferred pages: 266450 (dup: 73568, rd: 192882, fd: 0) , new dirty pages: 123869 , remaining dirty pages: 123869 > Iteration 2, duration: 13989 ms , transferred pages: 121594 (dup: 7874, rd: 81653, fd: 32067) , new dirty pages: 112960 , remaining dirty pages: 115235 > Iteration 3, duration: 13605 ms , transferred pages: 113276 (dup: 2028, rd: 83783, fd: 27465) , new dirty pages: 112314 , remaining dirty pages: 114273 > Iteration 4, duration: 13509 ms , transferred pages: 111935 (dup: 1505, rd: 83535, fd: 26895) , new dirty pages: 114078 , remaining dirty pages: 116416 > Iteration 5, duration: 13810 ms , transferred pages: 114262 (dup: 1378, rd: 84039, fd: 28845) , new dirty pages: 112271 , remaining dirty pages: 114425 > Iteration 6, duration: 13604 ms , transferred pages: 112664 (dup: 1416, rd: 84300, fd: 26948) , new dirty pages: 112903 , remaining dirty pages: 114664 > Iteration 7, duration: 13604 ms , transferred pages: 112655 (dup: 1407, rd: 84027, fd: 27221) , new dirty pages: 110943 , remaining dirty pages: 112952 > Iteration 8, duration: 13406 ms , transferred pages: 110720 (dup: 1108, rd: 84075, fd: 25537) , new dirty pages: 109321 , remaining dirty pages: 111553 > Iteration 9, duration: 13306 ms , transferred pages: 109726 (dup: 932, rd: 83652, fd: 25142) , new dirty pages: 113446 , remaining dirty pages: 115273 > Iteration 10, duration: 13705 ms , transferred pages: 113121 (dup: 1055, rd: 84671, fd: 27395) , new dirty pages: 108776 , remaining dirty pages: 110928 > > 14. cpu2006.GmesFDTD (can not converge) > > Iteration 1, duration: 13303 ms , transferred pages: 266450 (dup: 157809, rd: 108641, fd: 0) , new dirty pages: 226802 , remaining dirty pages: 226802 > Iteration 2, duration: 10797 ms , transferred pages: 226507 (dup: 138637, rd: 61818, fd: 26052) , new dirty pages: 200769 , remaining dirty pages: 201064 > Iteration 3, duration: 8900 ms , transferred pages: 199717 (dup: 127187, rd: 69340, fd: 3190) , new dirty pages: 203436 , remaining dirty pages: 204783 > Iteration 4, duration: 10904 ms , transferred pages: 204127 (dup: 115211, rd: 85767, fd: 3149) , new dirty pages: 198407 , remaining dirty pages: 199063 > Iteration 5, duration: 12109 ms , transferred pages: 198206 (dup: 99435, rd: 96956, fd: 1815) , new dirty pages: 213719 , remaining dirty pages: 214576 > Iteration 6, duration: 16307 ms , transferred pages: 213595 (dup: 80422, rd: 116885, fd: 16288) , new dirty pages: 199637 , remaining dirty pages: 200618 > Iteration 7, duration: 16915 ms , transferred pages: 198289 (dup: 60169, rd: 134208, fd: 3912) , new dirty pages: 199343 , remaining dirty pages: 201672 > Iteration 8, duration: 19518 ms , transferred pages: 200452 (dup: 41014, rd: 156083, fd: 3355) , new dirty pages: 222927 , remaining dirty pages: 224147 > > 15. cpu2006.wrf (can not converge) > > Iteration 1, duration: 18499 ms , transferred pages: 266380 (dup: 115285, rd: 151095, fd: 0) , new dirty pages: 112322 , remaining dirty pages: 112392 > Iteration 2, duration: 9802 ms , transferred pages: 110025 (dup: 29917, rd: 65782, fd: 14326) , new dirty pages: 88855 , remaining dirty pages: 91222 > Iteration 3, duration: 8199 ms , transferred pages: 89761 (dup: 22728, rd: 57262, fd: 9771) , new dirty pages: 58431 , remaining dirty pages: 59892 > Iteration 4, duration: 5603 ms , transferred pages: 58502 (dup: 12716, rd: 41809, fd: 3977) , new dirty pages: 80556 , remaining dirty pages: 81946 > Iteration 5, duration: 7101 ms , transferred pages: 79778 (dup: 21738, rd: 50896, fd: 7144) , new dirty pages: 62592 , remaining dirty pages: 64760 > Iteration 6, duration: 5702 ms , transferred pages: 63388 (dup: 16793, rd: 42726, fd: 3869) , new dirty pages: 80747 , remaining dirty pages: 82119 > Iteration 7, duration: 7000 ms , transferred pages: 80868 (dup: 23652, rd: 52194, fd: 5022) , new dirty pages: 84593 , remaining dirty pages: 85844 > Iteration 8, duration: 7099 ms , transferred pages: 83799 (dup: 25769, rd: 51772, fd: 6258) , new dirty pages: 67951 , remaining dirty pages: 69996 > Iteration 9, duration: 6303 ms , transferred pages: 68478 (dup: 16979, rd: 36490, fd: 15009) , new dirty pages: 81181 , remaining dirty pages: 82699 > Iteration 10, duration: 7000 ms , transferred pages: 80724 (dup: 23503, rd: 52826, fd: 4395) , new dirty pages: 47930 , remaining dirty pages: 49905 > > > > > > > > So I think "booting" and "kernel compilation" should benefit a lot from this > > > improvement. The reason of "kernel compilation" would benefit is that some > > > iterations take around 600ms, and if they are halved into 300ms, then the precopy > > > may have the chance to step into stop and copy phase. > > > > > > On the other hand, "idle" and "web server" would not benefit a lot, because > > > most of the time are spent on the 1st iteration and little on the others. > > > > > > As to the "zeusmp" and "memcached", although the time spent on the other iterations > > > but the 1st one may be halved, they still could not converge to stop and copy > > > with the 300ms downtime. > > > > > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------ > > > > > > 1. booting : begin to migrate when the VM is booting > > > > > > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414 > > > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459 > > > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356 > > > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430 > > > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430 > > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now. > > > "n" means "normal pages" and "d" means "duplicate (zero) pages".) > > > > > > 2. idle > > > > > > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398 > > > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294 > > > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849 > > > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849 > > > > > > 3. kernel compilation (can not converge) > > > > > > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067 > > > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518 > > > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207 > > > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339 > > > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200 > > > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116 > > > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033 > > > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642 > > > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427 > > > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731 > > > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252 > > > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348 > > > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920 > > > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693 > > > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382 > > > > > > 4. cpu2006.zeusmp (can not converge) > > > > > > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625 > > > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360 > > > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831 > > > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768 > > > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885 > > > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634 > > > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894 > > > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476 > > > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692 > > > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727 > > > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810 > > > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813 > > > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329 > > > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918 > > > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338 > > > > > > 5. web server : An apache web server. The client is configured with 50 concurrent connections. > > > > > > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628 > > > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574 > > > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261 > > > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519 > > > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167 > > > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167 > > > > > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------ > > > > > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge) > > > > > > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023 > > > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013 > > > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551 > > > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638 > > > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712 > > > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787 > > > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942 > > > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432 > > > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096 > > > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418 > > > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321 > > > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169 > > > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989 > > > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061 > > > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717 > > > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257 > > > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576 > > > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963 > > > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888 > > > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047 > > > > > > > > > -- > > > Chunguang Li, Ph.D. Candidate > > > Wuhan National Laboratory for Optoelectronics (WNLO) > > > Huazhong University of Science & Technology (HUST) > > > Wuhan, Hubei Prov., China > > > > > > > > > > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > -- > Chunguang Li, Ph.D. Candidate > Wuhan National Laboratory for Optoelectronics (WNLO) > Huazhong University of Science & Technology (HUST) > Wuhan, Hubei Prov., China > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent 2016-11-08 11:05 ` Dr. David Alan Gilbert @ 2016-11-08 13:40 ` Chunguang Li 0 siblings, 0 replies; 21+ messages in thread From: Chunguang Li @ 2016-11-08 13:40 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: Amit Shah, qemu-devel, pbonzini, stefanha, quintela > -----Original Messages----- > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > Sent Time: Tuesday, November 8, 2016 > To: "Chunguang Li" <lichunguang@hust.edu.cn> > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > > > > -----Original Messages----- > > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > Sent Time: Friday, October 14, 2016 > > > To: "Chunguang Li" <lichunguang@hust.edu.cn> > > > Cc: "Amit Shah" <amit.shah@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > Subject: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > > > > > > > > > > > > -----原始邮件----- > > > > > 发件人: "Amit Shah" <amit.shah@redhat.com> > > > > > 发送时间: 2016年9月30日 星期五 > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > > > 抄送: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, qemu-devel@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > > > 主题: Re: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > > On (Mon) 26 Sep 2016 [22:55:01], Chunguang Li wrote: > > > > > > > > > > > > > > > > > > > > > > > > > -----原始邮件----- > > > > > > > 发件人: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > > > > 发送时间: 2016年9月26日 星期一 > > > > > > > 收件人: "Chunguang Li" <lichunguang@hust.edu.cn> > > > > > > > 抄送: qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, quintela@redhat.com > > > > > > > 主题: Re: [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent > > > > > > > > > > > > > > * Chunguang Li (lichunguang@hust.edu.cn) wrote: > > > > > > > > Hi all! > > > > > > > > I have some confusion about the dirty bitmap during migration. I have digged into the code. I figure out that every now and then during migration, the dirty bitmap will be grabbed from the kernel space through ioctl(KVM_GET_DIRTY_LOG), and then be used to update qemu's dirty bitmap. However I think this mechanism leads to resendness of some NON-dirty pages. > > > > > > > > > > > > > > > > Take the first iteration of precopy for instance, during which all the pages will be sent. Before that during the migration setup, the ioctl(KVM_GET_DIRTY_LOG) is called once, so the kernel begins to produce the dirty bitmap from this moment. When the pages "that haven't been sent" are written, the kernel space marks them as dirty. However I don't think this is correct, because these pages will be sent during this and the next iterations with the same content (if they are not written again after they are sent). It only makes sense to mark the pages which have already been sent during one iteration as dirty when they are written. > > > > > > > > > > > > > > > > > > > > > > > > Am I right about this consideration? If I am right, is there some advice to improve this? > > > > > > > > > > > > > > I think you're right that this can happen; to clarify I think the > > > > > > > case you're talking about is: > > > > > > > > > > > > > > Iteration 1 > > > > > > > sync bitmap > > > > > > > start sending pages > > > > > > > page 'n' is modified - but hasn't been sent yet > > > > > > > page 'n' gets sent > > > > > > > Iteration 2 > > > > > > > sync bitmap > > > > > > > 'page n is shown as modified' > > > > > > > send page 'n' again > > > > > > > > > > > > > > > > > > > Yes,this is right the case I am talking about. > > > > > > > > > > > > > So you're right that is wasteful; I guess it's more wasteful > > > > > > > on big VMs with slow networks where the length of each iteration > > > > > > > is large. > > > > > > > > > > > > I think this is "very" wasteful. Assume the workload writes the pages dirty randomly within the guest address space, and the transfer speed is constant. Intuitively, I think nearly half of the dirty pages produced in Iteration 1 is not really dirty. This means the time of Iteration 2 is double of that to send only really dirty pages. > > > > > > > > > > It makes sense, can you get some perf numbers to show what kinds of > > > > > workloads get impacted the most? That would also help us to figure > > > > > out what kinds of speed improvements we can expect. > > > > > > > > > > > > > > > Amit > > > > > > > > I have picked up 6 workloads and got the following statistics numbers > > > > of every iteration (except the last stop-copy one) during precopy. > > > > These numbers are obtained with the basic precopy migration, without > > > > the capabilities like xbzrle or compression, etc. The network for the > > > > migration is exclusive, with a separate network for the workloads. > > > > They are both gigabit ethernet. I use qemu-2.5.1. > > > > > > > > Three (booting, idle, web server) of them converged to the stop-copy phase, > > > > with the given bandwidth and default downtime (300ms), while the other > > > > three (kernel compilation, zeusmp, memcached) did not. > > > > > > > > One page is "not-really-dirty", if it is written first and is sent later > > > > (and not written again after that) during one iteration. I guess this > > > > would not happen so often during the other iterations as during the 1st > > > > iteration. Because all the pages of the VM are sent to the dest node during > > > > the 1st iteration, while during the others, only part of the pages are sent. > > > > So I think the "not-really-dirty" pages should be produced mainly during > > > > the 1st iteration , and maybe very little during the other iterations. > > > > > > > > If we could avoid resending the "not-really-dirty" pages, intuitively, I > > > > think the time spent on Iteration 2 would be halved. This is a chain reaction, > > > > because the dirty pages produced during Iteration 2 is halved, which incurs > > > > that the time spent on Iteration 3 is halved, then Iteration 4, 5... > > > > > > Yes; these numbers don't show how many of them are false dirty though. > > > > > > One problem is thinking about pages that have been redirtied, if the page is dirtied > > > after the sync but before the network write then it's the false-dirty that > > > you're describing. > > > > > > However, if the page is being written a few times, and so it would have been written > > > after the network write then it isn't a false-dirty. > > > > > > You might be able to figure that out with some kernel tracing of when the dirtying > > > happens, but it might be easier to write the fix! > > > > > > Dave > > > > Hi, I have made some new progress now. > > > > To tell how many false dirty pages there are exactly in each iteration, I malloc a > > buffer in memory as big as the size of the whole VM memory. When a page is > > transferred to the dest node, it is copied to the buffer; During the next iteration, > > if one page is transferred, it is compared to the old one in the buffer, and the > > old one will be replaced for next comparison if it is really dirty. Thus, we are now > > able to get the exact number of false dirty pages. > > > > This time, I use 15 workloads to get the statistic number. They are: > > > > 1. 11 benchmarks picked up from cpu2006 benchmark suit. They are all scientific > > computing workloads like Quantum Chromodynamics, Fluid Dynamics, etc. I pick > > up these 11 benchmarks because compared to others, they have bigger memory > > occupation and higher memory dirty rate. Thus most of them could not converge > > to stop-and-copy using the default migration speed (32MB/s). > > 2. kernel compilation > > 3. idle VM > > 4. Apache web server which serves static content > > > > (the above workloads are all running in VM with 1 vcpu and 1GB memory, and the > > migration speed is the default 32MB/s) > > > > 5. Memcached. The VM has 6 cpu cores and 6GB memory, and 4GB are used as the cache. > > After filling up the 4GB cache, a client writes the cache at a constant speed > > during migration. This time, migration speed has no limit, and is up to the > > capability of 1Gbps Ethernet. > > > > Summarize the results first: (and you can read the precise number below) > > > > 1. 4 of these 15 workloads have a big proportion (>60%, even >80% during some iterations) > > of false dirty pages out of all the dirty pages since iteration 2 (and the big > > proportion lasts during the following iterations). They are cpu2006.zeusmp, > > cpu2006.bzip2, cpu2006.mcf, and memcached. > > 2. 2 workloads (idle, webserver) spend most of the migration time on iteration 1, even > > though the proportion of false dirty pages is big since iteration 2, the space to > > optimize is small. > > 3. 1 workload (kernel compilation) only have a big proportion during iteration 2, not > > in the other iterations. > > 4. 8 workloads (the other 8 benchmarks of cpu2006) have little proportion of false > > dirty pages since iteration 2. So the spaces to optimize for them are small. > > > > Now I want to talk a little more about the reasons why false dirty pages are produced. > > The first reason is what we have discussed before---the mechanism to track the dirty > > pages. > > And then I come up with another reason. Here is the situation: a write operation to one > > memory page happens, but it doesn't change any content of the page. So it's "write but > > not dirty", and kernel still marks it as dirty. One guy in our lab has done some experiments > > to figure out the proportion of "write but not dirty" operations, and he uses the cpu2006 > > benchmark suit. According to his results, general workloads has a little proportion (<10%) > > of "write but not dirty" out of all the write operations, while few workloads have higher > > proportion (one even as high as 50%). Now we are not sure why "write but not dirty" would > > happen, it just happened. > > I think there are a few different reasons I can think of: > a) You have a flag or mutex that's set and cleared; so it gets set (marked > dirty) and cleared around some operation. By the time we come to migrate > it then it's back to cleared again. > Similarly with other temporary data structures. > b) Some system operation causes the page to be moved - e.g. swap or the kernel > reorganising memory. Sorry, I don't quite understand reason (b). Take swap as example, do you mean a page is swapped out and swapped in to the old address again, so the content remains unchanged? > > However, it's a shame I don't think you can tell in your experiment which of the > two cases we're hitting? I'd like to know if it's worth working on > making the page sync mechanism better or if it's nore important to deal > with the second reason you show. Yes, you are right, it's hard to tell which case we're hitting (including the cases you think of). However, as I use the SHA1 method, now I don't have to tell them. Because it just handle all the cases we have thought of. > > > So these two reasons contribute to the false dirty pages. To optimize, I compute and store > > the SHA1 hash before transferring each page. Next time, if one page needs retransmission, its > > SHA1 hash is computed again, and compared to the old hash. If the hash is the same, it's a > > false dirty page, and we just skip this page; Otherwise, the page is transferred, and the new > > hash replaces the old one for next comparison. > > The reason to use SHA1 hash but not byte-by-byte comparison is the memory overheads. One SHA1 > > hash is 20 bytes. So we need extra 20/4096 (<1/200) memory space of the whole VM memory, which > > is relatively small. > > As far as I know, SHA1 hash is widely used in the scenes of deduplication for backup systems. > > They have proven that the probability of hash collision is far smaller than disk hardware fault, > > so it's secure hash, that is, if the hashes of two chunks are the same, the content must be the > > same. So I think the SHA1 hash could replace byte-to-byte comparison in the VM memory scenery. > > There was a proposal ( https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05331.html ) > to do a migration system where > a copy of the migration RAM is stored on disc on the destination for cases where similar VMs > are migrated, and it used a checksum for each page to find the matching page > in the cache; that originally used a smaller hash, I think in the end they used a SHA-256. > (Hash based checks still make me nervous for intentional collisions but that's probably > me being paranoid?) Em... I don't know if most people would accept the hash based checks. Maybe it needs some more mathematical proving like they have done in the field of deduplication for backup systems. > > > Then I do the same migration experiments using the SHA1 hash. For the 4 workloads which have > > big proportions of false dirty pages, the improvement is remarkable. Without optimization, > > they either can not converge to stop-and-copy, or take a very long time to complete. With the > > SHA1 hash method, all of them now complete in a relatively short time. > > For the reason I have talked above, the other workloads don't get notable improvements from the > > optimization. So below, I only show the exact number after optimization for the 4 workloads with > > remarkable improvements. > > > > Any comments or suggestions? > > You might be able to save some of the CPU time; we've > got a test that checks if a page is all-zero; if you're doing > the SHA calculation you could avoid doing the all-zero check > and replace it by comparing hte output of the SHA. Yes, this is one way. However now I'm doing the opposite. I first calculate the SHA1 of the all-zero page and remember that. Then next time, if I recognize an all-zero page after the check, I just store the SHA1 I have got earlier, avoiding calculating the SHA1 of the all-zero page again. I think this is better, because I think the current implementation to check all-zero pages is faster than calculating SHA1. Thanks, Chunguang > > > > > Below is the experiments data: > > ( > > "dup" means zero page, this kind of pages takes very little migration time and network > > resources, so they are always not regard as dirty pages in my numbers; > > "rd" means really dirty pages; > > "fd" means false dirty pages; > > The numbers refer to the quantities of pages. > > ) > > > > ------------------The 4 workloads with remarkable improvements (both the results of original precopy and with optimization are shown)------------------- > > > > 1. memcached > > > > ----- original pre-copy (can not converge): ----- > > Iteration 1, duration: 42111 ms , transferred pages: 1568788 (dup: 416239, rd: 1152549, fd: 0) , new dirty pages: 499015 , remaining dirty pages: 507397 > > Iteration 2, duration: 17208 ms , transferred pages: 498946 (dup: 5456, rd: 160206, fd: 333284) , new dirty pages: 261237 , remaining dirty pages: 269688 > > Iteration 3, duration: 9134 ms , transferred pages: 262377 (dup: 519, rd: 111900, fd: 149958) , new dirty pages: 170281 , remaining dirty pages: 177592 > > Iteration 4, duration: 5920 ms , transferred pages: 169966 (dup: 87, rd: 82487, fd: 87392) , new dirty pages: 121154 , remaining dirty pages: 128780 > > Iteration 5, duration: 4239 ms , transferred pages: 121551 (dup: 81, rd: 64120, fd: 57350) , new dirty pages: 100976 , remaining dirty pages: 108205 > > Iteration 6, duration: 3495 ms , transferred pages: 100353 (dup: 90, rd: 56021, fd: 44242) , new dirty pages: 74547 , remaining dirty pages: 82399 > > Iteration 7, duration: 2583 ms , transferred pages: 74160 (dup: 56, rd: 38016, fd: 36088) , new dirty pages: 58209 , remaining dirty pages: 66448 > > Iteration 8, duration: 2039 ms , transferred pages: 58534 (dup: 81, rd: 26885, fd: 31568) , new dirty pages: 43511 , remaining dirty pages: 51425 > > Iteration 9, duration: 1513 ms , transferred pages: 43484 (dup: 55, rd: 26641, fd: 16788) , new dirty pages: 43722 , remaining dirty pages: 51663 > > Iteration 10, duration: 1521 ms , transferred pages: 43676 (dup: 62, rd: 26463, fd: 17151) , new dirty pages: 35347 , remaining dirty pages: 43334 > > Iteration 11, duration: 1230 ms , transferred pages: 35287 (dup: 0, rd: 21293, fd: 13994) , new dirty pages: 28851 , remaining dirty pages: 36898 > > Iteration 12, duration: 1031 ms , transferred pages: 29651 (dup: 82, rd: 18143, fd: 11426) , new dirty pages: 27062 , remaining dirty pages: 34309 > > Iteration 13, duration: 917 ms , transferred pages: 26385 (dup: 56, rd: 14149, fd: 12180) , new dirty pages: 22723 , remaining dirty pages: 30647 > > Iteration 14, duration: 762 ms , transferred pages: 21902 (dup: 55, rd: 16355, fd: 5492) , new dirty pages: 18208 , remaining dirty pages: 26953 > > Iteration 15, duration: 650 ms , transferred pages: 18636 (dup: 0, rd: 11943, fd: 6693) , new dirty pages: 16085 , remaining dirty pages: 24402 > > Iteration 16, duration: 554 ms , transferred pages: 15946 (dup: 56, rd: 9527, fd: 6363) , new dirty pages: 14766 , remaining dirty pages: 23222 > > Iteration 17, duration: 538 ms , transferred pages: 15434 (dup: 0, rd: 9779, fd: 5655) , new dirty pages: 13381 , remaining dirty pages: 21169 > > Iteration 18, duration: 487 ms , transferred pages: 14089 (dup: 81, rd: 7737, fd: 6271) , new dirty pages: 13325 , remaining dirty pages: 20405 > > Iteration 19, duration: 428 ms , transferred pages: 12232 (dup: 0, rd: 8488, fd: 3744) , new dirty pages: 10274 , remaining dirty pages: 18447 > > Iteration 20, duration: 377 ms , transferred pages: 10887 (dup: 56, rd: 6362, fd: 4469) , new dirty pages: 9708 , remaining dirty pages: 17268 > > Iteration 21, duration: 320 ms , transferred pages: 9222 (dup: 0, rd: 5789, fd: 3433) , new dirty pages: 8015 , remaining dirty pages: 16061 > > Iteration 22, duration: 268 ms , transferred pages: 7621 (dup: 0, rd: 6204, fd: 1417) , new dirty pages: 7227 , remaining dirty pages: 15667 > > Iteration 23, duration: 269 ms , transferred pages: 7813 (dup: 56, rd: 4410, fd: 3347) , new dirty pages: 7591 , remaining dirty pages: 15445 > > Iteration 24, duration: 271 ms , transferred pages: 7749 (dup: 0, rd: 4565, fd: 3184) , new dirty pages: 15126 , remaining dirty pages: 22822 > > Iteration 25, duration: 549 ms , transferred pages: 15818 (dup: 60, rd: 10545, fd: 5213) , new dirty pages: 14559 , remaining dirty pages: 21563 > > Iteration 26, duration: 499 ms , transferred pages: 14281 (dup: 3, rd: 8760, fd: 5518) , new dirty pages: 11947 , remaining dirty pages: 19229 > > Iteration 27, duration: 376 ms , transferred pages: 10823 (dup: 25, rd: 6550, fd: 4248) , new dirty pages: 8561 , remaining dirty pages: 16967 > > Iteration 28, duration: 324 ms , transferred pages: 9350 (dup: 31, rd: 5292, fd: 4027) , new dirty pages: 8655 , remaining dirty pages: 16272 > > Iteration 29, duration: 274 ms , transferred pages: 7813 (dup: 0, rd: 6088, fd: 1725) , new dirty pages: 6300 , remaining dirty pages: 14759 > > Iteration 30, duration: 218 ms , transferred pages: 6340 (dup: 45, rd: 3196, fd: 3099) , new dirty pages: 5143 , remaining dirty pages: 13562 > > > > ----- after optimization: ----- > > Iteration 1, duration: 40664 ms , transferred pages: 1569037 (dup: 405940, rd: 1163097) , new dirty pages: 506846 , remaining dirty pages: 514979 > > Iteration 2, duration: 8032 ms , transferred pages: 161130 (dup: 4007, rd: 157123) , new dirty pages: 153479 , remaining dirty pages: 153479 > > Big difference. > > > Iteration 3, duration: 2620 ms , transferred pages: 65260 (dup: 20, rd: 65240) , new dirty pages: 64014 , remaining dirty pages: 67100 > > Iteration 4, duration: 1160 ms , transferred pages: 30227 (dup: 60, rd: 30167) , new dirty pages: 34031 , remaining dirty pages: 41414 > > Iteration 5, duration: 648 ms , transferred pages: 18700 (dup: 56, rd: 18644) , new dirty pages: 18375 , remaining dirty pages: 25536 > > Iteration 6, duration: 389 ms , transferred pages: 11399 (dup: 55, rd: 11344) , new dirty pages: 12536 , remaining dirty pages: 17516 > > Iteration 7, duration: 292 ms , transferred pages: 8197 (dup: 0, rd: 8197) , new dirty pages: 8387 , remaining dirty pages: 16802 > > Iteration 8, duration: 171 ms , transferred pages: 4931 (dup: 39, rd: 4892) , new dirty pages: 6182 , remaining dirty pages: 14060 > > Iteration 9, duration: 163 ms , transferred pages: 4355 (dup: 16, rd: 4339) , new dirty pages: 5530 , remaining dirty pages: 11973 > > Iteration 10, duration: 104 ms , transferred pages: 3266 (dup: 0, rd: 3266) , new dirty pages: 2893 , remaining dirty pages: 11014 > > Iteration 11, duration: 52 ms , transferred pages: 1153 (dup: 0, rd: 1153) , new dirty pages: 1586 , remaining dirty pages: 10516 > > Iteration 12, duration: 52 ms , transferred pages: 1921 (dup: 39, rd: 1882) , new dirty pages: 1619 , remaining dirty pages: 8842 > > Iteration 13, duration: 62 ms , transferred pages: 1537 (dup: 0, rd: 1537) , new dirty pages: 2052 , remaining dirty pages: 8871 > > Iteration 14, duration: 58 ms , transferred pages: 1665 (dup: 0, rd: 1665) , new dirty pages: 1947 , remaining dirty pages: 7989 > > Iteration 15, duration: 2 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 7989 > > total time: 54693 milliseconds > > Very nice. > > Dave > > > 2. cpu2006.zeusmp > > > > ----- original pre-copy (can not converge): ----- > > Iteration 1, duration: 21112 ms , transferred pages: 266450 (dup: 93385, rd: 173065, fd: 0) , new dirty pages: 127866 , remaining dirty pages: 127866 > > Iteration 2, duration: 6192 ms , transferred pages: 125662 (dup: 75762, rd: 17389, fd: 32511) , new dirty pages: 131655 , remaining dirty pages: 133859 > > Iteration 3, duration: 6699 ms , transferred pages: 131937 (dup: 77298, rd: 20320, fd: 34319) , new dirty pages: 121027 , remaining dirty pages: 122949 > > Iteration 4, duration: 5999 ms , transferred pages: 122512 (dup: 73588, rd: 17236, fd: 31688) , new dirty pages: 122759 , remaining dirty pages: 123196 > > Iteration 5, duration: 5804 ms , transferred pages: 122717 (dup: 75436, rd: 19016, fd: 28265) , new dirty pages: 123697 , remaining dirty pages: 124176 > > Iteration 6, duration: 5698 ms , transferred pages: 123708 (dup: 77249, rd: 18022, fd: 28437) , new dirty pages: 121838 , remaining dirty pages: 122306 > > Iteration 7, duration: 5515 ms , transferred pages: 122306 (dup: 76727, rd: 14819, fd: 30760) , new dirty pages: 122382 , remaining dirty pages: 122382 > > Iteration 8, duration: 6086 ms , transferred pages: 120825 (dup: 71834, rd: 15987, fd: 33004) , new dirty pages: 121587 , remaining dirty pages: 123144 > > Iteration 9, duration: 5899 ms , transferred pages: 120964 (dup: 72860, rd: 18191, fd: 29913) , new dirty pages: 120391 , remaining dirty pages: 122571 > > Iteration 10, duration: 5801 ms , transferred pages: 121425 (dup: 74140, rd: 20722, fd: 26563) , new dirty pages: 122302 , remaining dirty pages: 123448 > > Iteration 11, duration: 5909 ms , transferred pages: 123448 (dup: 74735, rd: 19678, fd: 29035) , new dirty pages: 123258 , remaining dirty pages: 123258 > > Iteration 12, duration: 6293 ms , transferred pages: 121211 (dup: 70442, rd: 18128, fd: 32641) , new dirty pages: 123623 , remaining dirty pages: 125670 > > Iteration 13, duration: 6398 ms , transferred pages: 124897 (dup: 72701, rd: 21134, fd: 31062) , new dirty pages: 122355 , remaining dirty pages: 123128 > > Iteration 14, duration: 6301 ms , transferred pages: 121893 (dup: 70514, rd: 23470, fd: 27909) , new dirty pages: 120980 , remaining dirty pages: 122215 > > Iteration 15, duration: 6304 ms , transferred pages: 121389 (dup: 70005, rd: 21731, fd: 29653) , new dirty pages: 121628 , remaining dirty pages: 122454 > > Iteration 16, duration: 6398 ms , transferred pages: 122164 (dup: 69962, rd: 24376, fd: 27826) , new dirty pages: 122246 , remaining dirty pages: 122536 > > Iteration 17, duration: 6201 ms , transferred pages: 121548 (dup: 70984, rd: 23915, fd: 26649) , new dirty pages: 121460 , remaining dirty pages: 122448 > > Iteration 18, duration: 6401 ms , transferred pages: 122272 (dup: 70072, rd: 22261, fd: 29939) , new dirty pages: 123518 , remaining dirty pages: 123694 > > Iteration 19, duration: 7003 ms , transferred pages: 121873 (dup: 64754, rd: 27325, fd: 29794) , new dirty pages: 120568 , remaining dirty pages: 122389 > > Iteration 20, duration: 6400 ms , transferred pages: 121422 (dup: 69221, rd: 25300, fd: 26901) , new dirty pages: 121229 , remaining dirty pages: 122196 > > Iteration 21, duration: 6703 ms , transferred pages: 119895 (dup: 65232, rd: 25877, fd: 28786) , new dirty pages: 123284 , remaining dirty pages: 125585 > > Iteration 22, duration: 6902 ms , transferred pages: 123884 (dup: 67582, rd: 29020, fd: 27282) , new dirty pages: 122057 , remaining dirty pages: 123758 > > Iteration 23, duration: 6800 ms , transferred pages: 122010 (dup: 66529, rd: 30644, fd: 24837) , new dirty pages: 120916 , remaining dirty pages: 122664 > > Iteration 24, duration: 7202 ms , transferred pages: 121951 (dup: 63188, rd: 31105, fd: 27658) , new dirty pages: 122715 , remaining dirty pages: 123428 > > Iteration 25, duration: 7202 ms , transferred pages: 122919 (dup: 64161, rd: 32063, fd: 26695) , new dirty pages: 123180 , remaining dirty pages: 123689 > > Iteration 26, duration: 7404 ms , transferred pages: 123092 (dup: 62694, rd: 33459, fd: 26939) , new dirty pages: 122149 , remaining dirty pages: 122746 > > Iteration 27, duration: 7205 ms , transferred pages: 120427 (dup: 61664, rd: 34344, fd: 24419) , new dirty pages: 120299 , remaining dirty pages: 122618 > > Iteration 28, duration: 7100 ms , transferred pages: 121074 (dup: 63130, rd: 32403, fd: 25541) , new dirty pages: 122984 , remaining dirty pages: 124528 > > Iteration 29, duration: 7904 ms , transferred pages: 124060 (dup: 59564, rd: 35631, fd: 28865) , new dirty pages: 127080 , remaining dirty pages: 127548 > > Iteration 30, duration: 7906 ms , transferred pages: 127518 (dup: 63029, rd: 34416, fd: 30073) , new dirty pages: 125028 , remaining dirty pages: 125058 > > > > ----- after optimization: ----- > > Iteration 1, duration: 21601 ms , transferred pages: 266450 (dup: 89731, rd: 176719) , new dirty pages: 139843 , remaining dirty pages: 139843 > > Iteration 2, duration: 1747 ms , transferred pages: 92077 (dup: 78364, rd: 13713) , new dirty pages: 90945 , remaining dirty pages: 90945 > > Iteration 3, duration: 1592 ms , transferred pages: 62253 (dup: 49435, rd: 12818) , new dirty pages: 76929 , remaining dirty pages: 76929 > > Iteration 4, duration: 992 ms , transferred pages: 44837 (dup: 37886, rd: 6951) , new dirty pages: 71331 , remaining dirty pages: 72916 > > Iteration 5, duration: 998 ms , transferred pages: 55229 (dup: 47150, rd: 8079) , new dirty pages: 21703 , remaining dirty pages: 23302 > > Iteration 6, duration: 211 ms , transferred pages: 20337 (dup: 18516, rd: 1821) , new dirty pages: 14500 , remaining dirty pages: 14500 > > Iteration 7, duration: 31 ms , transferred pages: 12933 (dup: 12627, rd: 306) , new dirty pages: 1520 , remaining dirty pages: 1520 > > Iteration 8, duration: 30 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 4 , remaining dirty pages: 1524 > > total time: 27225 milliseconds > > > > 3. cpu2006.bzip2 > > > > ----- original pre-copy: ----- > > Iteration 1, duration: 18306 ms , transferred pages: 266450 (dup: 116569, rd: 149881, fd: 0) , new dirty pages: 106299 , remaining dirty pages: 106299 > > Iteration 2, duration: 10694 ms , transferred pages: 104611 (dup: 17550, rd: 10536, fd: 76525) , new dirty pages: 34394 , remaining dirty pages: 36082 > > Iteration 3, duration: 2998 ms , transferred pages: 34442 (dup: 9924, rd: 12254, fd: 12264) , new dirty pages: 6419 , remaining dirty pages: 8059 > > Iteration 4, duration: 699 ms , transferred pages: 5748 (dup: 22, rd: 2583, fd: 3143) , new dirty pages: 1226 , remaining dirty pages: 3537 > > Iteration 5, duration: 200 ms , transferred pages: 1636 (dup: 0, rd: 1194, fd: 442) , new dirty pages: 478 , remaining dirty pages: 2379 > > Iteration 6, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2379 > > > > ----- after optimization: ----- > > Iteration 1, duration: 13995 ms , transferred pages: 266314 (dup: 152118, rd: 114196) , new dirty pages: 97009 , remaining dirty pages: 97145 > > Iteration 2, duration: 1215 ms , transferred pages: 33400 (dup: 26745, rd: 6655) , new dirty pages: 12866 , remaining dirty pages: 14017 > > Iteration 3, duration: 701 ms , transferred pages: 5774 (dup: 48, rd: 5726) , new dirty pages: 6342 , remaining dirty pages: 8761 > > Iteration 4, duration: 500 ms , transferred pages: 4111 (dup: 21, rd: 4090) , new dirty pages: 4311 , remaining dirty pages: 6485 > > Iteration 5, duration: 400 ms , transferred pages: 3273 (dup: 1, rd: 3272) , new dirty pages: 3034 , remaining dirty pages: 5431 > > Iteration 6, duration: 301 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2094 , remaining dirty pages: 4472 > > Iteration 7, duration: 299 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2066 , remaining dirty pages: 4082 > > Iteration 8, duration: 202 ms , transferred pages: 1636 (dup: 0, rd: 1636) , new dirty pages: 2881 , remaining dirty pages: 4648 > > Iteration 9, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 4775 , remaining dirty pages: 6778 > > Iteration 10, duration: 400 ms , transferred pages: 3281 (dup: 9, rd: 3272) , new dirty pages: 3757 , remaining dirty pages: 5576 > > Iteration 11, duration: 401 ms , transferred pages: 3279 (dup: 7, rd: 3272) , new dirty pages: 6980 , remaining dirty pages: 8906 > > Iteration 12, duration: 500 ms , transferred pages: 7118 (dup: 3035, rd: 4083) , new dirty pages: 10774 , remaining dirty pages: 11922 > > Iteration 13, duration: 116 ms , transferred pages: 11706 (dup: 10152, rd: 1554) , new dirty pages: 1326 , remaining dirty pages: 1326 > > Iteration 14, duration: 117 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 1326 > > total time: 19479 milliseconds > > > > 4. cpu2006.mcf > > > > ----- original pre-copy: ----- > > Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6925, rd: 259525, fd: 0) , new dirty pages: 244403 , remaining dirty pages: 244403 > > Iteration 2, duration: 29603 ms , transferred pages: 242275 (dup: 377, rd: 224001, fd: 17897) , new dirty pages: 227335 , remaining dirty pages: 229463 > > Iteration 3, duration: 27806 ms , transferred pages: 227573 (dup: 169, rd: 65681, fd: 161723) , new dirty pages: 195593 , remaining dirty pages: 197483 > > Iteration 4, duration: 23907 ms , transferred pages: 195543 (dup: 41, rd: 39838, fd: 155664) , new dirty pages: 215066 , remaining dirty pages: 217006 > > Iteration 5, duration: 26305 ms , transferred pages: 215289 (dup: 155, rd: 33082, fd: 182052) , new dirty pages: 111098 , remaining dirty pages: 112815 > > Iteration 6, duration: 13502 ms , transferred pages: 110452 (dup: 22, rd: 26793, fd: 83637) , new dirty pages: 161054 , remaining dirty pages: 163417 > > Iteration 7, duration: 19705 ms , transferred pages: 161266 (dup: 120, rd: 33818, fd: 127328) , new dirty pages: 220562 , remaining dirty pages: 222713 > > Iteration 8, duration: 27003 ms , transferred pages: 220881 (dup: 21, rd: 215721, fd: 5139) , new dirty pages: 219787 , remaining dirty pages: 221619 > > Iteration 9, duration: 26802 ms , transferred pages: 219248 (dup: 24, rd: 84648, fd: 134576) , new dirty pages: 207959 , remaining dirty pages: 210330 > > Iteration 10, duration: 25411 ms , transferred pages: 207916 (dup: 144, rd: 35842, fd: 171930) , new dirty pages: 144442 , remaining dirty pages: 146856 > > Iteration 11, duration: 17714 ms , transferred pages: 144804 (dup: 18, rd: 25414, fd: 119372) , new dirty pages: 205127 , remaining dirty pages: 207179 > > Iteration 12, duration: 25112 ms , transferred pages: 205446 (dup: 128, rd: 23197, fd: 182121) , new dirty pages: 167319 , remaining dirty pages: 169052 > > Iteration 13, duration: 20411 ms , transferred pages: 166886 (dup: 14, rd: 21960, fd: 144912) , new dirty pages: 221592 , remaining dirty pages: 223758 > > Iteration 14, duration: 27126 ms , transferred pages: 221800 (dup: 122, rd: 42368, fd: 179310) , new dirty pages: 233630 , remaining dirty pages: 235588 > > Iteration 15, duration: 28517 ms , transferred pages: 233321 (dup: 191, rd: 222528, fd: 10602) , new dirty pages: 224282 , remaining dirty pages: 226549 > > Iteration 16, duration: 27422 ms , transferred pages: 224187 (dup: 55, rd: 45773, fd: 178359) , new dirty pages: 209815 , remaining dirty pages: 212177 > > Iteration 17, duration: 25723 ms , transferred pages: 210260 (dup: 34, rd: 79405, fd: 130821) , new dirty pages: 220297 , remaining dirty pages: 222214 > > Iteration 18, duration: 26920 ms , transferred pages: 220056 (dup: 14, rd: 214128, fd: 5914) , new dirty pages: 192015 , remaining dirty pages: 194173 > > Iteration 19, duration: 23520 ms , transferred pages: 192239 (dup: 9, rd: 25140, fd: 167090) , new dirty pages: 96450 , remaining dirty pages: 98384 > > Iteration 20, duration: 11805 ms , transferred pages: 96538 (dup: 14, rd: 7424, fd: 89100) , new dirty pages: 6978 , remaining dirty pages: 8824 > > Iteration 21, duration: 799 ms , transferred pages: 6545 (dup: 1, rd: 1802, fd: 4742) , new dirty pages: 138 , remaining dirty pages: 2417 > > Iteration 22, duration: 1 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2417 > > > > ----- after optimization: ----- > > Iteration 1, duration: 31711 ms , transferred pages: 266450 (dup: 6831, rd: 259619) , new dirty pages: 240209 , remaining dirty pages: 240209 > > Iteration 2, duration: 6250 ms , transferred pages: 51244 (dup: 211, rd: 51033) , new dirty pages: 226651 , remaining dirty pages: 228571 > > Iteration 3, duration: 4395 ms , transferred pages: 36008 (dup: 80, rd: 35928) , new dirty pages: 110719 , remaining dirty pages: 111478 > > Iteration 4, duration: 3390 ms , transferred pages: 28068 (dup: 28, rd: 28040) , new dirty pages: 185172 , remaining dirty pages: 185172 > > Iteration 5, duration: 2986 ms , transferred pages: 23780 (dup: 45, rd: 23735) , new dirty pages: 64357 , remaining dirty pages: 66305 > > Iteration 6, duration: 2727 ms , transferred pages: 22800 (dup: 12, rd: 22788) , new dirty pages: 61675 , remaining dirty pages: 61675 > > Iteration 7, duration: 2372 ms , transferred pages: 18943 (dup: 13, rd: 18930) , new dirty pages: 55144 , remaining dirty pages: 55265 > > Iteration 8, duration: 2100 ms , transferred pages: 17189 (dup: 11, rd: 17178) , new dirty pages: 55244 , remaining dirty pages: 55668 > > Iteration 9, duration: 2003 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 107058 , remaining dirty pages: 108014 > > Iteration 10, duration: 2132 ms , transferred pages: 17825 (dup: 24, rd: 17801) , new dirty pages: 126214 , remaining dirty pages: 126214 > > Iteration 11, duration: 2229 ms , transferred pages: 18156 (dup: 22, rd: 18134) , new dirty pages: 65725 , remaining dirty pages: 65725 > > Iteration 12, duration: 2315 ms , transferred pages: 18651 (dup: 21, rd: 18630) , new dirty pages: 52575 , remaining dirty pages: 53903 > > Iteration 13, duration: 2147 ms , transferred pages: 17435 (dup: 16, rd: 17419) , new dirty pages: 46652 , remaining dirty pages: 47260 > > Iteration 14, duration: 2000 ms , transferred pages: 16371 (dup: 11, rd: 16360) , new dirty pages: 42721 , remaining dirty pages: 43266 > > Iteration 15, duration: 1901 ms , transferred pages: 15552 (dup: 10, rd: 15542) , new dirty pages: 38593 , remaining dirty pages: 40792 > > Iteration 16, duration: 1801 ms , transferred pages: 14735 (dup: 11, rd: 14724) , new dirty pages: 54252 , remaining dirty pages: 55639 > > Iteration 17, duration: 1708 ms , transferred pages: 13860 (dup: 2, rd: 13858) , new dirty pages: 72379 , remaining dirty pages: 74170 > > Iteration 18, duration: 1923 ms , transferred pages: 15442 (dup: 12, rd: 15430) , new dirty pages: 101911 , remaining dirty pages: 103547 > > Iteration 19, duration: 2311 ms , transferred pages: 18823 (dup: 9, rd: 18814) , new dirty pages: 80534 , remaining dirty pages: 82521 > > Iteration 20, duration: 2081 ms , transferred pages: 17156 (dup: 34, rd: 17122) , new dirty pages: 36054 , remaining dirty pages: 36054 > > Iteration 21, duration: 1665 ms , transferred pages: 13777 (dup: 10, rd: 13767) , new dirty pages: 29624 , remaining dirty pages: 29624 > > Iteration 22, duration: 1657 ms , transferred pages: 13290 (dup: 7, rd: 13283) , new dirty pages: 25949 , remaining dirty pages: 28265 > > Iteration 23, duration: 1599 ms , transferred pages: 13088 (dup: 0, rd: 13088) , new dirty pages: 22356 , remaining dirty pages: 24813 > > Iteration 24, duration: 1500 ms , transferred pages: 12280 (dup: 10, rd: 12270) , new dirty pages: 21181 , remaining dirty pages: 22608 > > Iteration 25, duration: 1400 ms , transferred pages: 11457 (dup: 5, rd: 11452) , new dirty pages: 18657 , remaining dirty pages: 20311 > > Iteration 26, duration: 1200 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 15690 , remaining dirty pages: 17294 > > Iteration 27, duration: 1201 ms , transferred pages: 9822 (dup: 6, rd: 9816) , new dirty pages: 14810 , remaining dirty pages: 15936 > > Iteration 28, duration: 1000 ms , transferred pages: 8183 (dup: 3, rd: 8180) , new dirty pages: 15387 , remaining dirty pages: 16423 > > Iteration 29, duration: 900 ms , transferred pages: 7372 (dup: 10, rd: 7362) , new dirty pages: 13303 , remaining dirty pages: 15292 > > Iteration 30, duration: 1000 ms , transferred pages: 8181 (dup: 1, rd: 8180) , new dirty pages: 17879 , remaining dirty pages: 18457 > > Iteration 31, duration: 951 ms , transferred pages: 8140 (dup: 9, rd: 8131) , new dirty pages: 21738 , remaining dirty pages: 23304 > > Iteration 32, duration: 946 ms , transferred pages: 6946 (dup: 1, rd: 6945) , new dirty pages: 15815 , remaining dirty pages: 15815 > > Iteration 33, duration: 747 ms , transferred pages: 6192 (dup: 0, rd: 6192) , new dirty pages: 6249 , remaining dirty pages: 7670 > > Iteration 34, duration: 501 ms , transferred pages: 4090 (dup: 0, rd: 4090) , new dirty pages: 6163 , remaining dirty pages: 8422 > > Iteration 35, duration: 600 ms , transferred pages: 4910 (dup: 2, rd: 4908) , new dirty pages: 3673 , remaining dirty pages: 5222 > > Iteration 36, duration: 300 ms , transferred pages: 2454 (dup: 0, rd: 2454) , new dirty pages: 2132 , remaining dirty pages: 4337 > > Iteration 37, duration: 200 ms , transferred pages: 1637 (dup: 1, rd: 1636) , new dirty pages: 544 , remaining dirty pages: 2251 > > Iteration 38, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0) , new dirty pages: 0 , remaining dirty pages: 2251 > > total time: 97919 milliseconds > > > > ------------------The other 11 workloads without notable improvements (only the result of original precopy is shown)------------------- > > > > 5. idle > > > > Iteration 1, duration: 14702 ms , transferred pages: 266450 (dup: 146393, rd: 120057, fd: 0) , new dirty pages: 14595 , remaining dirty pages: 14595 > > Iteration 2, duration: 1592 ms , transferred pages: 12412 (dup: 103, rd: 3280, fd: 9029) , new dirty pages: 218 , remaining dirty pages: 2401 > > Iteration 3, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2401 > > > > 6. kernel compilation (can not converge) > > > > Iteration 1, duration: 20607 ms , transferred pages: 266450 (dup: 97552, rd: 168898, fd: 0) , new dirty pages: 19293 , remaining dirty pages: 19293 > > Iteration 2, duration: 2092 ms , transferred pages: 17176 (dup: 597, rd: 8625, fd: 7954) , new dirty pages: 8318 , remaining dirty pages: 10435 > > Iteration 3, duration: 1000 ms , transferred pages: 8484 (dup: 304, rd: 6256, fd: 1924) , new dirty pages: 8736 , remaining dirty pages: 10687 > > Iteration 4, duration: 1000 ms , transferred pages: 8435 (dup: 255, rd: 7089, fd: 1091) , new dirty pages: 7627 , remaining dirty pages: 9879 > > Iteration 5, duration: 900 ms , transferred pages: 7553 (dup: 191, rd: 5602, fd: 1760) , new dirty pages: 7287 , remaining dirty pages: 9613 > > Iteration 6, duration: 900 ms , transferred pages: 7620 (dup: 258, rd: 5761, fd: 1601) , new dirty pages: 8958 , remaining dirty pages: 10951 > > Iteration 7, duration: 1099 ms , transferred pages: 9309 (dup: 311, rd: 8051, fd: 947) , new dirty pages: 7189 , remaining dirty pages: 8831 > > Iteration 8, duration: 800 ms , transferred pages: 6832 (dup: 288, rd: 5717, fd: 827) , new dirty pages: 5782 , remaining dirty pages: 7781 > > Iteration 9, duration: 701 ms , transferred pages: 5875 (dup: 149, rd: 4005, fd: 1721) , new dirty pages: 4587 , remaining dirty pages: 6493 > > Iteration 10, duration: 500 ms , transferred pages: 4234 (dup: 144, rd: 3057, fd: 1033) , new dirty pages: 7352 , remaining dirty pages: 9611 > > Iteration 11, duration: 900 ms , transferred pages: 7759 (dup: 397, rd: 6563, fd: 799) , new dirty pages: 6686 , remaining dirty pages: 8538 > > Iteration 12, duration: 800 ms , transferred pages: 6808 (dup: 264, rd: 6017, fd: 527) , new dirty pages: 6871 , remaining dirty pages: 8601 > > Iteration 13, duration: 800 ms , transferred pages: 6775 (dup: 231, rd: 5722, fd: 822) , new dirty pages: 7540 , remaining dirty pages: 9366 > > Iteration 14, duration: 900 ms , transferred pages: 7507 (dup: 145, rd: 5900, fd: 1462) , new dirty pages: 7581 , remaining dirty pages: 9440 > > Iteration 15, duration: 900 ms , transferred pages: 7630 (dup: 268, rd: 6211, fd: 1151) , new dirty pages: 7268 , remaining dirty pages: 9078 > > Iteration 16, duration: 800 ms , transferred pages: 6759 (dup: 215, rd: 5763, fd: 781) , new dirty pages: 6861 , remaining dirty pages: 9180 > > Iteration 17, duration: 800 ms , transferred pages: 6838 (dup: 294, rd: 6037, fd: 507) , new dirty pages: 6196 , remaining dirty pages: 8538 > > Iteration 18, duration: 800 ms , transferred pages: 6852 (dup: 308, rd: 4905, fd: 1639) , new dirty pages: 5947 , remaining dirty pages: 7633 > > Iteration 19, duration: 700 ms , transferred pages: 5919 (dup: 193, rd: 4853, fd: 873) , new dirty pages: 5861 , remaining dirty pages: 7575 > > Iteration 20, duration: 600 ms , transferred pages: 5284 (dup: 376, rd: 4408, fd: 500) , new dirty pages: 5206 , remaining dirty pages: 7497 > > Iteration 21, duration: 600 ms , transferred pages: 5147 (dup: 239, rd: 4308, fd: 600) , new dirty pages: 5031 , remaining dirty pages: 7381 > > Iteration 22, duration: 599 ms , transferred pages: 5064 (dup: 156, rd: 4026, fd: 882) , new dirty pages: 5601 , remaining dirty pages: 7918 > > Iteration 23, duration: 702 ms , transferred pages: 5965 (dup: 239, rd: 5028, fd: 698) , new dirty pages: 6079 , remaining dirty pages: 8032 > > Iteration 24, duration: 700 ms , transferred pages: 6175 (dup: 449, rd: 5146, fd: 580) , new dirty pages: 10932 , remaining dirty pages: 12789 > > Iteration 25, duration: 1300 ms , transferred pages: 10936 (dup: 302, rd: 6205, fd: 4429) , new dirty pages: 8713 , remaining dirty pages: 10566 > > Iteration 26, duration: 1000 ms , transferred pages: 8282 (dup: 102, rd: 5662, fd: 2518) , new dirty pages: 5119 , remaining dirty pages: 7403 > > Iteration 27, duration: 600 ms , transferred pages: 5007 (dup: 99, rd: 4099, fd: 809) , new dirty pages: 2226 , remaining dirty pages: 4622 > > Iteration 28, duration: 300 ms , transferred pages: 2491 (dup: 37, rd: 1794, fd: 660) , new dirty pages: 6746 , remaining dirty pages: 8877 > > Iteration 29, duration: 800 ms , transferred pages: 6757 (dup: 213, rd: 5532, fd: 1012) , new dirty pages: 6070 , remaining dirty pages: 8190 > > Iteration 30, duration: 700 ms , transferred pages: 6052 (dup: 326, rd: 5107, fd: 619) , new dirty pages: 5177 , remaining dirty pages: 7315 > > > > 7. web server > > > > Iteration 1, duration: 20902 ms , transferred pages: 266450 (dup: 95497, rd: 170953, fd: 0) , new dirty pages: 8528 , remaining dirty pages: 8528 > > Iteration 2, duration: 796 ms , transferred pages: 6472 (dup: 131, rd: 1885, fd: 4456) , new dirty pages: 650 , remaining dirty pages: 2706 > > Iteration 3, duration: 100 ms , transferred pages: 818 (dup: 0, rd: 383, fd: 435) , new dirty pages: 328 , remaining dirty pages: 2216 > > Iteration 4, duration: 0 ms , transferred pages: 0 (dup: 0, rd: 0, fd: 0) , new dirty pages: 0 , remaining dirty pages: 2216 > > > > > > 8. cpu2006.bwaves (can not converge) > > > > Iteration 1, duration: 31715 ms , transferred pages: 266450 (dup: 6766, rd: 259684, fd: 0) , new dirty pages: 242702 , remaining dirty pages: 242702 > > Iteration 2, duration: 29397 ms , transferred pages: 240508 (dup: 405, rd: 225588, fd: 14515) , new dirty pages: 230889 , remaining dirty pages: 233083 > > Iteration 3, duration: 28205 ms , transferred pages: 230858 (dup: 182, rd: 214596, fd: 16080) , new dirty pages: 226998 , remaining dirty pages: 229223 > > Iteration 4, duration: 27805 ms , transferred pages: 227574 (dup: 170, rd: 217045, fd: 10359) , new dirty pages: 227360 , remaining dirty pages: 229009 > > Iteration 5, duration: 27703 ms , transferred pages: 226786 (dup: 200, rd: 212130, fd: 14456) , new dirty pages: 225885 , remaining dirty pages: 228108 > > Iteration 6, duration: 27600 ms , transferred pages: 225923 (dup: 155, rd: 215503, fd: 10265) , new dirty pages: 223555 , remaining dirty pages: 225740 > > Iteration 7, duration: 27309 ms , transferred pages: 223574 (dup: 260, rd: 215641, fd: 7673) , new dirty pages: 231975 , remaining dirty pages: 234141 > > Iteration 8, duration: 28403 ms , transferred pages: 232397 (dup: 85, rd: 214086, fd: 18226) , new dirty pages: 222170 , remaining dirty pages: 223914 > > Iteration 9, duration: 27105 ms , transferred pages: 221809 (dup: 131, rd: 214988, fd: 6690) , new dirty pages: 230065 , remaining dirty pages: 232170 > > Iteration 10, duration: 28104 ms , transferred pages: 230201 (dup: 343, rd: 213531, fd: 16327) , new dirty pages: 227590 , remaining dirty pages: 229559 > > Iteration 11, duration: 27801 ms , transferred pages: 227717 (dup: 313, rd: 221408, fd: 5996) , new dirty pages: 228457 , remaining dirty pages: 230299 > > Iteration 12, duration: 27916 ms , transferred pages: 228560 (dup: 338, rd: 219660, fd: 8562) , new dirty pages: 238326 , remaining dirty pages: 240065 > > > > 9. cpu2006.lbm (can not converge) > > Iteration 1, duration: 31012 ms , transferred pages: 266450 (dup: 12253, rd: 254197, fd: 0) , new dirty pages: 108960 , remaining dirty pages: 108960 > > Iteration 2, duration: 13095 ms , transferred pages: 106522 (dup: 3, rd: 102045, fd: 4474) , new dirty pages: 129292 , remaining dirty pages: 131730 > > Iteration 3, duration: 15802 ms , transferred pages: 129688 (dup: 444, rd: 110860, fd: 18384) , new dirty pages: 116682 , remaining dirty pages: 118724 > > Iteration 4, duration: 14204 ms , transferred pages: 116316 (dup: 160, rd: 104951, fd: 11205) , new dirty pages: 107246 , remaining dirty pages: 109654 > > Iteration 5, duration: 13208 ms , transferred pages: 107977 (dup: 1, rd: 101834, fd: 6142) , new dirty pages: 105371 , remaining dirty pages: 107048 > > Iteration 6, duration: 12804 ms , transferred pages: 104705 (dup: 1, rd: 99629, fd: 5075) , new dirty pages: 103841 , remaining dirty pages: 106184 > > Iteration 7, duration: 12709 ms , transferred pages: 103891 (dup: 5, rd: 99212, fd: 4674) , new dirty pages: 106692 , remaining dirty pages: 108985 > > Iteration 8, duration: 13105 ms , transferred pages: 107169 (dup: 11, rd: 100125, fd: 7033) , new dirty pages: 103132 , remaining dirty pages: 104948 > > Iteration 9, duration: 12607 ms , transferred pages: 103068 (dup: 0, rd: 99460, fd: 3608) , new dirty pages: 102511 , remaining dirty pages: 104391 > > Iteration 10, duration: 12514 ms , transferred pages: 102250 (dup: 0, rd: 99094, fd: 3156) , new dirty pages: 102888 , remaining dirty pages: 105029 > > > > 10. cpu2006.astar (can not converge) > > > > Iteration 1, duration: 28402 ms , transferred pages: 266450 (dup: 33770, rd: 232680, fd: 0) , new dirty pages: 62078 , remaining dirty pages: 62078 > > Iteration 2, duration: 7393 ms , transferred pages: 60107 (dup: 10, rd: 51722, fd: 8375) , new dirty pages: 48854 , remaining dirty pages: 50825 > > Iteration 3, duration: 6001 ms , transferred pages: 49094 (dup: 14, rd: 46540, fd: 2540) , new dirty pages: 48137 , remaining dirty pages: 49868 > > Iteration 4, duration: 5800 ms , transferred pages: 47444 (dup: 0, rd: 45389, fd: 2055) , new dirty pages: 49147 , remaining dirty pages: 51571 > > Iteration 5, duration: 6102 ms , transferred pages: 49912 (dup: 14, rd: 46216, fd: 3682) , new dirty pages: 55606 , remaining dirty pages: 57265 > > Iteration 6, duration: 6699 ms , transferred pages: 54949 (dup: 143, rd: 20745, fd: 34061) , new dirty pages: 9166 , remaining dirty pages: 11482 > > Iteration 7, duration: 1200 ms , transferred pages: 9830 (dup: 14, rd: 7011, fd: 2805) , new dirty pages: 8294 , remaining dirty pages: 9946 > > Iteration 8, duration: 1000 ms , transferred pages: 8194 (dup: 14, rd: 7178, fd: 1002) , new dirty pages: 5475 , remaining dirty pages: 7227 > > Iteration 9, duration: 600 ms , transferred pages: 4908 (dup: 0, rd: 3470, fd: 1438) , new dirty pages: 4175 , remaining dirty pages: 6494 > > Iteration 10, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3856, fd: 234) , new dirty pages: 4095 , remaining dirty pages: 6499 > > Iteration 11, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 3313, fd: 777) , new dirty pages: 3371 , remaining dirty pages: 5780 > > Iteration 12, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3823, fd: 267) , new dirty pages: 7518 , remaining dirty pages: 9208 > > Iteration 13, duration: 899 ms , transferred pages: 7376 (dup: 14, rd: 6028, fd: 1334) , new dirty pages: 3931 , remaining dirty pages: 5763 > > Iteration 14, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 4346 , remaining dirty pages: 6019 > > Iteration 15, duration: 502 ms , transferred pages: 4090 (dup: 0, rd: 3817, fd: 273) , new dirty pages: 3054 , remaining dirty pages: 4983 > > Iteration 16, duration: 400 ms , transferred pages: 3272 (dup: 0, rd: 3138, fd: 134) , new dirty pages: 3874 , remaining dirty pages: 5585 > > Iteration 17, duration: 399 ms , transferred pages: 3272 (dup: 0, rd: 3248, fd: 24) , new dirty pages: 5285 , remaining dirty pages: 7598 > > Iteration 18, duration: 701 ms , transferred pages: 5726 (dup: 0, rd: 4385, fd: 1341) , new dirty pages: 8903 , remaining dirty pages: 10775 > > Iteration 19, duration: 1101 ms , transferred pages: 9010 (dup: 12, rd: 5597, fd: 3401) , new dirty pages: 4199 , remaining dirty pages: 5964 > > Iteration 20, duration: 500 ms , transferred pages: 4090 (dup: 0, rd: 4078, fd: 12) , new dirty pages: 3829 , remaining dirty pages: 5703 > > > > 11. cpu2006.xalancbmk (can not converge) > > > > Iteration 1, duration: 30407 ms , transferred pages: 266450 (dup: 17700, rd: 248750, fd: 0) , new dirty pages: 96169 , remaining dirty pages: 96169 > > Iteration 2, duration: 11495 ms , transferred pages: 94164 (dup: 205, rd: 67068, fd: 26891) , new dirty pages: 61766 , remaining dirty pages: 63771 > > Iteration 3, duration: 7501 ms , transferred pages: 61471 (dup: 121, rd: 53587, fd: 7763) , new dirty pages: 56569 , remaining dirty pages: 58869 > > Iteration 4, duration: 6902 ms , transferred pages: 56461 (dup: 19, rd: 50553, fd: 5889) , new dirty pages: 52181 , remaining dirty pages: 54589 > > Iteration 5, duration: 6402 ms , transferred pages: 52459 (dup: 107, rd: 46986, fd: 5366) , new dirty pages: 54051 , remaining dirty pages: 56181 > > Iteration 6, duration: 6601 ms , transferred pages: 54003 (dup: 15, rd: 47566, fd: 6422) , new dirty pages: 50844 , remaining dirty pages: 53022 > > Iteration 7, duration: 6202 ms , transferred pages: 50723 (dup: 7, rd: 47143, fd: 3573) , new dirty pages: 64880 , remaining dirty pages: 67179 > > Iteration 8, duration: 8001 ms , transferred pages: 65447 (dup: 7, rd: 61159, fd: 4281) , new dirty pages: 67854 , remaining dirty pages: 69586 > > Iteration 9, duration: 8202 ms , transferred pages: 67444 (dup: 368, rd: 56357, fd: 10719) , new dirty pages: 65178 , remaining dirty pages: 67320 > > Iteration 10, duration: 8000 ms , transferred pages: 65455 (dup: 15, rd: 60581, fd: 4859) , new dirty pages: 52421 , remaining dirty pages: 54286 > > > > 12. cpu2006.milc (can not converge) > > > > Iteration 1, duration: 31410 ms , transferred pages: 266450 (dup: 9454, rd: 256996, fd: 0) , new dirty pages: 158860 , remaining dirty pages: 158860 > > Iteration 2, duration: 19193 ms , transferred pages: 157048 (dup: 150, rd: 96807, fd: 60091) , new dirty pages: 102238 , remaining dirty pages: 104050 > > Iteration 3, duration: 12504 ms , transferred pages: 102271 (dup: 21, rd: 95107, fd: 7143) , new dirty pages: 97944 , remaining dirty pages: 99723 > > Iteration 4, duration: 11905 ms , transferred pages: 97360 (dup: 18, rd: 93610, fd: 3732) , new dirty pages: 99150 , remaining dirty pages: 101513 > > Iteration 5, duration: 12105 ms , transferred pages: 99094 (dup: 116, rd: 94125, fd: 4853) , new dirty pages: 98589 , remaining dirty pages: 101008 > > Iteration 6, duration: 12101 ms , transferred pages: 98995 (dup: 17, rd: 94069, fd: 4909) , new dirty pages: 147403 , remaining dirty pages: 149416 > > Iteration 7, duration: 18001 ms , transferred pages: 147284 (dup: 44, rd: 135691, fd: 11549) , new dirty pages: 136445 , remaining dirty pages: 138577 > > Iteration 8, duration: 16702 ms , transferred pages: 136636 (dup: 30, rd: 130805, fd: 5801) , new dirty pages: 145481 , remaining dirty pages: 147422 > > Iteration 9, duration: 17800 ms , transferred pages: 145734 (dup: 130, rd: 133239, fd: 12365) , new dirty pages: 98032 , remaining dirty pages: 99720 > > Iteration 10, duration: 11902 ms , transferred pages: 97364 (dup: 22, rd: 93096, fd: 4246) , new dirty pages: 95391 , remaining dirty pages: 97747 > > > > 13. cpu2006.cactusADM (can not converge) > > > > Iteration 1, duration: 23508 ms , transferred pages: 266450 (dup: 73568, rd: 192882, fd: 0) , new dirty pages: 123869 , remaining dirty pages: 123869 > > Iteration 2, duration: 13989 ms , transferred pages: 121594 (dup: 7874, rd: 81653, fd: 32067) , new dirty pages: 112960 , remaining dirty pages: 115235 > > Iteration 3, duration: 13605 ms , transferred pages: 113276 (dup: 2028, rd: 83783, fd: 27465) , new dirty pages: 112314 , remaining dirty pages: 114273 > > Iteration 4, duration: 13509 ms , transferred pages: 111935 (dup: 1505, rd: 83535, fd: 26895) , new dirty pages: 114078 , remaining dirty pages: 116416 > > Iteration 5, duration: 13810 ms , transferred pages: 114262 (dup: 1378, rd: 84039, fd: 28845) , new dirty pages: 112271 , remaining dirty pages: 114425 > > Iteration 6, duration: 13604 ms , transferred pages: 112664 (dup: 1416, rd: 84300, fd: 26948) , new dirty pages: 112903 , remaining dirty pages: 114664 > > Iteration 7, duration: 13604 ms , transferred pages: 112655 (dup: 1407, rd: 84027, fd: 27221) , new dirty pages: 110943 , remaining dirty pages: 112952 > > Iteration 8, duration: 13406 ms , transferred pages: 110720 (dup: 1108, rd: 84075, fd: 25537) , new dirty pages: 109321 , remaining dirty pages: 111553 > > Iteration 9, duration: 13306 ms , transferred pages: 109726 (dup: 932, rd: 83652, fd: 25142) , new dirty pages: 113446 , remaining dirty pages: 115273 > > Iteration 10, duration: 13705 ms , transferred pages: 113121 (dup: 1055, rd: 84671, fd: 27395) , new dirty pages: 108776 , remaining dirty pages: 110928 > > > > 14. cpu2006.GmesFDTD (can not converge) > > > > Iteration 1, duration: 13303 ms , transferred pages: 266450 (dup: 157809, rd: 108641, fd: 0) , new dirty pages: 226802 , remaining dirty pages: 226802 > > Iteration 2, duration: 10797 ms , transferred pages: 226507 (dup: 138637, rd: 61818, fd: 26052) , new dirty pages: 200769 , remaining dirty pages: 201064 > > Iteration 3, duration: 8900 ms , transferred pages: 199717 (dup: 127187, rd: 69340, fd: 3190) , new dirty pages: 203436 , remaining dirty pages: 204783 > > Iteration 4, duration: 10904 ms , transferred pages: 204127 (dup: 115211, rd: 85767, fd: 3149) , new dirty pages: 198407 , remaining dirty pages: 199063 > > Iteration 5, duration: 12109 ms , transferred pages: 198206 (dup: 99435, rd: 96956, fd: 1815) , new dirty pages: 213719 , remaining dirty pages: 214576 > > Iteration 6, duration: 16307 ms , transferred pages: 213595 (dup: 80422, rd: 116885, fd: 16288) , new dirty pages: 199637 , remaining dirty pages: 200618 > > Iteration 7, duration: 16915 ms , transferred pages: 198289 (dup: 60169, rd: 134208, fd: 3912) , new dirty pages: 199343 , remaining dirty pages: 201672 > > Iteration 8, duration: 19518 ms , transferred pages: 200452 (dup: 41014, rd: 156083, fd: 3355) , new dirty pages: 222927 , remaining dirty pages: 224147 > > > > 15. cpu2006.wrf (can not converge) > > > > Iteration 1, duration: 18499 ms , transferred pages: 266380 (dup: 115285, rd: 151095, fd: 0) , new dirty pages: 112322 , remaining dirty pages: 112392 > > Iteration 2, duration: 9802 ms , transferred pages: 110025 (dup: 29917, rd: 65782, fd: 14326) , new dirty pages: 88855 , remaining dirty pages: 91222 > > Iteration 3, duration: 8199 ms , transferred pages: 89761 (dup: 22728, rd: 57262, fd: 9771) , new dirty pages: 58431 , remaining dirty pages: 59892 > > Iteration 4, duration: 5603 ms , transferred pages: 58502 (dup: 12716, rd: 41809, fd: 3977) , new dirty pages: 80556 , remaining dirty pages: 81946 > > Iteration 5, duration: 7101 ms , transferred pages: 79778 (dup: 21738, rd: 50896, fd: 7144) , new dirty pages: 62592 , remaining dirty pages: 64760 > > Iteration 6, duration: 5702 ms , transferred pages: 63388 (dup: 16793, rd: 42726, fd: 3869) , new dirty pages: 80747 , remaining dirty pages: 82119 > > Iteration 7, duration: 7000 ms , transferred pages: 80868 (dup: 23652, rd: 52194, fd: 5022) , new dirty pages: 84593 , remaining dirty pages: 85844 > > Iteration 8, duration: 7099 ms , transferred pages: 83799 (dup: 25769, rd: 51772, fd: 6258) , new dirty pages: 67951 , remaining dirty pages: 69996 > > Iteration 9, duration: 6303 ms , transferred pages: 68478 (dup: 16979, rd: 36490, fd: 15009) , new dirty pages: 81181 , remaining dirty pages: 82699 > > Iteration 10, duration: 7000 ms , transferred pages: 80724 (dup: 23503, rd: 52826, fd: 4395) , new dirty pages: 47930 , remaining dirty pages: 49905 > > > > > > > > > > > > > So I think "booting" and "kernel compilation" should benefit a lot from this > > > > improvement. The reason of "kernel compilation" would benefit is that some > > > > iterations take around 600ms, and if they are halved into 300ms, then the precopy > > > > may have the chance to step into stop and copy phase. > > > > > > > > On the other hand, "idle" and "web server" would not benefit a lot, because > > > > most of the time are spent on the 1st iteration and little on the others. > > > > > > > > As to the "zeusmp" and "memcached", although the time spent on the other iterations > > > > but the 1st one may be halved, they still could not converge to stop and copy > > > > with the 300ms downtime. > > > > > > > > --------------------1 vcpu, 1 GB ram, default bandwidth (32MB/s):------------------ > > > > > > > > 1. booting : begin to migrate when the VM is booting > > > > > > > > Iteration 1, duration: 6997 ms , transferred pages: 266450 (n: 57269, d: 209181 ) , new dirty pages: 56414 , remaining dirty pages: 56414 > > > > Iteration 2, duration: 6497 ms , transferred pages: 54008 (n: 52701, d: 1307 ) , new dirty pages: 48053 , remaining dirty pages: 50459 > > > > Iteration 3, duration: 5800 ms , transferred pages: 48232 (n: 47444, d: 788 ) , new dirty pages: 9129 , remaining dirty pages: 11356 > > > > Iteration 4, duration: 1100 ms , transferred pages: 9091 (n: 8998, d: 93 ) , new dirty pages: 165 , remaining dirty pages: 2430 > > > > Iteration 5, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2430 > > > > (note: When the workload does converge, the output of the last iteration is "fake". It just indicates that the precopy steps into stop-copy phase now. > > > > "n" means "normal pages" and "d" means "duplicate (zero) pages".) > > > > > > > > 2. idle > > > > > > > > Iteration 1, duration: 14496 ms , transferred pages: 266450 (n: 118980, d: 147470 ) , new dirty pages: 17398 , remaining dirty pages: 17398 > > > > Iteration 2, duration: 1896 ms , transferred pages: 14953 (n: 14854, d: 99 ) , new dirty pages: 1849 , remaining dirty pages: 4294 > > > > Iteration 3, duration: 300 ms , transferred pages: 2454 (n: 2454, d: 0 ) , new dirty pages: 9 , remaining dirty pages: 1849 > > > > Iteration 4, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 1849 > > > > > > > > 3. kernel compilation (can not converge) > > > > > > > > Iteration 1, duration: 20700 ms , transferred pages: 266450 (n: 169778, d: 96672 ) , new dirty pages: 40067 , remaining dirty pages: 40067 > > > > Iteration 2, duration: 4696 ms , transferred pages: 38401 (n: 37787, d: 614 ) , new dirty pages: 8852 , remaining dirty pages: 10518 > > > > Iteration 3, duration: 1000 ms , transferred pages: 8642 (n: 8180, d: 462 ) , new dirty pages: 6331 , remaining dirty pages: 8207 > > > > Iteration 4, duration: 700 ms , transferred pages: 6110 (n: 5726, d: 384 ) , new dirty pages: 5242 , remaining dirty pages: 7339 > > > > Iteration 5, duration: 600 ms , transferred pages: 5007 (n: 4908, d: 99 ) , new dirty pages: 4868 , remaining dirty pages: 7200 > > > > Iteration 6, duration: 600 ms , transferred pages: 5226 (n: 4908, d: 318 ) , new dirty pages: 6142 , remaining dirty pages: 8116 > > > > Iteration 7, duration: 700 ms , transferred pages: 5985 (n: 5726, d: 259 ) , new dirty pages: 5902 , remaining dirty pages: 8033 > > > > Iteration 8, duration: 701 ms , transferred pages: 5893 (n: 5726, d: 167 ) , new dirty pages: 7502 , remaining dirty pages: 9642 > > > > Iteration 9, duration: 900 ms , transferred pages: 7623 (n: 7362, d: 261 ) , new dirty pages: 6408 , remaining dirty pages: 8427 > > > > Iteration 10, duration: 700 ms , transferred pages: 6008 (n: 5726, d: 282 ) , new dirty pages: 8312 , remaining dirty pages: 10731 > > > > Iteration 11, duration: 1000 ms , transferred pages: 8353 (n: 8180, d: 173 ) , new dirty pages: 6874 , remaining dirty pages: 9252 > > > > Iteration 12, duration: 899 ms , transferred pages: 7477 (n: 7362, d: 115 ) , new dirty pages: 5573 , remaining dirty pages: 7348 > > > > Iteration 13, duration: 601 ms , transferred pages: 5099 (n: 4908, d: 191 ) , new dirty pages: 7671 , remaining dirty pages: 9920 > > > > Iteration 14, duration: 900 ms , transferred pages: 7586 (n: 7362, d: 224 ) , new dirty pages: 7359 , remaining dirty pages: 9693 > > > > Iteration 15, duration: 900 ms , transferred pages: 7682 (n: 7362, d: 320 ) , new dirty pages: 7371 , remaining dirty pages: 9382 > > > > > > > > 4. cpu2006.zeusmp (can not converge) > > > > > > > > Iteration 1, duration: 21603 ms , transferred pages: 266450 (n: 176660, d: 89790 ) , new dirty pages: 145625 , remaining dirty pages: 145625 > > > > Iteration 2, duration: 8696 ms , transferred pages: 144389 (n: 70862, d: 73527 ) , new dirty pages: 125124 , remaining dirty pages: 126360 > > > > Iteration 3, duration: 6301 ms , transferred pages: 124057 (n: 51379, d: 72678 ) , new dirty pages: 122528 , remaining dirty pages: 124831 > > > > Iteration 4, duration: 6400 ms , transferred pages: 124330 (n: 52196, d: 72134 ) , new dirty pages: 124267 , remaining dirty pages: 124768 > > > > Iteration 5, duration: 6703 ms , transferred pages: 124034 (n: 54656, d: 69378 ) , new dirty pages: 124151 , remaining dirty pages: 124885 > > > > Iteration 6, duration: 6703 ms , transferred pages: 124357 (n: 54658, d: 69699 ) , new dirty pages: 124106 , remaining dirty pages: 124634 > > > > Iteration 7, duration: 6602 ms , transferred pages: 124568 (n: 53838, d: 70730 ) , new dirty pages: 133828 , remaining dirty pages: 133894 > > > > Iteration 8, duration: 7600 ms , transferred pages: 133030 (n: 62021, d: 71009 ) , new dirty pages: 126612 , remaining dirty pages: 127476 > > > > Iteration 9, duration: 7299 ms , transferred pages: 126511 (n: 59569, d: 66942 ) , new dirty pages: 122727 , remaining dirty pages: 123692 > > > > Iteration 10, duration: 6609 ms , transferred pages: 123692 (n: 54539, d: 69153 ) , new dirty pages: 122727 , remaining dirty pages: 122727 > > > > Iteration 11, duration: 6995 ms , transferred pages: 120347 (n: 56423, d: 63924 ) , new dirty pages: 121430 , remaining dirty pages: 123810 > > > > Iteration 12, duration: 6703 ms , transferred pages: 123040 (n: 54657, d: 68383 ) , new dirty pages: 122043 , remaining dirty pages: 122813 > > > > Iteration 13, duration: 7006 ms , transferred pages: 122353 (n: 57121, d: 65232 ) , new dirty pages: 133869 , remaining dirty pages: 134329 > > > > Iteration 14, duration: 8209 ms , transferred pages: 132325 (n: 66932, d: 65393 ) , new dirty pages: 126914 , remaining dirty pages: 128918 > > > > Iteration 15, duration: 7802 ms , transferred pages: 126931 (n: 63671, d: 63260 ) , new dirty pages: 122351 , remaining dirty pages: 124338 > > > > > > > > 5. web server : An apache web server. The client is configured with 50 concurrent connections. > > > > > > > > Iteration 1, duration: 30697 ms , transferred pages: 266450 (n: 251215, d: 15235 ) , new dirty pages: 30628 , remaining dirty pages: 30628 > > > > Iteration 2, duration: 3496 ms , transferred pages: 28859 (n: 28513, d: 346 ) , new dirty pages: 5805 , remaining dirty pages: 7574 > > > > Iteration 3, duration: 701 ms , transferred pages: 5746 (n: 5726, d: 20 ) , new dirty pages: 3433 , remaining dirty pages: 5261 > > > > Iteration 4, duration: 400 ms , transferred pages: 3281 (n: 3272, d: 9 ) , new dirty pages: 1539 , remaining dirty pages: 3519 > > > > Iteration 5, duration: 199 ms , transferred pages: 1653 (n: 1636, d: 17 ) , new dirty pages: 301 , remaining dirty pages: 2167 > > > > Iteration 6, duration: 1 ms , transferred pages: 0 (n: 0, d: 0 ) , new dirty pages: 0 , remaining dirty pages: 2167 > > > > > > > > --------------------6 vcpu, 6 GB ram, max bandwidth (941.08 mbps):------------------ > > > > > > > > 6. memcached : 4 GB cache, memaslap: all write, concurrency = 5 (can not converge) > > > > > > > > Iteration 1, duration: 42486 ms , transferred pages: 1568087 (n: 1216079, d: 352008 ) , new dirty pages: 571940 , remaining dirty pages: 581023 > > > > Iteration 2, duration: 19774 ms , transferred pages: 571700 (n: 567416, d: 4284 ) , new dirty pages: 331690 , remaining dirty pages: 341013 > > > > Iteration 3, duration: 11589 ms , transferred pages: 332187 (n: 332095, d: 92 ) , new dirty pages: 222725 , remaining dirty pages: 231551 > > > > Iteration 4, duration: 7790 ms , transferred pages: 223571 (n: 223499, d: 72 ) , new dirty pages: 157658 , remaining dirty pages: 165638 > > > > Iteration 5, duration: 5518 ms , transferred pages: 158056 (n: 157998, d: 58 ) , new dirty pages: 128130 , remaining dirty pages: 135712 > > > > Iteration 6, duration: 4442 ms , transferred pages: 127764 (n: 127701, d: 63 ) , new dirty pages: 104839 , remaining dirty pages: 112787 > > > > Iteration 7, duration: 3649 ms , transferred pages: 104581 (n: 104523, d: 58 ) , new dirty pages: 100736 , remaining dirty pages: 108942 > > > > Iteration 8, duration: 3532 ms , transferred pages: 101379 (n: 101315, d: 64 ) , new dirty pages: 87869 , remaining dirty pages: 95432 > > > > Iteration 9, duration: 3030 ms , transferred pages: 86841 (n: 86786, d: 55 ) , new dirty pages: 77505 , remaining dirty pages: 86096 > > > > Iteration 10, duration: 2709 ms , transferred pages: 77875 (n: 77814, d: 61 ) , new dirty pages: 77197 , remaining dirty pages: 85418 > > > > Iteration 11, duration: 2696 ms , transferred pages: 77107 (n: 77044, d: 63 ) , new dirty pages: 65010 , remaining dirty pages: 73321 > > > > Iteration 12, duration: 2308 ms , transferred pages: 66540 (n: 66484, d: 56 ) , new dirty pages: 64388 , remaining dirty pages: 71169 > > > > Iteration 13, duration: 2198 ms , transferred pages: 62953 (n: 62897, d: 56 ) , new dirty pages: 62773 , remaining dirty pages: 70989 > > > > Iteration 14, duration: 2214 ms , transferred pages: 63466 (n: 63411, d: 55 ) , new dirty pages: 67538 , remaining dirty pages: 75061 > > > > Iteration 15, duration: 2329 ms , transferred pages: 66924 (n: 66875, d: 49 ) , new dirty pages: 63580 , remaining dirty pages: 71717 > > > > Iteration 16, duration: 2252 ms , transferred pages: 64554 (n: 64539, d: 15 ) , new dirty pages: 63094 , remaining dirty pages: 70257 > > > > Iteration 17, duration: 2188 ms , transferred pages: 62697 (n: 62641, d: 56 ) , new dirty pages: 63016 , remaining dirty pages: 70576 > > > > Iteration 18, duration: 2171 ms , transferred pages: 62377 (n: 62322, d: 55 ) , new dirty pages: 56764 , remaining dirty pages: 64963 > > > > Iteration 19, duration: 2003 ms , transferred pages: 57382 (n: 57324, d: 58 ) , new dirty pages: 65307 , remaining dirty pages: 72888 > > > > Iteration 20, duration: 2240 ms , transferred pages: 64426 (n: 64364, d: 62 ) , new dirty pages: 61585 , remaining dirty pages: 70047 > > > > > > > > > > > > -- > > > > Chunguang Li, Ph.D. Candidate > > > > Wuhan National Laboratory for Optoelectronics (WNLO) > > > > Huazhong University of Science & Technology (HUST) > > > > Wuhan, Hubei Prov., China > > > > > > > > > > > > > > > -- > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > > > > -- > > Chunguang Li, Ph.D. Candidate > > Wuhan National Laboratory for Optoelectronics (WNLO) > > Huazhong University of Science & Technology (HUST) > > Wuhan, Hubei Prov., China > > > > > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Chunguang Li, Ph.D. Candidate Wuhan National Laboratory for Optoelectronics (WNLO) Huazhong University of Science & Technology (HUST) Wuhan, Hubei Prov., China ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2016-11-08 13:32 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-09-25 8:22 [Qemu-devel] Migration dirty bitmap: should only mark pages as dirty after they have been sent Chunguang Li 2016-09-26 11:23 ` Dr. David Alan Gilbert 2016-09-26 14:55 ` Chunguang Li 2016-09-26 18:52 ` Dr. David Alan Gilbert 2016-09-27 12:28 ` Chunguang Li 2016-09-30 5:46 ` Amit Shah 2016-09-30 8:18 ` Chunguang Li 2016-10-08 7:55 ` Chunguang Li 2016-10-14 11:15 ` Dr. David Alan Gilbert 2016-11-03 8:25 ` Chunguang Li 2016-11-03 9:59 ` Li, Liang Z 2016-11-03 10:13 ` Li, Liang Z 2016-11-04 3:07 ` Chunguang Li 2016-11-04 4:50 ` Li, Liang Z 2016-11-04 7:03 ` Chunguang Li 2016-11-07 13:52 ` Chunguang Li 2016-11-07 14:17 ` Li, Liang Z 2016-11-08 5:27 ` Chunguang Li 2016-11-07 14:44 ` Li, Liang Z 2016-11-08 11:05 ` Dr. David Alan Gilbert 2016-11-08 13:40 ` Chunguang Li
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.