* [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency @ 2017-03-09 8:49 FENG, Jiasheng 2017-03-09 15:19 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 6+ messages in thread From: FENG, Jiasheng @ 2017-03-09 8:49 UTC (permalink / raw) To: qemu-devel; +Cc: Heming Cui, Wang Cheng, CHEN, XUSHENG, YE, Chen Dear QEMU Development Team, It is my honor to contact with you. I am a postgraduate student from University of Hong Kong. Currently I am working on a project related to QEMU MicroCheckpointing and I have encountered a performance issue during checkpoint pause & resume. Please kindly refer to migration/checkpoint.c file, in function capture_checkpoint, I proceeded a test to see the time consumption between vm_stop_force_state and vm_start. I found out that even if the system is idle, there are still 12-20ms latency recorded ( mem=2G, vCPU=4 ). Moreover, latency will be increased while more cpus equipped by my virtual machine. I have done some research on that and I realized that it is related to the Memory Barrier in KVM kernel. Each cpu will proceed a smp_wmb() request during pause & resume and it takes about 3-5ms to finish the request ( mem=2G, vCPU=4 ). Therefore, I would like to ask 3 questions regarding on the above issue: 1. What is your consideration with calling smp_wmb() in checkpoint period; 2. Is it any other solution to minimize the latency to improve the performance in checkpoint period; 3. Is smp_wmb() able to be safely disabled during the checkpoint period Really appreciate your help with my problems and hope to receive your feedback soon. Thanks again for your contribution to QEMU and it is such a masterpiece. Thanks and best regards, Niko Jiasheng Feng University of Hong Kong -- *Niko Jiasheng * *Feng **Computer Science(General Stream), Faculty of Engineering, The University of Hong Kong* Contact: (852)97908620 Address: Pokfulam Road, The University of Hong Kong Email: nikofeng@hku.hk / niko_jiasheng@163.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency 2017-03-09 8:49 [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency FENG, Jiasheng @ 2017-03-09 15:19 ` Dr. David Alan Gilbert 2017-03-09 16:37 ` FENG, Jiasheng 0 siblings, 1 reply; 6+ messages in thread From: Dr. David Alan Gilbert @ 2017-03-09 15:19 UTC (permalink / raw) To: FENG, Jiasheng Cc: qemu-devel, Wang Cheng, YE, Chen, CHEN, XUSHENG, Heming Cui * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: > Dear QEMU Development Team, > > > It is my honor to contact with you. > > > > I am a postgraduate student from University of Hong Kong. Currently I am > working on a project related to QEMU MicroCheckpointing and I have > encountered a performance issue during checkpoint pause & resume. The microcheckpointing code hasn't been maintained for a long time; most of the current checkpointing work is based on the COLO work which is still under development. > Please kindly refer to migration/checkpoint.c file, in function > capture_checkpoint, I proceeded a test to see the time consumption between > vm_stop_force_state and vm_start. I found out that even if the system is > idle, there are still 12-20ms latency recorded ( mem=2G, vCPU=4 ). > Moreover, latency will be increased while more cpus equipped by my virtual > machine. I have done some research on that and I realized that it is > related to the Memory Barrier in KVM kernel. Each cpu will proceed a > smp_wmb() request during pause & resume and it takes about 3-5ms to finish > the request ( mem=2G, vCPU=4 ). > > > > Therefore, I would like to ask 3 questions regarding on the above issue: > > > 1. What is your consideration with calling smp_wmb() in checkpoint period; > > 2. Is it any other solution to minimize the latency to improve the > performance in checkpoint period; > > 3. Is smp_wmb() able to be safely disabled during the checkpoint period Well you'd have to understand where it's used; but for example, when taking a checkpoint you'd want to be sure that the checkpoint data contained a consistent copy of the last write data from all of the vCPUs; so I think a wmb would be needed to make sure it's consistent. I'm surprised that the smp_wmb is such a big chunk of your total checkpoint time, and that it's quite so long. Are the vCPUs idle or are they busy - does it make difference? Dave > Really appreciate your help with my problems and hope to receive your > feedback soon. > > > Thanks again for your contribution to QEMU and it is such a masterpiece. Dave > > > > Thanks and best regards, > > Niko Jiasheng Feng > > University of Hong Kong > > -- > *Niko Jiasheng * > *Feng **Computer Science(General Stream), Faculty of Engineering, The > University of Hong Kong* > Contact: (852)97908620 > Address: Pokfulam Road, The University of Hong Kong > Email: nikofeng@hku.hk / niko_jiasheng@163.com -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency 2017-03-09 15:19 ` Dr. David Alan Gilbert @ 2017-03-09 16:37 ` FENG, Jiasheng 2017-03-09 17:06 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 6+ messages in thread From: FENG, Jiasheng @ 2017-03-09 16:37 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: qemu-devel, Wang Cheng, YE, Chen, CHEN, XUSHENG, Heming Cui Dear David, Really appreciate your feedback. I have proceeded the experiments in both conditions, and no matter the vCPUs are in idle or busy situation, there is no difference that smp_wmb() will consume a lot of time to proceed its work. In your opinion, may I know that what is the alternative way to minimize the time consumption of smp_wmb() or any other system setting could speed up smp_wmb()? Thanks in advance for your assistance and hope to receive your feedback soon Thanks and best regards, Niko Jiasheng Feng On Thu, Mar 9, 2017 at 11:19 PM, Dr. David Alan Gilbert <dgilbert@redhat.com > wrote: > * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: > > Dear QEMU Development Team, > > > > > > It is my honor to contact with you. > > > > > > > > I am a postgraduate student from University of Hong Kong. Currently I am > > working on a project related to QEMU MicroCheckpointing and I have > > encountered a performance issue during checkpoint pause & resume. > > The microcheckpointing code hasn't been maintained for a long time; > most of the current checkpointing work is based on the COLO work which is > still under development. > > > Please kindly refer to migration/checkpoint.c file, in function > > capture_checkpoint, I proceeded a test to see the time consumption > between > > vm_stop_force_state and vm_start. I found out that even if the system is > > idle, there are still 12-20ms latency recorded ( mem=2G, vCPU=4 ). > > Moreover, latency will be increased while more cpus equipped by my > virtual > > machine. I have done some research on that and I realized that it is > > related to the Memory Barrier in KVM kernel. Each cpu will proceed a > > smp_wmb() request during pause & resume and it takes about 3-5ms to > finish > > the request ( mem=2G, vCPU=4 ). > > > > > > > > Therefore, I would like to ask 3 questions regarding on the above issue: > > > > > > 1. What is your consideration with calling smp_wmb() in checkpoint > period; > > > > 2. Is it any other solution to minimize the latency to improve the > > performance in checkpoint period; > > > > 3. Is smp_wmb() able to be safely disabled during the checkpoint period > > Well you'd have to understand where it's used; but for example, when taking > a checkpoint you'd want to be sure that the checkpoint data contained > a consistent copy of the last write data from all of the vCPUs; so I think > a wmb would be needed to make sure it's consistent. > > I'm surprised that the smp_wmb is such a big chunk of your total checkpoint > time, and that it's quite so long. > Are the vCPUs idle or are they busy - does it make difference? > > Dave > > > Really appreciate your help with my problems and hope to receive your > > feedback soon. > > > > > > Thanks again for your contribution to QEMU and it is such a masterpiece. > > Dave > > > > > > > > > Thanks and best regards, > > > > Niko Jiasheng Feng > > > > University of Hong Kong > > > > -- > > *Niko Jiasheng * > > *Feng **Computer Science(General Stream), Faculty of Engineering, The > > University of Hong Kong* > > Contact: (852)97908620 > > Address: Pokfulam Road, The University of Hong Kong > > Email: nikofeng@hku.hk / niko_jiasheng@163.com > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > -- *Niko Jiasheng * *Feng **Computer Science(General Stream), Faculty of Engineering, The University of Hong Kong* Contact: (852)97908620 Address: Pokfulam Road, The University of Hong Kong Email: nikofeng@hku.hk / niko_jiasheng@163.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency 2017-03-09 16:37 ` FENG, Jiasheng @ 2017-03-09 17:06 ` Dr. David Alan Gilbert 2017-03-09 17:11 ` nikofeng 2017-03-09 17:15 ` Paolo Bonzini 0 siblings, 2 replies; 6+ messages in thread From: Dr. David Alan Gilbert @ 2017-03-09 17:06 UTC (permalink / raw) To: FENG, Jiasheng Cc: qemu-devel, Wang Cheng, YE, Chen, CHEN, XUSHENG, Heming Cui, pbonzini (cc'ing in Paolo since he knows our barrier code) * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: > Dear David, > > Really appreciate your feedback. > > I have proceeded the experiments in both conditions, and no matter the > vCPUs are in idle or busy situation, there is no difference that smp_wmb() > will consume a lot of time to proceed its work. > > In your opinion, may I know that what is the alternative way to minimize > the time consumption of smp_wmb() or any other system setting could speed > up smp_wmb()? > > Thanks in advance for your assistance and hope to receive your feedback soon Just checking, is this on a normal x86 PC? Your numbers of 3-5ms just seem quite high to me but I've not tried timing that code. Dave > > Thanks and best regards, > Niko Jiasheng Feng > > > > On Thu, Mar 9, 2017 at 11:19 PM, Dr. David Alan Gilbert <dgilbert@redhat.com > > wrote: > > > * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: > > > Dear QEMU Development Team, > > > > > > > > > It is my honor to contact with you. > > > > > > > > > > > > I am a postgraduate student from University of Hong Kong. Currently I am > > > working on a project related to QEMU MicroCheckpointing and I have > > > encountered a performance issue during checkpoint pause & resume. > > > > The microcheckpointing code hasn't been maintained for a long time; > > most of the current checkpointing work is based on the COLO work which is > > still under development. > > > > > Please kindly refer to migration/checkpoint.c file, in function > > > capture_checkpoint, I proceeded a test to see the time consumption > > between > > > vm_stop_force_state and vm_start. I found out that even if the system is > > > idle, there are still 12-20ms latency recorded ( mem=2G, vCPU=4 ). > > > Moreover, latency will be increased while more cpus equipped by my > > virtual > > > machine. I have done some research on that and I realized that it is > > > related to the Memory Barrier in KVM kernel. Each cpu will proceed a > > > smp_wmb() request during pause & resume and it takes about 3-5ms to > > finish > > > the request ( mem=2G, vCPU=4 ). > > > > > > > > > > > > Therefore, I would like to ask 3 questions regarding on the above issue: > > > > > > > > > 1. What is your consideration with calling smp_wmb() in checkpoint > > period; > > > > > > 2. Is it any other solution to minimize the latency to improve the > > > performance in checkpoint period; > > > > > > 3. Is smp_wmb() able to be safely disabled during the checkpoint period > > > > Well you'd have to understand where it's used; but for example, when taking > > a checkpoint you'd want to be sure that the checkpoint data contained > > a consistent copy of the last write data from all of the vCPUs; so I think > > a wmb would be needed to make sure it's consistent. > > > > I'm surprised that the smp_wmb is such a big chunk of your total checkpoint > > time, and that it's quite so long. > > Are the vCPUs idle or are they busy - does it make difference? > > > > Dave > > > > > Really appreciate your help with my problems and hope to receive your > > > feedback soon. > > > > > > > > > Thanks again for your contribution to QEMU and it is such a masterpiece. > > > > Dave > > > > > > > > > > > > > > Thanks and best regards, > > > > > > Niko Jiasheng Feng > > > > > > University of Hong Kong > > > > > > -- > > > *Niko Jiasheng * > > > *Feng **Computer Science(General Stream), Faculty of Engineering, The > > > University of Hong Kong* > > > Contact: (852)97908620 > > > Address: Pokfulam Road, The University of Hong Kong > > > Email: nikofeng@hku.hk / niko_jiasheng@163.com > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > > > > -- > *Niko Jiasheng * > *Feng **Computer Science(General Stream), Faculty of Engineering, The > University of Hong Kong* > Contact: (852)97908620 > Address: Pokfulam Road, The University of Hong Kong > Email: nikofeng@hku.hk / niko_jiasheng@163.com -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency 2017-03-09 17:06 ` Dr. David Alan Gilbert @ 2017-03-09 17:11 ` nikofeng 2017-03-09 17:15 ` Paolo Bonzini 1 sibling, 0 replies; 6+ messages in thread From: nikofeng @ 2017-03-09 17:11 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: qemu-devel, Wang Cheng, YE, Chen, CHEN, XUSHENG, Heming Cui, pbonzini Dear David, Yes, it is a normal x86 PC server. Thanks so much for your help and hope to receive your following feedback. Best Regards, Niko Jiasheng Feng Sent from Mail for Windows 10 From: Dr. David Alan Gilbert Sent: Friday, March 10, 2017 1:06 AM To: FENG, Jiasheng Cc: qemu-devel@nongnu.org; Wang Cheng; YE, Chen; CHEN, XUSHENG; Heming Cui; pbonzini@redhat.com Subject: Re: [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency (cc'ing in Paolo since he knows our barrier code) * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: > Dear David, > > Really appreciate your feedback. > > I have proceeded the experiments in both conditions, and no matter the > vCPUs are in idle or busy situation, there is no difference that smp_wmb() > will consume a lot of time to proceed its work. > > In your opinion, may I know that what is the alternative way to minimize > the time consumption of smp_wmb() or any other system setting could speed > up smp_wmb()? > > Thanks in advance for your assistance and hope to receive your feedback soon Just checking, is this on a normal x86 PC? Your numbers of 3-5ms just seem quite high to me but I've not tried timing that code. Dave > > Thanks and best regards, > Niko Jiasheng Feng > > > > On Thu, Mar 9, 2017 at 11:19 PM, Dr. David Alan Gilbert <dgilbert@redhat.com > > wrote: > > > * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: > > > Dear QEMU Development Team, > > > > > > > > > It is my honor to contact with you. > > > > > > > > > > > > I am a postgraduate student from University of Hong Kong. Currently I am > > > working on a project related to QEMU MicroCheckpointing and I have > > > encountered a performance issue during checkpoint pause & resume. > > > > The microcheckpointing code hasn't been maintained for a long time; > > most of the current checkpointing work is based on the COLO work which is > > still under development. > > > > > Please kindly refer to migration/checkpoint.c file, in function > > > capture_checkpoint, I proceeded a test to see the time consumption > > between > > > vm_stop_force_state and vm_start. I found out that even if the system is > > > idle, there are still 12-20ms latency recorded ( mem=2G, vCPU=4 ). > > > Moreover, latency will be increased while more cpus equipped by my > > virtual > > > machine. I have done some research on that and I realized that it is > > > related to the Memory Barrier in KVM kernel. Each cpu will proceed a > > > smp_wmb() request during pause & resume and it takes about 3-5ms to > > finish > > > the request ( mem=2G, vCPU=4 ). > > > > > > > > > > > > Therefore, I would like to ask 3 questions regarding on the above issue: > > > > > > > > > 1. What is your consideration with calling smp_wmb() in checkpoint > > period; > > > > > > 2. Is it any other solution to minimize the latency to improve the > > > performance in checkpoint period; > > > > > > 3. Is smp_wmb() able to be safely disabled during the checkpoint period > > > > Well you'd have to understand where it's used; but for example, when taking > > a checkpoint you'd want to be sure that the checkpoint data contained > > a consistent copy of the last write data from all of the vCPUs; so I think > > a wmb would be needed to make sure it's consistent. > > > > I'm surprised that the smp_wmb is such a big chunk of your total checkpoint > > time, and that it's quite so long. > > Are the vCPUs idle or are they busy - does it make difference? > > > > Dave > > > > > Really appreciate your help with my problems and hope to receive your > > > feedback soon. > > > > > > > > > Thanks again for your contribution to QEMU and it is such a masterpiece. > > > > Dave > > > > > > > > > > > > > > Thanks and best regards, > > > > > > Niko Jiasheng Feng > > > > > > University of Hong Kong > > > > > > -- > > > *Niko Jiasheng * > > > *Feng **Computer Science(General Stream), Faculty of Engineering, The > > > University of Hong Kong* > > > Contact: (852)97908620 > > > Address: Pokfulam Road, The University of Hong Kong > > > Email: nikofeng@hku.hk / niko_jiasheng@163.com > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > > > > -- > *Niko Jiasheng * > *Feng **Computer Science(General Stream), Faculty of Engineering, The > University of Hong Kong* > Contact: (852)97908620 > Address: Pokfulam Road, The University of Hong Kong > Email: nikofeng@hku.hk / niko_jiasheng@163.com -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency 2017-03-09 17:06 ` Dr. David Alan Gilbert 2017-03-09 17:11 ` nikofeng @ 2017-03-09 17:15 ` Paolo Bonzini 1 sibling, 0 replies; 6+ messages in thread From: Paolo Bonzini @ 2017-03-09 17:15 UTC (permalink / raw) To: Dr. David Alan Gilbert, FENG, Jiasheng Cc: qemu-devel, Wang Cheng, YE, Chen, CHEN, XUSHENG, Heming Cui On 09/03/2017 18:06, Dr. David Alan Gilbert wrote: > (cc'ing in Paolo since he knows our barrier code) > > * FENG, Jiasheng (nikofeng@connect.hku.hk) wrote: >> Dear David, >> >> Really appreciate your feedback. >> >> I have proceeded the experiments in both conditions, and no matter the >> vCPUs are in idle or busy situation, there is no difference that smp_wmb() >> will consume a lot of time to proceed its work. >> >> In your opinion, may I know that what is the alternative way to minimize >> the time consumption of smp_wmb() or any other system setting could speed >> up smp_wmb()? >> >> Thanks in advance for your assistance and hope to receive your feedback soon > > Just checking, is this on a normal x86 PC? > Your numbers of 3-5ms just seem quite high to me but I've not tried timing that > code. smp_wmb does not produce a single machine instruction, so this is probably a fluke in the profiling tool. The most expensive part of vm_stop_force_state is going to be bdrv_drain_all/bdrv_flush_all. bdrv_flush_all is definitely not needed for checkpointing purposes. Paolo >>> >>>> Please kindly refer to migration/checkpoint.c file, in function >>>> capture_checkpoint, I proceeded a test to see the time consumption >>> between >>>> vm_stop_force_state and vm_start. I found out that even if the system is >>>> idle, there are still 12-20ms latency recorded ( mem=2G, vCPU=4 ). >>>> Moreover, latency will be increased while more cpus equipped by my >>> virtual >>>> machine. I have done some research on that and I realized that it is >>>> related to the Memory Barrier in KVM kernel. Each cpu will proceed a >>>> smp_wmb() request during pause & resume and it takes about 3-5ms to >>> finish >>>> the request ( mem=2G, vCPU=4 ). >>>> >>>> >>>> >>>> Therefore, I would like to ask 3 questions regarding on the above issue: >>>> >>>> >>>> 1. What is your consideration with calling smp_wmb() in checkpoint >>> period; >>>> >>>> 2. Is it any other solution to minimize the latency to improve the >>>> performance in checkpoint period; >>>> >>>> 3. Is smp_wmb() able to be safely disabled during the checkpoint period >>> >>> Well you'd have to understand where it's used; but for example, when taking >>> a checkpoint you'd want to be sure that the checkpoint data contained >>> a consistent copy of the last write data from all of the vCPUs; so I think >>> a wmb would be needed to make sure it's consistent. >>> >>> I'm surprised that the smp_wmb is such a big chunk of your total checkpoint >>> time, and that it's quite so long. >>> Are the vCPUs idle or are they busy - does it make difference? >>> >>> Dave >>> >>>> Really appreciate your help with my problems and hope to receive your >>>> feedback soon. >>>> >>>> >>>> Thanks again for your contribution to QEMU and it is such a masterpiece. >>> >>> Dave >>> >>>> >>>> >>>> >>>> Thanks and best regards, >>>> >>>> Niko Jiasheng Feng >>>> >>>> University of Hong Kong >>>> >>>> -- >>>> *Niko Jiasheng * >>>> *Feng **Computer Science(General Stream), Faculty of Engineering, The >>>> University of Hong Kong* >>>> Contact: (852)97908620 >>>> Address: Pokfulam Road, The University of Hong Kong >>>> Email: nikofeng@hku.hk / niko_jiasheng@163.com >>> -- >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >>> >> >> >> >> -- >> *Niko Jiasheng * >> *Feng **Computer Science(General Stream), Faculty of Engineering, The >> University of Hong Kong* >> Contact: (852)97908620 >> Address: Pokfulam Road, The University of Hong Kong >> Email: nikofeng@hku.hk / niko_jiasheng@163.com > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-03-09 17:15 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-09 8:49 [Qemu-devel] QEMU MicroCheckpointing Pause & Resume Latency FENG, Jiasheng 2017-03-09 15:19 ` Dr. David Alan Gilbert 2017-03-09 16:37 ` FENG, Jiasheng 2017-03-09 17:06 ` Dr. David Alan Gilbert 2017-03-09 17:11 ` nikofeng 2017-03-09 17:15 ` Paolo Bonzini
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.