From: "Michael S. Tsirkin" <mst@redhat.com> To: "Li, Liang Z" <liang.z.li@intel.com> Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "rth@twiddle.net" <rth@twiddle.net>, "ehabkost@redhat.com" <ehabkost@redhat.com>, "amit.shah@redhat.com" <amit.shah@redhat.com>, "quintela@redhat.com" <quintela@redhat.com>, "dgilbert@redhat.com" <dgilbert@redhat.com>, "mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>, "jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>, "simhan@hpe.com" <simhan@hpe.com>, "rkagan@virtuozzo.com" <rkagan@virtuozzo.com>, "riel@redhat.com" <riel@redhat.com> Subject: Re: [RFC Design Doc]Speed up live migration by skipping free pages Date: Thu, 24 Mar 2016 16:44:13 +0200 [thread overview] Message-ID: <20160324164006-mutt-send-email-mst@redhat.com> (raw) In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415BCDD@shsmsx102.ccr.corp.intel.com> On Thu, Mar 24, 2016 at 02:33:15PM +0000, Li, Liang Z wrote: > > > > > > > Agree. Current balloon just send 256 PFNs a time, that's too > > > > > > > few and lead to too many times of virtio transmission, that's > > > > > > > the main reason for the > > > > > > bad performance. > > > > > > > Change the VIRTIO_BALLOON_ARRAY_PFNS_MAX to a large value > > can > > > > > > improve > > > > > > > the performance significant. Maybe we should increase it > > > > > > > before doing the further optimization, do you think so ? > > > > > > > > > > > > We could push it up a bit higher: 256 is 1kbyte in size, so we > > > > > > can make it 3x bigger and still fit struct virtio_balloon is a > > > > > > single page. But if we are going to add the bitmap variant > > > > > > anyway, we probably > > > > shouldn't bother. > > > > > > > > > > > > > > > c. address translation and madvise() operation (24%, > > > > > > > > > 1423ms) > > > > > > > > > > > > > > > > How is this split between translation and madvise? I > > > > > > > > suspect it's mostly madvise since you need translation when > > > > > > > > using bitmap as > > > > well. > > > > > > > > Correct? Could you measure this please? Also, what if we > > > > > > > > use the new MADV_FREE instead? By how much would this help? > > > > > > > > > > > > > > > For the current balloon, address translation is needed. > > > > > > > But for live migration, there is no need to do address translation. > > > > > > > > > > > > Well you need ram address in order to clear the dirty bit. > > > > > > How would you get it without translation? > > > > > > > > > > > > > > > > If you means that kind of address translation, yes, it need. > > > > > What I want to say is, filter out the free page can be done by > > > > > bitmap > > > > operation. > > > > > > > > > > Liang > > > > > > > > OK so I see that your patches use block->offset in struct RAMBlock > > > > to look up bits in guest-supplied bitmap. > > > > I don't think that's guaranteed to work. > > > > > > It's part of the bitmap operation, because the latest change of the > > ram_list.dirty_memory. > > > Why do you think so? Could you tell me the reason? > > > > > > Liang > > > > Sorry, why do I think what? That ram_addr_t is not guaranteed to equal GPA > > of the block? > > > > I mean why do you think that's can't guaranteed to work. > Yes, ram_addr_t is not guaranteed to equal GPA of the block. But I didn't use them as > GPA. The code in the filter_out_guest_free_pages() in my patch just follow the style of > the latest change of ram_list.dirty_memory[]. > > The free page bitmap got from the guest in my RFC patch has been filtered out the > 'hole', so the bit N of the free page bitmap and the bit N in > ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks are corresponding to > the same guest page. Right? > If it's true, I think I am doing the right thing? > > > Liang There's no guarantee that there's a single 'hole' even on the PC, and we want balloon to be portable. So I'm not sure I understand what your patch is doing, do you mean you pass the GPA to ram addr mapping from host to guest? That can be made to work but it's not a good idea, and I don't see why would it be faster than doing the same translation host side. > > E.g. HACKING says: > > Use hwaddr for guest physical addresses except pcibus_t > > for PCI addresses. In addition, ram_addr_t is a QEMU internal > > address > > space that maps guest RAM physical addresses into an intermediate > > address space that can map to host virtual address spaces. > > > > > > -- > > MST > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in the body of > > a message to majordomo@vger.kernel.org More majordomo info at > > http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com> To: "Li, Liang Z" <liang.z.li@intel.com> Cc: "rkagan@virtuozzo.com" <rkagan@virtuozzo.com>, "linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>, "ehabkost@redhat.com" <ehabkost@redhat.com>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "quintela@redhat.com" <quintela@redhat.com>, "simhan@hpe.com" <simhan@hpe.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "dgilbert@redhat.com" <dgilbert@redhat.com>, "jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>, "mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>, "amit.shah@redhat.com" <amit.shah@redhat.com>, "pbonzini@redhat.com" <pbonzini@redhat.com>, "rth@twiddle.net" <rth@twiddle.net> Subject: Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by skipping free pages Date: Thu, 24 Mar 2016 16:44:13 +0200 [thread overview] Message-ID: <20160324164006-mutt-send-email-mst@redhat.com> (raw) In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415BCDD@shsmsx102.ccr.corp.intel.com> On Thu, Mar 24, 2016 at 02:33:15PM +0000, Li, Liang Z wrote: > > > > > > > Agree. Current balloon just send 256 PFNs a time, that's too > > > > > > > few and lead to too many times of virtio transmission, that's > > > > > > > the main reason for the > > > > > > bad performance. > > > > > > > Change the VIRTIO_BALLOON_ARRAY_PFNS_MAX to a large value > > can > > > > > > improve > > > > > > > the performance significant. Maybe we should increase it > > > > > > > before doing the further optimization, do you think so ? > > > > > > > > > > > > We could push it up a bit higher: 256 is 1kbyte in size, so we > > > > > > can make it 3x bigger and still fit struct virtio_balloon is a > > > > > > single page. But if we are going to add the bitmap variant > > > > > > anyway, we probably > > > > shouldn't bother. > > > > > > > > > > > > > > > c. address translation and madvise() operation (24%, > > > > > > > > > 1423ms) > > > > > > > > > > > > > > > > How is this split between translation and madvise? I > > > > > > > > suspect it's mostly madvise since you need translation when > > > > > > > > using bitmap as > > > > well. > > > > > > > > Correct? Could you measure this please? Also, what if we > > > > > > > > use the new MADV_FREE instead? By how much would this help? > > > > > > > > > > > > > > > For the current balloon, address translation is needed. > > > > > > > But for live migration, there is no need to do address translation. > > > > > > > > > > > > Well you need ram address in order to clear the dirty bit. > > > > > > How would you get it without translation? > > > > > > > > > > > > > > > > If you means that kind of address translation, yes, it need. > > > > > What I want to say is, filter out the free page can be done by > > > > > bitmap > > > > operation. > > > > > > > > > > Liang > > > > > > > > OK so I see that your patches use block->offset in struct RAMBlock > > > > to look up bits in guest-supplied bitmap. > > > > I don't think that's guaranteed to work. > > > > > > It's part of the bitmap operation, because the latest change of the > > ram_list.dirty_memory. > > > Why do you think so? Could you tell me the reason? > > > > > > Liang > > > > Sorry, why do I think what? That ram_addr_t is not guaranteed to equal GPA > > of the block? > > > > I mean why do you think that's can't guaranteed to work. > Yes, ram_addr_t is not guaranteed to equal GPA of the block. But I didn't use them as > GPA. The code in the filter_out_guest_free_pages() in my patch just follow the style of > the latest change of ram_list.dirty_memory[]. > > The free page bitmap got from the guest in my RFC patch has been filtered out the > 'hole', so the bit N of the free page bitmap and the bit N in > ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks are corresponding to > the same guest page. Right? > If it's true, I think I am doing the right thing? > > > Liang There's no guarantee that there's a single 'hole' even on the PC, and we want balloon to be portable. So I'm not sure I understand what your patch is doing, do you mean you pass the GPA to ram addr mapping from host to guest? That can be made to work but it's not a good idea, and I don't see why would it be faster than doing the same translation host side. > > E.g. HACKING says: > > Use hwaddr for guest physical addresses except pcibus_t > > for PCI addresses. In addition, ram_addr_t is a QEMU internal > > address > > space that maps guest RAM physical addresses into an intermediate > > address space that can map to host virtual address spaces. > > > > > > -- > > MST > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in the body of > > a message to majordomo@vger.kernel.org More majordomo info at > > http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-03-24 14:44 UTC|newest] Thread overview: 112+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-03-22 7:43 [RFC Design Doc]Speed up live migration by skipping free pages Liang Li 2016-03-22 7:43 ` [Qemu-devel] " Liang Li 2016-03-22 10:11 ` Michael S. Tsirkin 2016-03-22 10:11 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-23 6:05 ` Li, Liang Z 2016-03-23 6:05 ` [Qemu-devel] " Li, Liang Z 2016-03-23 14:08 ` Michael S. Tsirkin 2016-03-23 14:08 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 1:19 ` Li, Liang Z 2016-03-24 1:19 ` [Qemu-devel] " Li, Liang Z 2016-03-24 9:48 ` Michael S. Tsirkin 2016-03-24 9:48 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 10:16 ` Li, Liang Z 2016-03-24 10:16 ` [Qemu-devel] " Li, Liang Z 2016-03-24 10:29 ` Michael S. Tsirkin 2016-03-24 10:29 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 14:33 ` Li, Liang Z 2016-03-24 14:33 ` [Qemu-devel] " Li, Liang Z 2016-03-24 14:44 ` Michael S. Tsirkin [this message] 2016-03-24 14:44 ` Michael S. Tsirkin 2016-03-24 15:16 ` Li, Liang Z 2016-03-24 15:16 ` [Qemu-devel] " Li, Liang Z 2016-03-24 15:18 ` Paolo Bonzini 2016-03-24 15:18 ` [Qemu-devel] " Paolo Bonzini 2016-03-24 15:25 ` Li, Liang Z 2016-03-24 15:25 ` [Qemu-devel] " Li, Liang Z 2016-03-24 15:27 ` Michael S. Tsirkin 2016-03-24 15:27 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 15:39 ` Li, Liang Z 2016-03-24 15:39 ` [Qemu-devel] " Li, Liang Z 2016-03-24 15:47 ` Paolo Bonzini 2016-03-24 15:47 ` [Qemu-devel] " Paolo Bonzini 2016-03-24 15:59 ` Li, Liang Z 2016-03-24 15:59 ` [Qemu-devel] " Li, Liang Z 2016-03-22 19:05 ` Dr. David Alan Gilbert 2016-03-22 19:05 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-03-23 6:48 ` Li, Liang Z 2016-03-23 6:48 ` [Qemu-devel] " Li, Liang Z 2016-03-24 1:24 ` Wei Yang 2016-03-24 1:24 ` [Qemu-devel] " Wei Yang 2016-03-24 9:00 ` Dr. David Alan Gilbert 2016-03-24 9:00 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-03-24 10:09 ` Li, Liang Z 2016-03-24 10:09 ` [Qemu-devel] " Li, Liang Z 2016-03-24 10:23 ` Dr. David Alan Gilbert 2016-03-24 10:23 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-03-24 14:50 ` Li, Liang Z 2016-03-24 14:50 ` [Qemu-devel] " Li, Liang Z 2016-03-24 15:11 ` Michael S. Tsirkin 2016-03-24 15:11 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 15:53 ` Li, Liang Z 2016-03-24 15:53 ` [Qemu-devel] " Li, Liang Z 2016-03-24 15:56 ` Michael S. Tsirkin 2016-03-24 15:56 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 16:05 ` Li, Liang Z 2016-03-24 16:05 ` [Qemu-devel] " Li, Liang Z 2016-03-24 16:25 ` Michael S. Tsirkin 2016-03-24 16:25 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-24 17:49 ` Dr. David Alan Gilbert 2016-03-24 17:49 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-03-24 22:16 ` Michael S. Tsirkin 2016-03-24 22:16 ` [Qemu-devel] " Michael S. Tsirkin 2016-03-25 1:59 ` Li, Liang Z 2016-03-25 1:59 ` [Qemu-devel] " Li, Liang Z 2016-03-25 1:32 ` Li, Liang Z 2016-03-25 1:32 ` [Qemu-devel] " Li, Liang Z 2016-04-18 11:08 ` Li, Liang Z 2016-04-18 11:08 ` [Qemu-devel] " Li, Liang Z 2016-04-18 11:29 ` Michael S. Tsirkin 2016-04-18 11:29 ` [Qemu-devel] " Michael S. Tsirkin 2016-04-18 14:36 ` Li, Liang Z 2016-04-18 14:36 ` [Qemu-devel] " Li, Liang Z 2016-04-18 15:38 ` Michael S. Tsirkin 2016-04-18 15:38 ` [Qemu-devel] " Michael S. Tsirkin 2016-04-19 2:20 ` Li, Liang Z 2016-04-19 2:20 ` [Qemu-devel] " Li, Liang Z 2016-04-19 19:12 ` Dr. David Alan Gilbert 2016-04-19 19:12 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-04-25 10:56 ` Michael S. Tsirkin 2016-04-25 10:56 ` [Qemu-devel] " Michael S. Tsirkin 2016-04-19 19:05 ` Dr. David Alan Gilbert 2016-04-19 19:05 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-04-20 3:22 ` Li, Liang Z 2016-04-20 3:22 ` [Qemu-devel] " Li, Liang Z 2016-04-20 8:10 ` Dr. David Alan Gilbert 2016-04-20 8:10 ` [Qemu-devel] " Dr. David Alan Gilbert 2016-03-25 1:32 ` Li, Liang Z 2016-03-25 1:32 ` [Qemu-devel] " Li, Liang Z 2016-04-01 10:54 ` Amit Shah 2016-04-01 10:54 ` [Qemu-devel] " Amit Shah 2016-04-05 1:49 ` Li, Liang Z 2016-04-05 1:49 ` [Qemu-devel] " Li, Liang Z 2016-03-23 1:37 ` Wei Yang 2016-03-23 1:37 ` [Qemu-devel] " Wei Yang 2016-03-23 7:18 ` Li, Liang Z 2016-03-23 7:18 ` [Qemu-devel] " Li, Liang Z 2016-03-23 9:46 ` Wei Yang 2016-03-23 9:46 ` [Qemu-devel] " Wei Yang 2016-03-23 14:35 ` Li, Liang Z 2016-03-23 14:35 ` [Qemu-devel] " Li, Liang Z 2016-03-24 0:52 ` Wei Yang 2016-03-24 0:52 ` [Qemu-devel] " Wei Yang 2016-03-24 1:32 ` Li, Liang Z 2016-03-24 1:32 ` [Qemu-devel] " Li, Liang Z 2016-03-24 1:56 ` Wei Yang 2016-03-24 1:56 ` [Qemu-devel] " Wei Yang 2016-03-23 16:53 ` Eric Blake 2016-03-23 16:53 ` Eric Blake 2016-03-23 21:41 ` Wei Yang 2016-03-23 21:41 ` Wei Yang 2016-03-24 1:23 ` Li, Liang Z 2016-03-24 1:23 ` Li, Liang Z
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20160324164006-mutt-send-email-mst@redhat.com \ --to=mst@redhat.com \ --cc=amit.shah@redhat.com \ --cc=dgilbert@redhat.com \ --cc=ehabkost@redhat.com \ --cc=jitendra.kolhe@hpe.com \ --cc=kvm@vger.kernel.org \ --cc=liang.z.li@intel.com \ --cc=linux-kernel@vger.kenel.org \ --cc=mohan_parthasarathy@hpe.com \ --cc=pbonzini@redhat.com \ --cc=qemu-devel@nongnu.org \ --cc=quintela@redhat.com \ --cc=riel@redhat.com \ --cc=rkagan@virtuozzo.com \ --cc=rth@twiddle.net \ --cc=simhan@hpe.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.