All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"rth@twiddle.net" <rth@twiddle.net>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"amit.shah@redhat.com" <amit.shah@redhat.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>,
	"mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>,
	"jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>,
	"simhan@hpe.com" <simhan@hpe.com>,
	"rkagan@virtuozzo.com" <rkagan@virtuozzo.com>,
	"riel@redhat.com" <riel@redhat.com>
Subject: Re: [RFC Design Doc]Speed up live migration by skipping free pages
Date: Thu, 24 Mar 2016 16:44:13 +0200	[thread overview]
Message-ID: <20160324164006-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415BCDD@shsmsx102.ccr.corp.intel.com>

On Thu, Mar 24, 2016 at 02:33:15PM +0000, Li, Liang Z wrote:
> > > > > > > Agree. Current balloon just send 256 PFNs a time, that's too
> > > > > > > few and lead to too many times of virtio transmission, that's
> > > > > > > the main reason for the
> > > > > > bad performance.
> > > > > > > Change the VIRTIO_BALLOON_ARRAY_PFNS_MAX to a large value
> > can
> > > > > > improve
> > > > > > > the performance significant. Maybe we should increase it
> > > > > > > before doing the further optimization, do you think so ?
> > > > > >
> > > > > > We could push it up a bit higher: 256 is 1kbyte in size, so we
> > > > > > can make it 3x bigger and still fit struct virtio_balloon is a
> > > > > > single page. But if we are going to add the bitmap variant
> > > > > > anyway, we probably
> > > > shouldn't bother.
> > > > > >
> > > > > > > > > c. address translation and madvise() operation (24%,
> > > > > > > > > 1423ms)
> > > > > > > >
> > > > > > > > How is this split between translation and madvise?  I
> > > > > > > > suspect it's mostly madvise since you need translation when
> > > > > > > > using bitmap as
> > > > well.
> > > > > > > > Correct? Could you measure this please?  Also, what if we
> > > > > > > > use the new MADV_FREE instead?  By how much would this help?
> > > > > > > >
> > > > > > > For the current balloon, address translation is needed.
> > > > > > > But for live migration, there is no need to do address translation.
> > > > > >
> > > > > > Well you need ram address in order to clear the dirty bit.
> > > > > > How would you get it without translation?
> > > > > >
> > > > >
> > > > > If you means that kind of address translation, yes, it need.
> > > > > What I want to say is, filter out the free page can be done by
> > > > > bitmap
> > > > operation.
> > > > >
> > > > > Liang
> > > >
> > > > OK so I see that your patches use block->offset in struct RAMBlock
> > > > to look up bits in guest-supplied bitmap.
> > > > I don't think that's guaranteed to work.
> > >
> > > It's part of the bitmap operation, because the latest change of the
> > ram_list.dirty_memory.
> > > Why do you think so? Could you tell me the reason?
> > >
> > > Liang
> > 
> > Sorry, why do I think what? That ram_addr_t is not guaranteed to equal GPA
> > of the block?
> > 
> 
> I mean why do you think that's can't guaranteed to work.
> Yes, ram_addr_t is not guaranteed to equal GPA of the block. But I didn't use them as
> GPA. The code in the filter_out_guest_free_pages() in my patch just follow the style of
> the latest change of  ram_list.dirty_memory[].
> 
> The free page bitmap got from the guest in my RFC patch has been filtered out the
> 'hole', so the bit N of the free page bitmap and the bit N in 
> ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks are corresponding to
> the same guest page.  Right?
> If it's true, I think I am doing the right thing?
> 
> 
> Liang

There's no guarantee that there's a single 'hole'
even on the PC, and we want balloon to be portable.

So I'm not sure I understand what your patch is doing,
do you mean you pass the GPA to ram addr
mapping from host to guest?

That can be made to work but it's not a good idea,
and I don't see why would it be faster than doing
the same translation host side.


> > E.g. HACKING says:
> > 	Use hwaddr for guest physical addresses except pcibus_t
> > 	for PCI addresses.  In addition, ram_addr_t is a QEMU internal
> > address
> > 	space that maps guest RAM physical addresses into an intermediate
> > 	address space that can map to host virtual address spaces.
> > 
> > 
> > --
> > MST
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in the body of
> > a message to majordomo@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "rkagan@virtuozzo.com" <rkagan@virtuozzo.com>,
	"linux-kernel@vger.kenel.org" <linux-kernel@vger.kenel.org>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"simhan@hpe.com" <simhan@hpe.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>,
	"jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>,
	"mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>,
	"amit.shah@redhat.com" <amit.shah@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"rth@twiddle.net" <rth@twiddle.net>
Subject: Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by skipping free pages
Date: Thu, 24 Mar 2016 16:44:13 +0200	[thread overview]
Message-ID: <20160324164006-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0415BCDD@shsmsx102.ccr.corp.intel.com>

On Thu, Mar 24, 2016 at 02:33:15PM +0000, Li, Liang Z wrote:
> > > > > > > Agree. Current balloon just send 256 PFNs a time, that's too
> > > > > > > few and lead to too many times of virtio transmission, that's
> > > > > > > the main reason for the
> > > > > > bad performance.
> > > > > > > Change the VIRTIO_BALLOON_ARRAY_PFNS_MAX to a large value
> > can
> > > > > > improve
> > > > > > > the performance significant. Maybe we should increase it
> > > > > > > before doing the further optimization, do you think so ?
> > > > > >
> > > > > > We could push it up a bit higher: 256 is 1kbyte in size, so we
> > > > > > can make it 3x bigger and still fit struct virtio_balloon is a
> > > > > > single page. But if we are going to add the bitmap variant
> > > > > > anyway, we probably
> > > > shouldn't bother.
> > > > > >
> > > > > > > > > c. address translation and madvise() operation (24%,
> > > > > > > > > 1423ms)
> > > > > > > >
> > > > > > > > How is this split between translation and madvise?  I
> > > > > > > > suspect it's mostly madvise since you need translation when
> > > > > > > > using bitmap as
> > > > well.
> > > > > > > > Correct? Could you measure this please?  Also, what if we
> > > > > > > > use the new MADV_FREE instead?  By how much would this help?
> > > > > > > >
> > > > > > > For the current balloon, address translation is needed.
> > > > > > > But for live migration, there is no need to do address translation.
> > > > > >
> > > > > > Well you need ram address in order to clear the dirty bit.
> > > > > > How would you get it without translation?
> > > > > >
> > > > >
> > > > > If you means that kind of address translation, yes, it need.
> > > > > What I want to say is, filter out the free page can be done by
> > > > > bitmap
> > > > operation.
> > > > >
> > > > > Liang
> > > >
> > > > OK so I see that your patches use block->offset in struct RAMBlock
> > > > to look up bits in guest-supplied bitmap.
> > > > I don't think that's guaranteed to work.
> > >
> > > It's part of the bitmap operation, because the latest change of the
> > ram_list.dirty_memory.
> > > Why do you think so? Could you tell me the reason?
> > >
> > > Liang
> > 
> > Sorry, why do I think what? That ram_addr_t is not guaranteed to equal GPA
> > of the block?
> > 
> 
> I mean why do you think that's can't guaranteed to work.
> Yes, ram_addr_t is not guaranteed to equal GPA of the block. But I didn't use them as
> GPA. The code in the filter_out_guest_free_pages() in my patch just follow the style of
> the latest change of  ram_list.dirty_memory[].
> 
> The free page bitmap got from the guest in my RFC patch has been filtered out the
> 'hole', so the bit N of the free page bitmap and the bit N in 
> ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks are corresponding to
> the same guest page.  Right?
> If it's true, I think I am doing the right thing?
> 
> 
> Liang

There's no guarantee that there's a single 'hole'
even on the PC, and we want balloon to be portable.

So I'm not sure I understand what your patch is doing,
do you mean you pass the GPA to ram addr
mapping from host to guest?

That can be made to work but it's not a good idea,
and I don't see why would it be faster than doing
the same translation host side.


> > E.g. HACKING says:
> > 	Use hwaddr for guest physical addresses except pcibus_t
> > 	for PCI addresses.  In addition, ram_addr_t is a QEMU internal
> > address
> > 	space that maps guest RAM physical addresses into an intermediate
> > 	address space that can map to host virtual address spaces.
> > 
> > 
> > --
> > MST
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in the body of
> > a message to majordomo@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-03-24 14:44 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-22  7:43 [RFC Design Doc]Speed up live migration by skipping free pages Liang Li
2016-03-22  7:43 ` [Qemu-devel] " Liang Li
2016-03-22 10:11 ` Michael S. Tsirkin
2016-03-22 10:11   ` [Qemu-devel] " Michael S. Tsirkin
2016-03-23  6:05   ` Li, Liang Z
2016-03-23  6:05     ` [Qemu-devel] " Li, Liang Z
2016-03-23 14:08     ` Michael S. Tsirkin
2016-03-23 14:08       ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24  1:19       ` Li, Liang Z
2016-03-24  1:19         ` [Qemu-devel] " Li, Liang Z
2016-03-24  9:48         ` Michael S. Tsirkin
2016-03-24  9:48           ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 10:16           ` Li, Liang Z
2016-03-24 10:16             ` [Qemu-devel] " Li, Liang Z
2016-03-24 10:29             ` Michael S. Tsirkin
2016-03-24 10:29               ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 14:33               ` Li, Liang Z
2016-03-24 14:33                 ` [Qemu-devel] " Li, Liang Z
2016-03-24 14:44                 ` Michael S. Tsirkin [this message]
2016-03-24 14:44                   ` Michael S. Tsirkin
2016-03-24 15:16                   ` Li, Liang Z
2016-03-24 15:16                     ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:18                     ` Paolo Bonzini
2016-03-24 15:18                       ` [Qemu-devel] " Paolo Bonzini
2016-03-24 15:25                       ` Li, Liang Z
2016-03-24 15:25                         ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:27                     ` Michael S. Tsirkin
2016-03-24 15:27                       ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 15:39                       ` Li, Liang Z
2016-03-24 15:39                         ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:47                         ` Paolo Bonzini
2016-03-24 15:47                           ` [Qemu-devel] " Paolo Bonzini
2016-03-24 15:59                           ` Li, Liang Z
2016-03-24 15:59                             ` [Qemu-devel] " Li, Liang Z
2016-03-22 19:05 ` Dr. David Alan Gilbert
2016-03-22 19:05   ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-23  6:48   ` Li, Liang Z
2016-03-23  6:48     ` [Qemu-devel] " Li, Liang Z
2016-03-24  1:24     ` Wei Yang
2016-03-24  1:24       ` [Qemu-devel] " Wei Yang
2016-03-24  9:00       ` Dr. David Alan Gilbert
2016-03-24  9:00         ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-24 10:09         ` Li, Liang Z
2016-03-24 10:09           ` [Qemu-devel] " Li, Liang Z
2016-03-24 10:23           ` Dr. David Alan Gilbert
2016-03-24 10:23             ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-24 14:50             ` Li, Liang Z
2016-03-24 14:50               ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:11               ` Michael S. Tsirkin
2016-03-24 15:11                 ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 15:53                 ` Li, Liang Z
2016-03-24 15:53                   ` [Qemu-devel] " Li, Liang Z
2016-03-24 15:56                   ` Michael S. Tsirkin
2016-03-24 15:56                     ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 16:05                     ` Li, Liang Z
2016-03-24 16:05                       ` [Qemu-devel] " Li, Liang Z
2016-03-24 16:25                       ` Michael S. Tsirkin
2016-03-24 16:25                         ` [Qemu-devel] " Michael S. Tsirkin
2016-03-24 17:49                         ` Dr. David Alan Gilbert
2016-03-24 17:49                           ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-24 22:16                           ` Michael S. Tsirkin
2016-03-24 22:16                             ` [Qemu-devel] " Michael S. Tsirkin
2016-03-25  1:59                             ` Li, Liang Z
2016-03-25  1:59                               ` [Qemu-devel] " Li, Liang Z
2016-03-25  1:32                           ` Li, Liang Z
2016-03-25  1:32                             ` [Qemu-devel] " Li, Liang Z
2016-04-18 11:08                           ` Li, Liang Z
2016-04-18 11:08                             ` [Qemu-devel] " Li, Liang Z
2016-04-18 11:29                             ` Michael S. Tsirkin
2016-04-18 11:29                               ` [Qemu-devel] " Michael S. Tsirkin
2016-04-18 14:36                               ` Li, Liang Z
2016-04-18 14:36                                 ` [Qemu-devel] " Li, Liang Z
2016-04-18 15:38                                 ` Michael S. Tsirkin
2016-04-18 15:38                                   ` [Qemu-devel] " Michael S. Tsirkin
2016-04-19  2:20                                   ` Li, Liang Z
2016-04-19  2:20                                     ` [Qemu-devel] " Li, Liang Z
2016-04-19 19:12                               ` Dr. David Alan Gilbert
2016-04-19 19:12                                 ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-25 10:56                                 ` Michael S. Tsirkin
2016-04-25 10:56                                   ` [Qemu-devel] " Michael S. Tsirkin
2016-04-19 19:05                             ` Dr. David Alan Gilbert
2016-04-19 19:05                               ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-20  3:22                               ` Li, Liang Z
2016-04-20  3:22                                 ` [Qemu-devel] " Li, Liang Z
2016-04-20  8:10                                 ` Dr. David Alan Gilbert
2016-04-20  8:10                                   ` [Qemu-devel] " Dr. David Alan Gilbert
2016-03-25  1:32                         ` Li, Liang Z
2016-03-25  1:32                           ` [Qemu-devel] " Li, Liang Z
2016-04-01 10:54   ` Amit Shah
2016-04-01 10:54     ` [Qemu-devel] " Amit Shah
2016-04-05  1:49     ` Li, Liang Z
2016-04-05  1:49       ` [Qemu-devel] " Li, Liang Z
2016-03-23  1:37 ` Wei Yang
2016-03-23  1:37   ` [Qemu-devel] " Wei Yang
2016-03-23  7:18   ` Li, Liang Z
2016-03-23  7:18     ` [Qemu-devel] " Li, Liang Z
2016-03-23  9:46     ` Wei Yang
2016-03-23  9:46       ` [Qemu-devel] " Wei Yang
2016-03-23 14:35       ` Li, Liang Z
2016-03-23 14:35         ` [Qemu-devel] " Li, Liang Z
2016-03-24  0:52         ` Wei Yang
2016-03-24  0:52           ` [Qemu-devel] " Wei Yang
2016-03-24  1:32           ` Li, Liang Z
2016-03-24  1:32             ` [Qemu-devel] " Li, Liang Z
2016-03-24  1:56             ` Wei Yang
2016-03-24  1:56               ` [Qemu-devel] " Wei Yang
2016-03-23 16:53     ` Eric Blake
2016-03-23 16:53       ` Eric Blake
2016-03-23 21:41       ` Wei Yang
2016-03-23 21:41         ` Wei Yang
2016-03-24  1:23       ` Li, Liang Z
2016-03-24  1:23         ` Li, Liang Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160324164006-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jitendra.kolhe@hpe.com \
    --cc=kvm@vger.kernel.org \
    --cc=liang.z.li@intel.com \
    --cc=linux-kernel@vger.kenel.org \
    --cc=mohan_parthasarathy@hpe.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=riel@redhat.com \
    --cc=rkagan@virtuozzo.com \
    --cc=rth@twiddle.net \
    --cc=simhan@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.