qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages.
@ 2019-07-11  9:44 Lin Ma
  2019-07-11 10:24 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 7+ messages in thread
From: Lin Ma @ 2019-07-11  9:44 UTC (permalink / raw)
  To: qemu-devel, dgilbert

Hi all,

When I live migrate a qemu/kvm guest, If the guest is using huge pages, I found that
the migrate_set_speed command had no effect during stage 2.
It was caused by commit 4c011c3 postcopy: Send whole huge pages

I'm wondering that is it by design or is it a bug waiting for fix?


Thanks,
Lin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages.
  2019-07-11  9:44 [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages Lin Ma
@ 2019-07-11 10:24 ` Dr. David Alan Gilbert
       [not found]   ` <BY5PR18MB331347C441DA068E32BFDE53C5F20@BY5PR18MB3313.namprd18.prod.outlook.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2019-07-11 10:24 UTC (permalink / raw)
  To: Lin Ma; +Cc: qemu-devel

* Lin Ma (LMa@suse.com) wrote:
> Hi all,

Hi Lin,

> When I live migrate a qemu/kvm guest, If the guest is using huge pages, I found that
> the migrate_set_speed command had no effect during stage 2.

Can you explain what you mean by 'stage 2'?

> It was caused by commit 4c011c3 postcopy: Send whole huge pages
> 
> I'm wondering that is it by design or is it a bug waiting for fix?

This is the first report I've seen for it.  How did you conclude that
4c011c3 caused it?  While I can see it might have some effect on the
bandwidth management, I'm surprised it has this much effect.

What size huge pages are you using - 2MB or 1GB?
I can imagine we might have a problem that since we only do the sleep
between the hugepages, if we were using 1GB hugepages then
we'd see <big chunk of data>[sleep]<big chunk of data>[sleep]
which isn't as smooth as it used to be.

Can you give me some more details of your test?

Dave

> 
> Thanks,
> Lin
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel]  答复:  migrate_set_speed has no effect if the guest is using hugepages.
       [not found]   ` <BY5PR18MB331347C441DA068E32BFDE53C5F20@BY5PR18MB3313.namprd18.prod.outlook.com>
@ 2019-07-12 12:34     ` Dr. David Alan Gilbert
  2019-07-15  9:43       ` [Qemu-devel] 回复: " Lin Ma
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2019-07-12 12:34 UTC (permalink / raw)
  To: Lin Ma; +Cc: qemu-devel

* Lin Ma (LMa@suse.com) wrote:
> 
> 
> > -----邮件原件-----
> > 发件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 发送时间: 2019年7月11日 18:24
> > 收件人: Lin Ma <LMa@suse.com>
> > 抄送: qemu-devel@nongnu.org
> > 主题: Re: [Qemu-devel] migrate_set_speed has no effect if the guest is using
> > hugepages.
> > 
> > * Lin Ma (LMa@suse.com) wrote:
> > > Hi all,
> > 
> > Hi Lin,
> 
> Hi Dave,
> > 
> > > When I live migrate a qemu/kvm guest, If the guest is using huge
> > > pages, I found that the migrate_set_speed command had no effect during
> > stage 2.
> > 
> > Can you explain what you mean by 'stage 2'?
> We know that the live migration contains 3 stages:
> Stage 1: Mark all of RAM dirty.
> Stage 2: Keep sending dirty RAM pages since last iteration
> Stage 3: Stop guest, transfer remaining dirty RAM, device state
> (Please refer to https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines/#live-migration for further details)

OK, yeh the numbering is pretty arbitrary so it's not something I
normally think about like that.

> 
> > > It was caused by commit 4c011c3 postcopy: Send whole huge pages
> > >
> > > I'm wondering that is it by design or is it a bug waiting for fix?
> > 
> > This is the first report I've seen for it.  How did you conclude that
> > 4c011c3 caused it?  While I can see it might have some effect on the
> > bandwidth management, I'm surprised it has this much effect.
> 
> While digging into the bandwidth issue, Git bisect shows that this commit was the first bad commit.

OK.

> > What size huge pages are you using - 2MB or 1GB?
> 
> When I hit this issue I was using 1GB huge page size.
> I tested this issue with 2MB page size today On Gigabit LAN, Although the bandwidth control looks
> a little better than using 1GB, But not too much. Please refer to the below test result.

OK, I can certainly see why this might happen with 1GB huge pages; I
need to have a think about a fix.

> > I can imagine we might have a problem that since we only do the sleep between
> > the hugepages, if we were using 1GB hugepages then we'd see <big chunk of
> > data>[sleep]<big chunk of data>[sleep] which isn't as smooth as it used to be.
> > 
> > Can you give me some more details of your test?
> 
> Live migration bandwidth management testing with 2MB hugepage size:
> sles12sp4_i440fx is a qemu/kvm guest with 6GB memory size.
> Note: the throughput value is approximating value.
> 
> Terminal 1:
> virsh migrate-setspeed sles12sp4_i440fx $bandwidth && virsh migrate --live sles12sp4_i440fx qemu+tcp://5810f/system
> 
> Terminal 2:
> virsh qemu-monitor-command sles12sp4_i440fx --hmp "info migrate"
> 
> bandwidth=5
> throughput: 160 mbps
> 
> bandwidth=10
> throughput: 167 mbps
> 
> bandwidth=15
> throughput: 168 mbps
> 
> bandwidth=20
> throughput: 168 mbps
> 
> bandwidth=21
> throughput: 336 mbps
> 
> bandwidth=22
> throughput: 336 mbps
> 
> bandwidth=25
> throughput: 335.87 mbps
> 
> bandwidth=30
> throughput: 335 mbps
> 
> bandwidth=35
> throughput: 335 mbps
> 
> bandwidth=40
> throughput: 335 mbps
> 
> bandwidth=45
> throughput: 504.00 mbps
> 
> bandwidth=50
> throughput: 500.00 mbps
> 
> bandwidth=55
> throughput: 500.00 mbps
> 
> bandwidth=60
> throughput: 500.00 mbps
> 
> bandwidth=65
> throughput: 650.00 mbps
> 
> bandwidth=70
> throughput: 660.00 mbps

OK, so migrate-setspeed takes a bandwidth in MBytes/sec and I guess
you're throughput is in MBit/sec - so at the higher end it's about
right, and at the lower end it's way off.

Let me think about a fix for this.

What are you using to measure throughput?

Dave

> 
> Thanks,
> Lin
> 
> 
> > Dave
> > 
> > >
> > > Thanks,
> > > Lin
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] 回复: 答复:  migrate_set_speed has no effect if the guest is using hugepages.
  2019-07-12 12:34     ` [Qemu-devel] 答复: " Dr. David Alan Gilbert
@ 2019-07-15  9:43       ` Lin Ma
  2019-08-02 10:57         ` [Qemu-devel] 回复: " Lin Ma
  2019-12-05 10:31         ` 回复: 答复: [Qemu-devel] " Dr. David Alan Gilbert
  0 siblings, 2 replies; 7+ messages in thread
From: Lin Ma @ 2019-07-15  9:43 UTC (permalink / raw)
  To:  Dr. David Alan Gilbert ; +Cc: qemu-devel



> -----邮件原件-----
> 发件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 发送时间: 2019年7月12日 20:34
> 收件人: Lin Ma <LMa@suse.com>
> 抄送: qemu-devel@nongnu.org
> 主题: Re: 答复: [Qemu-devel] migrate_set_speed has no effect if the guest is
> using hugepages.
> 
> * Lin Ma (LMa@suse.com) wrote:
> >
> >
> > > -----邮件原件-----
> > > 发件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > 发送时间: 2019年7月11日 18:24
> > > 收件人: Lin Ma <LMa@suse.com>
> > > 抄送: qemu-devel@nongnu.org
> > > 主题: Re: [Qemu-devel] migrate_set_speed has no effect if the guest is
> > > using hugepages.
> > >
> > > * Lin Ma (LMa@suse.com) wrote:
> > > > Hi all,
> > >
> > > Hi Lin,
> >
> > Hi Dave,
> > >
> > > > When I live migrate a qemu/kvm guest, If the guest is using huge
> > > > pages, I found that the migrate_set_speed command had no effect
> > > > during
> > > stage 2.
> > >
> > > Can you explain what you mean by 'stage 2'?
> > We know that the live migration contains 3 stages:
> > Stage 1: Mark all of RAM dirty.
> > Stage 2: Keep sending dirty RAM pages since last iteration Stage 3:
> > Stop guest, transfer remaining dirty RAM, device state (Please refer
> > to
> > https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-
> > virtual-machines/#live-migration for further details)
> 
> OK, yeh the numbering is pretty arbitrary so it's not something I normally think
> about like that.
> 
> >
> > > > It was caused by commit 4c011c3 postcopy: Send whole huge pages
> > > >
> > > > I'm wondering that is it by design or is it a bug waiting for fix?
> > >
> > > This is the first report I've seen for it.  How did you conclude
> > > that
> > > 4c011c3 caused it?  While I can see it might have some effect on the
> > > bandwidth management, I'm surprised it has this much effect.
> >
> > While digging into the bandwidth issue, Git bisect shows that this commit was
> the first bad commit.
> 
> OK.
> 
> > > What size huge pages are you using - 2MB or 1GB?
> >
> > When I hit this issue I was using 1GB huge page size.
> > I tested this issue with 2MB page size today On Gigabit LAN, Although
> > the bandwidth control looks a little better than using 1GB, But not too much.
> Please refer to the below test result.
> 
> OK, I can certainly see why this might happen with 1GB huge pages; I need to
> have a think about a fix.
> 
> > > I can imagine we might have a problem that since we only do the
> > > sleep between the hugepages, if we were using 1GB hugepages then
> > > we'd see <big chunk of
> > > data>[sleep]<big chunk of data>[sleep] which isn't as smooth as it used to
> be.
> > >
> > > Can you give me some more details of your test?
> >
> > Live migration bandwidth management testing with 2MB hugepage size:
> > sles12sp4_i440fx is a qemu/kvm guest with 6GB memory size.
> > Note: the throughput value is approximating value.
> >
> > Terminal 1:
> > virsh migrate-setspeed sles12sp4_i440fx $bandwidth && virsh migrate
> > --live sles12sp4_i440fx qemu+tcp://5810f/system
> >
> > Terminal 2:
> > virsh qemu-monitor-command sles12sp4_i440fx --hmp "info migrate"
> >
> > bandwidth=5
> > throughput: 160 mbps
> >
> > bandwidth=10
> > throughput: 167 mbps
> >
> > bandwidth=15
> > throughput: 168 mbps
> >
> > bandwidth=20
> > throughput: 168 mbps
> >
> > bandwidth=21
> > throughput: 336 mbps
> >
> > bandwidth=22
> > throughput: 336 mbps
> >
> > bandwidth=25
> > throughput: 335.87 mbps
> >
> > bandwidth=30
> > throughput: 335 mbps
> >
> > bandwidth=35
> > throughput: 335 mbps
> >
> > bandwidth=40
> > throughput: 335 mbps
> >
> > bandwidth=45
> > throughput: 504.00 mbps
> >
> > bandwidth=50
> > throughput: 500.00 mbps
> >
> > bandwidth=55
> > throughput: 500.00 mbps
> >
> > bandwidth=60
> > throughput: 500.00 mbps
> >
> > bandwidth=65
> > throughput: 650.00 mbps
> >
> > bandwidth=70
> > throughput: 660.00 mbps
> 
> OK, so migrate-setspeed takes a bandwidth in MBytes/sec and I guess you're
> throughput is in MBit/sec - so at the higher end it's about right, and at the lower
> end it's way off.
> 
> Let me think about a fix for this.
> 
> What are you using to measure throughput?

I use 'watch' command to observe the output of qemu hmp command 'info migrate', calculate the
average value of throughput field during stage 2 of live migration.

Thanks for taking time to dig into this issue,
Lin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] 回复:  回复: 答复:  migrate_set_speed has no effect if the guest is using hugepages.
  2019-07-15  9:43       ` [Qemu-devel] 回复: " Lin Ma
@ 2019-08-02 10:57         ` Lin Ma
  2019-12-05 10:31         ` 回复: 答复: [Qemu-devel] " Dr. David Alan Gilbert
  1 sibling, 0 replies; 7+ messages in thread
From: Lin Ma @ 2019-08-02 10:57 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel

Hi Dave,

May I ask that do you have any update about the fix?

Thanks,
Lin

> -----邮件原件-----
> 发件人: Qemu-devel <qemu-devel-bounces+lma=suse.com@nongnu.org> 代
> 表 Lin Ma
> 发送时间: 2019年7月15日 17:43
> 收件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 抄送: qemu-devel@nongnu.org
> 主题: [Qemu-devel] 回复: 答复: migrate_set_speed has no effect if the guest
> is using hugepages.
> 
> 
> 
> > -----邮件原件-----
> > 发件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 发送时间: 2019年7月12日 20:34
> > 收件人: Lin Ma <LMa@suse.com>
> > 抄送: qemu-devel@nongnu.org
> > 主题: Re: 答复: [Qemu-devel] migrate_set_speed has no effect if the guest
> > is using hugepages.
> >
> > * Lin Ma (LMa@suse.com) wrote:
> > >
> > >
> > > > -----邮件原件-----
> > > > 发件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > 发送时间: 2019年7月11日 18:24
> > > > 收件人: Lin Ma <LMa@suse.com>
> > > > 抄送: qemu-devel@nongnu.org
> > > > 主题: Re: [Qemu-devel] migrate_set_speed has no effect if the guest
> > > > is using hugepages.
> > > >
> > > > * Lin Ma (LMa@suse.com) wrote:
> > > > > Hi all,
> > > >
> > > > Hi Lin,
> > >
> > > Hi Dave,
> > > >
> > > > > When I live migrate a qemu/kvm guest, If the guest is using huge
> > > > > pages, I found that the migrate_set_speed command had no effect
> > > > > during
> > > > stage 2.
> > > >
> > > > Can you explain what you mean by 'stage 2'?
> > > We know that the live migration contains 3 stages:
> > > Stage 1: Mark all of RAM dirty.
> > > Stage 2: Keep sending dirty RAM pages since last iteration Stage 3:
> > > Stop guest, transfer remaining dirty RAM, device state (Please refer
> > > to
> > > https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kv
> > > m- virtual-machines/#live-migration for further details)
> >
> > OK, yeh the numbering is pretty arbitrary so it's not something I
> > normally think about like that.
> >
> > >
> > > > > It was caused by commit 4c011c3 postcopy: Send whole huge pages
> > > > >
> > > > > I'm wondering that is it by design or is it a bug waiting for fix?
> > > >
> > > > This is the first report I've seen for it.  How did you conclude
> > > > that
> > > > 4c011c3 caused it?  While I can see it might have some effect on
> > > > the bandwidth management, I'm surprised it has this much effect.
> > >
> > > While digging into the bandwidth issue, Git bisect shows that this
> > > commit was
> > the first bad commit.
> >
> > OK.
> >
> > > > What size huge pages are you using - 2MB or 1GB?
> > >
> > > When I hit this issue I was using 1GB huge page size.
> > > I tested this issue with 2MB page size today On Gigabit LAN,
> > > Although the bandwidth control looks a little better than using 1GB, But not
> too much.
> > Please refer to the below test result.
> >
> > OK, I can certainly see why this might happen with 1GB huge pages; I
> > need to have a think about a fix.
> >
> > > > I can imagine we might have a problem that since we only do the
> > > > sleep between the hugepages, if we were using 1GB hugepages then
> > > > we'd see <big chunk of
> > > > data>[sleep]<big chunk of data>[sleep] which isn't as smooth as it
> > > > data>used to
> > be.
> > > >
> > > > Can you give me some more details of your test?
> > >
> > > Live migration bandwidth management testing with 2MB hugepage size:
> > > sles12sp4_i440fx is a qemu/kvm guest with 6GB memory size.
> > > Note: the throughput value is approximating value.
> > >
> > > Terminal 1:
> > > virsh migrate-setspeed sles12sp4_i440fx $bandwidth && virsh migrate
> > > --live sles12sp4_i440fx qemu+tcp://5810f/system
> > >
> > > Terminal 2:
> > > virsh qemu-monitor-command sles12sp4_i440fx --hmp "info migrate"
> > >
> > > bandwidth=5
> > > throughput: 160 mbps
> > >
> > > bandwidth=10
> > > throughput: 167 mbps
> > >
> > > bandwidth=15
> > > throughput: 168 mbps
> > >
> > > bandwidth=20
> > > throughput: 168 mbps
> > >
> > > bandwidth=21
> > > throughput: 336 mbps
> > >
> > > bandwidth=22
> > > throughput: 336 mbps
> > >
> > > bandwidth=25
> > > throughput: 335.87 mbps
> > >
> > > bandwidth=30
> > > throughput: 335 mbps
> > >
> > > bandwidth=35
> > > throughput: 335 mbps
> > >
> > > bandwidth=40
> > > throughput: 335 mbps
> > >
> > > bandwidth=45
> > > throughput: 504.00 mbps
> > >
> > > bandwidth=50
> > > throughput: 500.00 mbps
> > >
> > > bandwidth=55
> > > throughput: 500.00 mbps
> > >
> > > bandwidth=60
> > > throughput: 500.00 mbps
> > >
> > > bandwidth=65
> > > throughput: 650.00 mbps
> > >
> > > bandwidth=70
> > > throughput: 660.00 mbps
> >
> > OK, so migrate-setspeed takes a bandwidth in MBytes/sec and I guess
> > you're throughput is in MBit/sec - so at the higher end it's about
> > right, and at the lower end it's way off.
> >
> > Let me think about a fix for this.
> >
> > What are you using to measure throughput?
> 
> I use 'watch' command to observe the output of qemu hmp command 'info
> migrate', calculate the average value of throughput field during stage 2 of live
> migration.
> 
> Thanks for taking time to dig into this issue, Lin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 回复: 答复: [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages.
  2019-07-15  9:43       ` [Qemu-devel] 回复: " Lin Ma
  2019-08-02 10:57         ` [Qemu-devel] 回复: " Lin Ma
@ 2019-12-05 10:31         ` Dr. David Alan Gilbert
       [not found]           ` <BY5PR18MB331333470C356200DDF0A8DAC55B0@BY5PR18MB3313.namprd18.prod.outlook.com>
  1 sibling, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2019-12-05 10:31 UTC (permalink / raw)
  To: Lin Ma; +Cc: qemu-devel

Hi Lin,
  I've just posted 'migration: Rate limit inside host pages'; please let
me know if it helps for you.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 回复: 回复: 答复: [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages.
       [not found]           ` <BY5PR18MB331333470C356200DDF0A8DAC55B0@BY5PR18MB3313.namprd18.prod.outlook.com>
@ 2019-12-10 14:01             ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2019-12-10 14:01 UTC (permalink / raw)
  To: Lin Ma; +Cc: qemu-devel

* Lin Ma (LMa@suse.com) wrote:
> Hi Dave,
> 
> The patch fixed the issue, The rate limit with hugepages works well.
> Thanks for your help!

No problem; thank you for reporting and testing it.

Dave

> Lin
> 
> > -----邮件原件-----
> > 发件人: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 发送时间: 2019年12月5日 18:32
> > 收件人: Lin Ma <LMa@suse.com>
> > 抄送: qemu-devel@nongnu.org
> > 主题: Re: 回复: 答复: [Qemu-devel] migrate_set_speed has no effect if the
> > guest is using hugepages.
> > 
> > Hi Lin,
> >   I've just posted 'migration: Rate limit inside host pages'; please let me know if
> > it helps for you.
> > 
> > Dave
> > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-12-10 14:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-11  9:44 [Qemu-devel] migrate_set_speed has no effect if the guest is using hugepages Lin Ma
2019-07-11 10:24 ` Dr. David Alan Gilbert
     [not found]   ` <BY5PR18MB331347C441DA068E32BFDE53C5F20@BY5PR18MB3313.namprd18.prod.outlook.com>
2019-07-12 12:34     ` [Qemu-devel] 答复: " Dr. David Alan Gilbert
2019-07-15  9:43       ` [Qemu-devel] 回复: " Lin Ma
2019-08-02 10:57         ` [Qemu-devel] 回复: " Lin Ma
2019-12-05 10:31         ` 回复: 答复: [Qemu-devel] " Dr. David Alan Gilbert
     [not found]           ` <BY5PR18MB331333470C356200DDF0A8DAC55B0@BY5PR18MB3313.namprd18.prod.outlook.com>
2019-12-10 14:01             ` 回复: " Dr. David Alan Gilbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).