* [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
@ 2019-09-23 17:49 Dr. David Alan Gilbert (git)
2019-09-23 18:23 ` Alex Bennée
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2019-09-23 17:49 UTC (permalink / raw)
To: qemu-devel, quintela, peterx; +Cc: thuth, alex.bennee
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Various parts of the migration code do different things when they're
in postcopy mode; prior to this patch this has been 'postcopy-active'.
This patch extends 'in_postcopy' to include 'postcopy-paused' and
'postcopy-recover'.
In particular, when you set the max-postcopy-bandwidth parameter, this
only affects the current migration fd if we're 'in_postcopy';
this leads to a race in the postcopy recovery test where it increases
the speed from 4k/sec to unlimited, but that increase can get ignored
if the change is made between the point at which the reconnection
happens and it transitions back to active.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/migration.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 01863a95f5..5f7e4d15e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void)
{
MigrationState *s = migrate_get_current();
- return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ switch (s->state) {
+ case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+ case MIGRATION_STATUS_POSTCOPY_PAUSED:
+ case MIGRATION_STATUS_POSTCOPY_RECOVER:
+ return true;
+ default:
+ return false;
+ }
}
bool migration_in_postcopy_after_devices(MigrationState *s)
--
2.21.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
@ 2019-09-23 18:23 ` Alex Bennée
2019-09-24 0:15 ` Peter Xu
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Alex Bennée @ 2019-09-23 18:23 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git); +Cc: thuth, qemu-devel, peterx, quintela
Dr. David Alan Gilbert (git) <dgilbert@redhat.com> writes:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Various parts of the migration code do different things when they're
> in postcopy mode; prior to this patch this has been 'postcopy-active'.
> This patch extends 'in_postcopy' to include 'postcopy-paused' and
> 'postcopy-recover'.
>
> In particular, when you set the max-postcopy-bandwidth parameter, this
> only affects the current migration fd if we're 'in_postcopy';
> this leads to a race in the postcopy recovery test where it increases
> the speed from 4k/sec to unlimited, but that increase can get ignored
> if the change is made between the point at which the reconnection
> happens and it transitions back to active.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
I'm stress testing it now.
> ---
> migration/migration.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 01863a95f5..5f7e4d15e9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void)
> {
> MigrationState *s = migrate_get_current();
>
> - return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> + switch (s->state) {
> + case MIGRATION_STATUS_POSTCOPY_ACTIVE:
> + case MIGRATION_STATUS_POSTCOPY_PAUSED:
> + case MIGRATION_STATUS_POSTCOPY_RECOVER:
> + return true;
> + default:
> + return false;
> + }
> }
>
> bool migration_in_postcopy_after_devices(MigrationState *s)
--
Alex Bennée
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
2019-09-23 18:23 ` Alex Bennée
@ 2019-09-24 0:15 ` Peter Xu
2019-09-24 7:29 ` Juan Quintela
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Peter Xu @ 2019-09-24 0:15 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git); +Cc: thuth, alex.bennee, qemu-devel, quintela
On Mon, Sep 23, 2019 at 06:49:42PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Various parts of the migration code do different things when they're
> in postcopy mode; prior to this patch this has been 'postcopy-active'.
> This patch extends 'in_postcopy' to include 'postcopy-paused' and
> 'postcopy-recover'.
>
> In particular, when you set the max-postcopy-bandwidth parameter, this
> only affects the current migration fd if we're 'in_postcopy';
> this leads to a race in the postcopy recovery test where it increases
> the speed from 4k/sec to unlimited, but that increase can get ignored
> if the change is made between the point at which the reconnection
> happens and it transitions back to active.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Yeh this makes quite a lot of sense to me...
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
2019-09-23 18:23 ` Alex Bennée
2019-09-24 0:15 ` Peter Xu
@ 2019-09-24 7:29 ` Juan Quintela
2019-09-24 15:39 ` Alex Bennée
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Juan Quintela @ 2019-09-24 7:29 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git); +Cc: thuth, alex.bennee, qemu-devel, peterx
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Various parts of the migration code do different things when they're
> in postcopy mode; prior to this patch this has been 'postcopy-active'.
> This patch extends 'in_postcopy' to include 'postcopy-paused' and
> 'postcopy-recover'.
>
> In particular, when you set the max-postcopy-bandwidth parameter, this
> only affects the current migration fd if we're 'in_postcopy';
> this leads to a race in the postcopy recovery test where it increases
> the speed from 4k/sec to unlimited, but that increase can get ignored
> if the change is made between the point at which the reconnection
> happens and it transitions back to active.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
` (2 preceding siblings ...)
2019-09-24 7:29 ` Juan Quintela
@ 2019-09-24 15:39 ` Alex Bennée
2019-09-25 9:21 ` Markus Armbruster
2019-09-25 10:37 ` Dr. David Alan Gilbert
5 siblings, 0 replies; 7+ messages in thread
From: Alex Bennée @ 2019-09-24 15:39 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git); +Cc: thuth, qemu-devel, peterx, quintela
Dr. David Alan Gilbert (git) <dgilbert@redhat.com> writes:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Various parts of the migration code do different things when they're
> in postcopy mode; prior to this patch this has been 'postcopy-active'.
> This patch extends 'in_postcopy' to include 'postcopy-paused' and
> 'postcopy-recover'.
>
> In particular, when you set the max-postcopy-bandwidth parameter, this
> only affects the current migration fd if we're 'in_postcopy';
> this leads to a race in the postcopy recovery test where it increases
> the speed from 4k/sec to unlimited, but that increase can get ignored
> if the change is made between the point at which the reconnection
> happens and it transitions back to active.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In my xenial stress test I run 100 times and it never triggered the 180s
timeout I set on my retry.py script:
Tested-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> migration/migration.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 01863a95f5..5f7e4d15e9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void)
> {
> MigrationState *s = migrate_get_current();
>
> - return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> + switch (s->state) {
> + case MIGRATION_STATUS_POSTCOPY_ACTIVE:
> + case MIGRATION_STATUS_POSTCOPY_PAUSED:
> + case MIGRATION_STATUS_POSTCOPY_RECOVER:
> + return true;
> + default:
> + return false;
> + }
> }
>
> bool migration_in_postcopy_after_devices(MigrationState *s)
--
Alex Bennée
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
` (3 preceding siblings ...)
2019-09-24 15:39 ` Alex Bennée
@ 2019-09-25 9:21 ` Markus Armbruster
2019-09-25 10:37 ` Dr. David Alan Gilbert
5 siblings, 0 replies; 7+ messages in thread
From: Markus Armbruster @ 2019-09-25 9:21 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git)
Cc: thuth, alex.bennee, qemu-devel, peterx, quintela
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> writes:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Various parts of the migration code do different things when they're
> in postcopy mode; prior to this patch this has been 'postcopy-active'.
> This patch extends 'in_postcopy' to include 'postcopy-paused' and
> 'postcopy-recover'.
>
> In particular, when you set the max-postcopy-bandwidth parameter, this
> only affects the current migration fd if we're 'in_postcopy';
> this leads to a race in the postcopy recovery test where it increases
> the speed from 4k/sec to unlimited, but that increase can get ignored
> if the change is made between the point at which the reconnection
> happens and it transitions back to active.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This seems to fix the intermittent hangs I observed and bisected to
commit 8504ddeca0 "migration: Fix postcopy bw for recovery".
Tested-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy'
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
` (4 preceding siblings ...)
2019-09-25 9:21 ` Markus Armbruster
@ 2019-09-25 10:37 ` Dr. David Alan Gilbert
5 siblings, 0 replies; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2019-09-25 10:37 UTC (permalink / raw)
To: qemu-devel, quintela, peterx; +Cc: thuth, alex.bennee
* Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Various parts of the migration code do different things when they're
> in postcopy mode; prior to this patch this has been 'postcopy-active'.
> This patch extends 'in_postcopy' to include 'postcopy-paused' and
> 'postcopy-recover'.
>
> In particular, when you set the max-postcopy-bandwidth parameter, this
> only affects the current migration fd if we're 'in_postcopy';
> this leads to a race in the postcopy recovery test where it increases
> the speed from 4k/sec to unlimited, but that increase can get ignored
> if the change is made between the point at which the reconnection
> happens and it transitions back to active.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Queued
> ---
> migration/migration.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 01863a95f5..5f7e4d15e9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void)
> {
> MigrationState *s = migrate_get_current();
>
> - return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> + switch (s->state) {
> + case MIGRATION_STATUS_POSTCOPY_ACTIVE:
> + case MIGRATION_STATUS_POSTCOPY_PAUSED:
> + case MIGRATION_STATUS_POSTCOPY_RECOVER:
> + return true;
> + default:
> + return false;
> + }
> }
>
> bool migration_in_postcopy_after_devices(MigrationState *s)
> --
> 2.21.0
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-09-25 10:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 17:49 [PATCH] migration/postcopy: Recognise the recovery states as 'in_postcopy' Dr. David Alan Gilbert (git)
2019-09-23 18:23 ` Alex Bennée
2019-09-24 0:15 ` Peter Xu
2019-09-24 7:29 ` Juan Quintela
2019-09-24 15:39 ` Alex Bennée
2019-09-25 9:21 ` Markus Armbruster
2019-09-25 10:37 ` Dr. David Alan Gilbert
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.