* [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration
@ 2019-08-02 10:18 Ivan Ren
2019-08-05 0:33 ` Wei Yang
2019-08-07 18:48 ` Dr. David Alan Gilbert
0 siblings, 2 replies; 5+ messages in thread
From: Ivan Ren @ 2019-08-02 10:18 UTC (permalink / raw)
To: quintela, dgilbert; +Cc: richardw.yang, qemu-devel
From: Ivan Ren <ivanren@tencent.com>
This patch fix a multifd migration bug in migration speed calculation, this
problem can be reproduced as follows:
1. start a vm and give a heavy memory write stress to prevent the vm be
successfully migrated to destination
2. begin a migration with multifd
3. migrate for a long time [actually, this can be measured by transferred bytes]
4. migrate cancel
5. begin a new migration with multifd, the migration will directly run into
migration_completion phase
Reason as follows:
Migration update bandwidth and s->threshold_size in function
migration_update_counters after BUFFER_DELAY time:
current_bytes = migration_total_bytes(s);
transferred = current_bytes - s->iteration_initial_bytes;
time_spent = current_time - s->iteration_start_time;
bandwidth = (double)transferred / time_spent;
s->threshold_size = bandwidth * s->parameters.downtime_limit;
In multifd migration, migration_total_bytes function return
qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes.
s->iteration_initial_bytes will be initialized to 0 at every new migration,
but ram_counters is a global variable, and history migration data will be
accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead
pending_size >= s->threshold_size become false in migration_iteration_run
after the first migration_update_counters.
Signed-off-by: Ivan Ren <ivanren@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Wei Yang <richardw.yang@linux.intel.com>
---
v2->v3:
- fix the bug of update_iteration_initial_status function prototype
v1->v2:
- Add interface update_iteration_initial_status to update statistic fields
at the same time to avoid info mismatch lead wrong calculation result.
migration/migration.c | 25 +++++++++++++++++++------
migration/savevm.c | 1 +
2 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 8a607fe1e2..bea9b1d796 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1908,6 +1908,11 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
}
migrate_init(s);
+ /*
+ * set ram_counters memory to zero for a
+ * new migration
+ */
+ memset(&ram_counters, 0, sizeof(ram_counters));
return true;
}
@@ -3025,6 +3030,17 @@ static void migration_calculate_complete(MigrationState *s)
}
}
+static void update_iteration_initial_status(MigrationState *s)
+{
+ /*
+ * Update these three fields at the same time to avoid mismatch info lead
+ * wrong speed calculation.
+ */
+ s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ s->iteration_initial_bytes = migration_total_bytes(s);
+ s->iteration_initial_pages = ram_get_total_transferred_pages();
+}
+
static void migration_update_counters(MigrationState *s,
int64_t current_time)
{
@@ -3060,9 +3076,7 @@ static void migration_update_counters(MigrationState *s,
qemu_file_reset_rate_limit(s->to_dst_file);
- s->iteration_start_time = current_time;
- s->iteration_initial_bytes = current_bytes;
- s->iteration_initial_pages = ram_get_total_transferred_pages();
+ update_iteration_initial_status(s);
trace_migrate_transferred(transferred, time_spent,
bandwidth, s->threshold_size);
@@ -3186,7 +3200,7 @@ static void *migration_thread(void *opaque)
rcu_register_thread();
object_ref(OBJECT(s));
- s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ update_iteration_initial_status(s);
qemu_savevm_state_header(s->to_dst_file);
@@ -3251,8 +3265,7 @@ static void *migration_thread(void *opaque)
* the local variables. This is important to avoid
* breaking transferred_bytes and bandwidth calculation
*/
- s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
- s->iteration_initial_bytes = 0;
+ update_iteration_initial_status(s);
}
current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
diff --git a/migration/savevm.c b/migration/savevm.c
index 79ed44d475..480c511b19 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1424,6 +1424,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
}
migrate_init(ms);
+ memset(&ram_counters, 0, sizeof(ram_counters));
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
--
2.17.2 (Apple Git-113)
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration
2019-08-02 10:18 [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration Ivan Ren
@ 2019-08-05 0:33 ` Wei Yang
2019-08-05 1:16 ` Ivan Ren
2019-08-07 18:48 ` Dr. David Alan Gilbert
1 sibling, 1 reply; 5+ messages in thread
From: Wei Yang @ 2019-08-05 0:33 UTC (permalink / raw)
To: Ivan Ren; +Cc: qemu-devel, richardw.yang, dgilbert, quintela
On Fri, Aug 02, 2019 at 06:18:41PM +0800, Ivan Ren wrote:
>From: Ivan Ren <ivanren@tencent.com>
>
>This patch fix a multifd migration bug in migration speed calculation, this
>problem can be reproduced as follows:
>1. start a vm and give a heavy memory write stress to prevent the vm be
> successfully migrated to destination
>2. begin a migration with multifd
>3. migrate for a long time [actually, this can be measured by transferred bytes]
>4. migrate cancel
>5. begin a new migration with multifd, the migration will directly run into
> migration_completion phase
>
>Reason as follows:
>
>Migration update bandwidth and s->threshold_size in function
>migration_update_counters after BUFFER_DELAY time:
>
> current_bytes = migration_total_bytes(s);
> transferred = current_bytes - s->iteration_initial_bytes;
> time_spent = current_time - s->iteration_start_time;
> bandwidth = (double)transferred / time_spent;
> s->threshold_size = bandwidth * s->parameters.downtime_limit;
>
>In multifd migration, migration_total_bytes function return
>qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes.
>s->iteration_initial_bytes will be initialized to 0 at every new migration,
>but ram_counters is a global variable, and history migration data will be
>accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead
>pending_size >= s->threshold_size become false in migration_iteration_run
>after the first migration_update_counters.
>
>Signed-off-by: Ivan Ren <ivanren@tencent.com>
>Reviewed-by: Juan Quintela <quintela@redhat.com>
>Suggested-by: Wei Yang <richardw.yang@linux.intel.com>
>---
>v2->v3:
>- fix the bug of update_iteration_initial_status function prototype
>
Code looks good. Have you verified on this version?
BTW, you didn't address the multifd count in this patch, right?
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration
2019-08-05 0:33 ` Wei Yang
@ 2019-08-05 1:16 ` Ivan Ren
2019-08-05 1:26 ` Wei Yang
0 siblings, 1 reply; 5+ messages in thread
From: Ivan Ren @ 2019-08-05 1:16 UTC (permalink / raw)
To: Wei Yang; +Cc: qemu-devel, dgilbert, quintela
On Mon, Aug 5, 2019 at 8:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> On Fri, Aug 02, 2019 at 06:18:41PM +0800, Ivan Ren wrote:
> >From: Ivan Ren <ivanren@tencent.com>
> >
> >This patch fix a multifd migration bug in migration speed calculation, this
> >problem can be reproduced as follows:
> >1. start a vm and give a heavy memory write stress to prevent the vm be
> > successfully migrated to destination
> >2. begin a migration with multifd
> >3. migrate for a long time [actually, this can be measured by transferred bytes]
> >4. migrate cancel
> >5. begin a new migration with multifd, the migration will directly run into
> > migration_completion phase
> >
> >Reason as follows:
> >
> >Migration update bandwidth and s->threshold_size in function
> >migration_update_counters after BUFFER_DELAY time:
> >
> > current_bytes = migration_total_bytes(s);
> > transferred = current_bytes - s->iteration_initial_bytes;
> > time_spent = current_time - s->iteration_start_time;
> > bandwidth = (double)transferred / time_spent;
> > s->threshold_size = bandwidth * s->parameters.downtime_limit;
> >
> >In multifd migration, migration_total_bytes function return
> >qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes.
> >s->iteration_initial_bytes will be initialized to 0 at every new migration,
> >but ram_counters is a global variable, and history migration data will be
> >accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead
> >pending_size >= s->threshold_size become false in migration_iteration_run
> >after the first migration_update_counters.
> >
> >Signed-off-by: Ivan Ren <ivanren@tencent.com>
> >Reviewed-by: Juan Quintela <quintela@redhat.com>
> >Suggested-by: Wei Yang <richardw.yang@linux.intel.com>
> >---
> >v2->v3:
> >- fix the bug of update_iteration_initial_status function prototype
> >
>
> Code looks good. Have you verified on this version?
Yes
> BTW, you didn't address the multifd count in this patch, right?
Yes.
Currently multifd page count has no harm, so I think it's better to
optimize it in a new patch to make things clearer.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration
2019-08-05 1:16 ` Ivan Ren
@ 2019-08-05 1:26 ` Wei Yang
0 siblings, 0 replies; 5+ messages in thread
From: Wei Yang @ 2019-08-05 1:26 UTC (permalink / raw)
To: Ivan Ren; +Cc: qemu-devel, Wei Yang, dgilbert, quintela
On Mon, Aug 05, 2019 at 09:16:24AM +0800, Ivan Ren wrote:
>On Mon, Aug 5, 2019 at 8:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>>
>> On Fri, Aug 02, 2019 at 06:18:41PM +0800, Ivan Ren wrote:
>> >From: Ivan Ren <ivanren@tencent.com>
>> >
>> >This patch fix a multifd migration bug in migration speed calculation, this
>> >problem can be reproduced as follows:
>> >1. start a vm and give a heavy memory write stress to prevent the vm be
>> > successfully migrated to destination
>> >2. begin a migration with multifd
>> >3. migrate for a long time [actually, this can be measured by transferred bytes]
>> >4. migrate cancel
>> >5. begin a new migration with multifd, the migration will directly run into
>> > migration_completion phase
>> >
>> >Reason as follows:
>> >
>> >Migration update bandwidth and s->threshold_size in function
>> >migration_update_counters after BUFFER_DELAY time:
>> >
>> > current_bytes = migration_total_bytes(s);
>> > transferred = current_bytes - s->iteration_initial_bytes;
>> > time_spent = current_time - s->iteration_start_time;
>> > bandwidth = (double)transferred / time_spent;
>> > s->threshold_size = bandwidth * s->parameters.downtime_limit;
>> >
>> >In multifd migration, migration_total_bytes function return
>> >qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes.
>> >s->iteration_initial_bytes will be initialized to 0 at every new migration,
>> >but ram_counters is a global variable, and history migration data will be
>> >accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead
>> >pending_size >= s->threshold_size become false in migration_iteration_run
>> >after the first migration_update_counters.
>> >
>> >Signed-off-by: Ivan Ren <ivanren@tencent.com>
>> >Reviewed-by: Juan Quintela <quintela@redhat.com>
>> >Suggested-by: Wei Yang <richardw.yang@linux.intel.com>
>> >---
>> >v2->v3:
>> >- fix the bug of update_iteration_initial_status function prototype
>> >
>>
>> Code looks good. Have you verified on this version?
>
>Yes
>
>> BTW, you didn't address the multifd count in this patch, right?
>
>Yes.
>Currently multifd page count has no harm, so I think it's better to
>optimize it in a new patch to make things clearer.
Fine.
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
>
>Thanks.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration
2019-08-02 10:18 [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration Ivan Ren
2019-08-05 0:33 ` Wei Yang
@ 2019-08-07 18:48 ` Dr. David Alan Gilbert
1 sibling, 0 replies; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2019-08-07 18:48 UTC (permalink / raw)
To: Ivan Ren; +Cc: qemu-devel, richardw.yang, quintela
* Ivan Ren (renyime@gmail.com) wrote:
> From: Ivan Ren <ivanren@tencent.com>
>
> This patch fix a multifd migration bug in migration speed calculation, this
> problem can be reproduced as follows:
> 1. start a vm and give a heavy memory write stress to prevent the vm be
> successfully migrated to destination
> 2. begin a migration with multifd
> 3. migrate for a long time [actually, this can be measured by transferred bytes]
> 4. migrate cancel
> 5. begin a new migration with multifd, the migration will directly run into
> migration_completion phase
>
> Reason as follows:
>
> Migration update bandwidth and s->threshold_size in function
> migration_update_counters after BUFFER_DELAY time:
>
> current_bytes = migration_total_bytes(s);
> transferred = current_bytes - s->iteration_initial_bytes;
> time_spent = current_time - s->iteration_start_time;
> bandwidth = (double)transferred / time_spent;
> s->threshold_size = bandwidth * s->parameters.downtime_limit;
>
> In multifd migration, migration_total_bytes function return
> qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes.
> s->iteration_initial_bytes will be initialized to 0 at every new migration,
> but ram_counters is a global variable, and history migration data will be
> accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead
> pending_size >= s->threshold_size become false in migration_iteration_run
> after the first migration_update_counters.
>
> Signed-off-by: Ivan Ren <ivanren@tencent.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> Suggested-by: Wei Yang <richardw.yang@linux.intel.com>
Thank you,
Queued
> ---
> v2->v3:
> - fix the bug of update_iteration_initial_status function prototype
>
> v1->v2:
> - Add interface update_iteration_initial_status to update statistic fields
> at the same time to avoid info mismatch lead wrong calculation result.
>
> migration/migration.c | 25 +++++++++++++++++++------
> migration/savevm.c | 1 +
> 2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 8a607fe1e2..bea9b1d796 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1908,6 +1908,11 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
> }
>
> migrate_init(s);
> + /*
> + * set ram_counters memory to zero for a
> + * new migration
> + */
> + memset(&ram_counters, 0, sizeof(ram_counters));
>
> return true;
> }
> @@ -3025,6 +3030,17 @@ static void migration_calculate_complete(MigrationState *s)
> }
> }
>
> +static void update_iteration_initial_status(MigrationState *s)
> +{
> + /*
> + * Update these three fields at the same time to avoid mismatch info lead
> + * wrong speed calculation.
> + */
> + s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> + s->iteration_initial_bytes = migration_total_bytes(s);
> + s->iteration_initial_pages = ram_get_total_transferred_pages();
> +}
> +
> static void migration_update_counters(MigrationState *s,
> int64_t current_time)
> {
> @@ -3060,9 +3076,7 @@ static void migration_update_counters(MigrationState *s,
>
> qemu_file_reset_rate_limit(s->to_dst_file);
>
> - s->iteration_start_time = current_time;
> - s->iteration_initial_bytes = current_bytes;
> - s->iteration_initial_pages = ram_get_total_transferred_pages();
> + update_iteration_initial_status(s);
>
> trace_migrate_transferred(transferred, time_spent,
> bandwidth, s->threshold_size);
> @@ -3186,7 +3200,7 @@ static void *migration_thread(void *opaque)
> rcu_register_thread();
>
> object_ref(OBJECT(s));
> - s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> + update_iteration_initial_status(s);
>
> qemu_savevm_state_header(s->to_dst_file);
>
> @@ -3251,8 +3265,7 @@ static void *migration_thread(void *opaque)
> * the local variables. This is important to avoid
> * breaking transferred_bytes and bandwidth calculation
> */
> - s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> - s->iteration_initial_bytes = 0;
> + update_iteration_initial_status(s);
> }
>
> current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 79ed44d475..480c511b19 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1424,6 +1424,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
> }
>
> migrate_init(ms);
> + memset(&ram_counters, 0, sizeof(ram_counters));
> ms->to_dst_file = f;
>
> qemu_mutex_unlock_iothread();
> --
> 2.17.2 (Apple Git-113)
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-08-07 18:48 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-02 10:18 [Qemu-devel] [PATCH v3] migration: always initial ram_counters for a new migration Ivan Ren
2019-08-05 0:33 ` Wei Yang
2019-08-05 1:16 ` Ivan Ren
2019-08-05 1:26 ` Wei Yang
2019-08-07 18:48 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).