All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tests/migration: Allow longer timeouts
@ 2020-10-08 16:03 Dr. David Alan Gilbert (git)
  2020-10-12 13:13 ` Thomas Huth
  0 siblings, 1 reply; 3+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-10-08 16:03 UTC (permalink / raw)
  To: qemu-devel, thuth, lvivier, alex.bennee; +Cc: quintela

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

In travis, with gcov and gprof we're seeing timeouts; hopefully fix
this by increasing the test timeouts a bit, but for xbzrle ensure it
really does get a couple of cycles through to test the cache.

I think the problem in travis is we have about 2 host CPU threads,
in the test we have at least 3:
   a) The vCPU thread (100% flat out)
   b) The source migration thread
   c) The destination migration thread

if (b) & (c) are slow for any reason - gcov+gperf or a slow host -
then they're sharing one host CPU thread so limit the migration
bandwidth.

Tested on my laptop with:
   taskset -c 0,1 ./tests/qtest/migration-test -p /x86_64/migration

Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 tests/qtest/migration-test.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 00a233cd8c..481db4e929 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -44,6 +44,9 @@ static bool uffd_feature_thread_id;
 #include <sys/ioctl.h>
 #include <linux/userfaultfd.h>
 
+/* A downtime where the test really should converge */
+#define CONVERGE_DOWNTIME 1000
+
 static bool ufd_version_check(void)
 {
     struct uffdio_api api_struct;
@@ -864,8 +867,7 @@ static void test_precopy_unix(void)
 
     wait_for_migration_pass(from);
 
-    /* 300 ms should converge */
-    migrate_set_parameter_int(from, "downtime-limit", 300);
+    migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME);
 
     if (!got_stop) {
         qtest_qmp_eventwait(from, "STOP");
@@ -946,10 +948,12 @@ static void test_xbzrle(const char *uri)
 
     migrate_qmp(from, uri, "{}");
 
+    wait_for_migration_pass(from);
+    /* Make sure we have 2 passes, so the xbzrle cache gets a workout */
     wait_for_migration_pass(from);
 
-    /* 300ms should converge */
-    migrate_set_parameter_int(from, "downtime-limit", 300);
+    /* 1000ms should converge */
+    migrate_set_parameter_int(from, "downtime-limit", 1000);
 
     if (!got_stop) {
         qtest_qmp_eventwait(from, "STOP");
@@ -999,8 +1003,7 @@ static void test_precopy_tcp(void)
 
     wait_for_migration_pass(from);
 
-    /* 300ms should converge */
-    migrate_set_parameter_int(from, "downtime-limit", 300);
+    migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME);
 
     if (!got_stop) {
         qtest_qmp_eventwait(from, "STOP");
@@ -1068,8 +1071,7 @@ static void test_migrate_fd_proto(void)
 
     wait_for_migration_pass(from);
 
-    /* 300ms should converge */
-    migrate_set_parameter_int(from, "downtime-limit", 300);
+    migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME);
 
     if (!got_stop) {
         qtest_qmp_eventwait(from, "STOP");
@@ -1304,8 +1306,7 @@ static void test_multifd_tcp(const char *method)
 
     wait_for_migration_pass(from);
 
-    /* 300ms it should converge */
-    migrate_set_parameter_int(from, "downtime-limit", 300);
+    migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME);
 
     if (!got_stop) {
         qtest_qmp_eventwait(from, "STOP");
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] tests/migration: Allow longer timeouts
  2020-10-08 16:03 [PATCH] tests/migration: Allow longer timeouts Dr. David Alan Gilbert (git)
@ 2020-10-12 13:13 ` Thomas Huth
  2020-10-13  6:06   ` Thomas Huth
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Huth @ 2020-10-12 13:13 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel, lvivier, alex.bennee; +Cc: quintela

On 08/10/2020 18.03, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> In travis, with gcov and gprof we're seeing timeouts; hopefully fix
> this by increasing the test timeouts a bit, but for xbzrle ensure it
> really does get a couple of cycles through to test the cache.
> 
> I think the problem in travis is we have about 2 host CPU threads,
> in the test we have at least 3:
>    a) The vCPU thread (100% flat out)
>    b) The source migration thread
>    c) The destination migration thread
> 
> if (b) & (c) are slow for any reason - gcov+gperf or a slow host -
> then they're sharing one host CPU thread so limit the migration
> bandwidth.
> 
> Tested on my laptop with:
>    taskset -c 0,1 ./tests/qtest/migration-test -p /x86_64/migration
> 
> Reported-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  tests/qtest/migration-test.c | 21 +++++++++++----------
>  1 file changed, 11 insertions(+), 10 deletions(-)

This seems to fix the gcov/gprof test indeed:

 https://travis-ci.com/github/huth/qemu/jobs/398270396

Thus:

Tested-by: Thomas Huth <thuth@redhat.com>

I'm also queuing this to my qtest-next branch (in case you don't plan a
migration pull request within the next days):

 https://gitlab.com/huth/qemu/-/commits/qtest-next/

 Thomas



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] tests/migration: Allow longer timeouts
  2020-10-12 13:13 ` Thomas Huth
@ 2020-10-13  6:06   ` Thomas Huth
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Huth @ 2020-10-13  6:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel, lvivier, alex.bennee; +Cc: quintela

On 12/10/2020 15.13, Thomas Huth wrote:
> On 08/10/2020 18.03, Dr. David Alan Gilbert (git) wrote:
>> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>>
>> In travis, with gcov and gprof we're seeing timeouts; hopefully fix
>> this by increasing the test timeouts a bit, but for xbzrle ensure it
>> really does get a couple of cycles through to test the cache.
>>
>> I think the problem in travis is we have about 2 host CPU threads,
>> in the test we have at least 3:
>>    a) The vCPU thread (100% flat out)
>>    b) The source migration thread
>>    c) The destination migration thread
>>
>> if (b) & (c) are slow for any reason - gcov+gperf or a slow host -
>> then they're sharing one host CPU thread so limit the migration
>> bandwidth.
>>
>> Tested on my laptop with:
>>    taskset -c 0,1 ./tests/qtest/migration-test -p /x86_64/migration
>>
>> Reported-by: Alex Bennée <alex.bennee@linaro.org>
>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>>  tests/qtest/migration-test.c | 21 +++++++++++----------
>>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> This seems to fix the gcov/gprof test indeed:
> 
>  https://travis-ci.com/github/huth/qemu/jobs/398270396
> 
> Thus:
> 
> Tested-by: Thomas Huth <thuth@redhat.com>
> 
> I'm also queuing this to my qtest-next branch (in case you don't plan a
> migration pull request within the next days):
> 
>  https://gitlab.com/huth/qemu/-/commits/qtest-next/

FYI, this patch fails to build on non-Linux systems:

https://cirrus-ci.com/task/5951706225704960?command=main#L6076

The #define needs to be moved out of the #if defined(__linux__) block. I can
fixup the patch here locally, but if you want to include it in your next
migration pull request instead, you should do that, too.

 Cheers,
  Thomas



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-10-13  6:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08 16:03 [PATCH] tests/migration: Allow longer timeouts Dr. David Alan Gilbert (git)
2020-10-12 13:13 ` Thomas Huth
2020-10-13  6:06   ` Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.