All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] migration: failover: continue to wait card unplug on error
@ 2021-06-29 15:50 Laurent Vivier
  2021-06-29 15:50 ` [PATCH 1/2] migration: move wait-unplug loop to its own function Laurent Vivier
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Laurent Vivier @ 2021-06-29 15:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jens Freimann, Dr. David Alan Gilbert, Juan Quintela

We need to wait the end of the unplug phase to be able
to plug back the card.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852

Laurent Vivier (2):
  migration: move wait-unplug loop to its own function
  migration: failover: continue to wait card unplug on error

 migration/migration.c | 65 ++++++++++++++++++++++++-------------------
 1 file changed, 37 insertions(+), 28 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] migration: move wait-unplug loop to its own function
  2021-06-29 15:50 [PATCH 0/2] migration: failover: continue to wait card unplug on error Laurent Vivier
@ 2021-06-29 15:50 ` Laurent Vivier
  2021-06-29 17:30   ` Dr. David Alan Gilbert
  2021-06-29 17:47   ` Juan Quintela
  2021-06-29 15:50 ` [PATCH 2/2] migration: failover: continue to wait card unplug on error Laurent Vivier
  2021-06-30 17:54 ` [PATCH 0/2] " Dr. David Alan Gilbert
  2 siblings, 2 replies; 11+ messages in thread
From: Laurent Vivier @ 2021-06-29 15:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jens Freimann, Dr. David Alan Gilbert, Juan Quintela

The loop is used in migration_thread() and bg_migration_thread(),
so we can move it to its own function and call it from these both places.

Moreover, in migration_thread() we have a wrong state transition from
SETUP to ACTIVE while state could be WAIT_UNPLUG. This is correctly
managed in bg_migration_thread() so use this code instead.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 migration/migration.c | 54 +++++++++++++++++++++----------------------
 1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4228635d1880..3e92c405a2b6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3664,6 +3664,28 @@ bool migration_rate_limit(void)
     return urgent;
 }
 
+/*
+ * if failover devices are present, wait they are completely
+ * unplugged
+ */
+
+static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
+                                    int new_state)
+{
+    if (qemu_savevm_state_guest_unplug_pending()) {
+        migrate_set_state(&s->state, old_state, MIGRATION_STATUS_WAIT_UNPLUG);
+
+        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
+               qemu_savevm_state_guest_unplug_pending()) {
+            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
+        }
+
+        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
+    } else {
+        migrate_set_state(&s->state, old_state, new_state);
+    }
+}
+
 /*
  * Master migration thread on the source VM.
  * It drives the migration and pumps the data down the outgoing channel.
@@ -3710,22 +3732,10 @@ static void *migration_thread(void *opaque)
 
     qemu_savevm_state_setup(s->to_dst_file);
 
-    if (qemu_savevm_state_guest_unplug_pending()) {
-        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
-                          MIGRATION_STATUS_WAIT_UNPLUG);
-
-        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
-               qemu_savevm_state_guest_unplug_pending()) {
-            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
-        }
-
-        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
-                MIGRATION_STATUS_ACTIVE);
-    }
+    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
+                               MIGRATION_STATUS_ACTIVE);
 
     s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
-    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
-                      MIGRATION_STATUS_ACTIVE);
 
     trace_migration_thread_setup_complete();
 
@@ -3833,21 +3843,9 @@ static void *bg_migration_thread(void *opaque)
     qemu_savevm_state_header(s->to_dst_file);
     qemu_savevm_state_setup(s->to_dst_file);
 
-    if (qemu_savevm_state_guest_unplug_pending()) {
-        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
-                          MIGRATION_STATUS_WAIT_UNPLUG);
-
-        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
-               qemu_savevm_state_guest_unplug_pending()) {
-            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
-        }
+    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
+                               MIGRATION_STATUS_ACTIVE);
 
-        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
-                          MIGRATION_STATUS_ACTIVE);
-    } else {
-        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
-                MIGRATION_STATUS_ACTIVE);
-    }
     s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
 
     trace_migration_thread_setup_complete();
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] migration: failover: continue to wait card unplug on error
  2021-06-29 15:50 [PATCH 0/2] migration: failover: continue to wait card unplug on error Laurent Vivier
  2021-06-29 15:50 ` [PATCH 1/2] migration: move wait-unplug loop to its own function Laurent Vivier
@ 2021-06-29 15:50 ` Laurent Vivier
  2021-06-29 17:50   ` Juan Quintela
  2021-06-30 17:48   ` Dr. David Alan Gilbert
  2021-06-30 17:54 ` [PATCH 0/2] " Dr. David Alan Gilbert
  2 siblings, 2 replies; 11+ messages in thread
From: Laurent Vivier @ 2021-06-29 15:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jens Freimann, Dr. David Alan Gilbert, Juan Quintela

If the user cancels the migration in the unplug-wait state,
QEMU will try to plug back the card and this fails because the card
is partially unplugged.
To avoid the problem, continue to wait the card unplug, but to
allow the migration to be canceled if the card never finishes to unplug
use a timeout.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 migration/migration.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 3e92c405a2b6..3b06d43a7f42 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3679,6 +3679,17 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
                qemu_savevm_state_guest_unplug_pending()) {
             qemu_sem_timedwait(&s->wait_unplug_sem, 250);
         }
+        if (s->state != MIGRATION_STATUS_WAIT_UNPLUG) {
+            int timeout = 120; /* 30 seconds */
+            /*
+             * migration has been canceled
+             * but as we have started an unplug we must wait the end
+             * to be able to plug back the card
+             */
+            while (timeout-- && qemu_savevm_state_guest_unplug_pending()) {
+                qemu_sem_timedwait(&s->wait_unplug_sem, 250);
+            }
+        }
 
         migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
     } else {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] migration: move wait-unplug loop to its own function
  2021-06-29 15:50 ` [PATCH 1/2] migration: move wait-unplug loop to its own function Laurent Vivier
@ 2021-06-29 17:30   ` Dr. David Alan Gilbert
  2021-06-29 17:47   ` Juan Quintela
  1 sibling, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-06-29 17:30 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Jens Freimann, qemu-devel, Juan Quintela

* Laurent Vivier (lvivier@redhat.com) wrote:
> The loop is used in migration_thread() and bg_migration_thread(),
> so we can move it to its own function and call it from these both places.
> 
> Moreover, in migration_thread() we have a wrong state transition from
> SETUP to ACTIVE while state could be WAIT_UNPLUG. This is correctly
> managed in bg_migration_thread() so use this code instead.
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 54 +++++++++++++++++++++----------------------
>  1 file changed, 26 insertions(+), 28 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 4228635d1880..3e92c405a2b6 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3664,6 +3664,28 @@ bool migration_rate_limit(void)
>      return urgent;
>  }
>  
> +/*
> + * if failover devices are present, wait they are completely
> + * unplugged
> + */
> +
> +static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
> +                                    int new_state)
> +{
> +    if (qemu_savevm_state_guest_unplug_pending()) {
> +        migrate_set_state(&s->state, old_state, MIGRATION_STATUS_WAIT_UNPLUG);
> +
> +        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> +               qemu_savevm_state_guest_unplug_pending()) {
> +            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> +        }
> +
> +        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
> +    } else {
> +        migrate_set_state(&s->state, old_state, new_state);
> +    }
> +}
> +
>  /*
>   * Master migration thread on the source VM.
>   * It drives the migration and pumps the data down the outgoing channel.
> @@ -3710,22 +3732,10 @@ static void *migration_thread(void *opaque)
>  
>      qemu_savevm_state_setup(s->to_dst_file);
>  
> -    if (qemu_savevm_state_guest_unplug_pending()) {
> -        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> -                          MIGRATION_STATUS_WAIT_UNPLUG);
> -
> -        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> -               qemu_savevm_state_guest_unplug_pending()) {
> -            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> -        }
> -
> -        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
> -                MIGRATION_STATUS_ACTIVE);
> -    }
> +    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
> +                               MIGRATION_STATUS_ACTIVE);
>  
>      s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> -    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> -                      MIGRATION_STATUS_ACTIVE);
>  
>      trace_migration_thread_setup_complete();
>  
> @@ -3833,21 +3843,9 @@ static void *bg_migration_thread(void *opaque)
>      qemu_savevm_state_header(s->to_dst_file);
>      qemu_savevm_state_setup(s->to_dst_file);
>  
> -    if (qemu_savevm_state_guest_unplug_pending()) {
> -        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> -                          MIGRATION_STATUS_WAIT_UNPLUG);
> -
> -        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> -               qemu_savevm_state_guest_unplug_pending()) {
> -            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> -        }
> +    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
> +                               MIGRATION_STATUS_ACTIVE);
>  
> -        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
> -                          MIGRATION_STATUS_ACTIVE);
> -    } else {
> -        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> -                MIGRATION_STATUS_ACTIVE);
> -    }
>      s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>  
>      trace_migration_thread_setup_complete();
> -- 
> 2.31.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] migration: move wait-unplug loop to its own function
  2021-06-29 15:50 ` [PATCH 1/2] migration: move wait-unplug loop to its own function Laurent Vivier
  2021-06-29 17:30   ` Dr. David Alan Gilbert
@ 2021-06-29 17:47   ` Juan Quintela
  2021-06-29 17:48     ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 11+ messages in thread
From: Juan Quintela @ 2021-06-29 17:47 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Jens Freimann, qemu-devel, Dr. David Alan Gilbert

Laurent Vivier <lvivier@redhat.com> wrote:
> The loop is used in migration_thread() and bg_migration_thread(),
> so we can move it to its own function and call it from these both places.
>
> Moreover, in migration_thread() we have a wrong state transition from
> SETUP to ACTIVE while state could be WAIT_UNPLUG. This is correctly
> managed in bg_migration_thread() so use this code instead.
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

If you have to repost:


> +/*
> + * if failover devices are present, wait they are completely
> + * unplugged
> + */
> +
> +static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
> +                                    int new_state)

old_state and new state are always the same. SETUP -> ACTIVE.  I think
we can hardcode them.


> +{
> +    if (qemu_savevm_state_guest_unplug_pending()) {
> +        migrate_set_state(&s->state, old_state, MIGRATION_STATUS_WAIT_UNPLUG);
> +
> +        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> +               qemu_savevm_state_guest_unplug_pending()) {
> +            qemu_sem_timedwait(&s->wait_unplug_sem, 250);

I still don't understand why are we using a semaphore when we just want
a timer :-(

Yes, this is independent of this patch.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] migration: move wait-unplug loop to its own function
  2021-06-29 17:47   ` Juan Quintela
@ 2021-06-29 17:48     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-06-29 17:48 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Laurent Vivier, Jens Freimann, qemu-devel

* Juan Quintela (quintela@redhat.com) wrote:
> Laurent Vivier <lvivier@redhat.com> wrote:
> > The loop is used in migration_thread() and bg_migration_thread(),
> > so we can move it to its own function and call it from these both places.
> >
> > Moreover, in migration_thread() we have a wrong state transition from
> > SETUP to ACTIVE while state could be WAIT_UNPLUG. This is correctly
> > managed in bg_migration_thread() so use this code instead.
> >
> > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> 
> If you have to repost:
> 
> 
> > +/*
> > + * if failover devices are present, wait they are completely
> > + * unplugged
> > + */
> > +
> > +static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
> > +                                    int new_state)
> 
> old_state and new state are always the same. SETUP -> ACTIVE.  I think
> we can hardcode them.
> 
> 
> > +{
> > +    if (qemu_savevm_state_guest_unplug_pending()) {
> > +        migrate_set_state(&s->state, old_state, MIGRATION_STATUS_WAIT_UNPLUG);
> > +
> > +        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> > +               qemu_savevm_state_guest_unplug_pending()) {
> > +            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> 
> I still don't understand why are we using a semaphore when we just want
> a timer :-(
> 
> Yes, this is independent of this patch.

So yes I was going to ask on the 2nd patch; no one anywhere seems to set
that semaphore?

Dave

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] migration: failover: continue to wait card unplug on error
  2021-06-29 15:50 ` [PATCH 2/2] migration: failover: continue to wait card unplug on error Laurent Vivier
@ 2021-06-29 17:50   ` Juan Quintela
  2021-06-30  9:04     ` Laurent Vivier
  2021-06-30 17:48   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 11+ messages in thread
From: Juan Quintela @ 2021-06-29 17:50 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Jens Freimann, qemu-devel, Dr. David Alan Gilbert

Laurent Vivier <lvivier@redhat.com> wrote:
> If the user cancels the migration in the unplug-wait state,
> QEMU will try to plug back the card and this fails because the card
> is partially unplugged.
> To avoid the problem, continue to wait the card unplug, but to
> allow the migration to be canceled if the card never finishes to unplug
> use a timeout.
>
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>  migration/migration.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 3e92c405a2b6..3b06d43a7f42 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3679,6 +3679,17 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
>                 qemu_savevm_state_guest_unplug_pending()) {
>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>          }
> +        if (s->state != MIGRATION_STATUS_WAIT_UNPLUG) {
> +            int timeout = 120; /* 30 seconds */
> +            /*
> +             * migration has been canceled
> +             * but as we have started an unplug we must wait the end
> +             * to be able to plug back the card
> +             */
> +            while (timeout-- && qemu_savevm_state_guest_unplug_pending()) {
> +                qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> +            }
> +        }
>  
>          migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
>      } else {

I agree with the idea.  But if we are getting out due to timeout == 0,
shouldn't we return some error, warning, whatever?

Later, Juan.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] migration: failover: continue to wait card unplug on error
  2021-06-29 17:50   ` Juan Quintela
@ 2021-06-30  9:04     ` Laurent Vivier
  2021-06-30  9:13       ` Juan Quintela
  0 siblings, 1 reply; 11+ messages in thread
From: Laurent Vivier @ 2021-06-30  9:04 UTC (permalink / raw)
  To: quintela; +Cc: Jens Freimann, qemu-devel, Dr. David Alan Gilbert

On 29/06/2021 19:50, Juan Quintela wrote:
> Laurent Vivier <lvivier@redhat.com> wrote:
>> If the user cancels the migration in the unplug-wait state,
>> QEMU will try to plug back the card and this fails because the card
>> is partially unplugged.
>> To avoid the problem, continue to wait the card unplug, but to
>> allow the migration to be canceled if the card never finishes to unplug
>> use a timeout.
>>
>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>  migration/migration.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 3e92c405a2b6..3b06d43a7f42 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -3679,6 +3679,17 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
>>                 qemu_savevm_state_guest_unplug_pending()) {
>>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>>          }
>> +        if (s->state != MIGRATION_STATUS_WAIT_UNPLUG) {
>> +            int timeout = 120; /* 30 seconds */
>> +            /*
>> +             * migration has been canceled
>> +             * but as we have started an unplug we must wait the end
>> +             * to be able to plug back the card
>> +             */
>> +            while (timeout-- && qemu_savevm_state_guest_unplug_pending()) {
>> +                qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>> +            }
>> +        }
>>  
>>          migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
>>      } else {
> I agree with the idea.  But if we are getting out due to timeout == 0,
> shouldn't we return some error, warning, whatever?

In that case, we keep the current behaviour: guest kernel will report an error when it
will try to plug back the card that has not been unplugged. This is a corner case: if it
happens we have something really wrong with the machine. Perhaps we can remove the
timeout, but I don't like to block the user, or increase it to be sure.

Thanks,

Laurent




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] migration: failover: continue to wait card unplug on error
  2021-06-30  9:04     ` Laurent Vivier
@ 2021-06-30  9:13       ` Juan Quintela
  0 siblings, 0 replies; 11+ messages in thread
From: Juan Quintela @ 2021-06-30  9:13 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Jens Freimann, qemu-devel, Dr. David Alan Gilbert

Laurent Vivier <lvivier@redhat.com> wrote:
> On 29/06/2021 19:50, Juan Quintela wrote:
>> Laurent Vivier <lvivier@redhat.com> wrote:
>>> If the user cancels the migration in the unplug-wait state,
>>> QEMU will try to plug back the card and this fails because the card
>>> is partially unplugged.
>>> To avoid the problem, continue to wait the card unplug, but to
>>> allow the migration to be canceled if the card never finishes to unplug
>>> use a timeout.
>>>
>>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>> ---
>>>  migration/migration.c | 11 +++++++++++
>>>  1 file changed, 11 insertions(+)
>>>
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index 3e92c405a2b6..3b06d43a7f42 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -3679,6 +3679,17 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
>>>                 qemu_savevm_state_guest_unplug_pending()) {
>>>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>>>          }
>>> +        if (s->state != MIGRATION_STATUS_WAIT_UNPLUG) {
>>> +            int timeout = 120; /* 30 seconds */
>>> +            /*
>>> +             * migration has been canceled
>>> +             * but as we have started an unplug we must wait the end
>>> +             * to be able to plug back the card
>>> +             */
>>> +            while (timeout-- && qemu_savevm_state_guest_unplug_pending()) {
>>> +                qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>>> +            }
>>> +        }
>>>  
>>>          migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
>>>      } else {
>> I agree with the idea.  But if we are getting out due to timeout == 0,
>> shouldn't we return some error, warning, whatever?
>
> In that case, we keep the current behaviour: guest kernel will report
> an error when it
> will try to plug back the card that has not been unplugged. This is a
> corner case: if it
> happens we have something really wrong with the machine. Perhaps we can remove the
> timeout, but I don't like to block the user, or increase it to be sure.

Oh, I whole agree that it is a corner case, and that it shouldn't
happen.

But if it happens, we don't log it anywhere.  That was my complaint.

Later, Juan.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] migration: failover: continue to wait card unplug on error
  2021-06-29 15:50 ` [PATCH 2/2] migration: failover: continue to wait card unplug on error Laurent Vivier
  2021-06-29 17:50   ` Juan Quintela
@ 2021-06-30 17:48   ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-06-30 17:48 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Jens Freimann, qemu-devel, Juan Quintela

* Laurent Vivier (lvivier@redhat.com) wrote:
> If the user cancels the migration in the unplug-wait state,
> QEMU will try to plug back the card and this fails because the card
> is partially unplugged.
> To avoid the problem, continue to wait the card unplug, but to
> allow the migration to be canceled if the card never finishes to unplug
> use a timeout.
> 
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

I'll take this for now, but as Juan says, we could really do with some
diags when this happens, so when someone comes and tells us that
the hotplug has failed we can see.  Please send something to add it.


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 3e92c405a2b6..3b06d43a7f42 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3679,6 +3679,17 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
>                 qemu_savevm_state_guest_unplug_pending()) {
>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>          }
> +        if (s->state != MIGRATION_STATUS_WAIT_UNPLUG) {
> +            int timeout = 120; /* 30 seconds */
> +            /*
> +             * migration has been canceled
> +             * but as we have started an unplug we must wait the end
> +             * to be able to plug back the card
> +             */
> +            while (timeout-- && qemu_savevm_state_guest_unplug_pending()) {
> +                qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> +            }
> +        }
>  
>          migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, new_state);
>      } else {
> -- 
> 2.31.1
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] migration: failover: continue to wait card unplug on error
  2021-06-29 15:50 [PATCH 0/2] migration: failover: continue to wait card unplug on error Laurent Vivier
  2021-06-29 15:50 ` [PATCH 1/2] migration: move wait-unplug loop to its own function Laurent Vivier
  2021-06-29 15:50 ` [PATCH 2/2] migration: failover: continue to wait card unplug on error Laurent Vivier
@ 2021-06-30 17:54 ` Dr. David Alan Gilbert
  2 siblings, 0 replies; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-06-30 17:54 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Jens Freimann, qemu-devel, Juan Quintela

* Laurent Vivier (lvivier@redhat.com) wrote:
> We need to wait the end of the unplug phase to be able
> to plug back the card.
> 
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852

Queued

> 
> Laurent Vivier (2):
>   migration: move wait-unplug loop to its own function
>   migration: failover: continue to wait card unplug on error
> 
>  migration/migration.c | 65 ++++++++++++++++++++++++-------------------
>  1 file changed, 37 insertions(+), 28 deletions(-)
> 
> -- 
> 2.31.1
> 
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-06-30 17:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29 15:50 [PATCH 0/2] migration: failover: continue to wait card unplug on error Laurent Vivier
2021-06-29 15:50 ` [PATCH 1/2] migration: move wait-unplug loop to its own function Laurent Vivier
2021-06-29 17:30   ` Dr. David Alan Gilbert
2021-06-29 17:47   ` Juan Quintela
2021-06-29 17:48     ` Dr. David Alan Gilbert
2021-06-29 15:50 ` [PATCH 2/2] migration: failover: continue to wait card unplug on error Laurent Vivier
2021-06-29 17:50   ` Juan Quintela
2021-06-30  9:04     ` Laurent Vivier
2021-06-30  9:13       ` Juan Quintela
2021-06-30 17:48   ` Dr. David Alan Gilbert
2021-06-30 17:54 ` [PATCH 0/2] " Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.