qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] apic: Fix migration breakage of >255 vcpus
@ 2019-10-15  7:54 Peter Xu
  2019-10-15  7:54 ` [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits Peter Xu
  2019-10-15  7:54 ` [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID Peter Xu
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Xu @ 2019-10-15  7:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Dr . David Alan Gilbert, peterx,
	Igor Mammedov, Paolo Bonzini

I'm not very certain, but... it seems to be broken starting from when
x2apic was introduced in QEMU, until now.

Please review, thanks.

Peter Xu (2):
  migration: Boost SaveStateEntry.instance_id to 64 bits
  apic: Use 32bit APIC ID for migration instance ID

 hw/intc/apic_common.c        |  2 +-
 include/migration/register.h |  2 +-
 include/migration/vmstate.h  |  4 ++--
 migration/savevm.c           | 10 +++++-----
 stubs/vmstate.c              |  2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  7:54 [PATCH 0/2] apic: Fix migration breakage of >255 vcpus Peter Xu
@ 2019-10-15  7:54 ` Peter Xu
  2019-10-15  8:34   ` Juan Quintela
  2019-10-15  8:45   ` Juan Quintela
  2019-10-15  7:54 ` [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID Peter Xu
  1 sibling, 2 replies; 14+ messages in thread
From: Peter Xu @ 2019-10-15  7:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Dr . David Alan Gilbert, peterx,
	Igor Mammedov, Paolo Bonzini

It was "int" and used as 32bits fields (see save_section_header()).
It's unsafe already because sizeof(int) could be 2 on i386, I think.
So at least uint32_t would suite more.  While it also uses "-1" as a
placeholder of "we want to generate the instance ID automatically".
Hence a more proper value should be int64_t.

This will start to be useful after next patch in which we can start to
convert a real uint32_t value as instance ID.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/migration/register.h |  2 +-
 include/migration/vmstate.h  |  4 ++--
 migration/savevm.c           | 10 +++++-----
 stubs/vmstate.c              |  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index a13359a08d..54f42c7413 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -69,7 +69,7 @@ typedef struct SaveVMHandlers {
 } SaveVMHandlers;
 
 int register_savevm_live(const char *idstr,
-                         int instance_id,
+                         int64_t instance_id,
                          int version_id,
                          const SaveVMHandlers *ops,
                          void *opaque);
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1fbfd099dd..6a7498463c 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -1114,14 +1114,14 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
 bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque);
 
 /* Returns: 0 on success, -1 on failure */
-int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
+int vmstate_register_with_alias_id(DeviceState *dev, int64_t instance_id,
                                    const VMStateDescription *vmsd,
                                    void *base, int alias_id,
                                    int required_for_version,
                                    Error **errp);
 
 /* Returns: 0 on success, -1 on failure */
-static inline int vmstate_register(DeviceState *dev, int instance_id,
+static inline int vmstate_register(DeviceState *dev, int64_t instance_id,
                                    const VMStateDescription *vmsd,
                                    void *opaque)
 {
diff --git a/migration/savevm.c b/migration/savevm.c
index bb9462a54d..dc9281c897 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -233,7 +233,7 @@ typedef struct CompatEntry {
 typedef struct SaveStateEntry {
     QTAILQ_ENTRY(SaveStateEntry) entry;
     char idstr[256];
-    int instance_id;
+    int64_t instance_id;
     int alias_id;
     int version_id;
     /* version id read from the stream */
@@ -668,7 +668,7 @@ void dump_vmstate_json_to_file(FILE *out_file)
 static int calculate_new_instance_id(const char *idstr)
 {
     SaveStateEntry *se;
-    int instance_id = 0;
+    int64_t instance_id = 0;
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
         if (strcmp(idstr, se->idstr) == 0
@@ -730,7 +730,7 @@ static void savevm_state_handler_insert(SaveStateEntry *nse)
    Meanwhile pass -1 as instance_id if you do not already have a clearly
    distinguishing id for all instances of your device class. */
 int register_savevm_live(const char *idstr,
-                         int instance_id,
+                         int64_t instance_id,
                          int version_id,
                          const SaveVMHandlers *ops,
                          void *opaque)
@@ -784,7 +784,7 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque)
     }
 }
 
-int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
+int vmstate_register_with_alias_id(DeviceState *dev, int64_t instance_id,
                                    const VMStateDescription *vmsd,
                                    void *opaque, int alias_id,
                                    int required_for_version,
@@ -1566,7 +1566,7 @@ int qemu_save_device_state(QEMUFile *f)
     return qemu_file_get_error(f);
 }
 
-static SaveStateEntry *find_se(const char *idstr, int instance_id)
+static SaveStateEntry *find_se(const char *idstr, int64_t instance_id)
 {
     SaveStateEntry *se;
 
diff --git a/stubs/vmstate.c b/stubs/vmstate.c
index e1e89b87f0..699003f3b0 100644
--- a/stubs/vmstate.c
+++ b/stubs/vmstate.c
@@ -4,7 +4,7 @@
 const VMStateDescription vmstate_dummy = {};
 
 int vmstate_register_with_alias_id(DeviceState *dev,
-                                   int instance_id,
+                                   int64_t instance_id,
                                    const VMStateDescription *vmsd,
                                    void *base, int alias_id,
                                    int required_for_version,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
  2019-10-15  7:54 [PATCH 0/2] apic: Fix migration breakage of >255 vcpus Peter Xu
  2019-10-15  7:54 ` [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits Peter Xu
@ 2019-10-15  7:54 ` Peter Xu
  2019-10-15  8:30   ` Juan Quintela
  2019-10-15  9:22   ` Dr. David Alan Gilbert
  1 sibling, 2 replies; 14+ messages in thread
From: Peter Xu @ 2019-10-15  7:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Dr . David Alan Gilbert, peterx,
	Igor Mammedov, Paolo Bonzini

Migration is silently broken now with x2apic config like this:

     -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
     -device intel-iommu,intremap=on,eim=on

After migration, the guest kernel could hang at anything, due to
x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
any operations related to x2apic could be broken then (e.g., RDMSR on
x2apic MSRs could fail because KVM would think that the vcpu hasn't
enabled x2apic at all).

The issue is that the x2apic bit was never applied correctly for vcpus
whose ID > 255 when migrate completes, and that's because when we
migrate APIC we use the APICCommonState.id as instance ID of the
migration stream, while that's too short for x2apic.

Let's use the newly introduced initial_apic_id for that.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/intc/apic_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index aafd8e0e33..6024a3e06a 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -315,7 +315,7 @@ static void apic_common_realize(DeviceState *dev, Error **errp)
     APICCommonState *s = APIC_COMMON(dev);
     APICCommonClass *info;
     static DeviceState *vapic;
-    int instance_id = s->id;
+    int64_t instance_id = s->initial_apic_id;
 
     info = APIC_COMMON_GET_CLASS(s);
     info->realize(dev, errp);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
  2019-10-15  7:54 ` [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID Peter Xu
@ 2019-10-15  8:30   ` Juan Quintela
  2019-10-15  9:22   ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 14+ messages in thread
From: Juan Quintela @ 2019-10-15  8:30 UTC (permalink / raw)
  To: Peter Xu
  Cc: Paolo Bonzini, Dr . David Alan Gilbert, qemu-devel,
	Eduardo Habkost, Igor Mammedov

Peter Xu <peterx@redhat.com> wrote:
> Migration is silently broken now with x2apic config like this:
>
>      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
>      -device intel-iommu,intremap=on,eim=on
>
> After migration, the guest kernel could hang at anything, due to
> x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> any operations related to x2apic could be broken then (e.g., RDMSR on
> x2apic MSRs could fail because KVM would think that the vcpu hasn't
> enabled x2apic at all).
>
> The issue is that the x2apic bit was never applied correctly for vcpus
> whose ID > 255 when migrate completes, and that's because when we
> migrate APIC we use the APICCommonState.id as instance ID of the
> migration stream, while that's too short for x2apic.
>
> Let's use the newly introduced initial_apic_id for that.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/intc/apic_common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index aafd8e0e33..6024a3e06a 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -315,7 +315,7 @@ static void apic_common_realize(DeviceState *dev, Error **errp)
>      APICCommonState *s = APIC_COMMON(dev);
>      APICCommonClass *info;
>      static DeviceState *vapic;
> -    int instance_id = s->id;
> +    int64_t instance_id = s->initial_apic_id;

int is ok here.

But damn thing, initial_apic_id is uint32_t.  Sniff.

Later, Juan.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  7:54 ` [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits Peter Xu
@ 2019-10-15  8:34   ` Juan Quintela
  2019-10-15 10:28     ` Peter Xu
  2019-10-15  8:45   ` Juan Quintela
  1 sibling, 1 reply; 14+ messages in thread
From: Juan Quintela @ 2019-10-15  8:34 UTC (permalink / raw)
  To: Peter Xu
  Cc: Paolo Bonzini, Dr . David Alan Gilbert, qemu-devel,
	Eduardo Habkost, Igor Mammedov

Peter Xu <peterx@redhat.com> wrote:
> It was "int" and used as 32bits fields (see save_section_header()).
> It's unsafe already because sizeof(int) could be 2 on i386,

i386 is 32bits, so int is 32bits O:-)
I really hope that we would never, ever, need a 64bits instance id.
It would mean that we have more than 2.000.000.000 objects of the same
type, no?

I am pretty sure than in 16bits platforms we have other problems than
insntance_id (namely that we don't have enough memory).

>I think.
> So at least uint32_t would suite more.  While it also uses "-1" as a
> placeholder of "we want to generate the instance ID automatically".
> Hence a more proper value should be int64_t.
>
> This will start to be useful after next patch in which we can start to
> convert a real uint32_t value as instance ID.

Later, Juan.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  7:54 ` [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits Peter Xu
  2019-10-15  8:34   ` Juan Quintela
@ 2019-10-15  8:45   ` Juan Quintela
  2019-10-15  8:57     ` Dr. David Alan Gilbert
  2019-10-15 10:23     ` Peter Xu
  1 sibling, 2 replies; 14+ messages in thread
From: Juan Quintela @ 2019-10-15  8:45 UTC (permalink / raw)
  To: Peter Xu
  Cc: Paolo Bonzini, Dr . David Alan Gilbert, qemu-devel,
	Eduardo Habkost, Igor Mammedov

Peter Xu <peterx@redhat.com> wrote:
> It was "int" and used as 32bits fields (see save_section_header()).
> It's unsafe already because sizeof(int) could be 2 on i386, I think.
> So at least uint32_t would suite more.  While it also uses "-1" as a
> placeholder of "we want to generate the instance ID automatically".
> Hence a more proper value should be int64_t.
>
> This will start to be useful after next patch in which we can start to
> convert a real uint32_t value as instance ID.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Hi

Being more helpful,  I think that it is better to just:

* change instance_id to be an uint32_t (notice that for all architectures
  that we support, it is actually int32_t).

* export calculate_new_instance_id() and adjust callers that use -1.

or

* export a new function that just use the calculate_new_instance_id()

A fast search shows:

10 callers of vmstate_register() with -1
1 caller of vmstate_register_with_alias_id with -1 (but it is the one
  that sets all qdev devices).
1 caller of vmstate_register_with_alias_id in apic, where it can be -1.
1 caller of register_savevm_live() with -1 (spapr)

And call it a day?

What do you think, Juan.

> ---
>  include/migration/register.h |  2 +-
>  include/migration/vmstate.h  |  4 ++--
>  migration/savevm.c           | 10 +++++-----
>  stubs/vmstate.c              |  2 +-
>  4 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/include/migration/register.h b/include/migration/register.h
> index a13359a08d..54f42c7413 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -69,7 +69,7 @@ typedef struct SaveVMHandlers {
>  } SaveVMHandlers;
>  
>  int register_savevm_live(const char *idstr,
> -                         int instance_id,
> +                         int64_t instance_id,
>                           int version_id,
>                           const SaveVMHandlers *ops,
>                           void *opaque);
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 1fbfd099dd..6a7498463c 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -1114,14 +1114,14 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
>  bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque);
>  
>  /* Returns: 0 on success, -1 on failure */
> -int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
> +int vmstate_register_with_alias_id(DeviceState *dev, int64_t instance_id,
>                                     const VMStateDescription *vmsd,
>                                     void *base, int alias_id,
>                                     int required_for_version,
>                                     Error **errp);
>  
>  /* Returns: 0 on success, -1 on failure */
> -static inline int vmstate_register(DeviceState *dev, int instance_id,
> +static inline int vmstate_register(DeviceState *dev, int64_t instance_id,
>                                     const VMStateDescription *vmsd,
>                                     void *opaque)
>  {
> diff --git a/migration/savevm.c b/migration/savevm.c
> index bb9462a54d..dc9281c897 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -233,7 +233,7 @@ typedef struct CompatEntry {
>  typedef struct SaveStateEntry {
>      QTAILQ_ENTRY(SaveStateEntry) entry;
>      char idstr[256];
> -    int instance_id;
> +    int64_t instance_id;
>      int alias_id;
>      int version_id;
>      /* version id read from the stream */
> @@ -668,7 +668,7 @@ void dump_vmstate_json_to_file(FILE *out_file)
>  static int calculate_new_instance_id(const char *idstr)
>  {
>      SaveStateEntry *se;
> -    int instance_id = 0;
> +    int64_t instance_id = 0;
>  
>      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>          if (strcmp(idstr, se->idstr) == 0
> @@ -730,7 +730,7 @@ static void savevm_state_handler_insert(SaveStateEntry *nse)
>     Meanwhile pass -1 as instance_id if you do not already have a clearly
>     distinguishing id for all instances of your device class. */
>  int register_savevm_live(const char *idstr,
> -                         int instance_id,
> +                         int64_t instance_id,
>                           int version_id,
>                           const SaveVMHandlers *ops,
>                           void *opaque)
> @@ -784,7 +784,7 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque)
>      }
>  }
>  
> -int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
> +int vmstate_register_with_alias_id(DeviceState *dev, int64_t instance_id,
>                                     const VMStateDescription *vmsd,
>                                     void *opaque, int alias_id,
>                                     int required_for_version,
> @@ -1566,7 +1566,7 @@ int qemu_save_device_state(QEMUFile *f)
>      return qemu_file_get_error(f);
>  }
>  
> -static SaveStateEntry *find_se(const char *idstr, int instance_id)
> +static SaveStateEntry *find_se(const char *idstr, int64_t instance_id)
>  {
>      SaveStateEntry *se;
>  
> diff --git a/stubs/vmstate.c b/stubs/vmstate.c
> index e1e89b87f0..699003f3b0 100644
> --- a/stubs/vmstate.c
> +++ b/stubs/vmstate.c
> @@ -4,7 +4,7 @@
>  const VMStateDescription vmstate_dummy = {};
>  
>  int vmstate_register_with_alias_id(DeviceState *dev,
> -                                   int instance_id,
> +                                   int64_t instance_id,
>                                     const VMStateDescription *vmsd,
>                                     void *base, int alias_id,
>                                     int required_for_version,


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  8:45   ` Juan Quintela
@ 2019-10-15  8:57     ` Dr. David Alan Gilbert
  2019-10-15 12:57       ` Juan Quintela
  2019-10-15 10:23     ` Peter Xu
  1 sibling, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2019-10-15  8:57 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Paolo Bonzini, Eduardo Habkost, qemu-devel, Peter Xu, Igor Mammedov

* Juan Quintela (quintela@redhat.com) wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > It was "int" and used as 32bits fields (see save_section_header()).
> > It's unsafe already because sizeof(int) could be 2 on i386, I think.
> > So at least uint32_t would suite more.  While it also uses "-1" as a
> > placeholder of "we want to generate the instance ID automatically".
> > Hence a more proper value should be int64_t.
> >
> > This will start to be useful after next patch in which we can start to
> > convert a real uint32_t value as instance ID.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Hi
> 
> Being more helpful,  I think that it is better to just:
> 
> * change instance_id to be an uint32_t (notice that for all architectures
>   that we support, it is actually int32_t).
> 
> * export calculate_new_instance_id() and adjust callers that use -1.
> 
> or
> 
> * export a new function that just use the calculate_new_instance_id()

Do you mean that we end up with two functions, one that does it
automatically, and one that takes an ID?

Dave

> A fast search shows:
> 
> 10 callers of vmstate_register() with -1
> 1 caller of vmstate_register_with_alias_id with -1 (but it is the one
>   that sets all qdev devices).
> 1 caller of vmstate_register_with_alias_id in apic, where it can be -1.
> 1 caller of register_savevm_live() with -1 (spapr)
> 
> And call it a day?
> 
> What do you think, Juan.

> 
> > ---
> >  include/migration/register.h |  2 +-
> >  include/migration/vmstate.h  |  4 ++--
> >  migration/savevm.c           | 10 +++++-----
> >  stubs/vmstate.c              |  2 +-
> >  4 files changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/migration/register.h b/include/migration/register.h
> > index a13359a08d..54f42c7413 100644
> > --- a/include/migration/register.h
> > +++ b/include/migration/register.h
> > @@ -69,7 +69,7 @@ typedef struct SaveVMHandlers {
> >  } SaveVMHandlers;
> >  
> >  int register_savevm_live(const char *idstr,
> > -                         int instance_id,
> > +                         int64_t instance_id,
> >                           int version_id,
> >                           const SaveVMHandlers *ops,
> >                           void *opaque);
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index 1fbfd099dd..6a7498463c 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -1114,14 +1114,14 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
> >  bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque);
> >  
> >  /* Returns: 0 on success, -1 on failure */
> > -int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
> > +int vmstate_register_with_alias_id(DeviceState *dev, int64_t instance_id,
> >                                     const VMStateDescription *vmsd,
> >                                     void *base, int alias_id,
> >                                     int required_for_version,
> >                                     Error **errp);
> >  
> >  /* Returns: 0 on success, -1 on failure */
> > -static inline int vmstate_register(DeviceState *dev, int instance_id,
> > +static inline int vmstate_register(DeviceState *dev, int64_t instance_id,
> >                                     const VMStateDescription *vmsd,
> >                                     void *opaque)
> >  {
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index bb9462a54d..dc9281c897 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -233,7 +233,7 @@ typedef struct CompatEntry {
> >  typedef struct SaveStateEntry {
> >      QTAILQ_ENTRY(SaveStateEntry) entry;
> >      char idstr[256];
> > -    int instance_id;
> > +    int64_t instance_id;
> >      int alias_id;
> >      int version_id;
> >      /* version id read from the stream */
> > @@ -668,7 +668,7 @@ void dump_vmstate_json_to_file(FILE *out_file)
> >  static int calculate_new_instance_id(const char *idstr)
> >  {
> >      SaveStateEntry *se;
> > -    int instance_id = 0;
> > +    int64_t instance_id = 0;
> >  
> >      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> >          if (strcmp(idstr, se->idstr) == 0
> > @@ -730,7 +730,7 @@ static void savevm_state_handler_insert(SaveStateEntry *nse)
> >     Meanwhile pass -1 as instance_id if you do not already have a clearly
> >     distinguishing id for all instances of your device class. */
> >  int register_savevm_live(const char *idstr,
> > -                         int instance_id,
> > +                         int64_t instance_id,
> >                           int version_id,
> >                           const SaveVMHandlers *ops,
> >                           void *opaque)
> > @@ -784,7 +784,7 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque)
> >      }
> >  }
> >  
> > -int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
> > +int vmstate_register_with_alias_id(DeviceState *dev, int64_t instance_id,
> >                                     const VMStateDescription *vmsd,
> >                                     void *opaque, int alias_id,
> >                                     int required_for_version,
> > @@ -1566,7 +1566,7 @@ int qemu_save_device_state(QEMUFile *f)
> >      return qemu_file_get_error(f);
> >  }
> >  
> > -static SaveStateEntry *find_se(const char *idstr, int instance_id)
> > +static SaveStateEntry *find_se(const char *idstr, int64_t instance_id)
> >  {
> >      SaveStateEntry *se;
> >  
> > diff --git a/stubs/vmstate.c b/stubs/vmstate.c
> > index e1e89b87f0..699003f3b0 100644
> > --- a/stubs/vmstate.c
> > +++ b/stubs/vmstate.c
> > @@ -4,7 +4,7 @@
> >  const VMStateDescription vmstate_dummy = {};
> >  
> >  int vmstate_register_with_alias_id(DeviceState *dev,
> > -                                   int instance_id,
> > +                                   int64_t instance_id,
> >                                     const VMStateDescription *vmsd,
> >                                     void *base, int alias_id,
> >                                     int required_for_version,
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
  2019-10-15  7:54 ` [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID Peter Xu
  2019-10-15  8:30   ` Juan Quintela
@ 2019-10-15  9:22   ` Dr. David Alan Gilbert
  2019-10-15 10:16     ` Peter Xu
  1 sibling, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2019-10-15  9:22 UTC (permalink / raw)
  To: Peter Xu
  Cc: Paolo Bonzini, Juan Quintela, qemu-devel, Eduardo Habkost, Igor Mammedov

* Peter Xu (peterx@redhat.com) wrote:
> Migration is silently broken now with x2apic config like this:
> 
>      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
>      -device intel-iommu,intremap=on,eim=on
> 
> After migration, the guest kernel could hang at anything, due to
> x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> any operations related to x2apic could be broken then (e.g., RDMSR on
> x2apic MSRs could fail because KVM would think that the vcpu hasn't
> enabled x2apic at all).
> 
> The issue is that the x2apic bit was never applied correctly for vcpus
> whose ID > 255 when migrate completes, and that's because when we
> migrate APIC we use the APICCommonState.id as instance ID of the
> migration stream, while that's too short for x2apic.
> 
> Let's use the newly introduced initial_apic_id for that.

I'd like to understand a few things:
   a) Does this change the instance ID of existing APICs on the
migration stream? 
     a1) Ever for <256 CPUs?
     a2) For >=256 CPUs?

    [Because changing the ID breaks migration]

  b) Is the instance ID constant - I can see it's a property on the
     APIC, but I cna't see who sets it

  c) In the case where it fails, did we end up registering two
     devices with the same name and instance ID?  If so, is it worth
     adding a check that would error if we tried?

Dave

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/intc/apic_common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index aafd8e0e33..6024a3e06a 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -315,7 +315,7 @@ static void apic_common_realize(DeviceState *dev, Error **errp)
>      APICCommonState *s = APIC_COMMON(dev);
>      APICCommonClass *info;
>      static DeviceState *vapic;
> -    int instance_id = s->id;
> +    int64_t instance_id = s->initial_apic_id;
>  
>      info = APIC_COMMON_GET_CLASS(s);
>      info->realize(dev, errp);
> -- 
> 2.21.0
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
  2019-10-15  9:22   ` Dr. David Alan Gilbert
@ 2019-10-15 10:16     ` Peter Xu
  2019-10-15 11:02       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2019-10-15 10:16 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Paolo Bonzini, Juan Quintela, qemu-devel, Eduardo Habkost, Igor Mammedov

On Tue, Oct 15, 2019 at 10:22:18AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > Migration is silently broken now with x2apic config like this:
> > 
> >      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
> >      -device intel-iommu,intremap=on,eim=on
> > 
> > After migration, the guest kernel could hang at anything, due to
> > x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> > any operations related to x2apic could be broken then (e.g., RDMSR on
> > x2apic MSRs could fail because KVM would think that the vcpu hasn't
> > enabled x2apic at all).
> > 
> > The issue is that the x2apic bit was never applied correctly for vcpus
> > whose ID > 255 when migrate completes, and that's because when we
> > migrate APIC we use the APICCommonState.id as instance ID of the
> > migration stream, while that's too short for x2apic.
> > 
> > Let's use the newly introduced initial_apic_id for that.
> 
> I'd like to understand a few things:
>    a) Does this change the instance ID of existing APICs on the
> migration stream? 
>      a1) Ever for <256 CPUs?

No.

>      a2) For >=256 CPUs?

Yes.

> 
>     [Because changing the ID breaks migration]

But if we don't change it, the stream is broken too. :)

Then the destination VM will receive e.g. two apic_id==0 instances (I
think the apic_id==256 instance will wrongly overwrite the apic_id==0
one), while the vcpu with apic_id==256 will use the initial apic
values.

So IMHO we should still fix this, even if it changes the migration
stream.  At least we start to make it right.

> 
>   b) Is the instance ID constant - I can see it's a property on the
>      APIC, but I cna't see who sets it

For each vcpu, I think yes it should be a constant as long as the
topology is the same.  This is how I understand it to be set:

(1) In pc_cpus_init(), we init these:

    possible_cpus = mc->possible_cpu_arch_ids(ms);
    for (i = 0; i < ms->smp.cpus; i++) {
        pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
    }

(2) In x86_cpu_apic_create(), we apply the apic_id to "id" property:

    qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);

> 
>   c) In the case where it fails, did we end up registering two
>      devices with the same name and instance ID?  If so, is it worth
>      adding a check that would error if we tried?

Sounds doable.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  8:45   ` Juan Quintela
  2019-10-15  8:57     ` Dr. David Alan Gilbert
@ 2019-10-15 10:23     ` Peter Xu
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Xu @ 2019-10-15 10:23 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Paolo Bonzini, Dr . David Alan Gilbert, qemu-devel,
	Eduardo Habkost, Igor Mammedov

On Tue, Oct 15, 2019 at 10:45:53AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > It was "int" and used as 32bits fields (see save_section_header()).
> > It's unsafe already because sizeof(int) could be 2 on i386, I think.
> > So at least uint32_t would suite more.  While it also uses "-1" as a
> > placeholder of "we want to generate the instance ID automatically".
> > Hence a more proper value should be int64_t.
> >
> > This will start to be useful after next patch in which we can start to
> > convert a real uint32_t value as instance ID.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Hi
> 
> Being more helpful,  I think that it is better to just:
> 
> * change instance_id to be an uint32_t (notice that for all architectures
>   that we support, it is actually int32_t).
> 
> * export calculate_new_instance_id() and adjust callers that use -1.
> 
> or
> 
> * export a new function that just use the calculate_new_instance_id()
> 
> A fast search shows:
> 
> 10 callers of vmstate_register() with -1
> 1 caller of vmstate_register_with_alias_id with -1 (but it is the one
>   that sets all qdev devices).
> 1 caller of vmstate_register_with_alias_id in apic, where it can be -1.
> 1 caller of register_savevm_live() with -1 (spapr)
> 
> And call it a day?
> 
> What do you think, Juan.

Sure, I can switch instance_id to uint32_t and add a new flag to both
functions (register_savevm_live, vmstate_register_with_alias_id)

Regards,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  8:34   ` Juan Quintela
@ 2019-10-15 10:28     ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2019-10-15 10:28 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Paolo Bonzini, Dr . David Alan Gilbert, qemu-devel,
	Eduardo Habkost, Igor Mammedov

On Tue, Oct 15, 2019 at 10:34:57AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > It was "int" and used as 32bits fields (see save_section_header()).
> > It's unsafe already because sizeof(int) could be 2 on i386,
> 
> i386 is 32bits, so int is 32bits O:-)

Right it should be 16 bits systems.  And yes I don't think we need to
consider that! :)

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
  2019-10-15 10:16     ` Peter Xu
@ 2019-10-15 11:02       ` Dr. David Alan Gilbert
  2019-10-15 19:49         ` Eduardo Habkost
  0 siblings, 1 reply; 14+ messages in thread
From: Dr. David Alan Gilbert @ 2019-10-15 11:02 UTC (permalink / raw)
  To: Peter Xu
  Cc: Paolo Bonzini, Juan Quintela, qemu-devel, Eduardo Habkost, Igor Mammedov

* Peter Xu (peterx@redhat.com) wrote:
> On Tue, Oct 15, 2019 at 10:22:18AM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > Migration is silently broken now with x2apic config like this:
> > > 
> > >      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
> > >      -device intel-iommu,intremap=on,eim=on
> > > 
> > > After migration, the guest kernel could hang at anything, due to
> > > x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> > > any operations related to x2apic could be broken then (e.g., RDMSR on
> > > x2apic MSRs could fail because KVM would think that the vcpu hasn't
> > > enabled x2apic at all).
> > > 
> > > The issue is that the x2apic bit was never applied correctly for vcpus
> > > whose ID > 255 when migrate completes, and that's because when we
> > > migrate APIC we use the APICCommonState.id as instance ID of the
> > > migration stream, while that's too short for x2apic.
> > > 
> > > Let's use the newly introduced initial_apic_id for that.
> > 
> > I'd like to understand a few things:
> >    a) Does this change the instance ID of existing APICs on the
> > migration stream? 
> >      a1) Ever for <256 CPUs?
> 
> No.
> 
> >      a2) For >=256 CPUs?
> 
> Yes.
> 
> > 
> >     [Because changing the ID breaks migration]
> 
> But if we don't change it, the stream is broken too. :)
> 
> Then the destination VM will receive e.g. two apic_id==0 instances (I
> think the apic_id==256 instance will wrongly overwrite the apic_id==0
> one), while the vcpu with apic_id==256 will use the initial apic
> values.
> 
> So IMHO we should still fix this, even if it changes the migration
> stream.  At least we start to make it right.

Yes, that makes sense.
It deserves a doc mention somewhere.

> > 
> >   b) Is the instance ID constant - I can see it's a property on the
> >      APIC, but I cna't see who sets it
> 
> For each vcpu, I think yes it should be a constant as long as the
> topology is the same.  This is how I understand it to be set:
> 
> (1) In pc_cpus_init(), we init these:
> 
>     possible_cpus = mc->possible_cpu_arch_ids(ms);
>     for (i = 0; i < ms->smp.cpus; i++) {
>         pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
>     }
> 
> (2) In x86_cpu_apic_create(), we apply the apic_id to "id" property:
> 
>     qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);

OK, that's fine - as long as it's constaatn and not guest influenced.

> > 
> >   c) In the case where it fails, did we end up registering two
> >      devices with the same name and instance ID?  If so, is it worth
> >      adding a check that would error if we tried?
> 
> Sounds doable.
> 

Great,

Dave

> Thanks,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits
  2019-10-15  8:57     ` Dr. David Alan Gilbert
@ 2019-10-15 12:57       ` Juan Quintela
  0 siblings, 0 replies; 14+ messages in thread
From: Juan Quintela @ 2019-10-15 12:57 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Paolo Bonzini, Eduardo Habkost, qemu-devel, Peter Xu, Igor Mammedov

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Peter Xu <peterx@redhat.com> wrote:
>> > It was "int" and used as 32bits fields (see save_section_header()).
>> > It's unsafe already because sizeof(int) could be 2 on i386, I think.
>> > So at least uint32_t would suite more.  While it also uses "-1" as a
>> > placeholder of "we want to generate the instance ID automatically".
>> > Hence a more proper value should be int64_t.
>> >
>> > This will start to be useful after next patch in which we can start to
>> > convert a real uint32_t value as instance ID.
>> >
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> 
>> Hi
>> 
>> Being more helpful,  I think that it is better to just:
>> 
>> * change instance_id to be an uint32_t (notice that for all architectures
>>   that we support, it is actually int32_t).
>> 
>> * export calculate_new_instance_id() and adjust callers that use -1.
>> 
>> or
>> 
>> * export a new function that just use the calculate_new_instance_id()
>
> Do you mean that we end up with two functions, one that does it
> automatically, and one that takes an ID?

That is one option.

The other is that we export calculate_new_instance_id(), and we use that
instead of -1.

Later, Juan.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
  2019-10-15 11:02       ` Dr. David Alan Gilbert
@ 2019-10-15 19:49         ` Eduardo Habkost
  0 siblings, 0 replies; 14+ messages in thread
From: Eduardo Habkost @ 2019-10-15 19:49 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Paolo Bonzini, Juan Quintela, qemu-devel, Peter Xu, Igor Mammedov

On Tue, Oct 15, 2019 at 12:02:53PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Tue, Oct 15, 2019 at 10:22:18AM +0100, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (peterx@redhat.com) wrote:
> > > > Migration is silently broken now with x2apic config like this:
> > > > 
> > > >      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
> > > >      -device intel-iommu,intremap=on,eim=on
> > > > 
> > > > After migration, the guest kernel could hang at anything, due to
> > > > x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> > > > any operations related to x2apic could be broken then (e.g., RDMSR on
> > > > x2apic MSRs could fail because KVM would think that the vcpu hasn't
> > > > enabled x2apic at all).
> > > > 
> > > > The issue is that the x2apic bit was never applied correctly for vcpus
> > > > whose ID > 255 when migrate completes, and that's because when we
> > > > migrate APIC we use the APICCommonState.id as instance ID of the
> > > > migration stream, while that's too short for x2apic.
> > > > 
> > > > Let's use the newly introduced initial_apic_id for that.
> > > 
> > > I'd like to understand a few things:
> > >    a) Does this change the instance ID of existing APICs on the
> > > migration stream? 
> > >      a1) Ever for <256 CPUs?
> > 
> > No.
> > 
> > >      a2) For >=256 CPUs?
> > 
> > Yes.
> > 
> > > 
> > >     [Because changing the ID breaks migration]
> > 
> > But if we don't change it, the stream is broken too. :)
> > 
> > Then the destination VM will receive e.g. two apic_id==0 instances (I
> > think the apic_id==256 instance will wrongly overwrite the apic_id==0
> > one), while the vcpu with apic_id==256 will use the initial apic
> > values.
> > 
> > So IMHO we should still fix this, even if it changes the migration
> > stream.  At least we start to make it right.
> 
> Yes, that makes sense.
> It deserves a doc mention somewhere.
> 
> > > 
> > >   b) Is the instance ID constant - I can see it's a property on the
> > >      APIC, but I cna't see who sets it
> > 
> > For each vcpu, I think yes it should be a constant as long as the
> > topology is the same.  This is how I understand it to be set:
> > 
> > (1) In pc_cpus_init(), we init these:
> > 
> >     possible_cpus = mc->possible_cpu_arch_ids(ms);
> >     for (i = 0; i < ms->smp.cpus; i++) {
> >         pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> >     }
> > 
> > (2) In x86_cpu_apic_create(), we apply the apic_id to "id" property:
> > 
> >     qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);
> 
> OK, that's fine - as long as it's constaatn and not guest influenced.

The guest may change the CPU APIC ID (although they rarely do),
but I believe X86CPU::apic_id is always going to be the initial
APIC ID.  I'll double check (and maybe send a patch to rename it
to initial_apic_id).

-- 
Eduardo


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-10-15 19:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-15  7:54 [PATCH 0/2] apic: Fix migration breakage of >255 vcpus Peter Xu
2019-10-15  7:54 ` [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits Peter Xu
2019-10-15  8:34   ` Juan Quintela
2019-10-15 10:28     ` Peter Xu
2019-10-15  8:45   ` Juan Quintela
2019-10-15  8:57     ` Dr. David Alan Gilbert
2019-10-15 12:57       ` Juan Quintela
2019-10-15 10:23     ` Peter Xu
2019-10-15  7:54 ` [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID Peter Xu
2019-10-15  8:30   ` Juan Quintela
2019-10-15  9:22   ` Dr. David Alan Gilbert
2019-10-15 10:16     ` Peter Xu
2019-10-15 11:02       ` Dr. David Alan Gilbert
2019-10-15 19:49         ` Eduardo Habkost

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).