* [Qemu-devel] [PATCH 0/2] pcie migration fixes @ 2012-08-30 17:51 Jason Baron 2012-08-30 17:51 ` [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration Jason Baron 2012-08-30 17:51 ` [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number Jason Baron 0 siblings, 2 replies; 13+ messages in thread From: Jason Baron @ 2012-08-30 17:51 UTC (permalink / raw) To: mst, quintela; +Cc: yamahata, alex.williamson, qemu-devel, jan.kiszka Hi, A couple of pcie related migration fixes that I found while testing q35 migration. Thanks, -Jason Jason Baron (2): pcie: drop version_id field for live migration pcie_aer: clear cmask for Advanced Error Interrupt Message Number hw/pci.c | 2 +- hw/pcie.h | 1 - hw/pcie_aer.c | 6 ++++++ 3 files changed, 7 insertions(+), 2 deletions(-) ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration 2012-08-30 17:51 [Qemu-devel] [PATCH 0/2] pcie migration fixes Jason Baron @ 2012-08-30 17:51 ` Jason Baron 2012-08-31 8:44 ` Michael S. Tsirkin 2012-08-31 8:51 ` Juan Quintela 2012-08-30 17:51 ` [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number Jason Baron 1 sibling, 2 replies; 13+ messages in thread From: Jason Baron @ 2012-08-30 17:51 UTC (permalink / raw) To: mst, quintela; +Cc: yamahata, alex.williamson, qemu-devel, jan.kiszka While testing q35 live migration, I found that the migration would abort with the following error: "Unknown savevm section type 76". The error is due to this check failing in 'vmstate_load_state()': while(field->name) { if ((field->field_exists && field->field_exists(opaque, version_id)) || (!field->field_exists && field->version_id <= version_id)) { The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, 'version_id' in the above check is 1. And thus we fail to load the pcie device field. Further the code returns to 'qemu_loadvm_state()' which produces the error that I saw. I'm proposing to fix this by simply dropping the 'version_id' field from VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already versioned. Thus, any versioning issues could be detected at the vmsd level. Taking a step back, I think that the 'field->version_id' should be compared against a saved version number for the field not the 'version_id'. Futhermore, once vmstate_load_state() is called recursively on another vmsd, the check of: if (version_id > vmsd->version_id) { return -EINVAL; } Will never fail since version_id is always equal to vmsd->version_id. So I'm wondering why we aren't storing the vmsd version id of the source in the migration stream? This patch also renames the 'name' field of vmstate_pcie_device from: PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. Signed-off-by: Jason Baron <jbaron@redhat.com> --- hw/pci.c | 2 +- hw/pcie.h | 1 - 2 files changed, 1 insertions(+), 2 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 3727afa..5386a4f 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -439,7 +439,7 @@ const VMStateDescription vmstate_pci_device = { }; const VMStateDescription vmstate_pcie_device = { - .name = "PCIDevice", + .name = "PCIEDevice", .version_id = 2, .minimum_version_id = 1, .minimum_version_id_old = 1, diff --git a/hw/pcie.h b/hw/pcie.h index b8ab0c7..4889194 100644 --- a/hw/pcie.h +++ b/hw/pcie.h @@ -133,7 +133,6 @@ extern const VMStateDescription vmstate_pcie_device; #define VMSTATE_PCIE_DEVICE(_field, _state) { \ .name = (stringify(_field)), \ - .version_id = 2, \ .size = sizeof(PCIDevice), \ .vmsd = &vmstate_pcie_device, \ .flags = VMS_STRUCT, \ -- 1.7.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration 2012-08-30 17:51 ` [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration Jason Baron @ 2012-08-31 8:44 ` Michael S. Tsirkin 2012-08-31 14:46 ` Jason Baron 2012-08-31 8:51 ` Juan Quintela 1 sibling, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2012-08-31 8:44 UTC (permalink / raw) To: Jason Baron; +Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Thu, Aug 30, 2012 at 01:51:10PM -0400, Jason Baron wrote: > While testing q35 live migration, I found that the migration would abort with > the following error: "Unknown savevm section type 76". > > The error is due to this check failing in 'vmstate_load_state()': > > while(field->name) { > if ((field->field_exists && > field->field_exists(opaque, version_id)) || > (!field->field_exists && > field->version_id <= version_id)) { > > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > 'version_id' in the above check is 1. And thus we fail to load the pcie device > field. Further the code returns to 'qemu_loadvm_state()' which produces the > error that I saw. > > I'm proposing to fix this by simply dropping the 'version_id' field from > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > versioned. Thus, any versioning issues could be detected at the vmsd level. > > Taking a step back, I think that the 'field->version_id' should be compared > against a saved version number for the field not the 'version_id'. Futhermore, > once vmstate_load_state() is called recursively on another vmsd, the check of: > > if (version_id > vmsd->version_id) { > return -EINVAL; > } > > Will never fail since version_id is always equal to vmsd->version_id. So I'm > wondering why we aren't storing the vmsd version id of the source in the > migration stream? > > This patch also renames the 'name' field of vmstate_pcie_device from: > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > Signed-off-by: Jason Baron <jbaron@redhat.com> I have this queued to apply already - this is just a repost or did I miss something? I have no idea about the larger issues - Juan? > --- > hw/pci.c | 2 +- > hw/pcie.h | 1 - > 2 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/hw/pci.c b/hw/pci.c > index 3727afa..5386a4f 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -439,7 +439,7 @@ const VMStateDescription vmstate_pci_device = { > }; > > const VMStateDescription vmstate_pcie_device = { > - .name = "PCIDevice", > + .name = "PCIEDevice", > .version_id = 2, > .minimum_version_id = 1, > .minimum_version_id_old = 1, > diff --git a/hw/pcie.h b/hw/pcie.h > index b8ab0c7..4889194 100644 > --- a/hw/pcie.h > +++ b/hw/pcie.h > @@ -133,7 +133,6 @@ extern const VMStateDescription vmstate_pcie_device; > > #define VMSTATE_PCIE_DEVICE(_field, _state) { \ > .name = (stringify(_field)), \ > - .version_id = 2, \ > .size = sizeof(PCIDevice), \ > .vmsd = &vmstate_pcie_device, \ > .flags = VMS_STRUCT, \ > -- > 1.7.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration 2012-08-31 8:44 ` Michael S. Tsirkin @ 2012-08-31 14:46 ` Jason Baron 2012-08-31 15:42 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Jason Baron @ 2012-08-31 14:46 UTC (permalink / raw) To: Michael S. Tsirkin Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Fri, Aug 31, 2012 at 11:44:54AM +0300, Michael S. Tsirkin wrote: > On Thu, Aug 30, 2012 at 01:51:10PM -0400, Jason Baron wrote: > > While testing q35 live migration, I found that the migration would abort with > > the following error: "Unknown savevm section type 76". > > > > The error is due to this check failing in 'vmstate_load_state()': > > > > while(field->name) { > > if ((field->field_exists && > > field->field_exists(opaque, version_id)) || > > (!field->field_exists && > > field->version_id <= version_id)) { > > > > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > > 'version_id' in the above check is 1. And thus we fail to load the pcie device > > field. Further the code returns to 'qemu_loadvm_state()' which produces the > > error that I saw. > > > > I'm proposing to fix this by simply dropping the 'version_id' field from > > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > > versioned. Thus, any versioning issues could be detected at the vmsd level. > > > > Taking a step back, I think that the 'field->version_id' should be compared > > against a saved version number for the field not the 'version_id'. Futhermore, > > once vmstate_load_state() is called recursively on another vmsd, the check of: > > > > if (version_id > vmsd->version_id) { > > return -EINVAL; > > } > > > > Will never fail since version_id is always equal to vmsd->version_id. So I'm > > wondering why we aren't storing the vmsd version id of the source in the > > migration stream? > > > > This patch also renames the 'name' field of vmstate_pcie_device from: > > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > > > Signed-off-by: Jason Baron <jbaron@redhat.com> > > I have this queued to apply already - this is just a repost > or did I miss something? > just a repost...but now we have Juan's ack :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration 2012-08-31 14:46 ` Jason Baron @ 2012-08-31 15:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 13+ messages in thread From: Michael S. Tsirkin @ 2012-08-31 15:42 UTC (permalink / raw) To: Jason Baron; +Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Fri, Aug 31, 2012 at 10:46:51AM -0400, Jason Baron wrote: > On Fri, Aug 31, 2012 at 11:44:54AM +0300, Michael S. Tsirkin wrote: > > On Thu, Aug 30, 2012 at 01:51:10PM -0400, Jason Baron wrote: > > > While testing q35 live migration, I found that the migration would abort with > > > the following error: "Unknown savevm section type 76". > > > > > > The error is due to this check failing in 'vmstate_load_state()': > > > > > > while(field->name) { > > > if ((field->field_exists && > > > field->field_exists(opaque, version_id)) || > > > (!field->field_exists && > > > field->version_id <= version_id)) { > > > > > > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > > > 'version_id' in the above check is 1. And thus we fail to load the pcie device > > > field. Further the code returns to 'qemu_loadvm_state()' which produces the > > > error that I saw. > > > > > > I'm proposing to fix this by simply dropping the 'version_id' field from > > > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > > > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > > > versioned. Thus, any versioning issues could be detected at the vmsd level. > > > > > > Taking a step back, I think that the 'field->version_id' should be compared > > > against a saved version number for the field not the 'version_id'. Futhermore, > > > once vmstate_load_state() is called recursively on another vmsd, the check of: > > > > > > if (version_id > vmsd->version_id) { > > > return -EINVAL; > > > } > > > > > > Will never fail since version_id is always equal to vmsd->version_id. So I'm > > > wondering why we aren't storing the vmsd version id of the source in the > > > migration stream? > > > > > > This patch also renames the 'name' field of vmstate_pcie_device from: > > > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > > > > > Signed-off-by: Jason Baron <jbaron@redhat.com> > > > > I have this queued to apply already - this is just a repost > > or did I miss something? > > > > just a repost...but now we have Juan's ack :) Right. It is preferable to tag reposts explicitly but no big deal. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration 2012-08-30 17:51 ` [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration Jason Baron 2012-08-31 8:44 ` Michael S. Tsirkin @ 2012-08-31 8:51 ` Juan Quintela 1 sibling, 0 replies; 13+ messages in thread From: Juan Quintela @ 2012-08-31 8:51 UTC (permalink / raw) To: Jason Baron; +Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, mst Jason Baron <jbaron@redhat.com> wrote: > While testing q35 live migration, I found that the migration would abort with > the following error: "Unknown savevm section type 76". Before we start, migration of PCI is ugly due to backwards compatibility. > The error is due to this check failing in 'vmstate_load_state()': > > while(field->name) { > if ((field->field_exists && > field->field_exists(opaque, version_id)) || > (!field->field_exists && > field->version_id <= version_id)) { This is a long history, but to make things short, when we started with vmstate, there were no cose where this were not true. It could potentially be true, but there were no examples. > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > 'version_id' in the above check is 1. And thus we fail to load the pcie device > field. Further the code returns to 'qemu_loadvm_state()' which produces the > error that I saw. Agreed about removing the set of the value. > I'm proposing to fix this by simply dropping the 'version_id' field from > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > versioned. Thus, any versioning issues could be detected at the vmsd level. > > Taking a step back, I think that the 'field->version_id' should be compared > against a saved version number for the field not the 'version_id'. Futhermore, > once vmstate_load_state() is called recursively on another vmsd, the check of: > > if (version_id > vmsd->version_id) { > return -EINVAL; > } This is a can of worms you don't want to open. Basically it is imposible to get it _sane_ and backward compatible. We choose on its time to maintain it backward compatible, and that is why it is not sane. I would have to dig the details, but the problems where when we were loading things that were not backward compatible. The problem was that PCI is the other "section inside a section", where we store the version_id. This is wrong for a number of reasons, between them that if we "increase" the version_id of pci, the "outermost" section id, has changed but it would not receive a new version number. Remember the part about _sane_ and _backward compatible_ :-() > Will never fail since version_id is always equal to vmsd->version_id. So I'm > wondering why we aren't storing the vmsd version id of the source in the > migration stream? > > This patch also renames the 'name' field of vmstate_pcie_device from: > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Juan Quintela <quintela@redhat.com> I agree with the migration bits of this series. I am not ack'ing the other patch because I don't understand pci well enough to know what the code does (before or after). Later, Juan. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-08-30 17:51 [Qemu-devel] [PATCH 0/2] pcie migration fixes Jason Baron 2012-08-30 17:51 ` [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration Jason Baron @ 2012-08-30 17:51 ` Jason Baron 2012-08-31 8:42 ` Michael S. Tsirkin 1 sibling, 1 reply; 13+ messages in thread From: Jason Baron @ 2012-08-30 17:51 UTC (permalink / raw) To: mst, quintela; +Cc: yamahata, alex.williamson, qemu-devel, jan.kiszka The Advanced Error Interrupt Message Number (bits 31:27 of the Root Error Status Register) is updated when the number of msi messages assigned to a device changes. Migration of windows 7 on q35 chipset failed because the check in get_pci_config_device() fails due to wmask being set on these bits. Its valid to update these bits and we must restore this state across migration. Signed-off-by: Jason Baron <jbaron@redhat.com> --- hw/pcie_aer.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c index 3b6981c..6edcd79 100644 --- a/hw/pcie_aer.c +++ b/hw/pcie_aer.c @@ -738,6 +738,12 @@ void pcie_aer_root_init(PCIDevice *dev) PCI_ERR_ROOT_CMD_EN_MASK); pci_set_long(dev->w1cmask + pos + PCI_ERR_ROOT_STATUS, PCI_ERR_ROOT_STATUS_REPORT_MASK); + /* Bits 31:27 - Advanced Error Interrupt Message Number + * These bits are updated when the number of MSI messages changes. + * By clearing the cmask, pcie devices can be migrated. + */ + pci_set_long(dev->cmask + pos + PCI_ERR_ROOT_STATUS, + (1 << PCI_ERR_ROOT_IRQ_SHIFT) - 1); } void pcie_aer_root_reset(PCIDevice *dev) -- 1.7.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-08-30 17:51 ` [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number Jason Baron @ 2012-08-31 8:42 ` Michael S. Tsirkin 2012-08-31 14:45 ` Jason Baron 0 siblings, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2012-08-31 8:42 UTC (permalink / raw) To: Jason Baron; +Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela Some minor nits below. If you dont get to it I will tweak this patch when I apply it early next week. On Thu, Aug 30, 2012 at 01:51:15PM -0400, Jason Baron wrote: > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > Error Status Register) is updated when the number of msi messages assigned to a > device changes. Migration of windows 7 on q35 chipset failed because the check > in get_pci_config_device() fails due to wmask being set on these bits. I think you actually mean 'not being set on these bits'? > Its valid > to update these bits and we must restore this state across migration. > > Signed-off-by: Jason Baron <jbaron@redhat.com> > --- > hw/pcie_aer.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c > index 3b6981c..6edcd79 100644 > --- a/hw/pcie_aer.c > +++ b/hw/pcie_aer.c > @@ -738,6 +738,12 @@ void pcie_aer_root_init(PCIDevice *dev) > PCI_ERR_ROOT_CMD_EN_MASK); > pci_set_long(dev->w1cmask + pos + PCI_ERR_ROOT_STATUS, > PCI_ERR_ROOT_STATUS_REPORT_MASK); > + /* Bits 31:27 - Advanced Error Interrupt Message Number This line is better moved to near definition of PCI_ERR_ROOT_IRQ. Then here we can say 'PCI_ERR_ROOT_IRQ is RO but devices change it using a device-specific method.' > + * These bits are updated when the number of MSI messages changes. > + * By clearing the cmask, pcie devices can be migrated. > + */ > + pci_set_long(dev->cmask + pos + PCI_ERR_ROOT_STATUS, > + (1 << PCI_ERR_ROOT_IRQ_SHIFT) - 1); ~PCI_ERR_ROOT_IRQ would be clearer I think. > } > > void pcie_aer_root_reset(PCIDevice *dev) > -- > 1.7.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-08-31 8:42 ` Michael S. Tsirkin @ 2012-08-31 14:45 ` Jason Baron 2012-08-31 15:35 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Jason Baron @ 2012-08-31 14:45 UTC (permalink / raw) To: Michael S. Tsirkin Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Fri, Aug 31, 2012 at 11:42:27AM +0300, Michael S. Tsirkin wrote: > Some minor nits below. If you dont get to it I will tweak this patch > when I apply it early next week. > > On Thu, Aug 30, 2012 at 01:51:15PM -0400, Jason Baron wrote: > > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > > Error Status Register) is updated when the number of msi messages assigned to a > > device changes. Migration of windows 7 on q35 chipset failed because the check > > in get_pci_config_device() fails due to wmask being set on these bits. > > I think you actually mean 'not being set on these bits'? > No, the check is: static int get_pci_config_device(QEMUFile *f, void *pv, size_t size) { PCIDevice *s = container_of(pv, PCIDevice, config); uint8_t *config; int i; assert(size == pci_config_size(s)); config = g_malloc(size); qemu_get_buffer(f, config, size); for (i = 0; i < size; ++i) { if ((config[i] ^ s->config[i]) & s->cmask[i] & ~s->wmask[i] & ~s->w1cmask[i]) { g_free(config); return -EINVAL; } } ..... So because cmask is set and these bits differ in config space (due to them being updated before migration), the migration aborts. > > Its valid > > to update these bits and we must restore this state across migration. > > > > Signed-off-by: Jason Baron <jbaron@redhat.com> > > --- > > hw/pcie_aer.c | 6 ++++++ > > 1 files changed, 6 insertions(+), 0 deletions(-) > > > > diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c > > index 3b6981c..6edcd79 100644 > > --- a/hw/pcie_aer.c > > +++ b/hw/pcie_aer.c > > @@ -738,6 +738,12 @@ void pcie_aer_root_init(PCIDevice *dev) > > PCI_ERR_ROOT_CMD_EN_MASK); > > pci_set_long(dev->w1cmask + pos + PCI_ERR_ROOT_STATUS, > > PCI_ERR_ROOT_STATUS_REPORT_MASK); > > + /* Bits 31:27 - Advanced Error Interrupt Message Number > > This line is better moved to near definition of PCI_ERR_ROOT_IRQ. > Then here we can say 'PCI_ERR_ROOT_IRQ is RO but devices > change it using a device-specific method.' ok. > > > + * These bits are updated when the number of MSI messages changes. > > + * By clearing the cmask, pcie devices can be migrated. > > + */ > > + pci_set_long(dev->cmask + pos + PCI_ERR_ROOT_STATUS, > > + (1 << PCI_ERR_ROOT_IRQ_SHIFT) - 1); > > ~PCI_ERR_ROOT_IRQ would be clearer I think. > agreed. I will re-spin and post a v2. Thanks, -Jason ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-08-31 14:45 ` Jason Baron @ 2012-08-31 15:35 ` Michael S. Tsirkin 2012-08-31 15:43 ` Jason Baron 0 siblings, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2012-08-31 15:35 UTC (permalink / raw) To: Jason Baron; +Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Fri, Aug 31, 2012 at 10:45:52AM -0400, Jason Baron wrote: > On Fri, Aug 31, 2012 at 11:42:27AM +0300, Michael S. Tsirkin wrote: > > Some minor nits below. If you dont get to it I will tweak this patch > > when I apply it early next week. > > > > On Thu, Aug 30, 2012 at 01:51:15PM -0400, Jason Baron wrote: > > > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > > > Error Status Register) is updated when the number of msi messages assigned to a > > > device changes. Migration of windows 7 on q35 chipset failed because the check > > > in get_pci_config_device() fails due to wmask being set on these bits. > > > > I think you actually mean 'not being set on these bits'? > > > > No, the check is: > > static int get_pci_config_device(QEMUFile *f, void *pv, size_t size) > { > PCIDevice *s = container_of(pv, PCIDevice, config); > uint8_t *config; > int i; > > assert(size == pci_config_size(s)); > config = g_malloc(size); > > qemu_get_buffer(f, config, size); > for (i = 0; i < size; ++i) { > if ((config[i] ^ s->config[i]) & > s->cmask[i] & ~s->wmask[i] & ~s->w1cmask[i]) { > g_free(config); > return -EINVAL; > } > } > ..... > > So because cmask is set and these bits differ in config space (due to > them being updated before migration), the migration aborts. Ah so you mean 'cmask being set' - not wmask as you wrote. > > > > Its valid > > > to update these bits and we must restore this state across migration. > > > > > > Signed-off-by: Jason Baron <jbaron@redhat.com> > > > --- > > > hw/pcie_aer.c | 6 ++++++ > > > 1 files changed, 6 insertions(+), 0 deletions(-) > > > > > > diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c > > > index 3b6981c..6edcd79 100644 > > > --- a/hw/pcie_aer.c > > > +++ b/hw/pcie_aer.c > > > @@ -738,6 +738,12 @@ void pcie_aer_root_init(PCIDevice *dev) > > > PCI_ERR_ROOT_CMD_EN_MASK); > > > pci_set_long(dev->w1cmask + pos + PCI_ERR_ROOT_STATUS, > > > PCI_ERR_ROOT_STATUS_REPORT_MASK); > > > + /* Bits 31:27 - Advanced Error Interrupt Message Number > > > > This line is better moved to near definition of PCI_ERR_ROOT_IRQ. > > Then here we can say 'PCI_ERR_ROOT_IRQ is RO but devices > > change it using a device-specific method.' > > ok. > > > > > > + * These bits are updated when the number of MSI messages changes. > > > + * By clearing the cmask, pcie devices can be migrated. > > > + */ > > > + pci_set_long(dev->cmask + pos + PCI_ERR_ROOT_STATUS, > > > + (1 << PCI_ERR_ROOT_IRQ_SHIFT) - 1); > > > > ~PCI_ERR_ROOT_IRQ would be clearer I think. > > > > agreed. > > I will re-spin and post a v2. > > Thanks, > > -Jason ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-08-31 15:35 ` Michael S. Tsirkin @ 2012-08-31 15:43 ` Jason Baron 2012-09-04 20:22 ` [Qemu-devel] [PATCH 2/2 v2] " Jason Baron 0 siblings, 1 reply; 13+ messages in thread From: Jason Baron @ 2012-08-31 15:43 UTC (permalink / raw) To: Michael S. Tsirkin Cc: yamahata, alex.williamson, quintela, qemu-devel, jan.kiszka On Fri, Aug 31, 2012 at 06:35:13PM +0300, Michael S. Tsirkin wrote: > On Fri, Aug 31, 2012 at 10:45:52AM -0400, Jason Baron wrote: > > On Fri, Aug 31, 2012 at 11:42:27AM +0300, Michael S. Tsirkin wrote: > > > Some minor nits below. If you dont get to it I will tweak this patch > > > when I apply it early next week. > > > > > > On Thu, Aug 30, 2012 at 01:51:15PM -0400, Jason Baron wrote: > > > > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > > > > Error Status Register) is updated when the number of msi messages assigned to a > > > > device changes. Migration of windows 7 on q35 chipset failed because the check > > > > in get_pci_config_device() fails due to wmask being set on these bits. > > > > > > I think you actually mean 'not being set on these bits'? > > > > > > > No, the check is: > > > > static int get_pci_config_device(QEMUFile *f, void *pv, size_t size) > > { > > PCIDevice *s = container_of(pv, PCIDevice, config); > > uint8_t *config; > > int i; > > > > assert(size == pci_config_size(s)); > > config = g_malloc(size); > > > > qemu_get_buffer(f, config, size); > > for (i = 0; i < size; ++i) { > > if ((config[i] ^ s->config[i]) & > > s->cmask[i] & ~s->wmask[i] & ~s->w1cmask[i]) { > > g_free(config); > > return -EINVAL; > > } > > } > > ..... > > > > So because cmask is set and these bits differ in config space (due to > > them being updated before migration), the migration aborts. > > Ah so you mean 'cmask being set' - not wmask as you wrote. > Sorry - my mistake, yes I meant 'cmask'. I will fix the changelog in the re-post. Thanks, -Jason ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2 v2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-08-31 15:43 ` Jason Baron @ 2012-09-04 20:22 ` Jason Baron 2012-09-07 6:01 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Jason Baron @ 2012-09-04 20:22 UTC (permalink / raw) To: Michael S. Tsirkin Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Fri, Aug 31, 2012 at 11:43:31AM -0400, Jason Baron wrote: > On Fri, Aug 31, 2012 at 06:35:13PM +0300, Michael S. Tsirkin wrote: > > On Fri, Aug 31, 2012 at 10:45:52AM -0400, Jason Baron wrote: > > > On Fri, Aug 31, 2012 at 11:42:27AM +0300, Michael S. Tsirkin wrote: > > > > Some minor nits below. If you dont get to it I will tweak this patch > > > > when I apply it early next week. > > > > > > > > On Thu, Aug 30, 2012 at 01:51:15PM -0400, Jason Baron wrote: > > > > > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > > > > > Error Status Register) is updated when the number of msi messages assigned to a > > > > > device changes. Migration of windows 7 on q35 chipset failed because the check > > > > > in get_pci_config_device() fails due to wmask being set on these bits. > > > > > > > > I think you actually mean 'not being set on these bits'? > > > > > > > > > > No, the check is: > > > > > > static int get_pci_config_device(QEMUFile *f, void *pv, size_t size) > > > { > > > PCIDevice *s = container_of(pv, PCIDevice, config); > > > uint8_t *config; > > > int i; > > > > > > assert(size == pci_config_size(s)); > > > config = g_malloc(size); > > > > > > qemu_get_buffer(f, config, size); > > > for (i = 0; i < size; ++i) { > > > if ((config[i] ^ s->config[i]) & > > > s->cmask[i] & ~s->wmask[i] & ~s->w1cmask[i]) { > > > g_free(config); > > > return -EINVAL; > > > } > > > } > > > ..... > > > > > > So because cmask is set and these bits differ in config space (due to > > > them being updated before migration), the migration aborts. > > > > Ah so you mean 'cmask being set' - not wmask as you wrote. > > > > Sorry - my mistake, yes I meant 'cmask'. I will fix the changelog in the > re-post. > Ok, here's a v2: --- pcie_aer: clear cmask for Advanced Error Interrupt Message Number The Advanced Error Interrupt Message Number (bits 31:27 of the Root Error Status Register) is updated when the number of msi messages assigned to a device changes. Migration of windows 7 on q35 chipset failed because the check in get_pci_config_device() fails due to cmask being set on these bits. Its valid to update these bits and we must restore this state across migration. Signed-off-by: Jason Baron <jbaron@redhat.com> --- v2: - Based on Michael Tsirkin's feedback: -updated changelog 'wmask' -> 'cmask' -Cleaned up comments -Make cmask set more readable hw/pcie_aer.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c index 3b6981c..b04c164 100644 --- a/hw/pcie_aer.c +++ b/hw/pcie_aer.c @@ -738,6 +738,11 @@ void pcie_aer_root_init(PCIDevice *dev) PCI_ERR_ROOT_CMD_EN_MASK); pci_set_long(dev->w1cmask + pos + PCI_ERR_ROOT_STATUS, PCI_ERR_ROOT_STATUS_REPORT_MASK); + /* PCI_ERR_ROOT_IRQ is RO but devices change it using a + * device-specific method. + */ + pci_set_long(dev->cmask + pos + PCI_ERR_ROOT_STATUS, + ~PCI_ERR_ROOT_IRQ); } void pcie_aer_root_reset(PCIDevice *dev) -- 1.7.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2 v2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number 2012-09-04 20:22 ` [Qemu-devel] [PATCH 2/2 v2] " Jason Baron @ 2012-09-07 6:01 ` Michael S. Tsirkin 0 siblings, 0 replies; 13+ messages in thread From: Michael S. Tsirkin @ 2012-09-07 6:01 UTC (permalink / raw) To: Jason Baron; +Cc: yamahata, alex.williamson, jan.kiszka, qemu-devel, quintela On Tue, Sep 04, 2012 at 04:22:46PM -0400, Jason Baron wrote: > On Fri, Aug 31, 2012 at 11:43:31AM -0400, Jason Baron wrote: > > On Fri, Aug 31, 2012 at 06:35:13PM +0300, Michael S. Tsirkin wrote: > > > On Fri, Aug 31, 2012 at 10:45:52AM -0400, Jason Baron wrote: > > > > On Fri, Aug 31, 2012 at 11:42:27AM +0300, Michael S. Tsirkin wrote: > > > > > Some minor nits below. If you dont get to it I will tweak this patch > > > > > when I apply it early next week. > > > > > > > > > > On Thu, Aug 30, 2012 at 01:51:15PM -0400, Jason Baron wrote: > > > > > > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > > > > > > Error Status Register) is updated when the number of msi messages assigned to a > > > > > > device changes. Migration of windows 7 on q35 chipset failed because the check > > > > > > in get_pci_config_device() fails due to wmask being set on these bits. > > > > > > > > > > I think you actually mean 'not being set on these bits'? > > > > > > > > > > > > > No, the check is: > > > > > > > > static int get_pci_config_device(QEMUFile *f, void *pv, size_t size) > > > > { > > > > PCIDevice *s = container_of(pv, PCIDevice, config); > > > > uint8_t *config; > > > > int i; > > > > > > > > assert(size == pci_config_size(s)); > > > > config = g_malloc(size); > > > > > > > > qemu_get_buffer(f, config, size); > > > > for (i = 0; i < size; ++i) { > > > > if ((config[i] ^ s->config[i]) & > > > > s->cmask[i] & ~s->wmask[i] & ~s->w1cmask[i]) { > > > > g_free(config); > > > > return -EINVAL; > > > > } > > > > } > > > > ..... > > > > > > > > So because cmask is set and these bits differ in config space (due to > > > > them being updated before migration), the migration aborts. > > > > > > Ah so you mean 'cmask being set' - not wmask as you wrote. > > > > > > > Sorry - my mistake, yes I meant 'cmask'. I will fix the changelog in the > > re-post. > > > > Ok, here's a v2: > > --- > Just a small request: in the future please do not put --- before patch please: git am ignores everything after ---. I fixed this up manually. > pcie_aer: clear cmask for Advanced Error Interrupt Message Number > > The Advanced Error Interrupt Message Number (bits 31:27 of the Root > Error Status Register) is updated when the number of msi messages assigned to a > device changes. Migration of windows 7 on q35 chipset failed because the check > in get_pci_config_device() fails due to cmask being set on these bits. Its valid > to update these bits and we must restore this state across migration. > > Signed-off-by: Jason Baron <jbaron@redhat.com> > --- > > v2: > - Based on Michael Tsirkin's feedback: > -updated changelog 'wmask' -> 'cmask' > -Cleaned up comments > -Make cmask set more readable > > hw/pcie_aer.c | 5 +++++ > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c > index 3b6981c..b04c164 100644 > --- a/hw/pcie_aer.c > +++ b/hw/pcie_aer.c > @@ -738,6 +738,11 @@ void pcie_aer_root_init(PCIDevice *dev) > PCI_ERR_ROOT_CMD_EN_MASK); > pci_set_long(dev->w1cmask + pos + PCI_ERR_ROOT_STATUS, > PCI_ERR_ROOT_STATUS_REPORT_MASK); > + /* PCI_ERR_ROOT_IRQ is RO but devices change it using a > + * device-specific method. > + */ > + pci_set_long(dev->cmask + pos + PCI_ERR_ROOT_STATUS, > + ~PCI_ERR_ROOT_IRQ); > } > > void pcie_aer_root_reset(PCIDevice *dev) > -- > 1.7.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-09-07 5:59 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-08-30 17:51 [Qemu-devel] [PATCH 0/2] pcie migration fixes Jason Baron 2012-08-30 17:51 ` [Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration Jason Baron 2012-08-31 8:44 ` Michael S. Tsirkin 2012-08-31 14:46 ` Jason Baron 2012-08-31 15:42 ` Michael S. Tsirkin 2012-08-31 8:51 ` Juan Quintela 2012-08-30 17:51 ` [Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number Jason Baron 2012-08-31 8:42 ` Michael S. Tsirkin 2012-08-31 14:45 ` Jason Baron 2012-08-31 15:35 ` Michael S. Tsirkin 2012-08-31 15:43 ` Jason Baron 2012-09-04 20:22 ` [Qemu-devel] [PATCH 2/2 v2] " Jason Baron 2012-09-07 6:01 ` Michael S. Tsirkin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.