[PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments
@ 2015-03-20 15:52 Jan Beulich
  2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jan Beulich @ 2015-03-20 15:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser

Due to the (late) point in time when qemu requests the hypervisor to
set up MSI-X interrupts (which is where the MMIO intercept gets put
in place), the hypervisor doesn't see all guest writes, and hence
shouldn't make assumptions on the state the virtual MSI-X resources
are in.

1: honor all mask requests
2: add valid bits for read acceleration

Signed-off-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] x86/vMSI-X: honor all mask requests
  2015-03-20 15:52 [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich
@ 2015-03-20 16:27 ` Jan Beulich
  2015-03-23 19:15   ` Konrad Rzeszutek Wilk
                     ` (2 more replies)
  2015-03-20 16:27 ` [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration Jan Beulich
  2015-04-14 10:11 ` Ping: [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich
  2 siblings, 3 replies; 12+ messages in thread
From: Jan Beulich @ 2015-03-20 16:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 1627 bytes --]

Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
guests") didn't go far enough: it fixed an issue with unmasking, but
left an issue with masking in place: Due to the (late) point in time
when qemu requests the hypervisor to set up MSI-X interrupts (which is
where the MMIO intercept gets put in place), the hypervisor doesn't
see all guest writes, and hence shouldn't make assumptions on the state
the virtual MSI-X resources are in. Bypassing the rest of the logic on
a guest mask operation leads to

[00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is already enabled.

which surprisingly enough doesn't lead to the device not working
anymore (I didn't dig in deep enough to figure out why that is). But it
does prevent the IRQ to be migrated inside the guest, i.e. all
interrupts will always arrive in vCPU 0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
         goto out;
     }

-    /* exit to device model if address/data has been modified */
-    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
+    /* Exit to device model when unmasking and address/data got modified. */
+    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
+         test_and_clear_bit(nr_entry, &entry->table_flags) )
     {
-        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
-            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
+        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
         goto out;
     }

[-- Attachment #2: x86-vMSI-X-honor-masking.patch --]
[-- Type: text/plain, Size: 1660 bytes --]

x86/vMSI-X: honor all mask requests

Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
guests") didn't go far enough: it fixed an issue with unmasking, but
left an issue with masking in place: Due to the (late) point in time
when qemu requests the hypervisor to set up MSI-X interrupts (which is
where the MMIO intercept gets put in place), the hypervisor doesn't
see all guest writes, and hence shouldn't make assumptions on the state
the virtual MSI-X resources are in. Bypassing the rest of the logic on
a guest mask operation leads to

[00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is already enabled.

which surprisingly enough doesn't lead to the device not working
anymore (I didn't dig in deep enough to figure out why that is). But it
does prevent the IRQ to be migrated inside the guest, i.e. all
interrupts will always arrive in vCPU 0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
         goto out;
     }

-    /* exit to device model if address/data has been modified */
-    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
+    /* Exit to device model when unmasking and address/data got modified. */
+    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
+         test_and_clear_bit(nr_entry, &entry->table_flags) )
     {
-        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
-            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
+        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
         goto out;
     }

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration
  2015-03-20 15:52 [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich
  2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
@ 2015-03-20 16:27 ` Jan Beulich
  2015-03-23 19:11   ` Konrad Rzeszutek Wilk
  2015-04-14 14:37   ` Andrew Cooper
  2015-04-14 10:11 ` Ping: [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich
  2 siblings, 2 replies; 12+ messages in thread
From: Jan Beulich @ 2015-03-20 16:27 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 1817 bytes --]

Again because Xen doesn't get to see all guest writes, it shouldn't
serve reads from its cache before having seen a write to the respective
address.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -153,12 +153,15 @@ struct msixtbl_entry
     /* TODO: resolve the potential race by destruction of pdev */
     struct pci_dev *pdev;
     unsigned long gtable;       /* gpa of msix table */
-    unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
+    DECLARE_BITMAP(table_flags, MAX_MSIX_TABLE_ENTRIES);
 #define MAX_MSIX_ACC_ENTRIES 3
     unsigned int table_len;
     struct { 
         uint32_t msi_ad[3];	/* Shadow of address low, high and data */
     } gentries[MAX_MSIX_ACC_ENTRIES];
+    DECLARE_BITMAP(acc_valid, 3 * MAX_MSIX_ACC_ENTRIES);
+#define acc_bit(what, ent, slot, idx) \
+        what##_bit((slot) * 3 + (idx), (ent)->acc_valid)
     struct rcu_head rcu;
 };
 
@@ -233,9 +236,10 @@ static int msixtbl_read(
     if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET )
     {
         nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
-        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
-            goto out;
         index = offset / sizeof(uint32_t);
+        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ||
+             !acc_bit(test, entry, nr_entry, index) )
+            goto out;
         *pval = entry->gentries[nr_entry].msi_ad[index];
     }
     else 
@@ -281,6 +285,7 @@ static int msixtbl_write(struct vcpu *v,
         {
             index = offset / sizeof(uint32_t);
             entry->gentries[nr_entry].msi_ad[index] = val;
+            acc_bit(set, entry, nr_entry, index);
         }
         set_bit(nr_entry, &entry->table_flags);
         goto out;




[-- Attachment #2: x86-vMSI-X-read-valid.patch --]
[-- Type: text/plain, Size: 1863 bytes --]

x86/vMSI-X: add valid bits for read acceleration

Again because Xen doesn't get to see all guest writes, it shouldn't
serve reads from its cache before having seen a write to the respective
address.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -153,12 +153,15 @@ struct msixtbl_entry
     /* TODO: resolve the potential race by destruction of pdev */
     struct pci_dev *pdev;
     unsigned long gtable;       /* gpa of msix table */
-    unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
+    DECLARE_BITMAP(table_flags, MAX_MSIX_TABLE_ENTRIES);
 #define MAX_MSIX_ACC_ENTRIES 3
     unsigned int table_len;
     struct { 
         uint32_t msi_ad[3];	/* Shadow of address low, high and data */
     } gentries[MAX_MSIX_ACC_ENTRIES];
+    DECLARE_BITMAP(acc_valid, 3 * MAX_MSIX_ACC_ENTRIES);
+#define acc_bit(what, ent, slot, idx) \
+        what##_bit((slot) * 3 + (idx), (ent)->acc_valid)
     struct rcu_head rcu;
 };
 
@@ -233,9 +236,10 @@ static int msixtbl_read(
     if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET )
     {
         nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
-        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
-            goto out;
         index = offset / sizeof(uint32_t);
+        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ||
+             !acc_bit(test, entry, nr_entry, index) )
+            goto out;
         *pval = entry->gentries[nr_entry].msi_ad[index];
     }
     else 
@@ -281,6 +285,7 @@ static int msixtbl_write(struct vcpu *v,
         {
             index = offset / sizeof(uint32_t);
             entry->gentries[nr_entry].msi_ad[index] = val;
+            acc_bit(set, entry, nr_entry, index);
         }
         set_bit(nr_entry, &entry->table_flags);
         goto out;

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration
  2015-03-20 16:27 ` [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration Jan Beulich
@ 2015-03-23 19:11   ` Konrad Rzeszutek Wilk
  2015-03-24  7:49     ` Jan Beulich
  2015-04-14 14:37   ` Andrew Cooper
  1 sibling, 1 reply; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-03-23 19:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Andrew Cooper

On Fri, Mar 20, 2015 at 04:27:57PM +0000, Jan Beulich wrote:
> Again because Xen doesn't get to see all guest writes, it shouldn't
> serve reads from its cache before having seen a write to the respective
> address.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> 
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -153,12 +153,15 @@ struct msixtbl_entry
>      /* TODO: resolve the potential race by destruction of pdev */
>      struct pci_dev *pdev;
>      unsigned long gtable;       /* gpa of msix table */
> -    unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
> +    DECLARE_BITMAP(table_flags, MAX_MSIX_TABLE_ENTRIES);

That seems unrelated to this patch? Perhaps mention the cleanup
part in the commit.

>  #define MAX_MSIX_ACC_ENTRIES 3
>      unsigned int table_len;
>      struct { 
>          uint32_t msi_ad[3];	/* Shadow of address low, high and data */
>      } gentries[MAX_MSIX_ACC_ENTRIES];
> +    DECLARE_BITMAP(acc_valid, 3 * MAX_MSIX_ACC_ENTRIES);
> +#define acc_bit(what, ent, slot, idx) \
> +        what##_bit((slot) * 3 + (idx), (ent)->acc_valid)
>      struct rcu_head rcu;
>  };
>  
> @@ -233,9 +236,10 @@ static int msixtbl_read(
>      if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET )
>      {
>          nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
> -        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
> -            goto out;
>          index = offset / sizeof(uint32_t);
> +        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ||
> +             !acc_bit(test, entry, nr_entry, index) )
> +            goto out;
>          *pval = entry->gentries[nr_entry].msi_ad[index];
>      }
>      else 
> @@ -281,6 +285,7 @@ static int msixtbl_write(struct vcpu *v,
>          {
>              index = offset / sizeof(uint32_t);
>              entry->gentries[nr_entry].msi_ad[index] = val;
> +            acc_bit(set, entry, nr_entry, index);
>          }
>          set_bit(nr_entry, &entry->table_flags);
>          goto out;
> 
> 
> 

> x86/vMSI-X: add valid bits for read acceleration
> 
> Again because Xen doesn't get to see all guest writes, it shouldn't
> serve reads from its cache before having seen a write to the respective
> address.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -153,12 +153,15 @@ struct msixtbl_entry
>      /* TODO: resolve the potential race by destruction of pdev */
>      struct pci_dev *pdev;
>      unsigned long gtable;       /* gpa of msix table */
> -    unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
> +    DECLARE_BITMAP(table_flags, MAX_MSIX_TABLE_ENTRIES);
>  #define MAX_MSIX_ACC_ENTRIES 3
>      unsigned int table_len;
>      struct { 
>          uint32_t msi_ad[3];	/* Shadow of address low, high and data */
>      } gentries[MAX_MSIX_ACC_ENTRIES];
> +    DECLARE_BITMAP(acc_valid, 3 * MAX_MSIX_ACC_ENTRIES);
> +#define acc_bit(what, ent, slot, idx) \
> +        what##_bit((slot) * 3 + (idx), (ent)->acc_valid)
>      struct rcu_head rcu;
>  };
>  
> @@ -233,9 +236,10 @@ static int msixtbl_read(
>      if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET )
>      {
>          nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
> -        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
> -            goto out;
>          index = offset / sizeof(uint32_t);
> +        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ||
> +             !acc_bit(test, entry, nr_entry, index) )
> +            goto out;
>          *pval = entry->gentries[nr_entry].msi_ad[index];
>      }
>      else 
> @@ -281,6 +285,7 @@ static int msixtbl_write(struct vcpu *v,
>          {
>              index = offset / sizeof(uint32_t);
>              entry->gentries[nr_entry].msi_ad[index] = val;
> +            acc_bit(set, entry, nr_entry, index);
>          }
>          set_bit(nr_entry, &entry->table_flags);
>          goto out;

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/vMSI-X: honor all mask requests
  2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
@ 2015-03-23 19:15   ` Konrad Rzeszutek Wilk
  2015-03-25  5:40   ` Wu, Feng
  2015-04-14 14:17   ` Andrew Cooper
  2 siblings, 0 replies; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-03-23 19:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Andrew Cooper

On Fri, Mar 20, 2015 at 04:27:29PM +0000, Jan Beulich wrote:
> Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
> guests") didn't go far enough: it fixed an issue with unmasking, but
> left an issue with masking in place: Due to the (late) point in time
> when qemu requests the hypervisor to set up MSI-X interrupts (which is
> where the MMIO intercept gets put in place), the hypervisor doesn't
> see all guest writes, and hence shouldn't make assumptions on the state
> the virtual MSI-X resources are in. Bypassing the rest of the logic on
> a guest mask operation leads to
> 
> [00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is already enabled.
> 
> which surprisingly enough doesn't lead to the device not working
> anymore (I didn't dig in deep enough to figure out why that is). But it
> does prevent the IRQ to be migrated inside the guest, i.e. all
> interrupts will always arrive in vCPU 0.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
>          goto out;
>      }
>  
> -    /* exit to device model if address/data has been modified */
> -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> +    /* Exit to device model when unmasking and address/data got modified. */
> +    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
> +         test_and_clear_bit(nr_entry, &entry->table_flags) )
>      {
> -        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
> -            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> +        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
>          goto out;
>      }
>  
> 
> 
> 

> x86/vMSI-X: honor all mask requests
> 
> Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
> guests") didn't go far enough: it fixed an issue with unmasking, but
> left an issue with masking in place: Due to the (late) point in time
> when qemu requests the hypervisor to set up MSI-X interrupts (which is
> where the MMIO intercept gets put in place), the hypervisor doesn't
> see all guest writes, and hence shouldn't make assumptions on the state
> the virtual MSI-X resources are in. Bypassing the rest of the logic on
> a guest mask operation leads to
> 
> [00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is already enabled.
> 
> which surprisingly enough doesn't lead to the device not working
> anymore (I didn't dig in deep enough to figure out why that is). But it
> does prevent the IRQ to be migrated inside the guest, i.e. all
> interrupts will always arrive in vCPU 0.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
>          goto out;
>      }
>  
> -    /* exit to device model if address/data has been modified */
> -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> +    /* Exit to device model when unmasking and address/data got modified. */
> +    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
> +         test_and_clear_bit(nr_entry, &entry->table_flags) )
>      {
> -        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
> -            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> +        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
>          goto out;
>      }
>  

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration
  2015-03-23 19:11   ` Konrad Rzeszutek Wilk
@ 2015-03-24  7:49     ` Jan Beulich
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2015-03-24  7:49 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, Keir Fraser, xen-devel

>>> On 23.03.15 at 20:11, <konrad.wilk@oracle.com> wrote:
> On Fri, Mar 20, 2015 at 04:27:57PM +0000, Jan Beulich wrote:
>> Again because Xen doesn't get to see all guest writes, it shouldn't
>> serve reads from its cache before having seen a write to the respective
>> address.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
>> 
>> --- a/xen/arch/x86/hvm/vmsi.c
>> +++ b/xen/arch/x86/hvm/vmsi.c
>> @@ -153,12 +153,15 @@ struct msixtbl_entry
>>      /* TODO: resolve the potential race by destruction of pdev */
>>      struct pci_dev *pdev;
>>      unsigned long gtable;       /* gpa of msix table */
>> -    unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
>> +    DECLARE_BITMAP(table_flags, MAX_MSIX_TABLE_ENTRIES);
> 
> That seems unrelated to this patch? Perhaps mention the cleanup
> part in the commit.

Ah, yes, I meant to but forgot.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/vMSI-X: honor all mask requests
  2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
  2015-03-23 19:15   ` Konrad Rzeszutek Wilk
@ 2015-03-25  5:40   ` Wu, Feng
  2015-03-25  5:49     ` Wu, Feng
  2015-03-25 10:36     ` Jan Beulich
  2015-04-14 14:17   ` Andrew Cooper
  2 siblings, 2 replies; 12+ messages in thread
From: Wu, Feng @ 2015-03-25  5:40 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Keir Fraser, Wu, Feng



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org
> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich
> Sent: Saturday, March 21, 2015 12:27 AM
> To: xen-devel
> Cc: Andrew Cooper; Keir Fraser
> Subject: [Xen-devel] [PATCH 1/2] x86/vMSI-X: honor all mask requests
> 
> Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
> guests") didn't go far enough: it fixed an issue with unmasking, but
> left an issue with masking in place: Due to the (late) point in time
> when qemu requests the hypervisor to set up MSI-X interrupts (which is
> where the MMIO intercept gets put in place), the hypervisor doesn't
> see all guest writes, and hence shouldn't make assumptions on the state
> the virtual MSI-X resources are in. Bypassing the rest of the logic on
> a guest mask operation leads to
> 
> [00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is
> already enabled.
> 
> which surprisingly enough doesn't lead to the device not working
> anymore (I didn't dig in deep enough to figure out why that is). But it
> does prevent the IRQ to be migrated inside the guest, i.e. all
> interrupts will always arrive in vCPU 0.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
>          goto out;
>      }
> 
> -    /* exit to device model if address/data has been modified */
> -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> +    /* Exit to device model when unmasking and address/data got modified.
> */
> +    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
> +         test_and_clear_bit(nr_entry, &entry->table_flags) )

Is it more accurate as the follows?

If ( (offset == PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) &&
   !(val & PCI_MSIX_VECTOR_BITMASK) &&
   test_and_clear_bit(nr_entry, &entry->table_flags) )

BTW, how can I reproduce this issue?

Thanks,
Feng

>      {
> -        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
> -            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> +        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
>          goto out;
>      }
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/vMSI-X: honor all mask requests
  2015-03-25  5:40   ` Wu, Feng
@ 2015-03-25  5:49     ` Wu, Feng
  2015-03-25 10:36     ` Jan Beulich
  1 sibling, 0 replies; 12+ messages in thread
From: Wu, Feng @ 2015-03-25  5:49 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Keir Fraser, Wu, Feng



> -----Original Message-----
> From: Wu, Feng
> Sent: Wednesday, March 25, 2015 1:41 PM
> To: Jan Beulich; xen-devel
> Cc: Andrew Cooper; Keir Fraser; Wu, Feng
> Subject: RE: [Xen-devel] [PATCH 1/2] x86/vMSI-X: honor all mask requests
> 
> 
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org
> > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich
> > Sent: Saturday, March 21, 2015 12:27 AM
> > To: xen-devel
> > Cc: Andrew Cooper; Keir Fraser
> > Subject: [Xen-devel] [PATCH 1/2] x86/vMSI-X: honor all mask requests
> >
> > Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
> > guests") didn't go far enough: it fixed an issue with unmasking, but
> > left an issue with masking in place: Due to the (late) point in time
> > when qemu requests the hypervisor to set up MSI-X interrupts (which is
> > where the MMIO intercept gets put in place), the hypervisor doesn't
> > see all guest writes, and hence shouldn't make assumptions on the state
> > the virtual MSI-X resources are in. Bypassing the rest of the logic on
> > a guest mask operation leads to
> >
> > [00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is
> > already enabled.
> >
> > which surprisingly enough doesn't lead to the device not working
> > anymore (I didn't dig in deep enough to figure out why that is). But it
> > does prevent the IRQ to be migrated inside the guest, i.e. all
> > interrupts will always arrive in vCPU 0.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >
> > --- a/xen/arch/x86/hvm/vmsi.c
> > +++ b/xen/arch/x86/hvm/vmsi.c
> > @@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
> >          goto out;
> >      }
> >
> > -    /* exit to device model if address/data has been modified */
> > -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> > +    /* Exit to device model when unmasking and address/data got
> modified.
> > */
> > +    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
> > +         test_and_clear_bit(nr_entry, &entry->table_flags) )
> 
> Is it more accurate as the follows?
> 
> If ( (offset == PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) &&
>    !(val & PCI_MSIX_VECTOR_BITMASK) &&
>    test_and_clear_bit(nr_entry, &entry->table_flags) )

Oh, just found "offset == PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET" is
not needed here, it guarantees this when we get here.

Thanks,
Feng

> 
> BTW, how can I reproduce this issue?
> 
> Thanks,
> Feng
> 
> >      {
> > -        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
> > -            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> > +        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> >          goto out;
> >      }
> >
> >
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/vMSI-X: honor all mask requests
  2015-03-25  5:40   ` Wu, Feng
  2015-03-25  5:49     ` Wu, Feng
@ 2015-03-25 10:36     ` Jan Beulich
  1 sibling, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2015-03-25 10:36 UTC (permalink / raw)
  To: Feng Wu; +Cc: Andrew Cooper, Keir Fraser, xen-devel

>>> On 25.03.15 at 06:40, <feng.wu@intel.com> wrote:
> BTW, how can I reproduce this issue?

Just change affinity of an MSI-X IRQ inside the guest - at least that's
all it took for me to see it.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Ping: [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments
  2015-03-20 15:52 [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich
  2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
  2015-03-20 16:27 ` [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration Jan Beulich
@ 2015-04-14 10:11 ` Jan Beulich
  2 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2015-04-14 10:11 UTC (permalink / raw)
  To: Andrew Cooper, Keir Fraser; +Cc: xen-devel

>>> On 20.03.15 at 16:52, <JBeulich@suse.com> wrote:

This has been pending for almost 4 weeks now...

Jan

> Due to the (late) point in time when qemu requests the hypervisor to
> set up MSI-X interrupts (which is where the MMIO intercept gets put
> in place), the hypervisor doesn't see all guest writes, and hence
> shouldn't make assumptions on the state the virtual MSI-X resources
> are in.
> 
> 1: honor all mask requests
> 2: add valid bits for read acceleration
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/vMSI-X: honor all mask requests
  2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
  2015-03-23 19:15   ` Konrad Rzeszutek Wilk
  2015-03-25  5:40   ` Wu, Feng
@ 2015-04-14 14:17   ` Andrew Cooper
  2 siblings, 0 replies; 12+ messages in thread
From: Andrew Cooper @ 2015-04-14 14:17 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Keir Fraser

On 20/03/15 16:27, Jan Beulich wrote:
> Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
> guests") didn't go far enough: it fixed an issue with unmasking, but
> left an issue with masking in place: Due to the (late) point in time
> when qemu requests the hypervisor to set up MSI-X interrupts (which is
> where the MMIO intercept gets put in place), the hypervisor doesn't
> see all guest writes, and hence shouldn't make assumptions on the state
> the virtual MSI-X resources are in. Bypassing the rest of the logic on
> a guest mask operation leads to
>
> [00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is already enabled.
>
> which surprisingly enough doesn't lead to the device not working
> anymore (I didn't dig in deep enough to figure out why that is). But it
> does prevent the IRQ to be migrated inside the guest, i.e. all
> interrupts will always arrive in vCPU 0.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

>
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -286,11 +286,11 @@ static int msixtbl_write(struct vcpu *v,
>          goto out;
>      }
>  
> -    /* exit to device model if address/data has been modified */
> -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> +    /* Exit to device model when unmasking and address/data got modified. */
> +    if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
> +         test_and_clear_bit(nr_entry, &entry->table_flags) )
>      {
> -        if ( !(val & PCI_MSIX_VECTOR_BITMASK) )
> -            v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> +        v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
>          goto out;
>      }
>  
>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration
  2015-03-20 16:27 ` [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration Jan Beulich
  2015-03-23 19:11   ` Konrad Rzeszutek Wilk
@ 2015-04-14 14:37   ` Andrew Cooper
  1 sibling, 0 replies; 12+ messages in thread
From: Andrew Cooper @ 2015-04-14 14:37 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Keir Fraser

On 20/03/15 16:27, Jan Beulich wrote:
> Again because Xen doesn't get to see all guest writes, it shouldn't
> serve reads from its cache before having seen a write to the respective
> address.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-04-14 14:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-20 15:52 [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich
2015-03-20 16:27 ` [PATCH 1/2] x86/vMSI-X: honor all mask requests Jan Beulich
2015-03-23 19:15   ` Konrad Rzeszutek Wilk
2015-03-25  5:40   ` Wu, Feng
2015-03-25  5:49     ` Wu, Feng
2015-03-25 10:36     ` Jan Beulich
2015-04-14 14:17   ` Andrew Cooper
2015-03-20 16:27 ` [PATCH 2/2] x86/vMSI-X: add valid bits for read acceleration Jan Beulich
2015-03-23 19:11   ` Konrad Rzeszutek Wilk
2015-03-24  7:49     ` Jan Beulich
2015-04-14 14:37   ` Andrew Cooper
2015-04-14 10:11 ` Ping: [PATCH 0/2] x86/vMSI-X: table read/write emulation adjustments Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.