All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6] interrupts: allow guest to set/clear MSI-X mask bit
@ 2013-08-15 15:47 Joby Poriyath
  2013-08-16  9:55 ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Joby Poriyath @ 2013-08-15 15:47 UTC (permalink / raw)
  To: xen-devel; +Cc: andrew.cooper3, malcolm.crossley, keir, JBeulich

Guest needs the ability to enable and disable MSI-X interrupts
by setting the MSI-X control bit, for a passed-through device.
Guest is allowed to write MSI-X mask bit only if Xen *thinks*
that mask is clear (interrupts enabled). If the mask is set by
Xen (interrupts disabled), writes to mask bit by the guest is
ignored.

Currently, a write to MSI-X mask bit by the guest is silently
ignored.

A likely scenario is where we have a 82599 SR-IOV nic passed
through to a guest. From the guest if you do

  ifconfig <ETH_DEV> down
  ifconfig <ETH_DEV> up

the interrupts remain masked. On VF reset, the mask bit is set
by the controller. At this point, Xen is not aware that mask is set.
However, interrupts are enabled by VF driver by clearing the mask
bit by writing directly to BAR3 region containing the MSI-X table.

>From dom0, we can verify that
interrupts are being masked using 'xl debug-keys M'.

Initially, guest was allowed to modify MSI-X bit.
Later this behaviour was changed.
See changeset 74c213c506afcd74a8556dd092995fd4dc38b225.

Patch revision history
----------------------
v1: Initial patch to allow guest writes to MSI-X control bit
v2: retained the reserved bits while updating MSI-X control vector
    (only 1 bit is defined)
v3: Allow guest writes only when Xen view of MSI-X control bit is 0
v4: Added a warning if Xen thinks MSI-X control bit is masked,
    where in reality, it's not
v5 & v6: Added const-correctness

Signed-off-by: Joby Poriyath <joby.poriyath@citrix.com>
---
 xen/arch/x86/hvm/vmsi.c |   60 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 46 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 36de312..60f010d 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -169,6 +169,7 @@ struct msixtbl_entry
         uint32_t msi_ad[3];	/* Shadow of address low, high and data */
     } gentries[MAX_MSIX_ACC_ENTRIES];
     struct rcu_head rcu;
+    const struct pirq *pirq;
 };
 
 static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock);
@@ -254,6 +255,8 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
     void *virt;
     unsigned int nr_entry, index;
     int r = X86EMUL_UNHANDLEABLE;
+    unsigned long flags, orig;
+    struct irq_desc *desc;
 
     if ( len != 4 || (address & 3) )
         return r;
@@ -283,22 +286,49 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
     if ( !virt )
         goto out;
 
-    /* Do not allow the mask bit to be changed. */
-#if 0 /* XXX
-       * As the mask bit is the only defined bit in the word, and as the
-       * host MSI-X code doesn't preserve the other bits anyway, doing
-       * this is pointless. So for now just discard the write (also
-       * saving us from having to determine the matching irq_desc).
-       */
-    spin_lock_irqsave(&desc->lock, flags);
+    desc = pirq_spin_lock_irq_desc(entry->pirq, &flags);
+    if ( !desc )
+        goto out;
+
+    if ( !desc->msi_desc )
+        goto unlock;
+
     orig = readl(virt);
-    val &= ~PCI_MSIX_VECTOR_BITMASK;
-    val |= orig & PCI_MSIX_VECTOR_BITMASK;
+
+    /*
+     * Do not allow guest to modify MSI-X control bit if it is masked 
+     * by Xen. We'll only handle the case where Xen thinks that
+     * bit is unmasked, but hardware has silently masked the bit
+     * (in case of SR-IOV VF reset, etc). On the other hand, if Xen 
+     * thinks that the bit is masked, but it's really not, 
+     * we log a warning.
+     */
+    if ( desc->msi_desc->msi_attrib.masked )
+    {
+        if ( !(orig & PCI_MSIX_VECTOR_BITMASK) )
+            printk(XENLOG_WARNING "MSI-X control bit is unmasked when"
+                   " it is expected to be masked [%04x:%02x:%02x.%01x]\n", 
+                   entry->pdev->seg, entry->pdev->bus,
+                   PCI_SLOT(entry->pdev->devfn), 
+                   PCI_FUNC(entry->pdev->devfn));
+
+        goto unlock;
+    }
+
+    /*
+     * The mask bit is the only defined bit in the word. But we 
+     * ought to preserve the reserved bits. Clearing the reserved 
+     * bits can result in undefined behaviour (see PCI Local Bus
+     * Specification revision 2.3).
+     */
+    val &= PCI_MSIX_VECTOR_BITMASK;
+    val |= (orig & ~PCI_MSIX_VECTOR_BITMASK);
     writel(val, virt);
-    spin_unlock_irqrestore(&desc->lock, flags);
-#endif
 
+unlock:
+    spin_unlock_irqrestore(&desc->lock, flags);
     r = X86EMUL_OKAY;
+
 out:
     rcu_read_unlock(&msixtbl_rcu_lock);
     return r;
@@ -328,7 +358,8 @@ const struct hvm_mmio_handler msixtbl_mmio_handler = {
 static void add_msixtbl_entry(struct domain *d,
                               struct pci_dev *pdev,
                               uint64_t gtable,
-                              struct msixtbl_entry *entry)
+                              struct msixtbl_entry *entry,
+                              const struct pirq *pirq)
 {
     u32 len;
 
@@ -342,6 +373,7 @@ static void add_msixtbl_entry(struct domain *d,
     entry->table_len = len;
     entry->pdev = pdev;
     entry->gtable = (unsigned long) gtable;
+    entry->pirq = pirq;
 
     list_add_rcu(&entry->list, &d->arch.hvm_domain.msixtbl_list);
 }
@@ -404,7 +436,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq *pirq, uint64_t gtable)
 
     entry = new_entry;
     new_entry = NULL;
-    add_msixtbl_entry(d, pdev, gtable, entry);
+    add_msixtbl_entry(d, pdev, gtable, entry, pirq);
 
 found:
     atomic_inc(&entry->refcnt);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v6] interrupts: allow guest to set/clear MSI-X mask bit
  2013-08-15 15:47 [PATCH v6] interrupts: allow guest to set/clear MSI-X mask bit Joby Poriyath
@ 2013-08-16  9:55 ` Jan Beulich
  2013-08-16 11:16   ` Joby Poriyath
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2013-08-16  9:55 UTC (permalink / raw)
  To: Joby Poriyath; +Cc: andrew.cooper3, malcolm.crossley, keir, xen-devel

>>> On 15.08.13 at 17:47, Joby Poriyath <joby.poriyath@citrix.com> wrote:
> @@ -404,7 +436,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq *pirq, uint64_t gtable)
>  
>      entry = new_entry;
>      new_entry = NULL;
> -    add_msixtbl_entry(d, pdev, gtable, entry);
> +    add_msixtbl_entry(d, pdev, gtable, entry, pirq);
>  
>  found:
>      atomic_inc(&entry->refcnt);

Just noticed this "found" label here, which made me go back and
look at the whole function: Did you consider the case of there
already being an entry, and hence add_msixtbl_entry() not
getting called, and thus entry->pirq not getting set to what got
passed in here? I'm assuming that this is only ever the case if
for the entry found entry->pirq == pirq, but if I'm right with
this, adding a respective ASSERT() here would seem desirable.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v6] interrupts: allow guest to set/clear MSI-X mask bit
  2013-08-16  9:55 ` Jan Beulich
@ 2013-08-16 11:16   ` Joby Poriyath
  2013-08-16 13:10     ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Joby Poriyath @ 2013-08-16 11:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, malcolm.crossley, keir, xen-devel

On Fri, Aug 16, 2013 at 10:55:39AM +0100, Jan Beulich wrote:
> >>> On 15.08.13 at 17:47, Joby Poriyath <joby.poriyath@citrix.com> wrote:
> > @@ -404,7 +436,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq *pirq, uint64_t gtable)
> >  
> >      entry = new_entry;
> >      new_entry = NULL;
> > -    add_msixtbl_entry(d, pdev, gtable, entry);
> > +    add_msixtbl_entry(d, pdev, gtable, entry, pirq);
> >  
> >  found:
> >      atomic_inc(&entry->refcnt);
> 
> Just noticed this "found" label here, which made me go back and
> look at the whole function: Did you consider the case of there
> already being an entry, and hence add_msixtbl_entry() not
> getting called, and thus entry->pirq not getting set to what got
> passed in here? I'm assuming that this is only ever the case if
> for the entry found entry->pirq == pirq, but if I'm right with
> this, adding a respective ASSERT() here would seem desirable.

If there was an entry, that was there only because it was added using 
"add_msixtbl_entry" function, and hence entry->pirq would have been 
initialized with a valid pirq. And modification of msixtbl_list is 
protected with a spinlock on msixtbl_list_lock. So in any case (adding
for the first time, or finding an entry), "entry" would have been 
correctly initialized. So looks like it's safe.

Or, am I getting this wrong? 

Thanks,
Joby

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v6] interrupts: allow guest to set/clear MSI-X mask bit
  2013-08-16 11:16   ` Joby Poriyath
@ 2013-08-16 13:10     ` Jan Beulich
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2013-08-16 13:10 UTC (permalink / raw)
  To: Joby Poriyath; +Cc: andrew.cooper3, malcolm.crossley, keir, xen-devel

>>> On 16.08.13 at 13:16, Joby Poriyath <joby.poriyath@citrix.com> wrote:
> On Fri, Aug 16, 2013 at 10:55:39AM +0100, Jan Beulich wrote:
>> >>> On 15.08.13 at 17:47, Joby Poriyath <joby.poriyath@citrix.com> wrote:
>> > @@ -404,7 +436,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq 
> *pirq, uint64_t gtable)
>> >  
>> >      entry = new_entry;
>> >      new_entry = NULL;
>> > -    add_msixtbl_entry(d, pdev, gtable, entry);
>> > +    add_msixtbl_entry(d, pdev, gtable, entry, pirq);
>> >  
>> >  found:
>> >      atomic_inc(&entry->refcnt);
>> 
>> Just noticed this "found" label here, which made me go back and
>> look at the whole function: Did you consider the case of there
>> already being an entry, and hence add_msixtbl_entry() not
>> getting called, and thus entry->pirq not getting set to what got
>> passed in here? I'm assuming that this is only ever the case if
>> for the entry found entry->pirq == pirq, but if I'm right with
>> this, adding a respective ASSERT() here would seem desirable.
> 
> If there was an entry, that was there only because it was added using 
> "add_msixtbl_entry" function, and hence entry->pirq would have been 
> initialized with a valid pirq. And modification of msixtbl_list is 
> protected with a spinlock on msixtbl_list_lock. So in any case (adding
> for the first time, or finding an entry), "entry" would have been 
> correctly initialized. So looks like it's safe.
> 
> Or, am I getting this wrong? 

"a valid pirq" != "the same pirq"

I went through the code there and at the call site, and as said I
think the pirq should be the same, but I couldn't see proof of that.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-16 13:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-15 15:47 [PATCH v6] interrupts: allow guest to set/clear MSI-X mask bit Joby Poriyath
2013-08-16  9:55 ` Jan Beulich
2013-08-16 11:16   ` Joby Poriyath
2013-08-16 13:10     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.