All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] vpci: add support for SR-IOV capability
@ 2018-06-20 14:42 Roger Pau Monne
  2018-06-20 14:42 ` [PATCH 01/10] vpci: move lock Roger Pau Monne
                   ` (9 more replies)
  0 siblings, 10 replies; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

Hello,

The following series enables the usage of the SR-IOV capability by a PVH
Dom0. This allows Dom0 to enable virtual functions and access them as it
would do on bare metal.

No changes are needed in the Dom0 kernel in order to manage the PCIe
SR-IOV capability.

The first 9 patches are preparatory changes in order to support SR-IOV.
Patch 10 actually adds support for the capability.

The series has been tested with a Linux PVH Dom0 and an Intel I350 nic.

Thanks, Roger.

Roger Pau Monne (10):
  vpci: move lock
  vpci/msix: add lock to protect the list of MSIX regions
  vpci: add tear down functions
  vpci/msix: add teardown cleanup
  vpci/msi: add teardown cleanup
  vpci/header: add teardown cleanup
  rangeset: introduce rangeset_merge
  vpci/header: allow multiple map operations
  pci: add vpci hooks for device addition/removal
  vpci/sriov: add support for SR-IOV capability

 tools/tests/vpci/emul.h          |   5 +-
 tools/tests/vpci/main.c          |   4 +-
 xen/arch/arm/xen.lds.S           |   9 +-
 xen/arch/x86/hvm/hvm.c           |   1 +
 xen/arch/x86/hvm/vmsi.c          |   8 +-
 xen/arch/x86/xen.lds.S           |   9 +-
 xen/common/rangeset.c            |  12 ++
 xen/drivers/passthrough/pci.c    |   9 +
 xen/drivers/vpci/Makefile        |   2 +-
 xen/drivers/vpci/header.c        | 108 ++++++++++--
 xen/drivers/vpci/msi.c           |  34 +++-
 xen/drivers/vpci/msix.c          |  68 ++++++--
 xen/drivers/vpci/sriov.c         | 273 +++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c          |  73 ++++++---
 xen/include/asm-x86/hvm/domain.h |   1 +
 xen/include/xen/pci.h            |   1 +
 xen/include/xen/rangeset.h       |   3 +
 xen/include/xen/vpci.h           |  27 ++-
 18 files changed, 563 insertions(+), 84 deletions(-)
 create mode 100644 xen/drivers/vpci/sriov.c

-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/10] vpci: move lock
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 10:19   ` Wei Liu
  2021-11-23 12:20   ` Oleksandr Andrushchenko
  2018-06-20 14:42 ` [PATCH 02/10] vpci/msix: add lock to protect the list of MSIX regions Roger Pau Monne
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

To the outside of the vpci struct. This way the lock can be used to
check whether vpci is present, and removal can be performed while
holding the lock, in order to make sure there are no accesses to the
contents of the vpci struct. Previously removal could race with
vpci_read for example, since the log was dropped prior to freeing
pdev->vpci.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 tools/tests/vpci/emul.h       |  5 ++--
 tools/tests/vpci/main.c       |  4 +--
 xen/arch/x86/hvm/vmsi.c       |  8 +++---
 xen/drivers/passthrough/pci.c |  1 +
 xen/drivers/vpci/header.c     | 19 +++++++++----
 xen/drivers/vpci/msi.c        | 11 +++++---
 xen/drivers/vpci/msix.c       |  8 +++---
 xen/drivers/vpci/vpci.c       | 51 ++++++++++++++++++++++-------------
 xen/include/xen/pci.h         |  1 +
 xen/include/xen/vpci.h        |  3 +--
 10 files changed, 70 insertions(+), 41 deletions(-)

diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h
index 5d47544bf7..d344ef71c9 100644
--- a/tools/tests/vpci/emul.h
+++ b/tools/tests/vpci/emul.h
@@ -44,6 +44,7 @@ struct domain {
 };
 
 struct pci_dev {
+    bool vpci_lock;
     struct vpci *vpci;
 };
 
@@ -53,10 +54,8 @@ struct vcpu
 };
 
 extern const struct vcpu *current;
-extern const struct pci_dev test_pdev;
+extern struct pci_dev test_pdev;
 
-typedef bool spinlock_t;
-#define spin_lock_init(l) (*(l) = false)
 #define spin_lock(l) (*(l) = true)
 #define spin_unlock(l) (*(l) = false)
 
diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
index b9a0a6006b..26c95b08b6 100644
--- a/tools/tests/vpci/main.c
+++ b/tools/tests/vpci/main.c
@@ -23,7 +23,8 @@ static struct vpci vpci;
 
 const static struct domain d;
 
-const struct pci_dev test_pdev = {
+struct pci_dev test_pdev = {
+    .vpci_lock = false,
     .vpci = &vpci,
 };
 
@@ -158,7 +159,6 @@ main(int argc, char **argv)
     int rc;
 
     INIT_LIST_HEAD(&vpci.handlers);
-    spin_lock_init(&vpci.lock);
 
     VPCI_ADD_REG(vpci_read32, vpci_write32, 0, 4, r0);
     VPCI_READ_CHECK(0, 4, r0);
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 3001d5c488..94550cb8c4 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -893,14 +893,14 @@ int vpci_msix_arch_print(const struct vpci_msix *msix)
         {
             struct pci_dev *pdev = msix->pdev;
 
-            spin_unlock(&msix->pdev->vpci->lock);
+            spin_unlock(&msix->pdev->vpci_lock);
             process_pending_softirqs();
             /* NB: we assume that pdev cannot go away for an alive domain. */
-            if ( !pdev->vpci || !spin_trylock(&pdev->vpci->lock) )
+            if ( !spin_trylock(&pdev->vpci_lock) )
                 return -EBUSY;
-            if ( pdev->vpci->msix != msix )
+            if ( !pdev->vpci || pdev->vpci->msix != msix )
             {
-                spin_unlock(&pdev->vpci->lock);
+                spin_unlock(&pdev->vpci_lock);
                 return -EAGAIN;
             }
         }
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c4890a4295..a5d59b83b7 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -315,6 +315,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
     *((u8*) &pdev->devfn) = devfn;
     pdev->domain = NULL;
     INIT_LIST_HEAD(&pdev->msi_list);
+    spin_lock_init(&pdev->vpci_lock);
 
     if ( pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
                              PCI_CAP_ID_MSIX) )
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 0ec4c082a6..9d5607d5f8 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -131,11 +131,12 @@ bool vpci_process_pending(struct vcpu *v)
         if ( rc == -ERESTART )
             return true;
 
-        spin_lock(&v->vpci.pdev->vpci->lock);
-        /* Disable memory decoding unconditionally on failure. */
-        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
-                        !rc && v->vpci.rom_only);
-        spin_unlock(&v->vpci.pdev->vpci->lock);
+        spin_lock(&v->vpci.pdev->vpci_lock);
+        if ( v->vpci.pdev->vpci )
+            /* Disable memory decoding unconditionally on failure. */
+            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
+                            !rc && v->vpci.rom_only);
+        spin_unlock(&v->vpci.pdev->vpci_lock);
 
         rangeset_destroy(v->vpci.mem);
         v->vpci.mem = NULL;
@@ -267,6 +268,12 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
                 continue;
         }
 
+        spin_lock(&tmp->vpci_lock);
+        if ( !tmp->vpci )
+        {
+            spin_unlock(&tmp->vpci_lock);
+            continue;
+        }
         for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
         {
             const struct vpci_bar *bar = &tmp->vpci->header.bars[i];
@@ -285,12 +292,14 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
             rc = rangeset_remove_range(mem, start, end);
             if ( rc )
             {
+                spin_unlock(&tmp->vpci_lock);
                 printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: %d\n",
                        start, end, rc);
                 rangeset_destroy(mem);
                 return rc;
             }
         }
+        spin_unlock(&tmp->vpci_lock);
     }
 
     ASSERT(dev);
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 8f15ad7bf2..108e871d1c 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -275,7 +275,7 @@ void vpci_dump_msi(void)
     rcu_read_lock(&domlist_read_lock);
     for_each_domain ( d )
     {
-        const struct pci_dev *pdev;
+        struct pci_dev *pdev;
 
         if ( !has_vpci(d) )
             continue;
@@ -287,8 +287,13 @@ void vpci_dump_msi(void)
             const struct vpci_msi *msi;
             const struct vpci_msix *msix;
 
-            if ( !pdev->vpci || !spin_trylock(&pdev->vpci->lock) )
+            if ( !spin_trylock(&pdev->vpci_lock) )
                 continue;
+            if ( !pdev->vpci )
+            {
+                spin_unlock(&pdev->vpci_lock);
+                continue;
+            }
 
             msi = pdev->vpci->msi;
             if ( msi && msi->enabled )
@@ -330,7 +335,7 @@ void vpci_dump_msi(void)
                 }
             }
 
-            spin_unlock(&pdev->vpci->lock);
+            spin_unlock(&pdev->vpci_lock);
             process_pending_softirqs();
         }
     }
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index bcf63256f6..e28096329d 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -236,7 +236,7 @@ static int msix_read(struct vcpu *v, unsigned long addr, unsigned int len,
         return X86EMUL_OKAY;
     }
 
-    spin_lock(&msix->pdev->vpci->lock);
+    spin_lock(&msix->pdev->vpci_lock);
     entry = get_entry(msix, addr);
     offset = addr & (PCI_MSIX_ENTRY_SIZE - 1);
 
@@ -265,7 +265,7 @@ static int msix_read(struct vcpu *v, unsigned long addr, unsigned int len,
         ASSERT_UNREACHABLE();
         break;
     }
-    spin_unlock(&msix->pdev->vpci->lock);
+    spin_unlock(&msix->pdev->vpci_lock);
 
     return X86EMUL_OKAY;
 }
@@ -308,7 +308,7 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
         return X86EMUL_OKAY;
     }
 
-    spin_lock(&msix->pdev->vpci->lock);
+    spin_lock(&msix->pdev->vpci_lock);
     entry = get_entry(msix, addr);
     offset = addr & (PCI_MSIX_ENTRY_SIZE - 1);
 
@@ -384,7 +384,7 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
         ASSERT_UNREACHABLE();
         break;
     }
-    spin_unlock(&msix->pdev->vpci->lock);
+    spin_unlock(&msix->pdev->vpci_lock);
 
     return X86EMUL_OKAY;
 }
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 82607bdb9a..7d52bcf8d0 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -35,9 +35,8 @@ extern vpci_register_init_t *const __start_vpci_array[];
 extern vpci_register_init_t *const __end_vpci_array[];
 #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
 
-void vpci_remove_device(struct pci_dev *pdev)
+static void vpci_remove_device_locked(struct pci_dev *pdev)
 {
-    spin_lock(&pdev->vpci->lock);
     while ( !list_empty(&pdev->vpci->handlers) )
     {
         struct vpci_register *r = list_first_entry(&pdev->vpci->handlers,
@@ -47,13 +46,20 @@ void vpci_remove_device(struct pci_dev *pdev)
         list_del(&r->node);
         xfree(r);
     }
-    spin_unlock(&pdev->vpci->lock);
     xfree(pdev->vpci->msix);
     xfree(pdev->vpci->msi);
     xfree(pdev->vpci);
     pdev->vpci = NULL;
 }
 
+void vpci_remove_device(struct pci_dev *pdev)
+{
+    spin_lock(&pdev->vpci_lock);
+    vpci_remove_device_locked(pdev);
+    spin_unlock(&pdev->vpci_lock);
+}
+
+
 int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
 {
     unsigned int i;
@@ -62,12 +68,15 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
     if ( !has_vpci(pdev->domain) )
         return 0;
 
+    spin_lock(&pdev->vpci_lock);
     pdev->vpci = xzalloc(struct vpci);
     if ( !pdev->vpci )
+    {
+        spin_unlock(&pdev->vpci_lock);
         return -ENOMEM;
+    }
 
     INIT_LIST_HEAD(&pdev->vpci->handlers);
-    spin_lock_init(&pdev->vpci->lock);
 
     for ( i = 0; i < NUM_VPCI_INIT; i++ )
     {
@@ -77,7 +86,8 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
     }
 
     if ( rc )
-        vpci_remove_device(pdev);
+        vpci_remove_device_locked(pdev);
+    spin_unlock(&pdev->vpci_lock);
 
     return rc;
 }
@@ -148,8 +158,6 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
     r->offset = offset;
     r->private = data;
 
-    spin_lock(&vpci->lock);
-
     /* The list of handlers must be kept sorted at all times. */
     list_for_each ( prev, &vpci->handlers )
     {
@@ -161,14 +169,12 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
             break;
         if ( cmp == 0 )
         {
-            spin_unlock(&vpci->lock);
             xfree(r);
             return -EEXIST;
         }
     }
 
     list_add_tail(&r->node, prev);
-    spin_unlock(&vpci->lock);
 
     return 0;
 }
@@ -179,7 +185,6 @@ int vpci_remove_register(struct vpci *vpci, unsigned int offset,
     const struct vpci_register r = { .offset = offset, .size = size };
     struct vpci_register *rm;
 
-    spin_lock(&vpci->lock);
     list_for_each_entry ( rm, &vpci->handlers, node )
     {
         int cmp = vpci_register_cmp(&r, rm);
@@ -191,14 +196,12 @@ int vpci_remove_register(struct vpci *vpci, unsigned int offset,
         if ( !cmp && rm->offset == offset && rm->size == size )
         {
             list_del(&rm->node);
-            spin_unlock(&vpci->lock);
             xfree(rm);
             return 0;
         }
         if ( cmp <= 0 )
             break;
     }
-    spin_unlock(&vpci->lock);
 
     return -ENOENT;
 }
@@ -315,7 +318,7 @@ static uint32_t merge_result(uint32_t data, uint32_t new, unsigned int size,
 uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
 {
     const struct domain *d = current->domain;
-    const struct pci_dev *pdev;
+    struct pci_dev *pdev;
     const struct vpci_register *r;
     unsigned int data_offset = 0;
     uint32_t data = ~(uint32_t)0;
@@ -331,7 +334,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
     if ( !pdev )
         return vpci_read_hw(sbdf, reg, size);
 
-    spin_lock(&pdev->vpci->lock);
+    spin_lock(&pdev->vpci_lock);
+    if ( !pdev->vpci )
+    {
+        spin_unlock(&pdev->vpci_lock);
+        return vpci_read_hw(sbdf, reg, size);
+    }
 
     /* Read from the hardware or the emulated register handlers. */
     list_for_each_entry ( r, &pdev->vpci->handlers, node )
@@ -383,7 +391,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
 
         data = merge_result(data, tmp_data, size - data_offset, data_offset);
     }
-    spin_unlock(&pdev->vpci->lock);
+    spin_unlock(&pdev->vpci_lock);
 
     return data & (0xffffffff >> (32 - 8 * size));
 }
@@ -418,7 +426,7 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
                 uint32_t data)
 {
     const struct domain *d = current->domain;
-    const struct pci_dev *pdev;
+    struct pci_dev *pdev;
     const struct vpci_register *r;
     unsigned int data_offset = 0;
 
@@ -439,7 +447,14 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
         return;
     }
 
-    spin_lock(&pdev->vpci->lock);
+    spin_lock(&pdev->vpci_lock);
+    if ( !pdev->vpci )
+    {
+        spin_unlock(&pdev->vpci_lock);
+        vpci_write_hw(sbdf, reg, size, data);
+        return;
+    }
+
 
     /* Write the value to the hardware or emulated registers. */
     list_for_each_entry ( r, &pdev->vpci->handlers, node )
@@ -480,7 +495,7 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
         vpci_write_hw(sbdf, reg + data_offset, size - data_offset,
                       data >> (data_offset * 8));
 
-    spin_unlock(&pdev->vpci->lock);
+    spin_unlock(&pdev->vpci_lock);
 }
 
 /*
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 4cfa774615..e554c14b2f 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -114,6 +114,7 @@ struct pci_dev {
     u64 vf_rlen[6];
 
     /* Data for vPCI. */
+    spinlock_t vpci_lock;
     struct vpci *vpci;
 };
 
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index af2b8580ee..98556d31ed 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -29,7 +29,7 @@ int __must_check vpci_add_handlers(struct pci_dev *dev);
 /* Remove all handlers and free vpci related structures. */
 void vpci_remove_device(struct pci_dev *pdev);
 
-/* Add/remove a register handler. */
+/* Add/remove a register handler. Must be called holding the vpci_lock. */
 int __must_check vpci_add_register(struct vpci *vpci,
                                    vpci_read_t *read_handler,
                                    vpci_write_t *write_handler,
@@ -58,7 +58,6 @@ bool __must_check vpci_process_pending(struct vcpu *v);
 struct vpci {
     /* List of vPCI handlers for a device. */
     struct list_head handlers;
-    spinlock_t lock;
 
 #ifdef __XEN__
     /* Hide the rest of the vpci struct from the user-space test harness. */
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/10] vpci/msix: add lock to protect the list of MSIX regions
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
  2018-06-20 14:42 ` [PATCH 01/10] vpci: move lock Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 10:29   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 03/10] vpci: add tear down functions Roger Pau Monne
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

This is required in order to allow run-time removal of MSI-X regions.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/hvm/hvm.c           |  1 +
 xen/drivers/vpci/msix.c          | 17 +++++++++++------
 xen/include/asm-x86/hvm/domain.h |  1 +
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 93092d2bb8..d29c824353 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -587,6 +587,7 @@ int hvm_domain_initialise(struct domain *d)
     spin_lock_init(&d->arch.hvm_domain.uc_lock);
     spin_lock_init(&d->arch.hvm_domain.write_map.lock);
     rwlock_init(&d->arch.hvm_domain.mmcfg_lock);
+    rwlock_init(&d->arch.hvm_domain.msix_lock);
     INIT_LIST_HEAD(&d->arch.hvm_domain.write_map.list);
     INIT_LIST_HEAD(&d->arch.hvm_domain.g2m_ioport_list);
     INIT_LIST_HEAD(&d->arch.hvm_domain.mmcfg_regions);
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index e28096329d..0b43f5eab9 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -148,10 +148,11 @@ static void control_write(const struct pci_dev *pdev, unsigned int reg,
         pci_conf_write16(pdev->seg, pdev->bus, slot, func, reg, val);
 }
 
-static struct vpci_msix *msix_find(const struct domain *d, unsigned long addr)
+static struct vpci_msix *msix_find(struct domain *d, unsigned long addr)
 {
     struct vpci_msix *msix;
 
+    read_lock(&d->arch.hvm_domain.msix_lock);
     list_for_each_entry ( msix, &d->arch.hvm_domain.msix_tables, next )
     {
         const struct vpci_bar *bars = msix->pdev->vpci->header.bars;
@@ -160,8 +161,12 @@ static struct vpci_msix *msix_find(const struct domain *d, unsigned long addr)
         for ( i = 0; i < ARRAY_SIZE(msix->tables); i++ )
             if ( bars[msix->tables[i] & PCI_MSIX_BIRMASK].enabled &&
                  VMSIX_ADDR_IN_RANGE(addr, msix->pdev->vpci, i) )
+            {
+                read_unlock(&d->arch.hvm_domain.msix_lock);
                 return msix;
+            }
     }
+    read_unlock(&d->arch.hvm_domain.msix_lock);
 
     return NULL;
 }
@@ -196,8 +201,7 @@ static struct vpci_msix_entry *get_entry(struct vpci_msix *msix,
 static int msix_read(struct vcpu *v, unsigned long addr, unsigned int len,
                      unsigned long *data)
 {
-    const struct domain *d = v->domain;
-    struct vpci_msix *msix = msix_find(d, addr);
+    struct vpci_msix *msix = msix_find(v->domain, addr);
     const struct vpci_msix_entry *entry;
     unsigned int offset;
 
@@ -273,8 +277,7 @@ static int msix_read(struct vcpu *v, unsigned long addr, unsigned int len,
 static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
                       unsigned long data)
 {
-    const struct domain *d = v->domain;
-    struct vpci_msix *msix = msix_find(d, addr);
+    struct vpci_msix *msix = msix_find(v->domain, addr);
     struct vpci_msix_entry *entry;
     unsigned int offset;
 
@@ -287,7 +290,7 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
     if ( VMSIX_ADDR_IN_RANGE(addr, msix->pdev->vpci, VPCI_MSIX_PBA) )
     {
         /* Ignore writes to PBA for DomUs, it's behavior is undefined. */
-        if ( is_hardware_domain(d) )
+        if ( is_hardware_domain(v->domain) )
         {
             switch ( len )
             {
@@ -441,7 +444,9 @@ static int init_msix(struct pci_dev *pdev)
     if ( list_empty(&d->arch.hvm_domain.msix_tables) )
         register_mmio_handler(d, &vpci_msix_table_ops);
 
+    write_lock(&d->arch.hvm_domain.msix_lock);
     list_add(&pdev->vpci->msix->next, &d->arch.hvm_domain.msix_tables);
+    write_unlock(&d->arch.hvm_domain.msix_lock);
 
     return 0;
 }
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 588595059d..181f6a2704 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -187,6 +187,7 @@ struct hvm_domain {
 
     /* List of MSI-X tables. */
     struct list_head msix_tables;
+    rwlock_t msix_lock;
 
     /* List of permanently write-mapped pages. */
     struct {
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/10] vpci: add tear down functions
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
  2018-06-20 14:42 ` [PATCH 01/10] vpci: move lock Roger Pau Monne
  2018-06-20 14:42 ` [PATCH 02/10] vpci/msix: add lock to protect the list of MSIX regions Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 10:37   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 04/10] vpci/msix: add teardown cleanup Roger Pau Monne
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

Tear down functions are not mandatory. Note that this patch just
implements the framework, but doesn't implement any tear down function
yet.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/arm/xen.lds.S    |  9 +--------
 xen/arch/x86/xen.lds.S    |  9 +--------
 xen/drivers/vpci/header.c |  2 +-
 xen/drivers/vpci/msi.c    |  2 +-
 xen/drivers/vpci/msix.c   |  2 +-
 xen/drivers/vpci/vpci.c   | 20 +++++++++++++++++---
 xen/include/xen/vpci.h    | 15 +++++++++++----
 7 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 245a0e0e85..2c6a09c59d 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -66,7 +66,7 @@ SECTIONS
        *(.data.param)
        __param_end = .;
 
-#if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
+#if defined(CONFIG_HAS_VPCI)
        . = ALIGN(POINTER_ALIGN);
        __start_vpci_array = .;
        *(SORT(.data.vpci.*))
@@ -178,13 +178,6 @@ SECTIONS
        *(.init_array)
        *(SORT(.init_array.*))
        __ctors_end = .;
-
-#if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
-       . = ALIGN(POINTER_ALIGN);
-       __start_vpci_array = .;
-       *(SORT(.data.vpci.*))
-       __end_vpci_array = .;
-#endif
   } :text
   __init_end_efi = .;
   . = ALIGN(STACK_SIZE);
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 70afedd31d..6b81ae9ce4 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -136,7 +136,7 @@ SECTIONS
        *(.data.param)
        __param_end = .;
 
-#if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
+#if defined(CONFIG_HAS_VPCI)
        . = ALIGN(POINTER_ALIGN);
        __start_vpci_array = .;
        *(SORT(.data.vpci.*))
@@ -242,13 +242,6 @@ SECTIONS
        *(.init_array)
        *(SORT(.init_array.*))
        __ctors_end = .;
-
-#if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
-       . = ALIGN(POINTER_ALIGN);
-       __start_vpci_array = .;
-       *(SORT(.data.vpci.*))
-       __end_vpci_array = .;
-#endif
   } :text
 
   . = ALIGN(SECTION_ALIGN);
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 9d5607d5f8..4363270a55 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -560,7 +560,7 @@ static int init_bars(struct pci_dev *pdev)
 
     return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
 }
-REGISTER_VPCI_INIT(init_bars, VPCI_PRIORITY_MIDDLE);
+REGISTER_VPCI_INIT(init_bars, NULL, VPCI_PRIORITY_MIDDLE);
 
 /*
  * Local variables:
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 108e871d1c..5bb505c864 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -266,7 +266,7 @@ static int init_msi(struct pci_dev *pdev)
 
     return 0;
 }
-REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
+REGISTER_VPCI_INIT(init_msi, NULL, VPCI_PRIORITY_LOW);
 
 void vpci_dump_msi(void)
 {
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 0b43f5eab9..6132f576b6 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -450,7 +450,7 @@ static int init_msix(struct pci_dev *pdev)
 
     return 0;
 }
-REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
+REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
 
 /*
  * Local variables:
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 7d52bcf8d0..b3968d6523 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -31,12 +31,26 @@ struct vpci_register {
 };
 
 #ifdef __XEN__
-extern vpci_register_init_t *const __start_vpci_array[];
-extern vpci_register_init_t *const __end_vpci_array[];
+extern const struct vpci_handler __start_vpci_array[];
+extern const struct vpci_handler __end_vpci_array[];
 #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
 
 static void vpci_remove_device_locked(struct pci_dev *pdev)
 {
+    unsigned int i;
+
+    if ( !pdev->vpci )
+        return;
+
+    for ( i = 0; i < NUM_VPCI_INIT; i++ )
+    {
+        vpci_teardown_t *teardown =
+            __start_vpci_array[NUM_VPCI_INIT - i - 1].teardown;
+
+        if ( !teardown )
+            continue;
+        teardown(pdev);
+    }
     while ( !list_empty(&pdev->vpci->handlers) )
     {
         struct vpci_register *r = list_first_entry(&pdev->vpci->handlers,
@@ -80,7 +94,7 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
 
     for ( i = 0; i < NUM_VPCI_INIT; i++ )
     {
-        rc = __start_vpci_array[i](pdev);
+        rc = __start_vpci_array[i].init(pdev);
         if ( rc )
             break;
     }
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 98556d31ed..208667227e 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -13,15 +13,22 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
 typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
                           uint32_t val, void *data);
 
-typedef int vpci_register_init_t(struct pci_dev *dev);
+typedef int vpci_init_t(struct pci_dev *dev);
+typedef void vpci_teardown_t(struct pci_dev *dev);
+
+struct vpci_handler {
+    vpci_init_t *init;
+    vpci_teardown_t *teardown;
+};
 
 #define VPCI_PRIORITY_HIGH      "1"
 #define VPCI_PRIORITY_MIDDLE    "5"
 #define VPCI_PRIORITY_LOW       "9"
 
-#define REGISTER_VPCI_INIT(x, p)                \
-  static vpci_register_init_t *const x##_entry  \
-               __used_section(".data.vpci." p) = x
+#define REGISTER_VPCI_INIT(i, t, p)                                     \
+  const static struct vpci_handler i ## t ## _entry                     \
+               __used_section(".data.vpci." p) = { .init = (i),         \
+                                                   .teardown = (t), }
 
 /* Add vPCI handlers to device. */
 int __must_check vpci_add_handlers(struct pci_dev *dev);
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (2 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 03/10] vpci: add tear down functions Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 10:52   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 05/10] vpci/msi: " Roger Pau Monne
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

So that interrupts are properly freed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/vpci/msix.c | 43 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 6132f576b6..cfca1cd43a 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -450,7 +450,48 @@ static int init_msix(struct pci_dev *pdev)
 
     return 0;
 }
-REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
+
+static void teardown_msix(struct pci_dev *pdev)
+{
+    struct vpci_msix *msix = pdev->vpci->msix;
+    unsigned int i;
+
+    if ( !msix )
+        return;
+
+    if ( msix->enabled )
+    {
+        /* Disable MSIX. */
+        unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
+                                               PCI_SLOT(pdev->devfn),
+                                               PCI_FUNC(pdev->devfn),
+                                               PCI_CAP_ID_MSIX);
+        uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
+                                           PCI_SLOT(pdev->devfn),
+                                           PCI_FUNC(pdev->devfn),
+                                           msix_control_reg(pos));
+
+        pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                         PCI_FUNC(pdev->devfn), msix_control_reg(pos),
+                         (control & ~PCI_MSIX_FLAGS_ENABLE));
+    }
+
+    write_lock(&pdev->domain->arch.hvm_domain.msix_lock);
+    list_del(&pdev->vpci->msix->next);
+    write_unlock(&pdev->domain->arch.hvm_domain.msix_lock);
+
+    for ( i = 0; i < msix->max_entries && msix->enabled; i++ )
+    {
+        int rc = vpci_msix_arch_disable_entry(&msix->entries[i], pdev);
+
+        if ( rc && rc != -ENOENT )
+            gprintk(XENLOG_WARNING,
+                    "%04x:%02x:%02x.%u: unable to disable MSIX entry %u: %d\n",
+                    pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                    PCI_FUNC(pdev->devfn), i, rc);
+    }
+}
+REGISTER_VPCI_INIT(init_msix, teardown_msix, VPCI_PRIORITY_HIGH);
 
 /*
  * Local variables:
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/10] vpci/msi: add teardown cleanup
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (3 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 04/10] vpci/msix: add teardown cleanup Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 10:52   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 06/10] vpci/header: " Roger Pau Monne
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

So that interrupts are properly freed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/vpci/msi.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 5bb505c864..e8cd1238df 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -266,7 +266,28 @@ static int init_msi(struct pci_dev *pdev)
 
     return 0;
 }
-REGISTER_VPCI_INIT(init_msi, NULL, VPCI_PRIORITY_LOW);
+
+static void teardown_msi(struct pci_dev *pdev)
+{
+    unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
+                                           PCI_SLOT(pdev->devfn),
+                                           PCI_FUNC(pdev->devfn),
+                                           PCI_CAP_ID_MSI);
+    struct vpci_msi *msi = pdev->vpci->msi;
+    uint16_t control;
+
+    if ( !msi || !msi->enabled )
+        return;
+
+    control = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                              PCI_FUNC(pdev->devfn), msi_control_reg(pos));
+    pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                     PCI_FUNC(pdev->devfn), msi_control_reg(pos),
+                     (control & ~PCI_MSI_FLAGS_ENABLE));
+
+    vpci_msi_arch_disable(msi, pdev);
+}
+REGISTER_VPCI_INIT(init_msi, teardown_msi, VPCI_PRIORITY_LOW);
 
 void vpci_dump_msi(void)
 {
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/10] vpci/header: add teardown cleanup
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (4 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 05/10] vpci/msi: " Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 11:15   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 07/10] rangeset: introduce rangeset_merge Roger Pau Monne
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

In order to unmap the BARs

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/vpci/header.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 4363270a55..686e04e35a 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -131,12 +131,15 @@ bool vpci_process_pending(struct vcpu *v)
         if ( rc == -ERESTART )
             return true;
 
-        spin_lock(&v->vpci.pdev->vpci_lock);
-        if ( v->vpci.pdev->vpci )
-            /* Disable memory decoding unconditionally on failure. */
-            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
-                            !rc && v->vpci.rom_only);
-        spin_unlock(&v->vpci.pdev->vpci_lock);
+        if ( v->vpci.pdev )
+        {
+            spin_lock(&v->vpci.pdev->vpci_lock);
+            if ( v->vpci.pdev->vpci )
+                /* Disable memory decoding unconditionally on failure. */
+                modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
+                                !rc && v->vpci.rom_only);
+            spin_unlock(&v->vpci.pdev->vpci_lock);
+        }
 
         rangeset_destroy(v->vpci.mem);
         v->vpci.mem = NULL;
@@ -560,7 +563,20 @@ static int init_bars(struct pci_dev *pdev)
 
     return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
 }
-REGISTER_VPCI_INIT(init_bars, NULL, VPCI_PRIORITY_MIDDLE);
+
+static void teardown_bars(struct pci_dev *pdev)
+{
+    uint16_t cmd = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                                   PCI_FUNC(pdev->devfn), PCI_COMMAND);
+
+    if ( cmd & PCI_COMMAND_MEMORY )
+    {
+        /* Unmap all BARs from guest p2m. */
+        modify_bars(pdev, false, false);
+        current->vpci.pdev = NULL;
+    }
+}
+REGISTER_VPCI_INIT(init_bars, teardown_bars, VPCI_PRIORITY_MIDDLE);
 
 /*
  * Local variables:
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/10] rangeset: introduce rangeset_merge
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (5 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 06/10] vpci/header: " Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 11:23   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 08/10] vpci/header: allow multiple map operations Roger Pau Monne
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

This new helper will merge two rangesets.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/rangeset.c      | 12 ++++++++++++
 xen/include/xen/rangeset.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/xen/common/rangeset.c b/xen/common/rangeset.c
index bb68ce62e4..195347669e 100644
--- a/xen/common/rangeset.c
+++ b/xen/common/rangeset.c
@@ -378,6 +378,18 @@ int rangeset_consume_ranges(struct rangeset *r,
     return rc;
 }
 
+static int merge(unsigned long s, unsigned long e, void *data)
+{
+    struct rangeset *r = data;
+
+    return rangeset_add_range(r, s, e);
+}
+
+int rangeset_merge(struct rangeset *r1, struct rangeset *r2)
+{
+    return rangeset_report_ranges(r2, 0, ~0ul, merge, r1);
+}
+
 int rangeset_add_singleton(
     struct rangeset *r, unsigned long s)
 {
diff --git a/xen/include/xen/rangeset.h b/xen/include/xen/rangeset.h
index 583b72bb0c..0c05c2fd4e 100644
--- a/xen/include/xen/rangeset.h
+++ b/xen/include/xen/rangeset.h
@@ -80,6 +80,9 @@ int rangeset_consume_ranges(struct rangeset *r,
                                       void *, unsigned long *c),
                             void *ctxt);
 
+/* Merge rangeset r2 into rangeset r1. */
+int __must_check rangeset_merge(struct rangeset *r1, struct rangeset *r2);
+
 /* Add/remove/query a single number. */
 int __must_check rangeset_add_singleton(
     struct rangeset *r, unsigned long s);
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/10] vpci/header: allow multiple map operations
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (6 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 07/10] rangeset: introduce rangeset_merge Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 14:44   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 09/10] pci: add vpci hooks for device addition/removal Roger Pau Monne
  2018-06-20 14:42 ` [PATCH 10/10] vpci/sriov: add support for SR-IOV capability Roger Pau Monne
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

To be queued in vpci_vcpu. This will be required for SR-IOV support,
which uses a single control register bit to toggle memory decoding for
all the virtual functions.

No functional change expected.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/vpci/header.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 686e04e35a..d2188da7f0 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -184,7 +184,19 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
      * started for the same device if the domain is not well-behaved.
      */
     curr->vpci.pdev = pdev;
-    curr->vpci.mem = mem;
+    if ( !curr->vpci.mem )
+        curr->vpci.mem = mem;
+    else
+    {
+        int rc = rangeset_merge(curr->vpci.mem, mem);
+
+        if ( rc )
+            gprintk(XENLOG_WARNING,
+                    "%04x:%02x:%02x.%u: unable to %smap memory region: %d\n",
+                    pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                    PCI_FUNC(pdev->devfn), map ? "" : "un", rc);
+        rangeset_destroy(mem);
+    }
     curr->vpci.map = map;
     curr->vpci.rom_only = rom_only;
 }
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/10] pci: add vpci hooks for device addition/removal
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (7 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 08/10] vpci/header: allow multiple map operations Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 14:50   ` Wei Liu
  2018-06-20 14:42 ` [PATCH 10/10] vpci/sriov: add support for SR-IOV capability Roger Pau Monne
  9 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

So that pci_{add/remove}_device work correctly with vpci. Note that
this requires moving vpci_add_handlers out of the init section.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/passthrough/pci.c | 8 ++++++++
 xen/drivers/vpci/vpci.c       | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index a5d59b83b7..a712db0294 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -768,6 +768,13 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
             goto out;
         }
 
+        ret = vpci_add_handlers(pdev);
+        if ( ret )
+        {
+            pdev->domain = NULL;
+            goto out;
+        }
+
         list_add(&pdev->domain_list, &hardware_domain->arch.pdev_list);
     }
     else
@@ -812,6 +819,7 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
     list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
         if ( pdev->bus == bus && pdev->devfn == devfn )
         {
+            vpci_remove_device(pdev);
             ret = iommu_remove_device(pdev);
             if ( pdev->domain )
                 list_del(&pdev->domain_list);
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index b3968d6523..9939702e50 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -74,7 +74,7 @@ void vpci_remove_device(struct pci_dev *pdev)
 }
 
 
-int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
+int vpci_add_handlers(struct pci_dev *pdev)
 {
     unsigned int i;
     int rc = 0;
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
                   ` (8 preceding siblings ...)
  2018-06-20 14:42 ` [PATCH 09/10] pci: add vpci hooks for device addition/removal Roger Pau Monne
@ 2018-06-20 14:42 ` Roger Pau Monne
  2018-06-29 15:27   ` Wei Liu
  2018-07-03  9:27   ` Wei Liu
  9 siblings, 2 replies; 43+ messages in thread
From: Roger Pau Monne @ 2018-06-20 14:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich,
	Roger Pau Monne

So that a PCI device that supports SR-IOV (PF) can enable the capability
and use the virtual functions.

This code is expected to only be used by privileged domains,
unprivileged domains should not get access to the SR-IOV capability.

The current code detects enabling of the virtual functions feature and
automatically adds the VFs to the domain. It also detects enabling of
memory space and maps the VFs BARs into the domain p2m. Disabling of
the VF enable bit removes the devices and the BAR memory map from the
domain p2m.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Tested with a Linux PVH Dom0 with Intel I350 nic.
---
 xen/drivers/vpci/Makefile |   2 +-
 xen/drivers/vpci/header.c |  67 ++++++++--
 xen/drivers/vpci/sriov.c  | 273 ++++++++++++++++++++++++++++++++++++++
 xen/include/xen/vpci.h    |   9 +-
 4 files changed, 337 insertions(+), 14 deletions(-)
 create mode 100644 xen/drivers/vpci/sriov.c

diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
index 55d1bdfda0..6274f60e74 100644
--- a/xen/drivers/vpci/Makefile
+++ b/xen/drivers/vpci/Makefile
@@ -1 +1 @@
-obj-y += vpci.o header.o msi.o msix.o
+obj-y += vpci.o header.o msi.o msix.o sriov.o
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index d2188da7f0..24fb469df5 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -183,7 +183,19 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
      * is mapped. This can lead to parallel mapping operations being
      * started for the same device if the domain is not well-behaved.
      */
-    curr->vpci.pdev = pdev;
+    if ( !pdev->info.is_virtfn )
+        curr->vpci.pdev = pdev;
+    else
+    {
+        unsigned int i;
+        /*
+         * Set the BARs as enabled now, for VF the memory decoding is not
+         * controlled by the VF command register.
+         */
+        for ( i = 0; i < ARRAY_SIZE(pdev->vpci->header.bars); i++ )
+            if ( MAPPABLE_BAR(&pdev->vpci->header.bars[i]) )
+                pdev->vpci->header.bars[i].enabled = map;
+    }
     if ( !curr->vpci.mem )
         curr->vpci.mem = mem;
     else
@@ -201,11 +213,11 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
     curr->vpci.rom_only = rom_only;
 }
 
-static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
+int vpci_modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
 {
     struct vpci_header *header = &pdev->vpci->header;
     struct rangeset *mem = rangeset_new(NULL, NULL, 0);
-    struct pci_dev *tmp, *dev = NULL;
+    struct pci_dev *tmp, *dev = NULL, *parent = NULL;
     const struct vpci_msix *msix = pdev->vpci->msix;
     unsigned int i;
     int rc;
@@ -261,12 +273,29 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
         }
     }
 
+    /* Get the parent dev if it's a VF. */
+    if ( pdev->info.is_virtfn )
+    {
+        pcidevs_lock();
+        parent = pci_get_pdev(pdev->seg, pdev->info.physfn.bus,
+                              pdev->info.physfn.devfn);
+        pcidevs_unlock();
+    }
+
     /*
      * Check for overlaps with other BARs. Note that only BARs that are
      * currently mapped (enabled) are checked for overlaps.
      */
     list_for_each_entry(tmp, &pdev->domain->arch.pdev_list, domain_list)
     {
+        /*
+         * When mapping the BARs of a VF the parent PF is already locked,
+         * trying to lock it will result in a deadlock. This is because
+         * vpci_modify_bars is called from the parent PF control_write register
+         * handler.
+         */
+        bool lock = parent != tmp;
+
         if ( tmp == pdev )
         {
             /*
@@ -283,10 +312,12 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
                 continue;
         }
 
-        spin_lock(&tmp->vpci_lock);
+        if ( lock )
+            spin_lock(&tmp->vpci_lock);
         if ( !tmp->vpci )
         {
-            spin_unlock(&tmp->vpci_lock);
+            if ( lock )
+                spin_unlock(&tmp->vpci_lock);
             continue;
         }
         for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
@@ -307,14 +338,16 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
             rc = rangeset_remove_range(mem, start, end);
             if ( rc )
             {
-                spin_unlock(&tmp->vpci_lock);
+                if ( lock )
+                    spin_unlock(&tmp->vpci_lock);
                 printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: %d\n",
                        start, end, rc);
                 rangeset_destroy(mem);
                 return rc;
             }
         }
-        spin_unlock(&tmp->vpci_lock);
+        if ( lock )
+            spin_unlock(&tmp->vpci_lock);
     }
 
     ASSERT(dev);
@@ -356,7 +389,7 @@ static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
          * memory decoding bit has not been changed, so leave everything as-is,
          * hoping the guest will realize and try again.
          */
-        modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
+        vpci_modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
     else
         pci_conf_write16(pdev->seg, pdev->bus, slot, func, reg, cmd);
 }
@@ -437,13 +470,13 @@ static void rom_write(const struct pci_dev *pdev, unsigned int reg,
         header->rom_enabled = new_enabled;
         pci_conf_write32(pdev->seg, pdev->bus, slot, func, reg, val);
     }
-    else if ( modify_bars(pdev, new_enabled, true) )
+    else if ( vpci_modify_bars(pdev, new_enabled, true) )
         /*
          * No memory has been added or removed from the p2m (because the actual
          * p2m changes are deferred in defer_map) and the ROM enable bit has
          * not been changed, so leave everything as-is, hoping the guest will
          * realize and try again. It's important to not update rom->addr in the
-         * unmap case if modify_bars has failed, or future attempts would
+         * unmap case if vpci_modify_bars has failed, or future attempts would
          * attempt to unmap the wrong address.
          */
         return;
@@ -465,6 +498,16 @@ static int init_bars(struct pci_dev *pdev)
     };
     int rc;
 
+    if ( pdev->info.is_virtfn )
+        /*
+         * No need to set traps for the command register or the BAR registers
+         * because those are not used by VFs. Memory decoding and position of
+         * the VF BARs is controlled from the PF.
+         *
+         * TODO: add DomU support for VFs.
+         */
+        return 0;
+
     switch ( pci_conf_read8(pdev->seg, pdev->bus, slot, func, PCI_HEADER_TYPE)
              & 0x7f )
     {
@@ -573,7 +616,7 @@ static int init_bars(struct pci_dev *pdev)
             rom->type = VPCI_BAR_EMPTY;
     }
 
-    return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
+    return (cmd & PCI_COMMAND_MEMORY) ? vpci_modify_bars(pdev, true, false) : 0;
 }
 
 static void teardown_bars(struct pci_dev *pdev)
@@ -584,7 +627,7 @@ static void teardown_bars(struct pci_dev *pdev)
     if ( cmd & PCI_COMMAND_MEMORY )
     {
         /* Unmap all BARs from guest p2m. */
-        modify_bars(pdev, false, false);
+        vpci_modify_bars(pdev, false, false);
         current->vpci.pdev = NULL;
     }
 }
diff --git a/xen/drivers/vpci/sriov.c b/xen/drivers/vpci/sriov.c
new file mode 100644
index 0000000000..7e8ae70188
--- /dev/null
+++ b/xen/drivers/vpci/sriov.c
@@ -0,0 +1,273 @@
+/*
+ * Handlers for accesses to the SR-IOV capability structure.
+ *
+ * Copyright (C) 2018 Citrix Systems R&D
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/delay.h>
+#include <xen/sched.h>
+#include <xen/vpci.h>
+
+#define SRIOV_SIZE(num) offsetof(struct vpci_sriov, vf[num])
+
+static void modify_memory_mapping(const struct pci_dev *pdev, unsigned int pos,
+                                  bool enable)
+{
+    const struct vpci_sriov *sriov = pdev->vpci->sriov;
+    unsigned int i;
+    int rc;
+
+    if ( enable )
+    {
+        struct pci_dev *pf_dev;
+
+        pcidevs_lock();
+        /*
+         * NB: a non-const pci_dev of the PF is needed in order to update
+         * vf_rlen.
+         */
+        pf_dev = pci_get_pdev(pdev->seg, pdev->bus, pdev->devfn);
+        pcidevs_unlock();
+        ASSERT(pf_dev);
+
+        /* Set the BARs addresses and size. */
+        for ( i = 0; i < PCI_SRIOV_NUM_BARS; i += rc )
+        {
+            unsigned int j, idx = pos + PCI_SRIOV_BAR + i * 4;
+            const pci_sbdf_t sbdf = {
+                .sbdf = PCI_SBDF3(pdev->seg, pdev->bus, pdev->devfn),
+            };
+            uint32_t bar = pci_conf_read32(pdev->seg, pdev->bus,
+                                           PCI_SLOT(pdev->devfn),
+                                           PCI_FUNC(pdev->devfn), idx);
+            uint64_t addr, size;
+
+            rc = pci_size_mem_bar(sbdf, idx, &addr, &size,
+                                  PCI_BAR_VF |
+                                  ((i == PCI_SRIOV_NUM_BARS - 1) ?
+                                   PCI_BAR_LAST : 0));
+            if ( rc <= 0 )
+            {
+                gprintk(XENLOG_ERR,
+                        "%04x:%02x:%02x.%u: failed to size VF BAR\n",
+                        pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                        PCI_FUNC(pdev->devfn));
+                domain_crash(pdev->domain);
+                return;
+            }
+
+            /*
+             * Update vf_rlen on the PF. According to the spec the size of
+             * the BARs can change if the system page size register is
+             * modified, so always update rlen when enabling VFs.
+             */
+            pf_dev->vf_rlen[i] = size;
+
+            for ( j = 0; j < sriov->num_vfs; j++ )
+            {
+                struct vpci_header *header;
+
+                if ( !sriov->vf[j] )
+                    /* Can happen if pci_add_device fails. */
+                    continue;
+
+                spin_lock(&sriov->vf[j]->vpci_lock);
+                header = &sriov->vf[j]->vpci->header;
+
+                if ( !size )
+                {
+                    header->bars[i].type = VPCI_BAR_EMPTY;
+                    spin_unlock(&sriov->vf[j]->vpci_lock);
+                    continue;
+                }
+
+                header->bars[i].addr = addr + size * j;
+                header->bars[i].size = size;
+                header->bars[i].prefetchable =
+                    bar & PCI_BASE_ADDRESS_MEM_PREFETCH;
+
+                switch ( rc )
+                {
+                case 1:
+                    header->bars[i].type = VPCI_BAR_MEM32;
+                    break;
+
+                case 2:
+                    header->bars[i].type = VPCI_BAR_MEM64_LO;
+                    header->bars[i + 1].type = VPCI_BAR_MEM64_HI;
+                    break;
+
+                default:
+                    ASSERT_UNREACHABLE();
+                    spin_unlock(&sriov->vf[j]->vpci_lock);
+                    domain_crash(pdev->domain);
+                    return;
+                }
+                spin_unlock(&sriov->vf[j]->vpci_lock);
+            }
+        }
+    }
+
+    /* Add/remove mappings for the VFs BARs into the p2m. */
+    for ( i = 0; i < sriov->num_vfs; i++ )
+    {
+        struct pci_dev *vf_pdev = sriov->vf[i];
+
+        spin_lock(&vf_pdev->vpci_lock);
+        rc = vpci_modify_bars(vf_pdev, enable, false);
+        spin_unlock(&vf_pdev->vpci_lock);
+        if ( rc )
+            gprintk(XENLOG_ERR,
+                    "failed to %smap BARs of VF %04x:%02x:%02x.%u: %d\n",
+                    enable ? "" : "un", vf_pdev->seg, vf_pdev->bus,
+                    PCI_SLOT(vf_pdev->devfn), PCI_FUNC(vf_pdev->devfn), rc);
+    }
+}
+
+static void control_write(const struct pci_dev *pdev, unsigned int reg,
+                          uint32_t val, void *data)
+{
+    struct vpci_sriov *sriov = data;
+    unsigned int i, pos = reg - PCI_SRIOV_CTRL;
+    uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
+                                       PCI_SLOT(pdev->devfn),
+                                       PCI_FUNC(pdev->devfn),
+                                       pos + PCI_SRIOV_CTRL);
+    bool enabled = control & PCI_SRIOV_CTRL_VFE;
+    bool mem_enabled = control & PCI_SRIOV_CTRL_MSE;
+    bool new_enabled = val & PCI_SRIOV_CTRL_VFE;
+    bool new_mem_enabled = val & PCI_SRIOV_CTRL_MSE;
+    int rc;
+
+    if ( new_enabled != enabled )
+    {
+        uint16_t offset = pci_conf_read16(pdev->seg, pdev->bus,
+                                          PCI_SLOT(pdev->devfn),
+                                          PCI_FUNC(pdev->devfn),
+                                          pos + PCI_SRIOV_VF_OFFSET);
+        uint16_t stride = pci_conf_read16(pdev->seg, pdev->bus,
+                                          PCI_SLOT(pdev->devfn),
+                                          PCI_FUNC(pdev->devfn),
+                                          pos + PCI_SRIOV_VF_STRIDE);
+
+        if ( new_enabled )
+        {
+            /*
+             * Only update the number of active VFs when enabling, when
+             * disabling use the cached value in order to always remove the
+             * same number of VFs that where active.
+             */
+            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
+                                             PCI_SLOT(pdev->devfn),
+                                             PCI_FUNC(pdev->devfn),
+                                             pos + PCI_SRIOV_NUM_VF);
+
+            /*
+             * NB: VFE needs to be enabled before calling pci_add_device so Xen
+             * can access the config space of VFs.
+             */
+            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                             PCI_FUNC(pdev->devfn), reg,
+                             control | PCI_SRIOV_CTRL_VFE);
+
+            /*
+             * The spec states that the software must wait at least 100ms
+             * before attempting to access VF registers when enabling virtual
+             * functions on the PF.
+             */
+            mdelay(100);
+        }
+
+        for ( i = 0; i < sriov->num_vfs; i++ )
+        {
+            const pci_sbdf_t bdf = {
+                .bdf = PCI_BDF2(pdev->bus, pdev->devfn) + offset + stride * i,
+            };
+
+            if ( new_enabled )
+            {
+                const struct pci_dev_info info = {
+                    .is_virtfn = true,
+                    .physfn.bus = pdev->bus,
+                    .physfn.devfn = pdev->devfn,
+                };
+
+                rc = pci_add_device(pdev->seg, bdf.bus, bdf.extfunc, &info,
+                                    pdev->node);
+            }
+            else
+                rc = pci_remove_device(pdev->seg, bdf.bus, bdf.extfunc);
+            if ( rc )
+                gprintk(XENLOG_ERR, "failed to %s VF %04x:%02x:%02x.%u: %d\n",
+                        new_enabled ? "add" : "remove", pdev->seg, bdf.bus,
+                        bdf.dev, bdf.func, rc);
+
+            pcidevs_lock();
+            sriov->vf[i] = pci_get_pdev(pdev->seg, bdf.bus, bdf.extfunc);
+            pcidevs_unlock();
+        }
+
+        if ( new_mem_enabled )
+            modify_memory_mapping(pdev, pos, true);
+    }
+    else if ( new_mem_enabled != mem_enabled && new_enabled )
+        modify_memory_mapping(pdev, pos, new_mem_enabled);
+
+    pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                     PCI_FUNC(pdev->devfn), reg, val);
+}
+
+static int init_sriov(struct pci_dev *pdev)
+{
+    unsigned int pos = pci_find_ext_capability(pdev->seg, pdev->bus,
+                                               pdev->devfn,
+                                               PCI_EXT_CAP_ID_SRIOV);
+    uint16_t total_vfs;
+
+    if ( !pos )
+        return 0;
+
+    total_vfs = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                                PCI_FUNC(pdev->devfn),
+                                pos + PCI_SRIOV_TOTAL_VF);
+
+    pdev->vpci->sriov = xzalloc_bytes(SRIOV_SIZE(total_vfs));
+    if ( !pdev->vpci->sriov )
+        return -ENOMEM;
+
+    return vpci_add_register(pdev->vpci, vpci_hw_read16, control_write,
+                             pos + PCI_SRIOV_CTRL, 2, pdev->vpci->sriov);
+}
+
+static void teardown_sriov(struct pci_dev *pdev)
+{
+    if ( pdev->vpci->sriov )
+    {
+        /* TODO: removing PFs is not currently supported. */
+        ASSERT_UNREACHABLE();
+        domain_crash(pdev->domain);
+    }
+}
+REGISTER_VPCI_INIT(init_sriov, teardown_sriov, VPCI_PRIORITY_LOW);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 208667227e..bd92c7cd12 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -94,7 +94,6 @@ struct vpci {
          * is mapped into guest p2m) if there's a ROM BAR on the device.
          */
         bool rom_enabled      : 1;
-        /* FIXME: currently there's no support for SR-IOV. */
     } header;
 
     /* MSI data. */
@@ -144,6 +143,11 @@ struct vpci {
             struct vpci_arch_msix_entry arch;
         } entries[];
     } *msix;
+
+    struct vpci_sriov {
+        uint16_t num_vfs;
+        struct pci_dev *vf[];
+    } *sriov;
 #endif
 };
 
@@ -214,6 +218,9 @@ static inline unsigned int vmsix_entry_nr(const struct vpci_msix *msix,
 {
     return entry - msix->entries;
 }
+
+/* Map/unmap the BARs of a vPCI device. */
+int vpci_modify_bars(const struct pci_dev *pdev, bool map, bool rom_only);
 #endif /* __XEN__ */
 
 #else /* !CONFIG_HAS_VPCI */
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] vpci: move lock
  2018-06-20 14:42 ` [PATCH 01/10] vpci: move lock Roger Pau Monne
@ 2018-06-29 10:19   ` Wei Liu
  2018-06-29 13:27     ` Roger Pau Monné
  2021-11-23 12:20   ` Oleksandr Andrushchenko
  1 sibling, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-06-29 10:19 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:25PM +0200, Roger Pau Monne wrote:
> To the outside of the vpci struct. This way the lock can be used to
> check whether vpci is present, and removal can be performed while
> holding the lock, in order to make sure there are no accesses to the
> contents of the vpci struct. Previously removal could race with
> vpci_read for example, since the log was dropped prior to freeing

log -> lock.

> pdev->vpci.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
[...]
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 0ec4c082a6..9d5607d5f8 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -131,11 +131,12 @@ bool vpci_process_pending(struct vcpu *v)
>          if ( rc == -ERESTART )
>              return true;
>  
> -        spin_lock(&v->vpci.pdev->vpci->lock);
> -        /* Disable memory decoding unconditionally on failure. */
> -        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> -                        !rc && v->vpci.rom_only);
> -        spin_unlock(&v->vpci.pdev->vpci->lock);
> +        spin_lock(&v->vpci.pdev->vpci_lock);
> +        if ( v->vpci.pdev->vpci )

The purpose of this check is to fix a latent bug in the original code?

> +            /* Disable memory decoding unconditionally on failure. */
> +            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> +                            !rc && v->vpci.rom_only);
> +        spin_unlock(&v->vpci.pdev->vpci_lock);
>  
[...]
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 82607bdb9a..7d52bcf8d0 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -35,9 +35,8 @@ extern vpci_register_init_t *const __start_vpci_array[];
>  extern vpci_register_init_t *const __end_vpci_array[];
>  #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>  
> -void vpci_remove_device(struct pci_dev *pdev)
> +static void vpci_remove_device_locked(struct pci_dev *pdev)
>  {
> -    spin_lock(&pdev->vpci->lock);

ASSERT(spin_is_locked(&pdev->vpci_lock));

>      while ( !list_empty(&pdev->vpci->handlers) )
>      {
>          struct vpci_register *r = list_first_entry(&pdev->vpci->handlers,
> @@ -47,13 +46,20 @@ void vpci_remove_device(struct pci_dev *pdev)
>          list_del(&r->node);
>          xfree(r);
>      }
> -    spin_unlock(&pdev->vpci->lock);
>      xfree(pdev->vpci->msix);
>      xfree(pdev->vpci->msi);
>      xfree(pdev->vpci);
>      pdev->vpci = NULL;
>  }
>  
> +void vpci_remove_device(struct pci_dev *pdev)
> +{
> +    spin_lock(&pdev->vpci_lock);
> +    vpci_remove_device_locked(pdev);
> +    spin_unlock(&pdev->vpci_lock);
> +}
> +
> +

Too many blank lines.

>  int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
>  {
>      unsigned int i;
> @@ -62,12 +68,15 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
>      if ( !has_vpci(pdev->domain) )
>          return 0;
>  
> +    spin_lock(&pdev->vpci_lock);
>      pdev->vpci = xzalloc(struct vpci);
>      if ( !pdev->vpci )
> +    {
> +        spin_unlock(&pdev->vpci_lock);
>          return -ENOMEM;
> +    }
>  
>      INIT_LIST_HEAD(&pdev->vpci->handlers);
> -    spin_lock_init(&pdev->vpci->lock);
>  
>      for ( i = 0; i < NUM_VPCI_INIT; i++ )
>      {
[...]
> @@ -77,7 +86,8 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
> @@ -315,7 +318,7 @@ static uint32_t merge_result(uint32_t data, uint32_t new, unsigned int size,
>  uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>  {
>      const struct domain *d = current->domain;
> -    const struct pci_dev *pdev;
> +    struct pci_dev *pdev;
>      const struct vpci_register *r;
>      unsigned int data_offset = 0;
>      uint32_t data = ~(uint32_t)0;
> @@ -331,7 +334,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>      if ( !pdev )
>          return vpci_read_hw(sbdf, reg, size);
>  
> -    spin_lock(&pdev->vpci->lock);
> +    spin_lock(&pdev->vpci_lock);
> +    if ( !pdev->vpci )
> +    {
> +        spin_unlock(&pdev->vpci_lock);
> +        return vpci_read_hw(sbdf, reg, size);
> +    }
>  
>      /* Read from the hardware or the emulated register handlers. */
>      list_for_each_entry ( r, &pdev->vpci->handlers, node )
> @@ -383,7 +391,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>  
>          data = merge_result(data, tmp_data, size - data_offset, data_offset);
>      }
> -    spin_unlock(&pdev->vpci->lock);
> +    spin_unlock(&pdev->vpci_lock);

I think the critical section in this function and the write function can
shrink a bit. Reading from / writing to hardware shouldn't need to be
protected by vpci_lock.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/10] vpci/msix: add lock to protect the list of MSIX regions
  2018-06-20 14:42 ` [PATCH 02/10] vpci/msix: add lock to protect the list of MSIX regions Roger Pau Monne
@ 2018-06-29 10:29   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2018-06-29 10:29 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:26PM +0200, Roger Pau Monne wrote:
> This is required in order to allow run-time removal of MSI-X regions.
> 
> No functional change.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/10] vpci: add tear down functions
  2018-06-20 14:42 ` [PATCH 03/10] vpci: add tear down functions Roger Pau Monne
@ 2018-06-29 10:37   ` Wei Liu
  2018-07-02  7:52     ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-06-29 10:37 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:27PM +0200, Roger Pau Monne wrote:
> Tear down functions are not mandatory. Note that this patch just
> implements the framework, but doesn't implement any tear down function
> yet.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/arm/xen.lds.S    |  9 +--------
>  xen/arch/x86/xen.lds.S    |  9 +--------
>  xen/drivers/vpci/header.c |  2 +-
>  xen/drivers/vpci/msi.c    |  2 +-
>  xen/drivers/vpci/msix.c   |  2 +-
>  xen/drivers/vpci/vpci.c   | 20 +++++++++++++++++---
>  xen/include/xen/vpci.h    | 15 +++++++++++----
>  7 files changed, 33 insertions(+), 26 deletions(-)
> 
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index 245a0e0e85..2c6a09c59d 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -66,7 +66,7 @@ SECTIONS
>         *(.data.param)
>         __param_end = .;
>  
> -#if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
> +#if defined(CONFIG_HAS_VPCI)
>         . = ALIGN(POINTER_ALIGN);
>         __start_vpci_array = .;
>         *(SORT(.data.vpci.*))
> @@ -178,13 +178,6 @@ SECTIONS
>         *(.init_array)
>         *(SORT(.init_array.*))
>         __ctors_end = .;
> -
> -#if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
> -       . = ALIGN(POINTER_ALIGN);
> -       __start_vpci_array = .;
> -       *(SORT(.data.vpci.*))
> -       __end_vpci_array = .;
> -#endif

It is worth mentioning in the commit message that why this is deleted. I
think it is because now they should be unconditionally put into rodata
since they can't be discarded after boot anymore.

The code looks OK.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-06-20 14:42 ` [PATCH 04/10] vpci/msix: add teardown cleanup Roger Pau Monne
@ 2018-06-29 10:52   ` Wei Liu
  2018-07-02  7:54     ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-06-29 10:52 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:28PM +0200, Roger Pau Monne wrote:
> So that interrupts are properly freed.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/drivers/vpci/msix.c | 43 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 6132f576b6..cfca1cd43a 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -450,7 +450,48 @@ static int init_msix(struct pci_dev *pdev)
>  
>      return 0;
>  }
> -REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
> +
> +static void teardown_msix(struct pci_dev *pdev)
> +{
> +    struct vpci_msix *msix = pdev->vpci->msix;
> +    unsigned int i;
> +
> +    if ( !msix )
> +        return;
> +
> +    if ( msix->enabled )
> +    {
> +        /* Disable MSIX. */
> +        unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> +                                               PCI_SLOT(pdev->devfn),
> +                                               PCI_FUNC(pdev->devfn),
> +                                               PCI_CAP_ID_MSIX);
> +        uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> +                                           PCI_SLOT(pdev->devfn),
> +                                           PCI_FUNC(pdev->devfn),
> +                                           msix_control_reg(pos));
> +
> +        pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                         PCI_FUNC(pdev->devfn), msix_control_reg(pos),
> +                         (control & ~PCI_MSIX_FLAGS_ENABLE));
> +    }
> +
> +    write_lock(&pdev->domain->arch.hvm_domain.msix_lock);
> +    list_del(&pdev->vpci->msix->next);
> +    write_unlock(&pdev->domain->arch.hvm_domain.msix_lock);
> +
> +    for ( i = 0; i < msix->max_entries && msix->enabled; i++ )

Maybe lift checking msix->enabled outside of the loop? Afaict nothing
in the loop manipulates that flag.

> +    {
> +        int rc = vpci_msix_arch_disable_entry(&msix->entries[i], pdev);
> +
> +        if ( rc && rc != -ENOENT )
> +            gprintk(XENLOG_WARNING,
> +                    "%04x:%02x:%02x.%u: unable to disable MSIX entry %u: %d\n",
> +                    pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                    PCI_FUNC(pdev->devfn), i, rc);
> +    }

No freeing msix here?

Wei.

> +}
> +REGISTER_VPCI_INIT(init_msix, teardown_msix, VPCI_PRIORITY_HIGH);
>  
>  /*
>   * Local variables:
> -- 
> 2.17.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/10] vpci/msi: add teardown cleanup
  2018-06-20 14:42 ` [PATCH 05/10] vpci/msi: " Roger Pau Monne
@ 2018-06-29 10:52   ` Wei Liu
  2018-07-02  7:55     ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-06-29 10:52 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:29PM +0200, Roger Pau Monne wrote:
> So that interrupts are properly freed.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/drivers/vpci/msi.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index 5bb505c864..e8cd1238df 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -266,7 +266,28 @@ static int init_msi(struct pci_dev *pdev)
>  
>      return 0;
>  }
> -REGISTER_VPCI_INIT(init_msi, NULL, VPCI_PRIORITY_LOW);
> +
> +static void teardown_msi(struct pci_dev *pdev)
> +{
> +    unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> +                                           PCI_SLOT(pdev->devfn),
> +                                           PCI_FUNC(pdev->devfn),
> +                                           PCI_CAP_ID_MSI);
> +    struct vpci_msi *msi = pdev->vpci->msi;
> +    uint16_t control;
> +
> +    if ( !msi || !msi->enabled )
> +        return;
> +
> +    control = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                              PCI_FUNC(pdev->devfn), msi_control_reg(pos));
> +    pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                     PCI_FUNC(pdev->devfn), msi_control_reg(pos),
> +                     (control & ~PCI_MSI_FLAGS_ENABLE));
> +
> +    vpci_msi_arch_disable(msi, pdev);

Missing xfree(msi)?

Wei.

> +}
> +REGISTER_VPCI_INIT(init_msi, teardown_msi, VPCI_PRIORITY_LOW);
>  
>  void vpci_dump_msi(void)
>  {
> -- 
> 2.17.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/10] vpci/header: add teardown cleanup
  2018-06-20 14:42 ` [PATCH 06/10] vpci/header: " Roger Pau Monne
@ 2018-06-29 11:15   ` Wei Liu
  2018-07-02  8:04     ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-06-29 11:15 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:30PM +0200, Roger Pau Monne wrote:
> In order to unmap the BARs
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/drivers/vpci/header.c | 30 +++++++++++++++++++++++-------
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 4363270a55..686e04e35a 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -131,12 +131,15 @@ bool vpci_process_pending(struct vcpu *v)
>          if ( rc == -ERESTART )
>              return true;
>  
> -        spin_lock(&v->vpci.pdev->vpci_lock);
> -        if ( v->vpci.pdev->vpci )
> -            /* Disable memory decoding unconditionally on failure. */
> -            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> -                            !rc && v->vpci.rom_only);
> -        spin_unlock(&v->vpci.pdev->vpci_lock);
> +        if ( v->vpci.pdev )
> +        {
> +            spin_lock(&v->vpci.pdev->vpci_lock);
> +            if ( v->vpci.pdev->vpci )
> +                /* Disable memory decoding unconditionally on failure. */
> +                modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> +                                !rc && v->vpci.rom_only);
> +            spin_unlock(&v->vpci.pdev->vpci_lock);
> +        }
>  
>          rangeset_destroy(v->vpci.mem);
>          v->vpci.mem = NULL;
> @@ -560,7 +563,20 @@ static int init_bars(struct pci_dev *pdev)
>  
>      return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
>  }
> -REGISTER_VPCI_INIT(init_bars, NULL, VPCI_PRIORITY_MIDDLE);
> +
> +static void teardown_bars(struct pci_dev *pdev)
> +{
> +    uint16_t cmd = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                                   PCI_FUNC(pdev->devfn), PCI_COMMAND);
> +
> +    if ( cmd & PCI_COMMAND_MEMORY )
> +    {
> +        /* Unmap all BARs from guest p2m. */
> +        modify_bars(pdev, false, false);

So modify_bars will eventually call defer_map in most cases (which I
believe are the ones your care about here).

But then the following line sets vpci.pdev to NULL, which means the
check in vpci_process_pending is false and modify_decoding is skipped.
If that what you want?

Wei.

> +        current->vpci.pdev = NULL;


> +    }
> +}
> +REGISTER_VPCI_INIT(init_bars, teardown_bars, VPCI_PRIORITY_MIDDLE);



>  
>  /*
>   * Local variables:
> -- 
> 2.17.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/10] rangeset: introduce rangeset_merge
  2018-06-20 14:42 ` [PATCH 07/10] rangeset: introduce rangeset_merge Roger Pau Monne
@ 2018-06-29 11:23   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2018-06-29 11:23 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:31PM +0200, Roger Pau Monne wrote:
> This new helper will merge two rangesets.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] vpci: move lock
  2018-06-29 10:19   ` Wei Liu
@ 2018-06-29 13:27     ` Roger Pau Monné
  2018-06-29 13:37       ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2018-06-29 13:27 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Fri, Jun 29, 2018 at 11:19:57AM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:25PM +0200, Roger Pau Monne wrote:
> > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> > index 0ec4c082a6..9d5607d5f8 100644
> > --- a/xen/drivers/vpci/header.c
> > +++ b/xen/drivers/vpci/header.c
> > @@ -131,11 +131,12 @@ bool vpci_process_pending(struct vcpu *v)
> >          if ( rc == -ERESTART )
> >              return true;
> >  
> > -        spin_lock(&v->vpci.pdev->vpci->lock);
> > -        /* Disable memory decoding unconditionally on failure. */
> > -        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > -                        !rc && v->vpci.rom_only);
> > -        spin_unlock(&v->vpci.pdev->vpci->lock);
> > +        spin_lock(&v->vpci.pdev->vpci_lock);
> > +        if ( v->vpci.pdev->vpci )
> 
> The purpose of this check is to fix a latent bug in the original code?

Previous code didn't support removing devices, so it's more about
making it capable of supporting vpci device removal.

> > +            /* Disable memory decoding unconditionally on failure. */
> > +            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > +                            !rc && v->vpci.rom_only);
> > +        spin_unlock(&v->vpci.pdev->vpci_lock);
> >  
> [...]
> > diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> > index 82607bdb9a..7d52bcf8d0 100644
> > --- a/xen/drivers/vpci/vpci.c
> > +++ b/xen/drivers/vpci/vpci.c
> > @@ -35,9 +35,8 @@ extern vpci_register_init_t *const __start_vpci_array[];
> >  extern vpci_register_init_t *const __end_vpci_array[];
> >  #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
> >  
> > -void vpci_remove_device(struct pci_dev *pdev)
> > +static void vpci_remove_device_locked(struct pci_dev *pdev)
> >  {
> > -    spin_lock(&pdev->vpci->lock);
> 
> ASSERT(spin_is_locked(&pdev->vpci_lock));

Er, yes. But keep in mind that this is going to return `true` even if
vpci_lock is locked by another CPU. Asserting lock ownership only
works correctly with recursive locks.

> > @@ -383,7 +391,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> >  
> >          data = merge_result(data, tmp_data, size - data_offset, data_offset);
> >      }
> > -    spin_unlock(&pdev->vpci->lock);
> > +    spin_unlock(&pdev->vpci_lock);
> 
> I think the critical section in this function and the write function can
> shrink a bit. Reading from / writing to hardware shouldn't need to be
> protected by vpci_lock.

There's no further usage of contents of the vpci struct, so I guess I
can move the unlock a little bit up. The same applies to the write
counterpart.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] vpci: move lock
  2018-06-29 13:27     ` Roger Pau Monné
@ 2018-06-29 13:37       ` Julien Grall
  0 siblings, 0 replies; 43+ messages in thread
From: Julien Grall @ 2018-06-29 13:37 UTC (permalink / raw)
  To: Roger Pau Monné, Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Jan Beulich, xen-devel

Hi Roger,

On 29/06/18 14:27, Roger Pau Monné wrote:
> On Fri, Jun 29, 2018 at 11:19:57AM +0100, Wei Liu wrote:
>> On Wed, Jun 20, 2018 at 04:42:25PM +0200, Roger Pau Monne wrote:
>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>>> index 0ec4c082a6..9d5607d5f8 100644
>>> --- a/xen/drivers/vpci/header.c
>>> +++ b/xen/drivers/vpci/header.c
>>> @@ -131,11 +131,12 @@ bool vpci_process_pending(struct vcpu *v)
>>>           if ( rc == -ERESTART )
>>>               return true;
>>>   
>>> -        spin_lock(&v->vpci.pdev->vpci->lock);
>>> -        /* Disable memory decoding unconditionally on failure. */
>>> -        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
>>> -                        !rc && v->vpci.rom_only);
>>> -        spin_unlock(&v->vpci.pdev->vpci->lock);
>>> +        spin_lock(&v->vpci.pdev->vpci_lock);
>>> +        if ( v->vpci.pdev->vpci )
>>
>> The purpose of this check is to fix a latent bug in the original code?
> 
> Previous code didn't support removing devices, so it's more about
> making it capable of supporting vpci device removal.
> 
>>> +            /* Disable memory decoding unconditionally on failure. */
>>> +            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
>>> +                            !rc && v->vpci.rom_only);
>>> +        spin_unlock(&v->vpci.pdev->vpci_lock);
>>>   
>> [...]
>>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>>> index 82607bdb9a..7d52bcf8d0 100644
>>> --- a/xen/drivers/vpci/vpci.c
>>> +++ b/xen/drivers/vpci/vpci.c
>>> @@ -35,9 +35,8 @@ extern vpci_register_init_t *const __start_vpci_array[];
>>>   extern vpci_register_init_t *const __end_vpci_array[];
>>>   #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>>>   
>>> -void vpci_remove_device(struct pci_dev *pdev)
>>> +static void vpci_remove_device_locked(struct pci_dev *pdev)
>>>   {
>>> -    spin_lock(&pdev->vpci->lock);
>>
>> ASSERT(spin_is_locked(&pdev->vpci_lock));
> 
> Er, yes. But keep in mind that this is going to return `true` even if
> vpci_lock is locked by another CPU. Asserting lock ownership only
> works correctly with recursive locks.

While I agree with your statement, the point of the ASSERT is to catch 
misuse, there are a fair amount of chance to have no contention on the 
lock (something would need to be done if it was the case anyway).

So in general, I still recommend developer to use 
ASSERT(spin_is_lock(...)) in any function relying on a lock taken. And 
who knows, maybe some day we would have a spin lock helper checking the 
CPU making the ASSERT more reliable.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/10] vpci/header: allow multiple map operations
  2018-06-20 14:42 ` [PATCH 08/10] vpci/header: allow multiple map operations Roger Pau Monne
@ 2018-06-29 14:44   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2018-06-29 14:44 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:32PM +0200, Roger Pau Monne wrote:
> To be queued in vpci_vcpu. This will be required for SR-IOV support,
> which uses a single control register bit to toggle memory decoding for
> all the virtual functions.
> 
> No functional change expected.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/10] pci: add vpci hooks for device addition/removal
  2018-06-20 14:42 ` [PATCH 09/10] pci: add vpci hooks for device addition/removal Roger Pau Monne
@ 2018-06-29 14:50   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2018-06-29 14:50 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:33PM +0200, Roger Pau Monne wrote:
> So that pci_{add/remove}_device work correctly with vpci. Note that
> this requires moving vpci_add_handlers out of the init section.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-06-20 14:42 ` [PATCH 10/10] vpci/sriov: add support for SR-IOV capability Roger Pau Monne
@ 2018-06-29 15:27   ` Wei Liu
  2018-07-02  8:12     ` Roger Pau Monné
  2018-07-03  9:27   ` Wei Liu
  1 sibling, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-06-29 15:27 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
> So that a PCI device that supports SR-IOV (PF) can enable the capability
> and use the virtual functions.
> 
> This code is expected to only be used by privileged domains,
> unprivileged domains should not get access to the SR-IOV capability.
> 
> The current code detects enabling of the virtual functions feature and
> automatically adds the VFs to the domain. It also detects enabling of
> memory space and maps the VFs BARs into the domain p2m. Disabling of
> the VF enable bit removes the devices and the BAR memory map from the
> domain p2m.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
[...]
>      int rc;
> @@ -261,12 +273,29 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
>          }
>      }
>  
> +    /* Get the parent dev if it's a VF. */
> +    if ( pdev->info.is_virtfn )
> +    {
> +        pcidevs_lock();
> +        parent = pci_get_pdev(pdev->seg, pdev->info.physfn.bus,
> +                              pdev->info.physfn.devfn);
> +        pcidevs_unlock();
> +    }
> +
>      /*
>       * Check for overlaps with other BARs. Note that only BARs that are
>       * currently mapped (enabled) are checked for overlaps.
>       */
>      list_for_each_entry(tmp, &pdev->domain->arch.pdev_list, domain_list)
>      {
> +        /*
> +         * When mapping the BARs of a VF the parent PF is already locked,
> +         * trying to lock it will result in a deadlock. This is because
> +         * vpci_modify_bars is called from the parent PF control_write register
> +         * handler.
> +         */
> +        bool lock = parent != tmp;

There is spin_lock_recursive. Would that work?

> +
>          if ( tmp == pdev )
>          {
>              /*
> @@ -283,10 +312,12 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
>                  continue;
>          }
>  
> -        spin_lock(&tmp->vpci_lock);
> +        if ( lock )
> +            spin_lock(&tmp->vpci_lock);
>          if ( !tmp->vpci )
>          {
> -            spin_unlock(&tmp->vpci_lock);
> +            if ( lock )
> +                spin_unlock(&tmp->vpci_lock);
>              continue;
>          }
>          for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
> @@ -307,14 +338,16 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
>              rc = rangeset_remove_range(mem, start, end);
>              if ( rc )
>              {
> -                spin_unlock(&tmp->vpci_lock);
> +                if ( lock )
> +                    spin_unlock(&tmp->vpci_lock);
>                  printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: %d\n",
>                         start, end, rc);
>                  rangeset_destroy(mem);
>                  return rc;
>              }
>          }
> -        spin_unlock(&tmp->vpci_lock);
> +        if ( lock )
> +            spin_unlock(&tmp->vpci_lock);
>      }
>  
>      ASSERT(dev);
[...]
> +static int init_sriov(struct pci_dev *pdev)
> +{
> +    unsigned int pos = pci_find_ext_capability(pdev->seg, pdev->bus,
> +                                               pdev->devfn,
> +                                               PCI_EXT_CAP_ID_SRIOV);
> +    uint16_t total_vfs;
> +
> +    if ( !pos )
> +        return 0;
> +
> +    total_vfs = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                                PCI_FUNC(pdev->devfn),
> +                                pos + PCI_SRIOV_TOTAL_VF);
> +
> +    pdev->vpci->sriov = xzalloc_bytes(SRIOV_SIZE(total_vfs));
> +    if ( !pdev->vpci->sriov )
> +        return -ENOMEM;
> +
> +    return vpci_add_register(pdev->vpci, vpci_hw_read16, control_write,
> +                             pos + PCI_SRIOV_CTRL, 2, pdev->vpci->sriov);

If vpci_add_register fails sriov is going to be leaked? Or eventually
that will cause the domain to crash in teardown.

I think it would make more sense if init_sriov is able to clean up after
itself. MSI and MSI-X code has similar issue.

I guess this is fine at the moment because it is Dom0-only. But if we
want to expose vpci to DomU eventually we might as well tighten up
things now.

I will need to re-read SR-IOV spec before I can comment on the rest.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/10] vpci: add tear down functions
  2018-06-29 10:37   ` Wei Liu
@ 2018-07-02  7:52     ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02  7:52 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Fri, Jun 29, 2018 at 11:37:56AM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:27PM +0200, Roger Pau Monne wrote:
> > Tear down functions are not mandatory. Note that this patch just
> > implements the framework, but doesn't implement any tear down function
> > yet.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien.grall@arm.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Tim Deegan <tim@xen.org>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/arch/arm/xen.lds.S    |  9 +--------
> >  xen/arch/x86/xen.lds.S    |  9 +--------
> >  xen/drivers/vpci/header.c |  2 +-
> >  xen/drivers/vpci/msi.c    |  2 +-
> >  xen/drivers/vpci/msix.c   |  2 +-
> >  xen/drivers/vpci/vpci.c   | 20 +++++++++++++++++---
> >  xen/include/xen/vpci.h    | 15 +++++++++++----
> >  7 files changed, 33 insertions(+), 26 deletions(-)
> > 
> > diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> > index 245a0e0e85..2c6a09c59d 100644
> > --- a/xen/arch/arm/xen.lds.S
> > +++ b/xen/arch/arm/xen.lds.S
> > @@ -66,7 +66,7 @@ SECTIONS
> >         *(.data.param)
> >         __param_end = .;
> >  
> > -#if defined(CONFIG_HAS_VPCI) && defined(CONFIG_LATE_HWDOM)
> > +#if defined(CONFIG_HAS_VPCI)
> >         . = ALIGN(POINTER_ALIGN);
> >         __start_vpci_array = .;
> >         *(SORT(.data.vpci.*))
> > @@ -178,13 +178,6 @@ SECTIONS
> >         *(.init_array)
> >         *(SORT(.init_array.*))
> >         __ctors_end = .;
> > -
> > -#if defined(CONFIG_HAS_VPCI) && !defined(CONFIG_LATE_HWDOM)
> > -       . = ALIGN(POINTER_ALIGN);
> > -       __start_vpci_array = .;
> > -       *(SORT(.data.vpci.*))
> > -       __end_vpci_array = .;
> > -#endif
> 
> It is worth mentioning in the commit message that why this is deleted. I
> think it is because now they should be unconditionally put into rodata
> since they can't be discarded after boot anymore.

Right, since removal and addition (once SR-IOV is implemented) can
happen at any point during the lifetime of Xen this data cannot be
discarded after boot.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-06-29 10:52   ` Wei Liu
@ 2018-07-02  7:54     ` Roger Pau Monné
  2018-07-02 13:55       ` Wei Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02  7:54 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Fri, Jun 29, 2018 at 11:52:07AM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:28PM +0200, Roger Pau Monne wrote:
> > So that interrupts are properly freed.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien.grall@arm.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Tim Deegan <tim@xen.org>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/drivers/vpci/msix.c | 43 ++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 42 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> > index 6132f576b6..cfca1cd43a 100644
> > --- a/xen/drivers/vpci/msix.c
> > +++ b/xen/drivers/vpci/msix.c
> > @@ -450,7 +450,48 @@ static int init_msix(struct pci_dev *pdev)
> >  
> >      return 0;
> >  }
> > -REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
> > +
> > +static void teardown_msix(struct pci_dev *pdev)
> > +{
> > +    struct vpci_msix *msix = pdev->vpci->msix;
> > +    unsigned int i;
> > +
> > +    if ( !msix )
> > +        return;
> > +
> > +    if ( msix->enabled )
> > +    {
> > +        /* Disable MSIX. */
> > +        unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> > +                                               PCI_SLOT(pdev->devfn),
> > +                                               PCI_FUNC(pdev->devfn),
> > +                                               PCI_CAP_ID_MSIX);
> > +        uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> > +                                           PCI_SLOT(pdev->devfn),
> > +                                           PCI_FUNC(pdev->devfn),
> > +                                           msix_control_reg(pos));
> > +
> > +        pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                         PCI_FUNC(pdev->devfn), msix_control_reg(pos),
> > +                         (control & ~PCI_MSIX_FLAGS_ENABLE));
> > +    }
> > +
> > +    write_lock(&pdev->domain->arch.hvm_domain.msix_lock);
> > +    list_del(&pdev->vpci->msix->next);
> > +    write_unlock(&pdev->domain->arch.hvm_domain.msix_lock);
> > +
> > +    for ( i = 0; i < msix->max_entries && msix->enabled; i++ )
> 
> Maybe lift checking msix->enabled outside of the loop? Afaict nothing
> in the loop manipulates that flag.

I've done it that way to skip one indentation level, but I can put it
outside if that's preferred.

> > +    {
> > +        int rc = vpci_msix_arch_disable_entry(&msix->entries[i], pdev);
> > +
> > +        if ( rc && rc != -ENOENT )
> > +            gprintk(XENLOG_WARNING,
> > +                    "%04x:%02x:%02x.%u: unable to disable MSIX entry %u: %d\n",
> > +                    pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                    PCI_FUNC(pdev->devfn), i, rc);
> > +    }
> 
> No freeing msix here?

msix is freed by vpci_remove_device. I can move the freeing inside of
the teardown function now, since it makes it clearer.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/10] vpci/msi: add teardown cleanup
  2018-06-29 10:52   ` Wei Liu
@ 2018-07-02  7:55     ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02  7:55 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Fri, Jun 29, 2018 at 11:52:49AM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:29PM +0200, Roger Pau Monne wrote:
> > So that interrupts are properly freed.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien.grall@arm.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Tim Deegan <tim@xen.org>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/drivers/vpci/msi.c | 23 ++++++++++++++++++++++-
> >  1 file changed, 22 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> > index 5bb505c864..e8cd1238df 100644
> > --- a/xen/drivers/vpci/msi.c
> > +++ b/xen/drivers/vpci/msi.c
> > @@ -266,7 +266,28 @@ static int init_msi(struct pci_dev *pdev)
> >  
> >      return 0;
> >  }
> > -REGISTER_VPCI_INIT(init_msi, NULL, VPCI_PRIORITY_LOW);
> > +
> > +static void teardown_msi(struct pci_dev *pdev)
> > +{
> > +    unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> > +                                           PCI_SLOT(pdev->devfn),
> > +                                           PCI_FUNC(pdev->devfn),
> > +                                           PCI_CAP_ID_MSI);
> > +    struct vpci_msi *msi = pdev->vpci->msi;
> > +    uint16_t control;
> > +
> > +    if ( !msi || !msi->enabled )
> > +        return;
> > +
> > +    control = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                              PCI_FUNC(pdev->devfn), msi_control_reg(pos));
> > +    pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                     PCI_FUNC(pdev->devfn), msi_control_reg(pos),
> > +                     (control & ~PCI_MSI_FLAGS_ENABLE));
> > +
> > +    vpci_msi_arch_disable(msi, pdev);
> 
> Missing xfree(msi)?

It's done in vpci_remove_device, but as said in the previous patch I
can move it here since it will be clearer.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/10] vpci/header: add teardown cleanup
  2018-06-29 11:15   ` Wei Liu
@ 2018-07-02  8:04     ` Roger Pau Monné
  2018-07-02 14:30       ` Wei Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02  8:04 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Fri, Jun 29, 2018 at 12:15:34PM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:30PM +0200, Roger Pau Monne wrote:
> > In order to unmap the BARs
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien.grall@arm.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Tim Deegan <tim@xen.org>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/drivers/vpci/header.c | 30 +++++++++++++++++++++++-------
> >  1 file changed, 23 insertions(+), 7 deletions(-)
> > 
> > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> > index 4363270a55..686e04e35a 100644
> > --- a/xen/drivers/vpci/header.c
> > +++ b/xen/drivers/vpci/header.c
> > @@ -131,12 +131,15 @@ bool vpci_process_pending(struct vcpu *v)
> >          if ( rc == -ERESTART )
> >              return true;
> >  
> > -        spin_lock(&v->vpci.pdev->vpci_lock);
> > -        if ( v->vpci.pdev->vpci )
> > -            /* Disable memory decoding unconditionally on failure. */
> > -            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > -                            !rc && v->vpci.rom_only);
> > -        spin_unlock(&v->vpci.pdev->vpci_lock);
> > +        if ( v->vpci.pdev )
> > +        {
> > +            spin_lock(&v->vpci.pdev->vpci_lock);
> > +            if ( v->vpci.pdev->vpci )
> > +                /* Disable memory decoding unconditionally on failure. */
> > +                modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > +                                !rc && v->vpci.rom_only);
> > +            spin_unlock(&v->vpci.pdev->vpci_lock);
> > +        }
> >  
> >          rangeset_destroy(v->vpci.mem);
> >          v->vpci.mem = NULL;
> > @@ -560,7 +563,20 @@ static int init_bars(struct pci_dev *pdev)
> >  
> >      return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
> >  }
> > -REGISTER_VPCI_INIT(init_bars, NULL, VPCI_PRIORITY_MIDDLE);
> > +
> > +static void teardown_bars(struct pci_dev *pdev)
> > +{
> > +    uint16_t cmd = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                                   PCI_FUNC(pdev->devfn), PCI_COMMAND);
> > +
> > +    if ( cmd & PCI_COMMAND_MEMORY )
> > +    {
> > +        /* Unmap all BARs from guest p2m. */
> > +        modify_bars(pdev, false, false);
> 
> So modify_bars will eventually call defer_map in most cases (which I
> believe are the ones your care about here).
> 
> But then the following line sets vpci.pdev to NULL, which means the
> check in vpci_process_pending is false and modify_decoding is skipped.
> If that what you want?

Yes, after the call to teardown_bars the pdev can be removed, so the
deferred part of the unmap shouldn't rely on pdev being available.
That means that the memory decoding bit is not switched off. I would
expect that the device will get reset into a proper state (or in the
SR-IOV just disappear).

Trying to get the state of the device to the same state it was when
vpci was activated doesn't seem very feasible.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-06-29 15:27   ` Wei Liu
@ 2018-07-02  8:12     ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02  8:12 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Fri, Jun 29, 2018 at 04:27:08PM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
> > So that a PCI device that supports SR-IOV (PF) can enable the capability
> > and use the virtual functions.
> > 
> > This code is expected to only be used by privileged domains,
> > unprivileged domains should not get access to the SR-IOV capability.
> > 
> > The current code detects enabling of the virtual functions feature and
> > automatically adds the VFs to the domain. It also detects enabling of
> > memory space and maps the VFs BARs into the domain p2m. Disabling of
> > the VF enable bit removes the devices and the BAR memory map from the
> > domain p2m.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> [...]
> >      int rc;
> > @@ -261,12 +273,29 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
> >          }
> >      }
> >  
> > +    /* Get the parent dev if it's a VF. */
> > +    if ( pdev->info.is_virtfn )
> > +    {
> > +        pcidevs_lock();
> > +        parent = pci_get_pdev(pdev->seg, pdev->info.physfn.bus,
> > +                              pdev->info.physfn.devfn);
> > +        pcidevs_unlock();
> > +    }
> > +
> >      /*
> >       * Check for overlaps with other BARs. Note that only BARs that are
> >       * currently mapped (enabled) are checked for overlaps.
> >       */
> >      list_for_each_entry(tmp, &pdev->domain->arch.pdev_list, domain_list)
> >      {
> > +        /*
> > +         * When mapping the BARs of a VF the parent PF is already locked,
> > +         * trying to lock it will result in a deadlock. This is because
> > +         * vpci_modify_bars is called from the parent PF control_write register
> > +         * handler.
> > +         */
> > +        bool lock = parent != tmp;
> 
> There is spin_lock_recursive. Would that work?

Yes, but I wasn't sure whether this single usage of recursiveness
would be enough to switch the lock to a recursive one. I can switch it
if there's consensus.

> [...]
> > +static int init_sriov(struct pci_dev *pdev)
> > +{
> > +    unsigned int pos = pci_find_ext_capability(pdev->seg, pdev->bus,
> > +                                               pdev->devfn,
> > +                                               PCI_EXT_CAP_ID_SRIOV);
> > +    uint16_t total_vfs;
> > +
> > +    if ( !pos )
> > +        return 0;
> > +
> > +    total_vfs = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                                PCI_FUNC(pdev->devfn),
> > +                                pos + PCI_SRIOV_TOTAL_VF);
> > +
> > +    pdev->vpci->sriov = xzalloc_bytes(SRIOV_SIZE(total_vfs));
> > +    if ( !pdev->vpci->sriov )
> > +        return -ENOMEM;
> > +
> > +    return vpci_add_register(pdev->vpci, vpci_hw_read16, control_write,
> > +                             pos + PCI_SRIOV_CTRL, 2, pdev->vpci->sriov);
> 
> If vpci_add_register fails sriov is going to be leaked? Or eventually
> that will cause the domain to crash in teardown.

sriov should be freed by vpci_remove_device, but I think I've missed
to add the free in there...

> I think it would make more sense if init_sriov is able to clean up after
> itself. MSI and MSI-X code has similar issue.

Yes, as said in previous patches I don't mind moving the freeing into
the teardown helpers (and also add frees in the init helpers in the
error cases).

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-07-02  7:54     ` Roger Pau Monné
@ 2018-07-02 13:55       ` Wei Liu
  2018-07-02 14:02         ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-07-02 13:55 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Mon, Jul 02, 2018 at 09:54:20AM +0200, Roger Pau Monné wrote:
> On Fri, Jun 29, 2018 at 11:52:07AM +0100, Wei Liu wrote:
> > On Wed, Jun 20, 2018 at 04:42:28PM +0200, Roger Pau Monne wrote:
> > > So that interrupts are properly freed.
> > > 
> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > ---
> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > Cc: Jan Beulich <jbeulich@suse.com>
> > > Cc: Julien Grall <julien.grall@arm.com>
> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > > Cc: Tim Deegan <tim@xen.org>
> > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > ---
> > >  xen/drivers/vpci/msix.c | 43 ++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 42 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> > > index 6132f576b6..cfca1cd43a 100644
> > > --- a/xen/drivers/vpci/msix.c
> > > +++ b/xen/drivers/vpci/msix.c
> > > @@ -450,7 +450,48 @@ static int init_msix(struct pci_dev *pdev)
> > >  
> > >      return 0;
> > >  }
> > > -REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
> > > +
> > > +static void teardown_msix(struct pci_dev *pdev)
> > > +{
> > > +    struct vpci_msix *msix = pdev->vpci->msix;
> > > +    unsigned int i;
> > > +
> > > +    if ( !msix )
> > > +        return;
> > > +
> > > +    if ( msix->enabled )
> > > +    {
> > > +        /* Disable MSIX. */
> > > +        unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> > > +                                               PCI_SLOT(pdev->devfn),
> > > +                                               PCI_FUNC(pdev->devfn),
> > > +                                               PCI_CAP_ID_MSIX);
> > > +        uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> > > +                                           PCI_SLOT(pdev->devfn),
> > > +                                           PCI_FUNC(pdev->devfn),
> > > +                                           msix_control_reg(pos));
> > > +
> > > +        pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > +                         PCI_FUNC(pdev->devfn), msix_control_reg(pos),
> > > +                         (control & ~PCI_MSIX_FLAGS_ENABLE));
> > > +    }
> > > +
> > > +    write_lock(&pdev->domain->arch.hvm_domain.msix_lock);
> > > +    list_del(&pdev->vpci->msix->next);
> > > +    write_unlock(&pdev->domain->arch.hvm_domain.msix_lock);
> > > +
> > > +    for ( i = 0; i < msix->max_entries && msix->enabled; i++ )
> > 
> > Maybe lift checking msix->enabled outside of the loop? Afaict nothing
> > in the loop manipulates that flag.
> 
> I've done it that way to skip one indentation level, but I can put it
> outside if that's preferred.

I would say if you really don't want to indent one more level.

    if ( !msix->enabled )
        return;

    for ( ... ) { }

A possibly dumb question: why are the entries not freed as your disable
hardware MSIX? Can you reshuffle code to make that happen? Say, first
take the structure of the list then disable MSI-X and free all entries.

If that's not possible then I think better explanation is needed why the
code is structured like this, because it appears there is some very
intricate sequence required.

> 
> > > +    {
> > > +        int rc = vpci_msix_arch_disable_entry(&msix->entries[i], pdev);
> > > +
> > > +        if ( rc && rc != -ENOENT )
> > > +            gprintk(XENLOG_WARNING,
> > > +                    "%04x:%02x:%02x.%u: unable to disable MSIX entry %u: %d\n",
> > > +                    pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > +                    PCI_FUNC(pdev->devfn), i, rc);
> > > +    }
> > 
> > No freeing msix here?
> 
> msix is freed by vpci_remove_device. I can move the freeing inside of
> the teardown function now, since it makes it clearer.

Yeah, I think making it symmetric is much better. Memory allocation
happens in init_msix while deallocation happens in teardown_msix.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-07-02 13:55       ` Wei Liu
@ 2018-07-02 14:02         ` Roger Pau Monné
  2018-07-02 14:07           ` Wei Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02 14:02 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Mon, Jul 02, 2018 at 02:55:37PM +0100, Wei Liu wrote:
> On Mon, Jul 02, 2018 at 09:54:20AM +0200, Roger Pau Monné wrote:
> > On Fri, Jun 29, 2018 at 11:52:07AM +0100, Wei Liu wrote:
> > > On Wed, Jun 20, 2018 at 04:42:28PM +0200, Roger Pau Monne wrote:
> > > > So that interrupts are properly freed.
> > > > 
> > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > > ---
> > > > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > > > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > > Cc: Jan Beulich <jbeulich@suse.com>
> > > > Cc: Julien Grall <julien.grall@arm.com>
> > > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > > > Cc: Tim Deegan <tim@xen.org>
> > > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > > ---
> > > >  xen/drivers/vpci/msix.c | 43 ++++++++++++++++++++++++++++++++++++++++-
> > > >  1 file changed, 42 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> > > > index 6132f576b6..cfca1cd43a 100644
> > > > --- a/xen/drivers/vpci/msix.c
> > > > +++ b/xen/drivers/vpci/msix.c
> > > > @@ -450,7 +450,48 @@ static int init_msix(struct pci_dev *pdev)
> > > >  
> > > >      return 0;
> > > >  }
> > > > -REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
> > > > +
> > > > +static void teardown_msix(struct pci_dev *pdev)
> > > > +{
> > > > +    struct vpci_msix *msix = pdev->vpci->msix;
> > > > +    unsigned int i;
> > > > +
> > > > +    if ( !msix )
> > > > +        return;
> > > > +
> > > > +    if ( msix->enabled )
> > > > +    {
> > > > +        /* Disable MSIX. */
> > > > +        unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> > > > +                                               PCI_SLOT(pdev->devfn),
> > > > +                                               PCI_FUNC(pdev->devfn),
> > > > +                                               PCI_CAP_ID_MSIX);
> > > > +        uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> > > > +                                           PCI_SLOT(pdev->devfn),
> > > > +                                           PCI_FUNC(pdev->devfn),
> > > > +                                           msix_control_reg(pos));
> > > > +
> > > > +        pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > > +                         PCI_FUNC(pdev->devfn), msix_control_reg(pos),
> > > > +                         (control & ~PCI_MSIX_FLAGS_ENABLE));
> > > > +    }
> > > > +
> > > > +    write_lock(&pdev->domain->arch.hvm_domain.msix_lock);
> > > > +    list_del(&pdev->vpci->msix->next);
> > > > +    write_unlock(&pdev->domain->arch.hvm_domain.msix_lock);
> > > > +
> > > > +    for ( i = 0; i < msix->max_entries && msix->enabled; i++ )
> > > 
> > > Maybe lift checking msix->enabled outside of the loop? Afaict nothing
> > > in the loop manipulates that flag.
> > 
> > I've done it that way to skip one indentation level, but I can put it
> > outside if that's preferred.
> 
> I would say if you really don't want to indent one more level.
> 
>     if ( !msix->enabled )
>         return;
> 
>     for ( ... ) { }
> 
> A possibly dumb question: why are the entries not freed as your disable
> hardware MSIX?

I'm not sure I understand the question. You cannot free specific
entries, they are part of the msix struct (you have to free the whole
struct, not specific entries).

> Can you reshuffle code to make that happen? Say, first
> take the structure of the list then disable MSI-X and free all entries.
> 
> If that's not possible then I think better explanation is needed why the
> code is structured like this, because it appears there is some very
> intricate sequence required.

I can reshuffle the code so it's easier to parse, let me do that.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-07-02 14:02         ` Roger Pau Monné
@ 2018-07-02 14:07           ` Wei Liu
  2018-07-02 14:12             ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-07-02 14:07 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Mon, Jul 02, 2018 at 04:02:44PM +0200, Roger Pau Monné wrote:
> On Mon, Jul 02, 2018 at 02:55:37PM +0100, Wei Liu wrote:
> > On Mon, Jul 02, 2018 at 09:54:20AM +0200, Roger Pau Monné wrote:
> > > On Fri, Jun 29, 2018 at 11:52:07AM +0100, Wei Liu wrote:
> > > > On Wed, Jun 20, 2018 at 04:42:28PM +0200, Roger Pau Monne wrote:
> > > > > So that interrupts are properly freed.
> > > > > 
> > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > > > ---
> > > > > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > > > > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > > > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > > > Cc: Jan Beulich <jbeulich@suse.com>
> > > > > Cc: Julien Grall <julien.grall@arm.com>
> > > > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > > > > Cc: Tim Deegan <tim@xen.org>
> > > > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > > > ---
> > > > >  xen/drivers/vpci/msix.c | 43 ++++++++++++++++++++++++++++++++++++++++-
> > > > >  1 file changed, 42 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> > > > > index 6132f576b6..cfca1cd43a 100644
> > > > > --- a/xen/drivers/vpci/msix.c
> > > > > +++ b/xen/drivers/vpci/msix.c
> > > > > @@ -450,7 +450,48 @@ static int init_msix(struct pci_dev *pdev)
> > > > >  
> > > > >      return 0;
> > > > >  }
> > > > > -REGISTER_VPCI_INIT(init_msix, NULL, VPCI_PRIORITY_HIGH);
> > > > > +
> > > > > +static void teardown_msix(struct pci_dev *pdev)
> > > > > +{
> > > > > +    struct vpci_msix *msix = pdev->vpci->msix;
> > > > > +    unsigned int i;
> > > > > +
> > > > > +    if ( !msix )
> > > > > +        return;
> > > > > +
> > > > > +    if ( msix->enabled )
> > > > > +    {
> > > > > +        /* Disable MSIX. */
> > > > > +        unsigned int pos = pci_find_cap_offset(pdev->seg, pdev->bus,
> > > > > +                                               PCI_SLOT(pdev->devfn),
> > > > > +                                               PCI_FUNC(pdev->devfn),
> > > > > +                                               PCI_CAP_ID_MSIX);
> > > > > +        uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> > > > > +                                           PCI_SLOT(pdev->devfn),
> > > > > +                                           PCI_FUNC(pdev->devfn),
> > > > > +                                           msix_control_reg(pos));
> > > > > +
> > > > > +        pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > > > +                         PCI_FUNC(pdev->devfn), msix_control_reg(pos),
> > > > > +                         (control & ~PCI_MSIX_FLAGS_ENABLE));
> > > > > +    }
> > > > > +
> > > > > +    write_lock(&pdev->domain->arch.hvm_domain.msix_lock);
> > > > > +    list_del(&pdev->vpci->msix->next);
> > > > > +    write_unlock(&pdev->domain->arch.hvm_domain.msix_lock);
> > > > > +
> > > > > +    for ( i = 0; i < msix->max_entries && msix->enabled; i++ )
> > > > 
> > > > Maybe lift checking msix->enabled outside of the loop? Afaict nothing
> > > > in the loop manipulates that flag.
> > > 
> > > I've done it that way to skip one indentation level, but I can put it
> > > outside if that's preferred.
> > 
> > I would say if you really don't want to indent one more level.
> > 
> >     if ( !msix->enabled )
> >         return;
> > 
> >     for ( ... ) { }
> > 
> > A possibly dumb question: why are the entries not freed as your disable
> > hardware MSIX?
> 
> I'm not sure I understand the question. You cannot free specific
> entries, they are part of the msix struct (you have to free the whole
> struct, not specific entries).

I meant why isn't the code structured like:

    lock()
    take entry off list
    unlock()

    if ( msix->enabled )
    {
        /* Disable MSIX. */
        unsigned int pos = ...

	pci_confi_write16(...);

	for ( i = 0; i < msix->max_entries; i++ )
	{
	     ...
	}
    }

I guess that's what you're going to do anyway.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/10] vpci/msix: add teardown cleanup
  2018-07-02 14:07           ` Wei Liu
@ 2018-07-02 14:12             ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02 14:12 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Mon, Jul 02, 2018 at 03:07:56PM +0100, Wei Liu wrote:
> On Mon, Jul 02, 2018 at 04:02:44PM +0200, Roger Pau Monné wrote:
> > I'm not sure I understand the question. You cannot free specific
> > entries, they are part of the msix struct (you have to free the whole
> > struct, not specific entries).
> 
> I meant why isn't the code structured like:
> 
>     lock()
>     take entry off list
>     unlock()
> 
>     if ( msix->enabled )
>     {
>         /* Disable MSIX. */
>         unsigned int pos = ...
> 
> 	pci_confi_write16(...);
> 
> 	for ( i = 0; i < msix->max_entries; i++ )
> 	{
> 	     ...
> 	}
>     }
> 
> I guess that's what you're going to do anyway.

Exactly, that's my plan, modulo:

lock()
take entry off list
unlock()

if ( !msix->enabled )
{
    free(msix);
    return;
}
...

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/10] vpci/header: add teardown cleanup
  2018-07-02  8:04     ` Roger Pau Monné
@ 2018-07-02 14:30       ` Wei Liu
  2018-07-02 14:42         ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-07-02 14:30 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Mon, Jul 02, 2018 at 10:04:30AM +0200, Roger Pau Monné wrote:
> On Fri, Jun 29, 2018 at 12:15:34PM +0100, Wei Liu wrote:
> > On Wed, Jun 20, 2018 at 04:42:30PM +0200, Roger Pau Monne wrote:
> > > In order to unmap the BARs
> > > 
> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > ---
> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > Cc: Jan Beulich <jbeulich@suse.com>
> > > Cc: Julien Grall <julien.grall@arm.com>
> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > > Cc: Tim Deegan <tim@xen.org>
> > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > ---
> > >  xen/drivers/vpci/header.c | 30 +++++++++++++++++++++++-------
> > >  1 file changed, 23 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> > > index 4363270a55..686e04e35a 100644
> > > --- a/xen/drivers/vpci/header.c
> > > +++ b/xen/drivers/vpci/header.c
> > > @@ -131,12 +131,15 @@ bool vpci_process_pending(struct vcpu *v)
> > >          if ( rc == -ERESTART )
> > >              return true;
> > >  
> > > -        spin_lock(&v->vpci.pdev->vpci_lock);
> > > -        if ( v->vpci.pdev->vpci )
> > > -            /* Disable memory decoding unconditionally on failure. */
> > > -            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > > -                            !rc && v->vpci.rom_only);
> > > -        spin_unlock(&v->vpci.pdev->vpci_lock);
> > > +        if ( v->vpci.pdev )
> > > +        {
> > > +            spin_lock(&v->vpci.pdev->vpci_lock);
> > > +            if ( v->vpci.pdev->vpci )
> > > +                /* Disable memory decoding unconditionally on failure. */
> > > +                modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > > +                                !rc && v->vpci.rom_only);
> > > +            spin_unlock(&v->vpci.pdev->vpci_lock);
> > > +        }
> > >  
> > >          rangeset_destroy(v->vpci.mem);
> > >          v->vpci.mem = NULL;
> > > @@ -560,7 +563,20 @@ static int init_bars(struct pci_dev *pdev)
> > >  
> > >      return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
> > >  }
> > > -REGISTER_VPCI_INIT(init_bars, NULL, VPCI_PRIORITY_MIDDLE);
> > > +
> > > +static void teardown_bars(struct pci_dev *pdev)
> > > +{
> > > +    uint16_t cmd = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > +                                   PCI_FUNC(pdev->devfn), PCI_COMMAND);
> > > +
> > > +    if ( cmd & PCI_COMMAND_MEMORY )
> > > +    {
> > > +        /* Unmap all BARs from guest p2m. */
> > > +        modify_bars(pdev, false, false);
> > 
> > So modify_bars will eventually call defer_map in most cases (which I
> > believe are the ones your care about here).
> > 
> > But then the following line sets vpci.pdev to NULL, which means the
> > check in vpci_process_pending is false and modify_decoding is skipped.
> > If that what you want?
> 
> Yes, after the call to teardown_bars the pdev can be removed, so the
> deferred part of the unmap shouldn't rely on pdev being available.
> That means that the memory decoding bit is not switched off. I would
> expect that the device will get reset into a proper state (or in the
> SR-IOV just disappear).
> 

Which reset are you referring to? I suppose you mean VF FLR?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/10] vpci/header: add teardown cleanup
  2018-07-02 14:30       ` Wei Liu
@ 2018-07-02 14:42         ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-02 14:42 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Mon, Jul 02, 2018 at 03:30:19PM +0100, Wei Liu wrote:
> On Mon, Jul 02, 2018 at 10:04:30AM +0200, Roger Pau Monné wrote:
> > On Fri, Jun 29, 2018 at 12:15:34PM +0100, Wei Liu wrote:
> > > On Wed, Jun 20, 2018 at 04:42:30PM +0200, Roger Pau Monne wrote:
> > > > In order to unmap the BARs
> > > > 
> > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > > ---
> > > > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > > > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > > Cc: Jan Beulich <jbeulich@suse.com>
> > > > Cc: Julien Grall <julien.grall@arm.com>
> > > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > > > Cc: Tim Deegan <tim@xen.org>
> > > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > > ---
> > > >  xen/drivers/vpci/header.c | 30 +++++++++++++++++++++++-------
> > > >  1 file changed, 23 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> > > > index 4363270a55..686e04e35a 100644
> > > > --- a/xen/drivers/vpci/header.c
> > > > +++ b/xen/drivers/vpci/header.c
> > > > @@ -131,12 +131,15 @@ bool vpci_process_pending(struct vcpu *v)
> > > >          if ( rc == -ERESTART )
> > > >              return true;
> > > >  
> > > > -        spin_lock(&v->vpci.pdev->vpci_lock);
> > > > -        if ( v->vpci.pdev->vpci )
> > > > -            /* Disable memory decoding unconditionally on failure. */
> > > > -            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > > > -                            !rc && v->vpci.rom_only);
> > > > -        spin_unlock(&v->vpci.pdev->vpci_lock);
> > > > +        if ( v->vpci.pdev )
> > > > +        {
> > > > +            spin_lock(&v->vpci.pdev->vpci_lock);
> > > > +            if ( v->vpci.pdev->vpci )
> > > > +                /* Disable memory decoding unconditionally on failure. */
> > > > +                modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> > > > +                                !rc && v->vpci.rom_only);
> > > > +            spin_unlock(&v->vpci.pdev->vpci_lock);
> > > > +        }
> > > >  
> > > >          rangeset_destroy(v->vpci.mem);
> > > >          v->vpci.mem = NULL;
> > > > @@ -560,7 +563,20 @@ static int init_bars(struct pci_dev *pdev)
> > > >  
> > > >      return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
> > > >  }
> > > > -REGISTER_VPCI_INIT(init_bars, NULL, VPCI_PRIORITY_MIDDLE);
> > > > +
> > > > +static void teardown_bars(struct pci_dev *pdev)
> > > > +{
> > > > +    uint16_t cmd = pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > > +                                   PCI_FUNC(pdev->devfn), PCI_COMMAND);
> > > > +
> > > > +    if ( cmd & PCI_COMMAND_MEMORY )
> > > > +    {
> > > > +        /* Unmap all BARs from guest p2m. */
> > > > +        modify_bars(pdev, false, false);
> > > 
> > > So modify_bars will eventually call defer_map in most cases (which I
> > > believe are the ones your care about here).
> > > 
> > > But then the following line sets vpci.pdev to NULL, which means the
> > > check in vpci_process_pending is false and modify_decoding is skipped.
> > > If that what you want?
> > 
> > Yes, after the call to teardown_bars the pdev can be removed, so the
> > deferred part of the unmap shouldn't rely on pdev being available.
> > That means that the memory decoding bit is not switched off. I would
> > expect that the device will get reset into a proper state (or in the
> > SR-IOV just disappear).
> > 
> 
> Which reset are you referring to? I suppose you mean VF FLR?

Whatever reset procedure is used is ATM out of the scope of this
patch. For example in the VF case the header teardown code will be
used when SR-IOV is disabled, and thus the VFs just disappear. If
SR-IOV is enabled afterwards the state of the VFs is clean.

For other devices (or when not disabling SR-IOV), then yes, some form
of reset (either FLR, PM or secondary bus reset) should be performed
before assigning the device again to a guest.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-06-20 14:42 ` [PATCH 10/10] vpci/sriov: add support for SR-IOV capability Roger Pau Monne
  2018-06-29 15:27   ` Wei Liu
@ 2018-07-03  9:27   ` Wei Liu
  2018-07-03 10:21     ` Roger Pau Monné
  1 sibling, 1 reply; 43+ messages in thread
From: Wei Liu @ 2018-07-03  9:27 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
> So that a PCI device that supports SR-IOV (PF) can enable the capability
> and use the virtual functions.
> 
> This code is expected to only be used by privileged domains,
> unprivileged domains should not get access to the SR-IOV capability.
> 
> The current code detects enabling of the virtual functions feature and
> automatically adds the VFs to the domain. It also detects enabling of
> memory space and maps the VFs BARs into the domain p2m. Disabling of
> the VF enable bit removes the devices and the BAR memory map from the
> domain p2m.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
[...]
> +
> +static void modify_memory_mapping(const struct pci_dev *pdev, unsigned int pos,
> +                                  bool enable)
> +{
> +    const struct vpci_sriov *sriov = pdev->vpci->sriov;
> +    unsigned int i;
> +    int rc;
> +
> +    if ( enable )
> +    {
> +        struct pci_dev *pf_dev;
> +
> +        pcidevs_lock();
> +        /*
> +         * NB: a non-const pci_dev of the PF is needed in order to update
> +         * vf_rlen.
> +         */
> +        pf_dev = pci_get_pdev(pdev->seg, pdev->bus, pdev->devfn);
> +        pcidevs_unlock();
> +        ASSERT(pf_dev);
> +
> +        /* Set the BARs addresses and size. */
> +        for ( i = 0; i < PCI_SRIOV_NUM_BARS; i += rc )
> +        {
> +            unsigned int j, idx = pos + PCI_SRIOV_BAR + i * 4;
> +            const pci_sbdf_t sbdf = {
> +                .sbdf = PCI_SBDF3(pdev->seg, pdev->bus, pdev->devfn),
> +            };
> +            uint32_t bar = pci_conf_read32(pdev->seg, pdev->bus,
> +                                           PCI_SLOT(pdev->devfn),
> +                                           PCI_FUNC(pdev->devfn), idx);
> +            uint64_t addr, size;
> +
> +            rc = pci_size_mem_bar(sbdf, idx, &addr, &size,
> +                                  PCI_BAR_VF |
> +                                  ((i == PCI_SRIOV_NUM_BARS - 1) ?
> +                                   PCI_BAR_LAST : 0));

This only returns 1 or 2. The return type is unsigned int which means rc
is never going to be <= 0.

> +            if ( rc <= 0 )
> +            {
> +                gprintk(XENLOG_ERR,
> +                        "%04x:%02x:%02x.%u: failed to size VF BAR\n",
> +                        pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                        PCI_FUNC(pdev->devfn));
> +                domain_crash(pdev->domain);
> +                return;
> +            }
> +
> +            /*
> +             * Update vf_rlen on the PF. According to the spec the size of
> +             * the BARs can change if the system page size register is
> +             * modified, so always update rlen when enabling VFs.
> +             */
> +            pf_dev->vf_rlen[i] = size;
> +
> +            for ( j = 0; j < sriov->num_vfs; j++ )
> +            {
> +                struct vpci_header *header;
> +
> +                if ( !sriov->vf[j] )
> +                    /* Can happen if pci_add_device fails. */
> +                    continue;
> +
> +                spin_lock(&sriov->vf[j]->vpci_lock);
> +                header = &sriov->vf[j]->vpci->header;
> +
> +                if ( !size )
> +                {
> +                    header->bars[i].type = VPCI_BAR_EMPTY;
> +                    spin_unlock(&sriov->vf[j]->vpci_lock);
> +                    continue;
> +                }
> +
> +                header->bars[i].addr = addr + size * j;
> +                header->bars[i].size = size;
> +                header->bars[i].prefetchable =
> +                    bar & PCI_BASE_ADDRESS_MEM_PREFETCH;
> +
> +                switch ( rc )
> +                {
> +                case 1:
> +                    header->bars[i].type = VPCI_BAR_MEM32;
> +                    break;
> +
> +                case 2:
> +                    header->bars[i].type = VPCI_BAR_MEM64_LO;
> +                    header->bars[i + 1].type = VPCI_BAR_MEM64_HI;
> +                    break;
> +
> +                default:
> +                    ASSERT_UNREACHABLE();
> +                    spin_unlock(&sriov->vf[j]->vpci_lock);
> +                    domain_crash(pdev->domain);
> +                    return;
> +                }
> +                spin_unlock(&sriov->vf[j]->vpci_lock);
> +            }
> +        }
> +    }
> +
> +    /* Add/remove mappings for the VFs BARs into the p2m. */
> +    for ( i = 0; i < sriov->num_vfs; i++ )
> +    {
> +        struct pci_dev *vf_pdev = sriov->vf[i];
> +
> +        spin_lock(&vf_pdev->vpci_lock);
> +        rc = vpci_modify_bars(vf_pdev, enable, false);
> +        spin_unlock(&vf_pdev->vpci_lock);
> +        if ( rc )
> +            gprintk(XENLOG_ERR,
> +                    "failed to %smap BARs of VF %04x:%02x:%02x.%u: %d\n",
> +                    enable ? "" : "un", vf_pdev->seg, vf_pdev->bus,
> +                    PCI_SLOT(vf_pdev->devfn), PCI_FUNC(vf_pdev->devfn), rc);
> +    }
> +}
> +
> +static void control_write(const struct pci_dev *pdev, unsigned int reg,
> +                          uint32_t val, void *data)
> +{
> +    struct vpci_sriov *sriov = data;
> +    unsigned int i, pos = reg - PCI_SRIOV_CTRL;
> +    uint16_t control = pci_conf_read16(pdev->seg, pdev->bus,
> +                                       PCI_SLOT(pdev->devfn),
> +                                       PCI_FUNC(pdev->devfn),
> +                                       pos + PCI_SRIOV_CTRL);
> +    bool enabled = control & PCI_SRIOV_CTRL_VFE;
> +    bool mem_enabled = control & PCI_SRIOV_CTRL_MSE;
> +    bool new_enabled = val & PCI_SRIOV_CTRL_VFE;
> +    bool new_mem_enabled = val & PCI_SRIOV_CTRL_MSE;
> +    int rc;
> +
> +    if ( new_enabled != enabled )
> +    {
> +        uint16_t offset = pci_conf_read16(pdev->seg, pdev->bus,
> +                                          PCI_SLOT(pdev->devfn),
> +                                          PCI_FUNC(pdev->devfn),
> +                                          pos + PCI_SRIOV_VF_OFFSET);
> +        uint16_t stride = pci_conf_read16(pdev->seg, pdev->bus,
> +                                          PCI_SLOT(pdev->devfn),
> +                                          PCI_FUNC(pdev->devfn),
> +                                          pos + PCI_SRIOV_VF_STRIDE);
> +
> +        if ( new_enabled )
> +        {
> +            /*
> +             * Only update the number of active VFs when enabling, when
> +             * disabling use the cached value in order to always remove the
> +             * same number of VFs that where active.

where -> were.

> +             */
> +            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
> +                                             PCI_SLOT(pdev->devfn),
> +                                             PCI_FUNC(pdev->devfn),
> +                                             pos + PCI_SRIOV_NUM_VF);
> +
> +            /*
> +             * NB: VFE needs to be enabled before calling pci_add_device so Xen
> +             * can access the config space of VFs.
> +             */
> +            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                             PCI_FUNC(pdev->devfn), reg,
> +                             control | PCI_SRIOV_CTRL_VFE);
> +
> +            /*
> +             * The spec states that the software must wait at least 100ms
> +             * before attempting to access VF registers when enabling virtual
> +             * functions on the PF.
> +             */
> +            mdelay(100);

IMHO delaying 100ms like this in an active system is far too long. It
would be better to put this into a loop and process softirqs in between
delays.

> +        }
> +
> +        for ( i = 0; i < sriov->num_vfs; i++ )
> +        {
> +            const pci_sbdf_t bdf = {
> +                .bdf = PCI_BDF2(pdev->bus, pdev->devfn) + offset + stride * i,
> +            };
> +
> +            if ( new_enabled )
> +            {
> +                const struct pci_dev_info info = {
> +                    .is_virtfn = true,
> +                    .physfn.bus = pdev->bus,
> +                    .physfn.devfn = pdev->devfn,
> +                };
> +
> +                rc = pci_add_device(pdev->seg, bdf.bus, bdf.extfunc, &info,
> +                                    pdev->node);
> +            }
> +            else
> +                rc = pci_remove_device(pdev->seg, bdf.bus, bdf.extfunc);
> +            if ( rc )
> +                gprintk(XENLOG_ERR, "failed to %s VF %04x:%02x:%02x.%u: %d\n",
> +                        new_enabled ? "add" : "remove", pdev->seg, bdf.bus,
> +                        bdf.dev, bdf.func, rc);
> +
> +            pcidevs_lock();
> +            sriov->vf[i] = pci_get_pdev(pdev->seg, bdf.bus, bdf.extfunc);
> +            pcidevs_unlock();

So the array is updated regardless of rc?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03  9:27   ` Wei Liu
@ 2018-07-03 10:21     ` Roger Pau Monné
  2018-07-03 10:48       ` Jan Beulich
  2018-07-03 10:54       ` Wei Liu
  0 siblings, 2 replies; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-03 10:21 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Tue, Jul 03, 2018 at 10:27:59AM +0100, Wei Liu wrote:
> On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
> > +        /* Set the BARs addresses and size. */
> > +        for ( i = 0; i < PCI_SRIOV_NUM_BARS; i += rc )
> > +        {
> > +            unsigned int j, idx = pos + PCI_SRIOV_BAR + i * 4;
> > +            const pci_sbdf_t sbdf = {
> > +                .sbdf = PCI_SBDF3(pdev->seg, pdev->bus, pdev->devfn),
> > +            };
> > +            uint32_t bar = pci_conf_read32(pdev->seg, pdev->bus,
> > +                                           PCI_SLOT(pdev->devfn),
> > +                                           PCI_FUNC(pdev->devfn), idx);
> > +            uint64_t addr, size;
> > +
> > +            rc = pci_size_mem_bar(sbdf, idx, &addr, &size,
> > +                                  PCI_BAR_VF |
> > +                                  ((i == PCI_SRIOV_NUM_BARS - 1) ?
> > +                                   PCI_BAR_LAST : 0));
> 
> This only returns 1 or 2. The return type is unsigned int which means rc
> is never going to be <= 0.

Right... I will replace the if with an ASSERT(rc > 0 && rc <= 2);

> > +             */
> > +            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
> > +                                             PCI_SLOT(pdev->devfn),
> > +                                             PCI_FUNC(pdev->devfn),
> > +                                             pos + PCI_SRIOV_NUM_VF);
> > +
> > +            /*
> > +             * NB: VFE needs to be enabled before calling pci_add_device so Xen
> > +             * can access the config space of VFs.
> > +             */
> > +            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +                             PCI_FUNC(pdev->devfn), reg,
> > +                             control | PCI_SRIOV_CTRL_VFE);
> > +
> > +            /*
> > +             * The spec states that the software must wait at least 100ms
> > +             * before attempting to access VF registers when enabling virtual
> > +             * functions on the PF.
> > +             */
> > +            mdelay(100);
> 
> IMHO delaying 100ms like this in an active system is far too long. It
> would be better to put this into a loop and process softirqs in between
> delays.

Ack, do you think 10ms delays would be OK?

> > +        }
> > +
> > +        for ( i = 0; i < sriov->num_vfs; i++ )
> > +        {
> > +            const pci_sbdf_t bdf = {
> > +                .bdf = PCI_BDF2(pdev->bus, pdev->devfn) + offset + stride * i,
> > +            };
> > +
> > +            if ( new_enabled )
> > +            {
> > +                const struct pci_dev_info info = {
> > +                    .is_virtfn = true,
> > +                    .physfn.bus = pdev->bus,
> > +                    .physfn.devfn = pdev->devfn,
> > +                };
> > +
> > +                rc = pci_add_device(pdev->seg, bdf.bus, bdf.extfunc, &info,
> > +                                    pdev->node);
> > +            }
> > +            else
> > +                rc = pci_remove_device(pdev->seg, bdf.bus, bdf.extfunc);
> > +            if ( rc )
> > +                gprintk(XENLOG_ERR, "failed to %s VF %04x:%02x:%02x.%u: %d\n",
> > +                        new_enabled ? "add" : "remove", pdev->seg, bdf.bus,
> > +                        bdf.dev, bdf.func, rc);
> > +
> > +            pcidevs_lock();
> > +            sriov->vf[i] = pci_get_pdev(pdev->seg, bdf.bus, bdf.extfunc);
> > +            pcidevs_unlock();
> 
> So the array is updated regardless of rc?

Yes, in the addition case the code is capable of dealing with a NULL
entry in the array (or that was the intention). In the case of a
failed removal an entry in the array won't be NULL, but I don't think
that's an issue.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03 10:21     ` Roger Pau Monné
@ 2018-07-03 10:48       ` Jan Beulich
  2018-07-03 10:51         ` Roger Pau Monné
  2018-07-03 10:54       ` Wei Liu
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2018-07-03 10:48 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 03.07.18 at 12:21, <roger.pau@citrix.com> wrote:
> On Tue, Jul 03, 2018 at 10:27:59AM +0100, Wei Liu wrote:
>> On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
>> > +            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
>> > +                                             PCI_SLOT(pdev->devfn),
>> > +                                             PCI_FUNC(pdev->devfn),
>> > +                                             pos + PCI_SRIOV_NUM_VF);
>> > +
>> > +            /*
>> > +             * NB: VFE needs to be enabled before calling pci_add_device so Xen
>> > +             * can access the config space of VFs.
>> > +             */
>> > +            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
>> > +                             PCI_FUNC(pdev->devfn), reg,
>> > +                             control | PCI_SRIOV_CTRL_VFE);
>> > +
>> > +            /*
>> > +             * The spec states that the software must wait at least 100ms
>> > +             * before attempting to access VF registers when enabling virtual
>> > +             * functions on the PF.
>> > +             */
>> > +            mdelay(100);
>> 
>> IMHO delaying 100ms like this in an active system is far too long. It
>> would be better to put this into a loop and process softirqs in between
>> delays.
> 
> Ack, do you think 10ms delays would be OK?

I haven't checked yet where we are here - generally such delays would
better be implemented by pausing the caller and resuming it once the
necessary time has passed. While in Dom0 context it might be acceptable
to do what Wei and you suggest, that'll be yet one more thing to deal
with once we want to re-use the code for DomU-s.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03 10:48       ` Jan Beulich
@ 2018-07-03 10:51         ` Roger Pau Monné
  2018-07-03 10:56           ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-03 10:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

On Tue, Jul 03, 2018 at 04:48:19AM -0600, Jan Beulich wrote:
> >>> On 03.07.18 at 12:21, <roger.pau@citrix.com> wrote:
> > On Tue, Jul 03, 2018 at 10:27:59AM +0100, Wei Liu wrote:
> >> On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
> >> > +            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
> >> > +                                             PCI_SLOT(pdev->devfn),
> >> > +                                             PCI_FUNC(pdev->devfn),
> >> > +                                             pos + PCI_SRIOV_NUM_VF);
> >> > +
> >> > +            /*
> >> > +             * NB: VFE needs to be enabled before calling pci_add_device so Xen
> >> > +             * can access the config space of VFs.
> >> > +             */
> >> > +            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> >> > +                             PCI_FUNC(pdev->devfn), reg,
> >> > +                             control | PCI_SRIOV_CTRL_VFE);
> >> > +
> >> > +            /*
> >> > +             * The spec states that the software must wait at least 100ms
> >> > +             * before attempting to access VF registers when enabling virtual
> >> > +             * functions on the PF.
> >> > +             */
> >> > +            mdelay(100);
> >> 
> >> IMHO delaying 100ms like this in an active system is far too long. It
> >> would be better to put this into a loop and process softirqs in between
> >> delays.
> > 
> > Ack, do you think 10ms delays would be OK?
> 
> I haven't checked yet where we are here - generally such delays would
> better be implemented by pausing the caller and resuming it once the
> necessary time has passed. While in Dom0 context it might be acceptable
> to do what Wei and you suggest, that'll be yet one more thing to deal
> with once we want to re-use the code for DomU-s.

I don't think we ever want to allow DomU-s to manage the SR-IOV
capability, so this code is always going to be Dom0 only AFAICT. In
fact I think I'm going to add an assert to that effect in the SR-IOV
init handler.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03 10:21     ` Roger Pau Monné
  2018-07-03 10:48       ` Jan Beulich
@ 2018-07-03 10:54       ` Wei Liu
  1 sibling, 0 replies; 43+ messages in thread
From: Wei Liu @ 2018-07-03 10:54 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Tue, Jul 03, 2018 at 12:21:08PM +0200, Roger Pau Monné wrote:
> On Tue, Jul 03, 2018 at 10:27:59AM +0100, Wei Liu wrote:
> > On Wed, Jun 20, 2018 at 04:42:34PM +0200, Roger Pau Monne wrote:
> > > +        /* Set the BARs addresses and size. */
> > > +        for ( i = 0; i < PCI_SRIOV_NUM_BARS; i += rc )
> > > +        {
> > > +            unsigned int j, idx = pos + PCI_SRIOV_BAR + i * 4;
> > > +            const pci_sbdf_t sbdf = {
> > > +                .sbdf = PCI_SBDF3(pdev->seg, pdev->bus, pdev->devfn),
> > > +            };
> > > +            uint32_t bar = pci_conf_read32(pdev->seg, pdev->bus,
> > > +                                           PCI_SLOT(pdev->devfn),
> > > +                                           PCI_FUNC(pdev->devfn), idx);
> > > +            uint64_t addr, size;
> > > +
> > > +            rc = pci_size_mem_bar(sbdf, idx, &addr, &size,
> > > +                                  PCI_BAR_VF |
> > > +                                  ((i == PCI_SRIOV_NUM_BARS - 1) ?
> > > +                                   PCI_BAR_LAST : 0));
> > 
> > This only returns 1 or 2. The return type is unsigned int which means rc
> > is never going to be <= 0.
> 
> Right... I will replace the if with an ASSERT(rc > 0 && rc <= 2);
> 
> > > +             */
> > > +            sriov->num_vfs = pci_conf_read16(pdev->seg, pdev->bus,
> > > +                                             PCI_SLOT(pdev->devfn),
> > > +                                             PCI_FUNC(pdev->devfn),
> > > +                                             pos + PCI_SRIOV_NUM_VF);
> > > +
> > > +            /*
> > > +             * NB: VFE needs to be enabled before calling pci_add_device so Xen
> > > +             * can access the config space of VFs.
> > > +             */
> > > +            pci_conf_write16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > > +                             PCI_FUNC(pdev->devfn), reg,
> > > +                             control | PCI_SRIOV_CTRL_VFE);
> > > +
> > > +            /*
> > > +             * The spec states that the software must wait at least 100ms
> > > +             * before attempting to access VF registers when enabling virtual
> > > +             * functions on the PF.
> > > +             */
> > > +            mdelay(100);
> > 
> > IMHO delaying 100ms like this in an active system is far too long. It
> > would be better to put this into a loop and process softirqs in between
> > delays.
> 
> Ack, do you think 10ms delays would be OK?

I think that's fine. But it would be better to have some input from Arm
folks since this could impact their real-time constraint.

This isn't a hot path so I think having shorter interval (more calls to
process_pending_softirqs) isn't going to be a problem.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03 10:51         ` Roger Pau Monné
@ 2018-07-03 10:56           ` Jan Beulich
  2018-07-03 11:07             ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2018-07-03 10:56 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 03.07.18 at 12:51, <roger.pau@citrix.com> wrote:
> I don't think we ever want to allow DomU-s to manage the SR-IOV
> capability, so this code is always going to be Dom0 only AFAICT. In
> fact I think I'm going to add an assert to that effect in the SR-IOV
> init handler.

Did you consider nested virt here?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03 10:56           ` Jan Beulich
@ 2018-07-03 11:07             ` Roger Pau Monné
  2018-07-03 11:47               ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2018-07-03 11:07 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

On Tue, Jul 03, 2018 at 04:56:38AM -0600, Jan Beulich wrote:
> >>> On 03.07.18 at 12:51, <roger.pau@citrix.com> wrote:
> > I don't think we ever want to allow DomU-s to manage the SR-IOV
> > capability, so this code is always going to be Dom0 only AFAICT. In
> > fact I think I'm going to add an assert to that effect in the SR-IOV
> > init handler.
> 
> Did you consider nested virt here?

I can see this making more sense for nested virt, still it's going to
require a non-trivial amount of work to make SR-IOV safe for DomUs to
manage, not to mention that DomUs would then be able to make pci
devices (VFs) appear and disappear, and likely collide with existing
devices?

In any case, I can see about pausing the vcpu and resuming after the
timeout, doesn't seem that complex.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/10] vpci/sriov: add support for SR-IOV capability
  2018-07-03 11:07             ` Roger Pau Monné
@ 2018-07-03 11:47               ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2018-07-03 11:47 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 03.07.18 at 13:07, <roger.pau@citrix.com> wrote:
> On Tue, Jul 03, 2018 at 04:56:38AM -0600, Jan Beulich wrote:
>> >>> On 03.07.18 at 12:51, <roger.pau@citrix.com> wrote:
>> > I don't think we ever want to allow DomU-s to manage the SR-IOV
>> > capability, so this code is always going to be Dom0 only AFAICT. In
>> > fact I think I'm going to add an assert to that effect in the SR-IOV
>> > init handler.
>> 
>> Did you consider nested virt here?
> 
> I can see this making more sense for nested virt, still it's going to
> require a non-trivial amount of work to make SR-IOV safe for DomUs to
> manage, not to mention that DomUs would then be able to make pci
> devices (VFs) appear and disappear, and likely collide with existing
> devices?

Right, avoiding such collisions would be one thing to address.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] vpci: move lock
  2018-06-20 14:42 ` [PATCH 01/10] vpci: move lock Roger Pau Monne
  2018-06-29 10:19   ` Wei Liu
@ 2021-11-23 12:20   ` Oleksandr Andrushchenko
  1 sibling, 0 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-23 12:20 UTC (permalink / raw)
  To: Roger Pau Monne, Jan Beulich
  Cc: Stefano Stabellini, xen-devel, Julien Grall, Rahul Singh,
	Bertrand Marquis, Artem Mygaiev

Hi, Roger, Jan!

I think the below work will help with solving issues brought up by [1].

So, after thinking a bit more I concluded that indeed we do not actually
need a per-domain vPCI lock, but instead we just want to protect pdev->vpci
which is done in this patch.

At the moment I see four entities which may run concurrently and touch vpci:
1. hypercalls PHYSDEVOP_pci_device_{add|remove} - for Dom0 only
2. hypercalls XEN_DOMCTL_{de|}assign_device - any domain
3. MMIO trap handlers vpci_{read|write}
4. vpci_process_pending

 From the above #1 will update pdev->vpci and #4 may remove pdev->vpci
with vpci_remove_device on error path for Dom0. #2 and #3 seem to be
readers only. So, it is possible to improve this patch to not only take
pdev->vpci->lock out of struct vpci, but also to convert it into RW lock.

Hope this is what is needed in context of vpci and locking.

Thank you,
Oleksandr

On 20.06.18 17:42, Roger Pau Monne wrote:
> To the outside of the vpci struct. This way the lock can be used to
> check whether vpci is present, and removal can be performed while
> holding the lock, in order to make sure there are no accesses to the
> contents of the vpci struct. Previously removal could race with
> vpci_read for example, since the log was dropped prior to freeing
> pdev->vpci.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>   tools/tests/vpci/emul.h       |  5 ++--
>   tools/tests/vpci/main.c       |  4 +--
>   xen/arch/x86/hvm/vmsi.c       |  8 +++---
>   xen/drivers/passthrough/pci.c |  1 +
>   xen/drivers/vpci/header.c     | 19 +++++++++----
>   xen/drivers/vpci/msi.c        | 11 +++++---
>   xen/drivers/vpci/msix.c       |  8 +++---
>   xen/drivers/vpci/vpci.c       | 51 ++++++++++++++++++++++-------------
>   xen/include/xen/pci.h         |  1 +
>   xen/include/xen/vpci.h        |  3 +--
>   10 files changed, 70 insertions(+), 41 deletions(-)
>
> diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h
> index 5d47544bf7..d344ef71c9 100644
> --- a/tools/tests/vpci/emul.h
> +++ b/tools/tests/vpci/emul.h
> @@ -44,6 +44,7 @@ struct domain {
>   };
>   
>   struct pci_dev {
> +    bool vpci_lock;
>       struct vpci *vpci;
>   };
>   
> @@ -53,10 +54,8 @@ struct vcpu
>   };
>   
>   extern const struct vcpu *current;
> -extern const struct pci_dev test_pdev;
> +extern struct pci_dev test_pdev;
>   
> -typedef bool spinlock_t;
> -#define spin_lock_init(l) (*(l) = false)
>   #define spin_lock(l) (*(l) = true)
>   #define spin_unlock(l) (*(l) = false)
>   
> diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
> index b9a0a6006b..26c95b08b6 100644
> --- a/tools/tests/vpci/main.c
> +++ b/tools/tests/vpci/main.c
> @@ -23,7 +23,8 @@ static struct vpci vpci;
>   
>   const static struct domain d;
>   
> -const struct pci_dev test_pdev = {
> +struct pci_dev test_pdev = {
> +    .vpci_lock = false,
>       .vpci = &vpci,
>   };
>   
> @@ -158,7 +159,6 @@ main(int argc, char **argv)
>       int rc;
>   
>       INIT_LIST_HEAD(&vpci.handlers);
> -    spin_lock_init(&vpci.lock);
>   
>       VPCI_ADD_REG(vpci_read32, vpci_write32, 0, 4, r0);
>       VPCI_READ_CHECK(0, 4, r0);
> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index 3001d5c488..94550cb8c4 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -893,14 +893,14 @@ int vpci_msix_arch_print(const struct vpci_msix *msix)
>           {
>               struct pci_dev *pdev = msix->pdev;
>   
> -            spin_unlock(&msix->pdev->vpci->lock);
> +            spin_unlock(&msix->pdev->vpci_lock);
>               process_pending_softirqs();
>               /* NB: we assume that pdev cannot go away for an alive domain. */
> -            if ( !pdev->vpci || !spin_trylock(&pdev->vpci->lock) )
> +            if ( !spin_trylock(&pdev->vpci_lock) )
>                   return -EBUSY;
> -            if ( pdev->vpci->msix != msix )
> +            if ( !pdev->vpci || pdev->vpci->msix != msix )
>               {
> -                spin_unlock(&pdev->vpci->lock);
> +                spin_unlock(&pdev->vpci_lock);
>                   return -EAGAIN;
>               }
>           }
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index c4890a4295..a5d59b83b7 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -315,6 +315,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>       *((u8*) &pdev->devfn) = devfn;
>       pdev->domain = NULL;
>       INIT_LIST_HEAD(&pdev->msi_list);
> +    spin_lock_init(&pdev->vpci_lock);
>   
>       if ( pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
>                                PCI_CAP_ID_MSIX) )
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 0ec4c082a6..9d5607d5f8 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -131,11 +131,12 @@ bool vpci_process_pending(struct vcpu *v)
>           if ( rc == -ERESTART )
>               return true;
>   
> -        spin_lock(&v->vpci.pdev->vpci->lock);
> -        /* Disable memory decoding unconditionally on failure. */
> -        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> -                        !rc && v->vpci.rom_only);
> -        spin_unlock(&v->vpci.pdev->vpci->lock);
> +        spin_lock(&v->vpci.pdev->vpci_lock);
> +        if ( v->vpci.pdev->vpci )
> +            /* Disable memory decoding unconditionally on failure. */
> +            modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> +                            !rc && v->vpci.rom_only);
> +        spin_unlock(&v->vpci.pdev->vpci_lock);
>   
>           rangeset_destroy(v->vpci.mem);
>           v->vpci.mem = NULL;
> @@ -267,6 +268,12 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
>                   continue;
>           }
>   
> +        spin_lock(&tmp->vpci_lock);
> +        if ( !tmp->vpci )
> +        {
> +            spin_unlock(&tmp->vpci_lock);
> +            continue;
> +        }
>           for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
>           {
>               const struct vpci_bar *bar = &tmp->vpci->header.bars[i];
> @@ -285,12 +292,14 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
>               rc = rangeset_remove_range(mem, start, end);
>               if ( rc )
>               {
> +                spin_unlock(&tmp->vpci_lock);
>                   printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: %d\n",
>                          start, end, rc);
>                   rangeset_destroy(mem);
>                   return rc;
>               }
>           }
> +        spin_unlock(&tmp->vpci_lock);
>       }
>   
>       ASSERT(dev);
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index 8f15ad7bf2..108e871d1c 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -275,7 +275,7 @@ void vpci_dump_msi(void)
>       rcu_read_lock(&domlist_read_lock);
>       for_each_domain ( d )
>       {
> -        const struct pci_dev *pdev;
> +        struct pci_dev *pdev;
>   
>           if ( !has_vpci(d) )
>               continue;
> @@ -287,8 +287,13 @@ void vpci_dump_msi(void)
>               const struct vpci_msi *msi;
>               const struct vpci_msix *msix;
>   
> -            if ( !pdev->vpci || !spin_trylock(&pdev->vpci->lock) )
> +            if ( !spin_trylock(&pdev->vpci_lock) )
>                   continue;
> +            if ( !pdev->vpci )
> +            {
> +                spin_unlock(&pdev->vpci_lock);
> +                continue;
> +            }
>   
>               msi = pdev->vpci->msi;
>               if ( msi && msi->enabled )
> @@ -330,7 +335,7 @@ void vpci_dump_msi(void)
>                   }
>               }
>   
> -            spin_unlock(&pdev->vpci->lock);
> +            spin_unlock(&pdev->vpci_lock);
>               process_pending_softirqs();
>           }
>       }
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index bcf63256f6..e28096329d 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -236,7 +236,7 @@ static int msix_read(struct vcpu *v, unsigned long addr, unsigned int len,
>           return X86EMUL_OKAY;
>       }
>   
> -    spin_lock(&msix->pdev->vpci->lock);
> +    spin_lock(&msix->pdev->vpci_lock);
>       entry = get_entry(msix, addr);
>       offset = addr & (PCI_MSIX_ENTRY_SIZE - 1);
>   
> @@ -265,7 +265,7 @@ static int msix_read(struct vcpu *v, unsigned long addr, unsigned int len,
>           ASSERT_UNREACHABLE();
>           break;
>       }
> -    spin_unlock(&msix->pdev->vpci->lock);
> +    spin_unlock(&msix->pdev->vpci_lock);
>   
>       return X86EMUL_OKAY;
>   }
> @@ -308,7 +308,7 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
>           return X86EMUL_OKAY;
>       }
>   
> -    spin_lock(&msix->pdev->vpci->lock);
> +    spin_lock(&msix->pdev->vpci_lock);
>       entry = get_entry(msix, addr);
>       offset = addr & (PCI_MSIX_ENTRY_SIZE - 1);
>   
> @@ -384,7 +384,7 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
>           ASSERT_UNREACHABLE();
>           break;
>       }
> -    spin_unlock(&msix->pdev->vpci->lock);
> +    spin_unlock(&msix->pdev->vpci_lock);
>   
>       return X86EMUL_OKAY;
>   }
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 82607bdb9a..7d52bcf8d0 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -35,9 +35,8 @@ extern vpci_register_init_t *const __start_vpci_array[];
>   extern vpci_register_init_t *const __end_vpci_array[];
>   #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>   
> -void vpci_remove_device(struct pci_dev *pdev)
> +static void vpci_remove_device_locked(struct pci_dev *pdev)
>   {
> -    spin_lock(&pdev->vpci->lock);
>       while ( !list_empty(&pdev->vpci->handlers) )
>       {
>           struct vpci_register *r = list_first_entry(&pdev->vpci->handlers,
> @@ -47,13 +46,20 @@ void vpci_remove_device(struct pci_dev *pdev)
>           list_del(&r->node);
>           xfree(r);
>       }
> -    spin_unlock(&pdev->vpci->lock);
>       xfree(pdev->vpci->msix);
>       xfree(pdev->vpci->msi);
>       xfree(pdev->vpci);
>       pdev->vpci = NULL;
>   }
>   
> +void vpci_remove_device(struct pci_dev *pdev)
> +{
> +    spin_lock(&pdev->vpci_lock);
> +    vpci_remove_device_locked(pdev);
> +    spin_unlock(&pdev->vpci_lock);
> +}
> +
> +
>   int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
>   {
>       unsigned int i;
> @@ -62,12 +68,15 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
>       if ( !has_vpci(pdev->domain) )
>           return 0;
>   
> +    spin_lock(&pdev->vpci_lock);
>       pdev->vpci = xzalloc(struct vpci);
>       if ( !pdev->vpci )
> +    {
> +        spin_unlock(&pdev->vpci_lock);
>           return -ENOMEM;
> +    }
>   
>       INIT_LIST_HEAD(&pdev->vpci->handlers);
> -    spin_lock_init(&pdev->vpci->lock);
>   
>       for ( i = 0; i < NUM_VPCI_INIT; i++ )
>       {
> @@ -77,7 +86,8 @@ int __hwdom_init vpci_add_handlers(struct pci_dev *pdev)
>       }
>   
>       if ( rc )
> -        vpci_remove_device(pdev);
> +        vpci_remove_device_locked(pdev);
> +    spin_unlock(&pdev->vpci_lock);
>   
>       return rc;
>   }
> @@ -148,8 +158,6 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>       r->offset = offset;
>       r->private = data;
>   
> -    spin_lock(&vpci->lock);
> -
>       /* The list of handlers must be kept sorted at all times. */
>       list_for_each ( prev, &vpci->handlers )
>       {
> @@ -161,14 +169,12 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>               break;
>           if ( cmp == 0 )
>           {
> -            spin_unlock(&vpci->lock);
>               xfree(r);
>               return -EEXIST;
>           }
>       }
>   
>       list_add_tail(&r->node, prev);
> -    spin_unlock(&vpci->lock);
>   
>       return 0;
>   }
> @@ -179,7 +185,6 @@ int vpci_remove_register(struct vpci *vpci, unsigned int offset,
>       const struct vpci_register r = { .offset = offset, .size = size };
>       struct vpci_register *rm;
>   
> -    spin_lock(&vpci->lock);
>       list_for_each_entry ( rm, &vpci->handlers, node )
>       {
>           int cmp = vpci_register_cmp(&r, rm);
> @@ -191,14 +196,12 @@ int vpci_remove_register(struct vpci *vpci, unsigned int offset,
>           if ( !cmp && rm->offset == offset && rm->size == size )
>           {
>               list_del(&rm->node);
> -            spin_unlock(&vpci->lock);
>               xfree(rm);
>               return 0;
>           }
>           if ( cmp <= 0 )
>               break;
>       }
> -    spin_unlock(&vpci->lock);
>   
>       return -ENOENT;
>   }
> @@ -315,7 +318,7 @@ static uint32_t merge_result(uint32_t data, uint32_t new, unsigned int size,
>   uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>   {
>       const struct domain *d = current->domain;
> -    const struct pci_dev *pdev;
> +    struct pci_dev *pdev;
>       const struct vpci_register *r;
>       unsigned int data_offset = 0;
>       uint32_t data = ~(uint32_t)0;
> @@ -331,7 +334,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>       if ( !pdev )
>           return vpci_read_hw(sbdf, reg, size);
>   
> -    spin_lock(&pdev->vpci->lock);
> +    spin_lock(&pdev->vpci_lock);
> +    if ( !pdev->vpci )
> +    {
> +        spin_unlock(&pdev->vpci_lock);
> +        return vpci_read_hw(sbdf, reg, size);
> +    }
>   
>       /* Read from the hardware or the emulated register handlers. */
>       list_for_each_entry ( r, &pdev->vpci->handlers, node )
> @@ -383,7 +391,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>   
>           data = merge_result(data, tmp_data, size - data_offset, data_offset);
>       }
> -    spin_unlock(&pdev->vpci->lock);
> +    spin_unlock(&pdev->vpci_lock);
>   
>       return data & (0xffffffff >> (32 - 8 * size));
>   }
> @@ -418,7 +426,7 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>                   uint32_t data)
>   {
>       const struct domain *d = current->domain;
> -    const struct pci_dev *pdev;
> +    struct pci_dev *pdev;
>       const struct vpci_register *r;
>       unsigned int data_offset = 0;
>   
> @@ -439,7 +447,14 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>           return;
>       }
>   
> -    spin_lock(&pdev->vpci->lock);
> +    spin_lock(&pdev->vpci_lock);
> +    if ( !pdev->vpci )
> +    {
> +        spin_unlock(&pdev->vpci_lock);
> +        vpci_write_hw(sbdf, reg, size, data);
> +        return;
> +    }
> +
>   
>       /* Write the value to the hardware or emulated registers. */
>       list_for_each_entry ( r, &pdev->vpci->handlers, node )
> @@ -480,7 +495,7 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>           vpci_write_hw(sbdf, reg + data_offset, size - data_offset,
>                         data >> (data_offset * 8));
>   
> -    spin_unlock(&pdev->vpci->lock);
> +    spin_unlock(&pdev->vpci_lock);
>   }
>   
>   /*
> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
> index 4cfa774615..e554c14b2f 100644
> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -114,6 +114,7 @@ struct pci_dev {
>       u64 vf_rlen[6];
>   
>       /* Data for vPCI. */
> +    spinlock_t vpci_lock;
>       struct vpci *vpci;
>   };
>   
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index af2b8580ee..98556d31ed 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -29,7 +29,7 @@ int __must_check vpci_add_handlers(struct pci_dev *dev);
>   /* Remove all handlers and free vpci related structures. */
>   void vpci_remove_device(struct pci_dev *pdev);
>   
> -/* Add/remove a register handler. */
> +/* Add/remove a register handler. Must be called holding the vpci_lock. */
>   int __must_check vpci_add_register(struct vpci *vpci,
>                                      vpci_read_t *read_handler,
>                                      vpci_write_t *write_handler,
> @@ -58,7 +58,6 @@ bool __must_check vpci_process_pending(struct vcpu *v);
>   struct vpci {
>       /* List of vPCI handlers for a device. */
>       struct list_head handlers;
> -    spinlock_t lock;
>   
>   #ifdef __XEN__
>       /* Hide the rest of the vpci struct from the user-space test harness. */
[1] https://patchwork.kernel.org/project/xen-devel/patch/20211105065629.940943-3-andr2000@gmail.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-11-23 12:20 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-20 14:42 [PATCH 00/10] vpci: add support for SR-IOV capability Roger Pau Monne
2018-06-20 14:42 ` [PATCH 01/10] vpci: move lock Roger Pau Monne
2018-06-29 10:19   ` Wei Liu
2018-06-29 13:27     ` Roger Pau Monné
2018-06-29 13:37       ` Julien Grall
2021-11-23 12:20   ` Oleksandr Andrushchenko
2018-06-20 14:42 ` [PATCH 02/10] vpci/msix: add lock to protect the list of MSIX regions Roger Pau Monne
2018-06-29 10:29   ` Wei Liu
2018-06-20 14:42 ` [PATCH 03/10] vpci: add tear down functions Roger Pau Monne
2018-06-29 10:37   ` Wei Liu
2018-07-02  7:52     ` Roger Pau Monné
2018-06-20 14:42 ` [PATCH 04/10] vpci/msix: add teardown cleanup Roger Pau Monne
2018-06-29 10:52   ` Wei Liu
2018-07-02  7:54     ` Roger Pau Monné
2018-07-02 13:55       ` Wei Liu
2018-07-02 14:02         ` Roger Pau Monné
2018-07-02 14:07           ` Wei Liu
2018-07-02 14:12             ` Roger Pau Monné
2018-06-20 14:42 ` [PATCH 05/10] vpci/msi: " Roger Pau Monne
2018-06-29 10:52   ` Wei Liu
2018-07-02  7:55     ` Roger Pau Monné
2018-06-20 14:42 ` [PATCH 06/10] vpci/header: " Roger Pau Monne
2018-06-29 11:15   ` Wei Liu
2018-07-02  8:04     ` Roger Pau Monné
2018-07-02 14:30       ` Wei Liu
2018-07-02 14:42         ` Roger Pau Monné
2018-06-20 14:42 ` [PATCH 07/10] rangeset: introduce rangeset_merge Roger Pau Monne
2018-06-29 11:23   ` Wei Liu
2018-06-20 14:42 ` [PATCH 08/10] vpci/header: allow multiple map operations Roger Pau Monne
2018-06-29 14:44   ` Wei Liu
2018-06-20 14:42 ` [PATCH 09/10] pci: add vpci hooks for device addition/removal Roger Pau Monne
2018-06-29 14:50   ` Wei Liu
2018-06-20 14:42 ` [PATCH 10/10] vpci/sriov: add support for SR-IOV capability Roger Pau Monne
2018-06-29 15:27   ` Wei Liu
2018-07-02  8:12     ` Roger Pau Monné
2018-07-03  9:27   ` Wei Liu
2018-07-03 10:21     ` Roger Pau Monné
2018-07-03 10:48       ` Jan Beulich
2018-07-03 10:51         ` Roger Pau Monné
2018-07-03 10:56           ` Jan Beulich
2018-07-03 11:07             ` Roger Pau Monné
2018-07-03 11:47               ` Jan Beulich
2018-07-03 10:54       ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.