All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4 0/3] Implement per-cpu reader-writer locks
@ 2015-12-18 10:06 Malcolm Crossley
  2015-12-18 10:06 ` [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Malcolm Crossley @ 2015-12-18 10:06 UTC (permalink / raw)
  To: malcolm.crossley, JBeulich, ian.campbell, andrew.cooper3,
	Marcos.Matsunaga, keir, konrad.wilk, george.dunlap
  Cc: xen-devel, dario.faggioli, stefano.stabellini

This patch series adds per-cpu reader-writer locks as a generic lock
implementation and then converts the grant table and p2m rwlocks to
use the percpu rwlocks, in order to improve multi-socket host performance.

CPU profiling has revealed the rwlocks themselves suffer from severe cache
line bouncing due to the cmpxchg operation used even when taking a read lock.
Multiqueue paravirtualised I/O results in heavy contention of the grant table
and p2m read locks of a specific domain and so I/O throughput is bottlenecked
by the overhead of the cache line bouncing itself.

Per-cpu read locks avoid lock cache line bouncing by using a per-cpu data
area to record a CPU has taken the read lock. Correctness is enforced for the 
write lock by using a per lock barrier which forces the per-cpu read lock 
to revert to using a standard read lock. The write lock then polls all 
the percpu data area until active readers for the lock have exited.

Removing the cache line bouncing on a multi-socket Haswell-EP system 
dramatically improves performance, with 16 vCPU network IO performance going 
from 15 gb/s to 64 gb/s! The host under test was fully utilising all 40 
logical CPU's at 64 gb/s, so a bigger logical CPU host may see an even better
IO improvement.

Note: Benchmarking of the these performance improvements should be done with 
the non debug version of the hypervisor otherwise the map_domain_page spinlock
is the main bottleneck.

Changes in V4:
- Fix the ASSERTS for the percpu_owner check
 
Changes in V3:
- Add percpu rwlock owner for debug Xen builds
- Validate percpu rwlock owner at runtime for debug Xen builds
- Fix hard tab issues
- Use percpu rwlock wrappers for grant table rwlock users
- Add comments why rw_is_locked ASSERTS have been removed in grant table code

Changes in V2:
- Add Cover letter
- Convert p2m rwlock to percpu rwlock
- Improve percpu rwlock to safely handle simultaneously holding 2 or more 
  locks 
- Move percpu rwlock barrier from global to per lock
- Move write lock cpumask variable to a percpu variable
- Add macros to help initialise and use percpu rwlocks
- Updated IO benchmark results to cover revised locking implementation

Malcolm Crossley (3):
  rwlock: Add per-cpu reader-writer lock infrastructure
  grant_table: convert grant table rwlock to percpu rwlock
  p2m: convert p2m rwlock to percpu rwlock

 xen/arch/arm/mm.c             |   4 +-
 xen/arch/x86/mm.c             |   4 +-
 xen/arch/x86/mm/mm-locks.h    |  12 ++--
 xen/arch/x86/mm/p2m.c         |   1 +
 xen/common/grant_table.c      | 126 +++++++++++++++++++++++-------------------
 xen/common/spinlock.c         |  46 +++++++++++++++
 xen/include/asm-arm/percpu.h  |   5 ++
 xen/include/asm-x86/mm.h      |   2 +-
 xen/include/asm-x86/percpu.h  |   6 ++
 xen/include/xen/grant_table.h |  24 +++++++-
 xen/include/xen/percpu.h      |   4 ++
 xen/include/xen/spinlock.h    | 115 ++++++++++++++++++++++++++++++++++++++
 12 files changed, 282 insertions(+), 67 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
  2015-12-18 10:06 [PATCHv4 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
@ 2015-12-18 10:06 ` Malcolm Crossley
  2015-12-18 11:37   ` Jan Beulich
  2015-12-18 10:06 ` [PATCHv4 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
  2015-12-18 10:06 ` [PATCHv4 3/3] p2m: convert p2m " Malcolm Crossley
  2 siblings, 1 reply; 6+ messages in thread
From: Malcolm Crossley @ 2015-12-18 10:06 UTC (permalink / raw)
  To: malcolm.crossley, JBeulich, ian.campbell, andrew.cooper3,
	Marcos.Matsunaga, keir, konrad.wilk, george.dunlap
  Cc: xen-devel, dario.faggioli, stefano.stabellini

Per-cpu read-write locks allow for the fast path read case to have
low overhead by only setting/clearing a per-cpu variable for using
the read lock. The per-cpu read fast path also avoids locked
compare swap operations which can be particularly slow on coherent
multi-socket systems, particularly if there is heavy usage of the
read lock itself.

The per-cpu reader-writer lock uses a local variable to control
the read lock fast path. This allows a writer to disable the fast
path and ensures the readers switch to using the underlying
read-write lock implementation instead of the per-cpu variable.

Once the writer has taken the write lock and disabled the fast path,
it must poll the per-cpu variable for all CPU's which have entered
the critical section for the specific read-write lock the writer is
attempting to take. This design allows for a single per-cpu variable
to be used for read/write locks belonging to seperate data structures.
If a two or more different per-cpu read lock(s) are taken
simultaneously then the per-cpu data structure is not used and the
implementation takes the read lock of the underlying read-write lock,
this behaviour is equivalent to the slow path in terms of performance.
The per-cpu rwlock is not recursion safe for taking the per-cpu
read lock because there is no recursion count variable, this is
functionally equivalent to standard spin locks.

Slow path readers which are unblocked, set the per-cpu variable and
drop the read lock. This simplifies the implementation and allows
for fairness in the underlying read-write lock to be taken
advantage of.

There is more overhead on the per-cpu write lock path due to checking
each CPUs fast path per-cpu variable but this overhead is likely be
hidden by the required delay of waiting for readers to exit the
critical section. The loop is optimised to only iterate over
the per-cpu data of active readers of the rwlock. The cpumask_t for
tracking the active readers is stored in a single per-cpu data
location and thus the write lock is not pre-emption safe. Therefore
the per-cpu write lock can only be used with interrupts disabled.

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
--
Changes since v3:
- Fix the ASSERTS for the percpu_owner check

Changes since v2:
- Remove stray hard tabs
- Added owner field to percpu_rwlock_t for debug builds. This allows
  for ASSERTs to be added which verify the correct global percpu
  structure is used for the local initialised percpu_rwlock lock

Changes since v1:
- Removed restriction on taking two or more different percpu rw
locks simultaneously
- Moved fast-path/slow-path barrier to be per lock instead of global
- Created seperate percpu_rwlock_t type and added macros to
initialise new type
- Added helper macros for using the percpu rwlock itself
- Moved active_readers cpumask off the stack and into a percpu
variable
---
 xen/common/spinlock.c        |  46 +++++++++++++++++
 xen/include/asm-arm/percpu.h |   5 ++
 xen/include/asm-x86/percpu.h |   6 +++
 xen/include/xen/percpu.h     |   4 ++
 xen/include/xen/spinlock.h   | 115 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 176 insertions(+)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 7f89694..c71afcb 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -10,6 +10,8 @@
 #include <asm/processor.h>
 #include <asm/atomic.h>
 
+static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
+
 #ifndef NDEBUG
 
 static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
@@ -492,6 +494,50 @@ int _rw_is_write_locked(rwlock_t *lock)
     return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
 }
 
+void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
+                percpu_rwlock_t *percpu_rwlock)
+{
+    unsigned int cpu;
+    cpumask_t *rwlock_readers = &this_cpu(percpu_rwlock_readers);
+
+#ifndef NDEBUG
+    /* Validate the correct per_cpudata variable has been provided. */
+    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
+#endif
+    /* 
+     * First take the write lock to protect against other writers or slow 
+     * path readers.
+     */
+    write_lock(&percpu_rwlock->rwlock);
+
+    /* Now set the global variable so that readers start using read_lock. */
+    percpu_rwlock->writer_activating = 1;
+    smp_mb();
+
+    /* Using a per cpu cpumask is only safe if there is no nesting. */
+    ASSERT(!in_irq());
+    cpumask_copy(rwlock_readers, &cpu_online_map);
+
+    /* Check if there are any percpu readers in progress on this rwlock. */
+    for ( ; ; )
+    {
+        for_each_cpu(cpu, rwlock_readers)
+        {
+            /* 
+             * Remove any percpu readers not contending on this rwlock
+             * from our check mask.
+             */
+            if ( per_cpu_ptr(per_cpudata, cpu) != percpu_rwlock )
+                __cpumask_clear_cpu(cpu, rwlock_readers);
+        }
+        /* Check if we've cleared all percpu readers from check mask. */
+        if ( cpumask_empty(rwlock_readers) )
+            break;
+        /* Give the coherency fabric a break. */
+        cpu_relax();
+    };
+}
+
 #ifdef LOCK_PROFILE
 
 struct lock_profile_anc {
diff --git a/xen/include/asm-arm/percpu.h b/xen/include/asm-arm/percpu.h
index 71e7649..c308a56 100644
--- a/xen/include/asm-arm/percpu.h
+++ b/xen/include/asm-arm/percpu.h
@@ -27,6 +27,11 @@ void percpu_init_areas(void);
 #define __get_cpu_var(var) \
     (*RELOC_HIDE(&per_cpu__##var, READ_SYSREG(TPIDR_EL2)))
 
+#define per_cpu_ptr(var, cpu)  \
+    (*RELOC_HIDE(&var, __per_cpu_offset[cpu]))
+#define __get_cpu_ptr(var) \
+    (*RELOC_HIDE(&var, READ_SYSREG(TPIDR_EL2)))
+
 #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
 
 DECLARE_PER_CPU(unsigned int, cpu_id);
diff --git a/xen/include/asm-x86/percpu.h b/xen/include/asm-x86/percpu.h
index 604ff0d..51562b9 100644
--- a/xen/include/asm-x86/percpu.h
+++ b/xen/include/asm-x86/percpu.h
@@ -20,4 +20,10 @@ void percpu_init_areas(void);
 
 #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
 
+#define __get_cpu_ptr(var) \
+    (*RELOC_HIDE(var, get_cpu_info()->per_cpu_offset))
+
+#define per_cpu_ptr(var, cpu)  \
+    (*RELOC_HIDE(var, __per_cpu_offset[cpu]))
+
 #endif /* __X86_PERCPU_H__ */
diff --git a/xen/include/xen/percpu.h b/xen/include/xen/percpu.h
index abe0b11..c896863 100644
--- a/xen/include/xen/percpu.h
+++ b/xen/include/xen/percpu.h
@@ -16,6 +16,10 @@
 /* Preferred on Xen. Also see arch-defined per_cpu(). */
 #define this_cpu(var)    __get_cpu_var(var)
 
+#define this_cpu_ptr(ptr)    __get_cpu_ptr(ptr)
+
+#define get_per_cpu_var(var)  (per_cpu__##var)
+
 /* Linux compatibility. */
 #define get_cpu_var(var) this_cpu(var)
 #define put_cpu_var(var)
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index fb0438e..fb1de92 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -3,6 +3,7 @@
 
 #include <asm/system.h>
 #include <asm/spinlock.h>
+#include <asm/types.h>
 
 #ifndef NDEBUG
 struct lock_debug {
@@ -261,4 +262,118 @@ int _rw_is_write_locked(rwlock_t *lock);
 #define rw_is_locked(l)               _rw_is_locked(l)
 #define rw_is_write_locked(l)         _rw_is_write_locked(l)
 
+typedef struct percpu_rwlock percpu_rwlock_t;
+
+struct percpu_rwlock {
+    rwlock_t            rwlock;
+    bool_t              writer_activating;
+#ifndef NDEBUG
+    percpu_rwlock_t     **percpu_owner;
+#endif
+};
+
+#ifndef NDEBUG
+#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0, owner }
+#else
+#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0 }
+#endif
+
+#define DEFINE_PERCPU_RWLOCK_RESOURCE(l, owner) \
+    percpu_rwlock_t l = PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner))
+#define percpu_rwlock_resource_init(l, owner) \
+    (*(l) = (percpu_rwlock_t)PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner)))
+
+
+static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
+                                         percpu_rwlock_t *percpu_rwlock)
+{
+#ifndef NDEBUG
+    /* Validate the correct per_cpudata variable has been provided. */
+    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
+#endif
+
+    /* We cannot support recursion on the same lock. */
+    ASSERT(this_cpu_ptr(per_cpudata) != percpu_rwlock);
+    /* 
+     * Detect using a second percpu_rwlock_t simulatenously and fallback
+     * to standard read_lock.
+     */
+    if ( unlikely(this_cpu_ptr(per_cpudata) != NULL ) )
+    {
+        read_lock(&percpu_rwlock->rwlock);
+        return;
+    }
+
+    /* Indicate this cpu is reading. */
+    this_cpu_ptr(per_cpudata) = percpu_rwlock;
+    smp_mb();
+    /* Check if a writer is waiting. */
+    if ( unlikely(percpu_rwlock->writer_activating) )
+    {
+        /* Let the waiting writer know we aren't holding the lock. */
+        this_cpu_ptr(per_cpudata) = NULL;
+        /* Wait using the read lock to keep the lock fair. */
+        read_lock(&percpu_rwlock->rwlock);
+        /* Set the per CPU data again and continue. */
+        this_cpu_ptr(per_cpudata) = percpu_rwlock;
+        /* Drop the read lock because we don't need it anymore. */
+        read_unlock(&percpu_rwlock->rwlock);
+    }
+}
+
+static inline void _percpu_read_unlock(percpu_rwlock_t **per_cpudata,
+                percpu_rwlock_t *percpu_rwlock)
+{
+#ifndef NDEBUG
+    /* Validate the correct per_cpudata variable has been provided. */
+    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
+#endif
+
+    /* Verify the read lock was taken for this lock */
+    ASSERT(this_cpu_ptr(per_cpudata) != NULL);
+    /* 
+     * Detect using a second percpu_rwlock_t simulatenously and fallback
+     * to standard read_unlock.
+     */
+    if ( unlikely(this_cpu_ptr(per_cpudata) != percpu_rwlock ) )
+    {
+        read_unlock(&percpu_rwlock->rwlock);
+        return;
+    }
+    this_cpu_ptr(per_cpudata) = NULL;
+    smp_wmb();
+}
+
+/* Don't inline percpu write lock as it's a complex function. */
+void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
+                        percpu_rwlock_t *percpu_rwlock);
+
+static inline void _percpu_write_unlock(percpu_rwlock_t **per_cpudata,
+                percpu_rwlock_t *percpu_rwlock)
+{
+#ifndef NDEBUG
+    /* Validate the correct per_cpudata variable has been provided. */
+    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
+#endif
+    ASSERT(percpu_rwlock->writer_activating);
+    percpu_rwlock->writer_activating = 0;
+    write_unlock(&percpu_rwlock->rwlock);
+}
+
+#define percpu_rw_is_write_locked(l)         _rw_is_write_locked(&((l)->rwlock))
+
+#define percpu_read_lock(percpu, lock) \
+    _percpu_read_lock(&get_per_cpu_var(percpu), lock)
+#define percpu_read_unlock(percpu, lock) \
+    _percpu_read_unlock(&get_per_cpu_var(percpu), lock)
+#define percpu_write_lock(percpu, lock) \
+    _percpu_write_lock(&get_per_cpu_var(percpu), lock)
+#define percpu_write_unlock(percpu, lock) \
+    _percpu_write_unlock(&get_per_cpu_var(percpu), lock)
+
+#define DEFINE_PERCPU_RWLOCK_GLOBAL(name) DEFINE_PER_CPU(percpu_rwlock_t *, \
+                                                         name)
+#define DECLARE_PERCPU_RWLOCK_GLOBAL(name) DECLARE_PER_CPU(percpu_rwlock_t *, \
+                                                         name)
+
 #endif /* __SPINLOCK_H__ */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCHv4 2/3] grant_table: convert grant table rwlock to percpu rwlock
  2015-12-18 10:06 [PATCHv4 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
  2015-12-18 10:06 ` [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
@ 2015-12-18 10:06 ` Malcolm Crossley
  2015-12-18 11:41   ` Jan Beulich
  2015-12-18 10:06 ` [PATCHv4 3/3] p2m: convert p2m " Malcolm Crossley
  2 siblings, 1 reply; 6+ messages in thread
From: Malcolm Crossley @ 2015-12-18 10:06 UTC (permalink / raw)
  To: malcolm.crossley, JBeulich, ian.campbell, andrew.cooper3,
	Marcos.Matsunaga, keir, konrad.wilk, george.dunlap
  Cc: xen-devel, dario.faggioli, stefano.stabellini

The per domain grant table read lock suffers from significant contention when
performance multi-queue block or network IO due to the parallel
grant map/unmaps/copies occurring on the DomU's grant table.

On multi-socket systems, the contention results in the locked compare swap
operation failing frequently which results in a tight loop of retries of the
compare swap operation. As the coherency fabric can only support a specific
rate of compare swap operations for a particular data location then taking
the read lock itself becomes a bottleneck for grant operations.

Standard rwlock performance of a single VIF VM-VM transfer with 16 queues
configured was limited to approximately 15 gbit/s on a 2 socket Haswell-EP
host.

Percpu rwlock performance with the same configuration is approximately
48 gbit/s.

Oprofile was used to determine the initial overhead of the read-write locks
and to confirm the overhead was dramatically reduced by the percpu rwlocks.

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
--
Changes since v2:
 - Switched to using wrappers for taking percpu rwlock
 - Added percpu structure pointer to percpu rwlock initialisation
 - Added comment on removal of ASSERTS for grant table rw_is_locked()
Changes since v1:
 - Used new macros provided in updated percpu rwlock v2 patch
 - Converted grant table rwlock_t to percpu_rwlock_t
 - Patched a missed grant table rwlock_t usage site
---
 xen/arch/arm/mm.c             |   4 +-
 xen/arch/x86/mm.c             |   4 +-
 xen/common/grant_table.c      | 126 +++++++++++++++++++++++-------------------
 xen/include/xen/grant_table.h |  24 +++++++-
 4 files changed, 97 insertions(+), 61 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 47bfb27..928d58ed 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1055,7 +1055,7 @@ int xenmem_add_to_physmap_one(
     switch ( space )
     {
     case XENMAPSPACE_grant_table:
-        write_lock(&d->grant_table->lock);
+        grant_percpu_write_lock(&d->grant_table->lock);
 
         if ( d->grant_table->gt_version == 0 )
             d->grant_table->gt_version = 1;
@@ -1085,7 +1085,7 @@ int xenmem_add_to_physmap_one(
 
         t = p2m_ram_rw;
 
-        write_unlock(&d->grant_table->lock);
+        grant_percpu_write_unlock(&d->grant_table->lock);
         break;
     case XENMAPSPACE_shared_info:
         if ( idx != 0 )
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 92df36f..8c968fa 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4654,7 +4654,7 @@ int xenmem_add_to_physmap_one(
                 mfn = virt_to_mfn(d->shared_info);
             break;
         case XENMAPSPACE_grant_table:
-            write_lock(&d->grant_table->lock);
+            grant_percpu_write_lock(&d->grant_table->lock);
 
             if ( d->grant_table->gt_version == 0 )
                 d->grant_table->gt_version = 1;
@@ -4676,7 +4676,7 @@ int xenmem_add_to_physmap_one(
                     mfn = virt_to_mfn(d->grant_table->shared_raw[idx]);
             }
 
-            write_unlock(&d->grant_table->lock);
+            grant_percpu_write_unlock(&d->grant_table->lock);
             break;
         case XENMAPSPACE_gmfn_range:
         case XENMAPSPACE_gmfn:
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 5d52d1e..fb6b808 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -178,6 +178,8 @@ struct active_grant_entry {
 #define _active_entry(t, e) \
     ((t)->active[(e)/ACGNT_PER_PAGE][(e)%ACGNT_PER_PAGE])
 
+DEFINE_PERCPU_RWLOCK_GLOBAL(grant_rwlock);
+
 static inline void gnttab_flush_tlb(const struct domain *d)
 {
     if ( !paging_mode_external(d) )
@@ -208,7 +210,13 @@ active_entry_acquire(struct grant_table *t, grant_ref_t e)
 {
     struct active_grant_entry *act;
 
-    ASSERT(rw_is_locked(&t->lock));
+    /* 
+     * The grant table for the active entry should be locked but the 
+     * percpu rwlock cannot be checked for read lock without race conditions
+     * or high overhead so we cannot use an ASSERT 
+     *
+     *   ASSERT(rw_is_locked(&t->lock));
+     */
 
     act = &_active_entry(t, e);
     spin_lock(&act->lock);
@@ -270,23 +278,23 @@ double_gt_lock(struct grant_table *lgt, struct grant_table *rgt)
      */
     if ( lgt < rgt )
     {
-        write_lock(&lgt->lock);
-        write_lock(&rgt->lock);
+        grant_percpu_write_lock(&lgt->lock);
+        grant_percpu_write_lock(&rgt->lock);
     }
     else
     {
         if ( lgt != rgt )
-            write_lock(&rgt->lock);
-        write_lock(&lgt->lock);
+            grant_percpu_write_lock(&rgt->lock);
+        grant_percpu_write_lock(&lgt->lock);
     }
 }
 
 static inline void
 double_gt_unlock(struct grant_table *lgt, struct grant_table *rgt)
 {
-    write_unlock(&lgt->lock);
+    grant_percpu_write_unlock(&lgt->lock);
     if ( lgt != rgt )
-        write_unlock(&rgt->lock);
+        grant_percpu_write_unlock(&rgt->lock);
 }
 
 static inline int
@@ -659,8 +667,14 @@ static int grant_map_exists(const struct domain *ld,
                             unsigned int *ref_count)
 {
     unsigned int ref, max_iter;
-    
-    ASSERT(rw_is_locked(&rgt->lock));
+
+    /* 
+     * The remote grant table should be locked but the percpu rwlock
+     * cannot be checked for read lock without race conditions or high 
+     * overhead so we cannot use an ASSERT 
+     *
+     *   ASSERT(rw_is_locked(&rgt->lock));
+     */
 
     max_iter = min(*ref_count + (1 << GNTTABOP_CONTINUATION_ARG_SHIFT),
                    nr_grant_entries(rgt));
@@ -703,12 +717,12 @@ static unsigned int mapkind(
      * Must have the local domain's grant table write lock when
      * iterating over its maptrack entries.
      */
-    ASSERT(rw_is_write_locked(&lgt->lock));
+    ASSERT(percpu_rw_is_write_locked(&lgt->lock));
     /*
      * Must have the remote domain's grant table write lock while
      * counting its active entries.
      */
-    ASSERT(rw_is_write_locked(&rd->grant_table->lock));
+    ASSERT(percpu_rw_is_write_locked(&rd->grant_table->lock));
 
     for ( handle = 0; !(kind & MAPKIND_WRITE) &&
                       handle < lgt->maptrack_limit; handle++ )
@@ -796,7 +810,7 @@ __gnttab_map_grant_ref(
     }
 
     rgt = rd->grant_table;
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
 
     /* Bounds check on the grant ref */
     if ( unlikely(op->ref >= nr_grant_entries(rgt)))
@@ -859,7 +873,7 @@ __gnttab_map_grant_ref(
     cache_flags = (shah->flags & (GTF_PAT | GTF_PWT | GTF_PCD) );
 
     active_entry_release(act);
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
 
     /* pg may be set, with a refcount included, from __get_paged_frame */
     if ( !pg )
@@ -1006,7 +1020,7 @@ __gnttab_map_grant_ref(
         put_page(pg);
     }
 
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
 
     act = active_entry_acquire(rgt, op->ref);
 
@@ -1029,7 +1043,7 @@ __gnttab_map_grant_ref(
     active_entry_release(act);
 
  unlock_out:
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
     op->status = rc;
     put_maptrack_handle(lgt, handle);
     rcu_unlock_domain(rd);
@@ -1080,18 +1094,18 @@ __gnttab_unmap_common(
 
     op->map = &maptrack_entry(lgt, op->handle);
 
-    read_lock(&lgt->lock);
+    grant_percpu_read_lock(&lgt->lock);
 
     if ( unlikely(!read_atomic(&op->map->flags)) )
     {
-        read_unlock(&lgt->lock);
+        grant_percpu_read_unlock(&lgt->lock);
         gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle);
         op->status = GNTST_bad_handle;
         return;
     }
 
     dom = op->map->domid;
-    read_unlock(&lgt->lock);
+    grant_percpu_read_unlock(&lgt->lock);
 
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
     {
@@ -1113,7 +1127,7 @@ __gnttab_unmap_common(
 
     rgt = rd->grant_table;
 
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
 
     op->flags = read_atomic(&op->map->flags);
     if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) )
@@ -1165,7 +1179,7 @@ __gnttab_unmap_common(
  act_release_out:
     active_entry_release(act);
  unmap_out:
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
 
     if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
@@ -1220,7 +1234,7 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
     rcu_lock_domain(rd);
     rgt = rd->grant_table;
 
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
     if ( rgt->gt_version == 0 )
         goto unlock_out;
 
@@ -1286,7 +1300,7 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op)
  act_release_out:
     active_entry_release(act);
  unlock_out:
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
 
     if ( put_handle )
     {
@@ -1585,7 +1599,7 @@ gnttab_setup_table(
     }
 
     gt = d->grant_table;
-    write_lock(&gt->lock);
+    grant_percpu_write_lock(&gt->lock);
 
     if ( gt->gt_version == 0 )
         gt->gt_version = 1;
@@ -1613,7 +1627,7 @@ gnttab_setup_table(
     }
 
  out3:
-    write_unlock(&gt->lock);
+    grant_percpu_write_unlock(&gt->lock);
  out2:
     rcu_unlock_domain(d);
  out1:
@@ -1655,13 +1669,13 @@ gnttab_query_size(
         goto query_out_unlock;
     }
 
-    read_lock(&d->grant_table->lock);
+    grant_percpu_read_lock(&d->grant_table->lock);
 
     op.nr_frames     = nr_grant_frames(d->grant_table);
     op.max_nr_frames = max_grant_frames;
     op.status        = GNTST_okay;
 
-    read_unlock(&d->grant_table->lock);
+    grant_percpu_read_unlock(&d->grant_table->lock);
 
  
  query_out_unlock:
@@ -1687,7 +1701,7 @@ gnttab_prepare_for_transfer(
     union grant_combo   scombo, prev_scombo, new_scombo;
     int                 retries = 0;
 
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
 
     if ( unlikely(ref >= nr_grant_entries(rgt)) )
     {
@@ -1730,11 +1744,11 @@ gnttab_prepare_for_transfer(
         scombo = prev_scombo;
     }
 
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
     return 1;
 
  fail:
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
     return 0;
 }
 
@@ -1925,7 +1939,7 @@ gnttab_transfer(
         TRACE_1D(TRC_MEM_PAGE_GRANT_TRANSFER, e->domain_id);
 
         /* Tell the guest about its new page frame. */
-        read_lock(&e->grant_table->lock);
+        grant_percpu_read_lock(&e->grant_table->lock);
         act = active_entry_acquire(e->grant_table, gop.ref);
 
         if ( e->grant_table->gt_version == 1 )
@@ -1949,7 +1963,7 @@ gnttab_transfer(
             GTF_transfer_completed;
 
         active_entry_release(act);
-        read_unlock(&e->grant_table->lock);
+        grant_percpu_read_unlock(&e->grant_table->lock);
 
         rcu_unlock_domain(e);
 
@@ -1987,7 +2001,7 @@ __release_grant_for_copy(
     released_read = 0;
     released_write = 0;
 
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
 
     act = active_entry_acquire(rgt, gref);
     sha = shared_entry_header(rgt, gref);
@@ -2029,7 +2043,7 @@ __release_grant_for_copy(
     }
 
     active_entry_release(act);
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
 
     if ( td != rd )
     {
@@ -2086,7 +2100,7 @@ __acquire_grant_for_copy(
 
     *page = NULL;
 
-    read_lock(&rgt->lock);
+    grant_percpu_read_lock(&rgt->lock);
 
     if ( unlikely(gref >= nr_grant_entries(rgt)) )
         PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
@@ -2168,20 +2182,20 @@ __acquire_grant_for_copy(
              * here and reacquire
              */
             active_entry_release(act);
-            read_unlock(&rgt->lock);
+            grant_percpu_read_unlock(&rgt->lock);
 
             rc = __acquire_grant_for_copy(td, trans_gref, rd->domain_id,
                                           readonly, &grant_frame, page,
                                           &trans_page_off, &trans_length, 0);
 
-            read_lock(&rgt->lock);
+            grant_percpu_read_lock(&rgt->lock);
             act = active_entry_acquire(rgt, gref);
 
             if ( rc != GNTST_okay ) {
                 __fixup_status_for_copy_pin(act, status);
                 rcu_unlock_domain(td);
                 active_entry_release(act);
-                read_unlock(&rgt->lock);
+                grant_percpu_read_unlock(&rgt->lock);
                 return rc;
             }
 
@@ -2194,7 +2208,7 @@ __acquire_grant_for_copy(
                 __fixup_status_for_copy_pin(act, status);
                 rcu_unlock_domain(td);
                 active_entry_release(act);
-                read_unlock(&rgt->lock);
+                grant_percpu_read_unlock(&rgt->lock);
                 put_page(*page);
                 return __acquire_grant_for_copy(rd, gref, ldom, readonly,
                                                 frame, page, page_off, length,
@@ -2258,7 +2272,7 @@ __acquire_grant_for_copy(
     *frame = act->frame;
 
     active_entry_release(act);
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
     return rc;
  
  unlock_out_clear:
@@ -2273,7 +2287,7 @@ __acquire_grant_for_copy(
     active_entry_release(act);
 
  gt_unlock_out:
-    read_unlock(&rgt->lock);
+    grant_percpu_read_unlock(&rgt->lock);
 
     return rc;
 }
@@ -2589,7 +2603,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop)
     if ( gt->gt_version == op.version )
         goto out;
 
-    write_lock(&gt->lock);
+    grant_percpu_write_lock(&gt->lock);
     /*
      * Make sure that the grant table isn't currently in use when we
      * change the version number, except for the first 8 entries which
@@ -2702,7 +2716,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop)
     gt->gt_version = op.version;
 
  out_unlock:
-    write_unlock(&gt->lock);
+    grant_percpu_write_unlock(&gt->lock);
 
  out:
     op.version = gt->gt_version;
@@ -2758,7 +2772,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) uop,
 
     op.status = GNTST_okay;
 
-    read_lock(&gt->lock);
+    grant_percpu_read_lock(&gt->lock);
 
     for ( i = 0; i < op.nr_frames; i++ )
     {
@@ -2767,7 +2781,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) uop,
             op.status = GNTST_bad_virt_addr;
     }
 
-    read_unlock(&gt->lock);
+    grant_percpu_read_unlock(&gt->lock);
 out2:
     rcu_unlock_domain(d);
 out1:
@@ -2817,7 +2831,7 @@ __gnttab_swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     struct active_grant_entry *act_b = NULL;
     s16 rc = GNTST_okay;
 
-    write_lock(&gt->lock);
+    grant_percpu_write_lock(&gt->lock);
 
     /* Bounds check on the grant refs */
     if ( unlikely(ref_a >= nr_grant_entries(d->grant_table)))
@@ -2865,7 +2879,7 @@ out:
         active_entry_release(act_b);
     if ( act_a != NULL )
         active_entry_release(act_a);
-    write_unlock(&gt->lock);
+    grant_percpu_write_unlock(&gt->lock);
 
     rcu_unlock_domain(d);
 
@@ -2936,12 +2950,12 @@ static int __gnttab_cache_flush(gnttab_cache_flush_t *cflush,
 
     if ( d != owner )
     {
-        read_lock(&owner->grant_table->lock);
+        grant_percpu_read_lock(&owner->grant_table->lock);
 
         ret = grant_map_exists(d, owner->grant_table, mfn, ref_count);
         if ( ret != 0 )
         {
-            read_unlock(&owner->grant_table->lock);
+            grant_percpu_read_unlock(&owner->grant_table->lock);
             rcu_unlock_domain(d);
             put_page(page);
             return ret;
@@ -2961,7 +2975,7 @@ static int __gnttab_cache_flush(gnttab_cache_flush_t *cflush,
         ret = 0;
 
     if ( d != owner )
-        read_unlock(&owner->grant_table->lock);
+        grant_percpu_read_unlock(&owner->grant_table->lock);
     unmap_domain_page(v);
     put_page(page);
 
@@ -3180,7 +3194,7 @@ grant_table_create(
         goto no_mem_0;
 
     /* Simple stuff. */
-    rwlock_init(&t->lock);
+    percpu_rwlock_resource_init(&t->lock, grant_rwlock);
     spin_lock_init(&t->maptrack_lock);
     t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES;
 
@@ -3282,7 +3296,7 @@ gnttab_release_mappings(
         }
 
         rgt = rd->grant_table;
-        read_lock(&rgt->lock);
+        grant_percpu_read_lock(&rgt->lock);
 
         act = active_entry_acquire(rgt, ref);
         sha = shared_entry_header(rgt, ref);
@@ -3343,7 +3357,7 @@ gnttab_release_mappings(
             gnttab_clear_flag(_GTF_reading, status);
 
         active_entry_release(act);
-        read_unlock(&rgt->lock);
+        grant_percpu_read_unlock(&rgt->lock);
 
         rcu_unlock_domain(rd);
 
@@ -3360,7 +3374,7 @@ void grant_table_warn_active_grants(struct domain *d)
 
 #define WARN_GRANT_MAX 10
 
-    read_lock(&gt->lock);
+    grant_percpu_read_lock(&gt->lock);
 
     for ( ref = 0; ref != nr_grant_entries(gt); ref++ )
     {
@@ -3382,7 +3396,7 @@ void grant_table_warn_active_grants(struct domain *d)
         printk(XENLOG_G_DEBUG "Dom%d has too many (%d) active grants to report\n",
                d->domain_id, nr_active);
 
-    read_unlock(&gt->lock);
+    grant_percpu_read_unlock(&gt->lock);
 
 #undef WARN_GRANT_MAX
 }
@@ -3432,7 +3446,7 @@ static void gnttab_usage_print(struct domain *rd)
     printk("      -------- active --------       -------- shared --------\n");
     printk("[ref] localdom mfn      pin          localdom gmfn     flags\n");
 
-    read_lock(&gt->lock);
+    grant_percpu_read_lock(&gt->lock);
 
     for ( ref = 0; ref != nr_grant_entries(gt); ref++ )
     {
@@ -3475,7 +3489,7 @@ static void gnttab_usage_print(struct domain *rd)
         active_entry_release(act);
     }
 
-    read_unlock(&gt->lock);
+    grant_percpu_read_unlock(&gt->lock);
 
     if ( first )
         printk("grant-table for remote domain:%5d ... "
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 1c29cee..831472e 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -51,13 +51,35 @@
 /* The maximum size of a grant table. */
 extern unsigned int max_grant_frames;
 
+DECLARE_PERCPU_RWLOCK_GLOBAL(grant_rwlock);
+
+static inline void grant_percpu_read_lock(percpu_rwlock_t *lock)
+{
+    percpu_read_lock(grant_rwlock, lock);
+}
+
+static inline void grant_percpu_read_unlock(percpu_rwlock_t *lock)
+{
+    percpu_read_unlock(grant_rwlock, lock);
+}
+
+static inline void grant_percpu_write_lock(percpu_rwlock_t *lock)
+{
+    percpu_write_lock(grant_rwlock, lock);
+}
+
+static inline void grant_percpu_write_unlock(percpu_rwlock_t *lock)
+{
+    percpu_write_unlock(grant_rwlock, lock);
+}
+
 /* Per-domain grant information. */
 struct grant_table {
     /*
      * Lock protecting updates to grant table state (version, active
      * entry list, etc.)
      */
-    rwlock_t              lock;
+    percpu_rwlock_t       lock;
     /* Table size. Number of frames shared with guest */
     unsigned int          nr_grant_frames;
     /* Shared grant table (see include/public/grant_table.h). */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCHv4 3/3] p2m: convert p2m rwlock to percpu rwlock
  2015-12-18 10:06 [PATCHv4 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
  2015-12-18 10:06 ` [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
  2015-12-18 10:06 ` [PATCHv4 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
@ 2015-12-18 10:06 ` Malcolm Crossley
  2 siblings, 0 replies; 6+ messages in thread
From: Malcolm Crossley @ 2015-12-18 10:06 UTC (permalink / raw)
  To: malcolm.crossley, JBeulich, ian.campbell, andrew.cooper3,
	Marcos.Matsunaga, keir, konrad.wilk, george.dunlap
  Cc: xen-devel, dario.faggioli, stefano.stabellini

The per domain p2m read lock suffers from significant contention when
performance multi-queue block or network IO due to the parallel
grant map/unmaps/copies occuring on the DomU's p2m.

On multi-socket systems, the contention results in the locked compare swap
operation failing frequently which results in a tight loop of retries of the
compare swap operation. As the coherency fabric can only support a specific
rate of compare swap operations for a particular data location then taking
the read lock itself becomes a bottleneck for p2m operations.

Percpu rwlock p2m performance with the same configuration is approximately
64 gbit/s vs the 48 gbit/s with grant table percpu rwlocks only.

Oprofile was used to determine the initial overhead of the read-write locks
and to confirm the overhead was dramatically reduced by the percpu rwlocks.

Note: altp2m users will not achieve a gain if they take an altp2m read lock
simultaneously with the main p2m lock.

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
--
Changes since v2
- Updated local percpu rwlock initialisation
---
 xen/arch/x86/mm/mm-locks.h | 12 +++++++-----
 xen/arch/x86/mm/p2m.c      |  1 +
 xen/include/asm-x86/mm.h   |  2 +-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 76c7217..8a40986 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -31,6 +31,8 @@
 DECLARE_PER_CPU(int, mm_lock_level);
 #define __get_lock_level()  (this_cpu(mm_lock_level))
 
+DECLARE_PERCPU_RWLOCK_GLOBAL(p2m_percpu_rwlock);
+
 static inline void mm_lock_init(mm_lock_t *l)
 {
     spin_lock_init(&l->lock);
@@ -99,7 +101,7 @@ static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
 
 static inline void mm_rwlock_init(mm_rwlock_t *l)
 {
-    rwlock_init(&l->lock);
+    percpu_rwlock_resource_init(&l->lock, p2m_percpu_rwlock);
     l->locker = -1;
     l->locker_function = "nobody";
     l->unlock_level = 0;
@@ -115,7 +117,7 @@ static inline void _mm_write_lock(mm_rwlock_t *l, const char *func, int level)
     if ( !mm_write_locked_by_me(l) )
     {
         __check_lock_level(level);
-        write_lock(&l->lock);
+        percpu_write_lock(p2m_percpu_rwlock, &l->lock);
         l->locker = get_processor_id();
         l->locker_function = func;
         l->unlock_level = __get_lock_level();
@@ -131,20 +133,20 @@ static inline void mm_write_unlock(mm_rwlock_t *l)
     l->locker = -1;
     l->locker_function = "nobody";
     __set_lock_level(l->unlock_level);
-    write_unlock(&l->lock);
+    percpu_write_unlock(p2m_percpu_rwlock, &l->lock);
 }
 
 static inline void _mm_read_lock(mm_rwlock_t *l, int level)
 {
     __check_lock_level(level);
-    read_lock(&l->lock);
+    percpu_read_lock(p2m_percpu_rwlock, &l->lock);
     /* There's nowhere to store the per-CPU unlock level so we can't
      * set the lock level. */
 }
 
 static inline void mm_read_unlock(mm_rwlock_t *l)
 {
-    read_unlock(&l->lock);
+    percpu_read_unlock(p2m_percpu_rwlock, &l->lock);
 }
 
 /* This wrapper uses the line number to express the locking order below */
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ed0bbd7..a45ee35 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -54,6 +54,7 @@ boolean_param("hap_2mb", opt_hap_2mb);
 #undef page_to_mfn
 #define page_to_mfn(_pg) _mfn(__page_to_mfn(_pg))
 
+DEFINE_PERCPU_RWLOCK_GLOBAL(p2m_percpu_rwlock);
 
 /* Init the datastructures for later use by the p2m code */
 static int p2m_initialise(struct domain *d, struct p2m_domain *p2m)
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 67b34c6..78511dd 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -566,7 +566,7 @@ typedef struct mm_lock {
 } mm_lock_t;
 
 typedef struct mm_rwlock {
-    rwlock_t           lock;
+    percpu_rwlock_t    lock;
     int                unlock_level;
     int                recurse_count;
     int                locker; /* CPU that holds the write lock */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
  2015-12-18 10:06 ` [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
@ 2015-12-18 11:37   ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2015-12-18 11:37 UTC (permalink / raw)
  To: malcolm.crossley
  Cc: keir, ian.campbell, george.dunlap, andrew.cooper3,
	dario.faggioli, Marcos.Matsunaga, stefano.stabellini, xen-devel

>>> On 18.12.15 at 11:06, <malcolm.crossley@citrix.com> wrote:
> @@ -492,6 +494,50 @@ int _rw_is_write_locked(rwlock_t *lock)
>      return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
>  }
>  
> +void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
> +                percpu_rwlock_t *percpu_rwlock)
> +{
> +    unsigned int cpu;
> +    cpumask_t *rwlock_readers = &this_cpu(percpu_rwlock_readers);
> +
> +#ifndef NDEBUG
> +    /* Validate the correct per_cpudata variable has been provided. */
> +    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
> +#endif

At the first glance the #ifndef looks plain superfluous, but I
understand that it's needed because we once decided to make
ASSERT() evaluate its condition even on non-debug builds.
However, to unclutter the use sites, you should probably put
this into a macro or inline function.

Everything else looks fine to me now.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv4 2/3] grant_table: convert grant table rwlock to percpu rwlock
  2015-12-18 10:06 ` [PATCHv4 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
@ 2015-12-18 11:41   ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2015-12-18 11:41 UTC (permalink / raw)
  To: malcolm.crossley
  Cc: keir, ian.campbell, george.dunlap, andrew.cooper3,
	dario.faggioli, Marcos.Matsunaga, stefano.stabellini, xen-devel

>>> On 18.12.15 at 11:06, <malcolm.crossley@citrix.com> wrote:
> --- a/xen/include/xen/grant_table.h
> +++ b/xen/include/xen/grant_table.h
> @@ -51,13 +51,35 @@
>  /* The maximum size of a grant table. */
>  extern unsigned int max_grant_frames;
>  
> +DECLARE_PERCPU_RWLOCK_GLOBAL(grant_rwlock);
> +
> +static inline void grant_percpu_read_lock(percpu_rwlock_t *lock)
> +{
> +    percpu_read_lock(grant_rwlock, lock);
> +}
> +
> +static inline void grant_percpu_read_unlock(percpu_rwlock_t *lock)
> +{
> +    percpu_read_unlock(grant_rwlock, lock);
> +}
> +
> +static inline void grant_percpu_write_lock(percpu_rwlock_t *lock)
> +{
> +    percpu_write_lock(grant_rwlock, lock);
> +}
> +
> +static inline void grant_percpu_write_unlock(percpu_rwlock_t *lock)
> +{
> +    percpu_write_unlock(grant_rwlock, lock);
> +}

These should all be taking struct grant_table * arguments, so they
can't be accidentally used on some other per-CPU r/w lock. I also
question the need for the "percpu" in their names.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-12-18 11:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-18 10:06 [PATCHv4 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
2015-12-18 10:06 ` [PATCHv4 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
2015-12-18 11:37   ` Jan Beulich
2015-12-18 10:06 ` [PATCHv4 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
2015-12-18 11:41   ` Jan Beulich
2015-12-18 10:06 ` [PATCHv4 3/3] p2m: convert p2m " Malcolm Crossley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.