[PATCH 0/3] x86/mm-locks: add a bias to current domain lock levels

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] x86/mm-locks: add a bias to current domain lock levels
@ 2018-12-18 16:05 Roger Pau Monne
  2018-12-18 16:05 ` [PATCH 1/3] x86/mm-locks: remove trailing whitespace Roger Pau Monne
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Roger Pau Monne @ 2018-12-18 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne

Hello,

The following series attempts to fix a mm lock level issue that prevents
using paging_log_dirty_op from paging callers (like a PVH Dom0). The
discussion that lead to this series can be found at:

https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01197.html

Thanks, Roger.

Roger Pau Monne (3):
  x86/mm-locks: remove trailing whitespace
  x86/mm-locks: convert some macros to inline functions
  x86/mm-locks: apply a bias to lock levels for current domain

 xen/arch/x86/mm/mm-locks.h | 217 ++++++++++++++++++++++---------------
 xen/arch/x86/mm/p2m-pod.c  |   5 +-
 2 files changed, 130 insertions(+), 92 deletions(-)

-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] x86/mm-locks: remove trailing whitespace
  2018-12-18 16:05 [PATCH 0/3] x86/mm-locks: add a bias to current domain lock levels Roger Pau Monne
@ 2018-12-18 16:05 ` Roger Pau Monne
  2018-12-19 10:55   ` George Dunlap
  2018-12-18 16:05 ` [PATCH 2/3] x86/mm-locks: convert some macros to inline functions Roger Pau Monne
  2018-12-18 16:05 ` [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain Roger Pau Monne
  2 siblings, 1 reply; 28+ messages in thread
From: Roger Pau Monne @ 2018-12-18 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm/mm-locks.h | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 95295b62d2..64b8775a6d 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -3,11 +3,11 @@
  *
  * Spinlocks used by the code in arch/x86/mm.
  *
- * Copyright (c) 2011 Citrix Systems, inc. 
+ * Copyright (c) 2011 Citrix Systems, inc.
  * Copyright (c) 2007 Advanced Micro Devices (Wei Huang)
  * Copyright (c) 2006-2007 XenSource Inc.
  * Copyright (c) 2006 Michael A Fetterman
- * 
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -41,7 +41,7 @@ static inline void mm_lock_init(mm_lock_t *l)
     l->unlock_level = 0;
 }
 
-static inline int mm_locked_by_me(mm_lock_t *l) 
+static inline int mm_locked_by_me(mm_lock_t *l)
 {
     return (l->lock.recurse_cpu == current->processor);
 }
@@ -67,7 +67,7 @@ do {                                \
 
 static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
 {
-    if ( !((mm_locked_by_me(l)) && rec) ) 
+    if ( !((mm_locked_by_me(l)) && rec) )
         __check_lock_level(level);
     spin_lock_recursive(&l->lock);
     if ( l->lock.recurse_cnt == 1 )
@@ -186,7 +186,7 @@ static inline void mm_unlock(mm_lock_t *l)
     spin_unlock_recursive(&l->lock);
 }
 
-static inline void mm_enforce_order_unlock(int unlock_level, 
+static inline void mm_enforce_order_unlock(int unlock_level,
                                             unsigned short *recurse_count)
 {
     if ( recurse_count )
@@ -310,7 +310,7 @@ declare_mm_rwlock(altp2m);
 #define gfn_locked_by_me(p,g) p2m_locked_by_me(p)
 
 /* PoD lock (per-p2m-table)
- * 
+ *
  * Protects private PoD data structs: entry and cache
  * counts, page lists, sweep parameters. */
 
@@ -322,7 +322,7 @@ declare_mm_lock(pod)
 
 /* Page alloc lock (per-domain)
  *
- * This is an external lock, not represented by an mm_lock_t. However, 
+ * This is an external lock, not represented by an mm_lock_t. However,
  * pod code uses it in conjunction with the p2m lock, and expecting
  * the ordering which we enforce here.
  * The lock is not recursive. */
@@ -338,13 +338,13 @@ declare_mm_order_constraint(page_alloc)
  * For shadow pagetables, this lock protects
  *   - all changes to shadow page table pages
  *   - the shadow hash table
- *   - the shadow page allocator 
+ *   - the shadow page allocator
  *   - all changes to guest page table pages
  *   - all changes to the page_info->tlbflush_timestamp
- *   - the page_info->count fields on shadow pages 
- * 
- * For HAP, it protects the NPT/EPT tables and mode changes. 
- * 
+ *   - the page_info->count fields on shadow pages
+ *
+ * For HAP, it protects the NPT/EPT tables and mode changes.
+ *
  * It also protects the log-dirty bitmap from concurrent accesses (and
  * teardowns, etc). */
 
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/3] x86/mm-locks: convert some macros to inline functions
  2018-12-18 16:05 [PATCH 0/3] x86/mm-locks: add a bias to current domain lock levels Roger Pau Monne
  2018-12-18 16:05 ` [PATCH 1/3] x86/mm-locks: remove trailing whitespace Roger Pau Monne
@ 2018-12-18 16:05 ` Roger Pau Monne
  2018-12-19 11:01   ` George Dunlap
  2018-12-18 16:05 ` [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain Roger Pau Monne
  2 siblings, 1 reply; 28+ messages in thread
From: Roger Pau Monne @ 2018-12-18 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

And rename to have only one prefix underscore where applicable.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm/mm-locks.h | 96 ++++++++++++++++++++------------------
 1 file changed, 51 insertions(+), 45 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 64b8775a6d..d3497713e9 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -29,7 +29,6 @@
 
 /* Per-CPU variable for enforcing the lock ordering */
 DECLARE_PER_CPU(int, mm_lock_level);
-#define __get_lock_level()  (this_cpu(mm_lock_level))
 
 DECLARE_PERCPU_RWLOCK_GLOBAL(p2m_percpu_rwlock);
 
@@ -46,43 +45,47 @@ static inline int mm_locked_by_me(mm_lock_t *l)
     return (l->lock.recurse_cpu == current->processor);
 }
 
+static inline int _get_lock_level(void)
+{
+    return this_cpu(mm_lock_level);
+}
+
 /*
  * If you see this crash, the numbers printed are order levels defined
  * in this file.
  */
-#define __check_lock_level(l)                           \
-do {                                                    \
-    if ( unlikely(__get_lock_level() > (l)) )           \
-    {                                                   \
-        printk("mm locking order violation: %i > %i\n", \
-               __get_lock_level(), (l));                \
-        BUG();                                          \
-    }                                                   \
-} while(0)
-
-#define __set_lock_level(l)         \
-do {                                \
-    __get_lock_level() = (l);       \
-} while(0)
+static inline void _check_lock_level(int l)
+{
+    if ( unlikely(_get_lock_level() > l) )
+    {
+        printk("mm locking order violation: %i > %i\n", _get_lock_level(), l);
+        BUG();
+    }
+}
+
+static inline void _set_lock_level(int l)
+{
+    this_cpu(mm_lock_level) = l;
+}
 
 static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
 {
     if ( !((mm_locked_by_me(l)) && rec) )
-        __check_lock_level(level);
+        _check_lock_level(level);
     spin_lock_recursive(&l->lock);
     if ( l->lock.recurse_cnt == 1 )
     {
         l->locker_function = func;
-        l->unlock_level = __get_lock_level();
+        l->unlock_level = _get_lock_level();
     }
     else if ( (unlikely(!rec)) )
         panic("mm lock already held by %s\n", l->locker_function);
-    __set_lock_level(level);
+    _set_lock_level(level);
 }
 
 static inline void _mm_enforce_order_lock_pre(int level)
 {
-    __check_lock_level(level);
+    _check_lock_level(level);
 }
 
 static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
@@ -92,12 +95,12 @@ static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
     {
         if ( (*recurse_count)++ == 0 )
         {
-            *unlock_level = __get_lock_level();
+            *unlock_level = _get_lock_level();
         }
     } else {
-        *unlock_level = __get_lock_level();
+        *unlock_level = _get_lock_level();
     }
-    __set_lock_level(level);
+    _set_lock_level(level);
 }
 
 
@@ -118,12 +121,12 @@ static inline void _mm_write_lock(mm_rwlock_t *l, const char *func, int level)
 {
     if ( !mm_write_locked_by_me(l) )
     {
-        __check_lock_level(level);
+        _check_lock_level(level);
         percpu_write_lock(p2m_percpu_rwlock, &l->lock);
         l->locker = get_processor_id();
         l->locker_function = func;
-        l->unlock_level = __get_lock_level();
-        __set_lock_level(level);
+        l->unlock_level = _get_lock_level();
+        _set_lock_level(level);
     }
     l->recurse_count++;
 }
@@ -134,13 +137,13 @@ static inline void mm_write_unlock(mm_rwlock_t *l)
         return;
     l->locker = -1;
     l->locker_function = "nobody";
-    __set_lock_level(l->unlock_level);
+    _set_lock_level(l->unlock_level);
     percpu_write_unlock(p2m_percpu_rwlock, &l->lock);
 }
 
 static inline void _mm_read_lock(mm_rwlock_t *l, int level)
 {
-    __check_lock_level(level);
+    _check_lock_level(level);
     percpu_read_lock(p2m_percpu_rwlock, &l->lock);
     /* There's nowhere to store the per-CPU unlock level so we can't
      * set the lock level. */
@@ -181,7 +184,7 @@ static inline void mm_unlock(mm_lock_t *l)
     if ( l->lock.recurse_cnt == 1 )
     {
         l->locker_function = "nobody";
-        __set_lock_level(l->unlock_level);
+        _set_lock_level(l->unlock_level);
     }
     spin_unlock_recursive(&l->lock);
 }
@@ -194,10 +197,10 @@ static inline void mm_enforce_order_unlock(int unlock_level,
         BUG_ON(*recurse_count == 0);
         if ( (*recurse_count)-- == 1 )
         {
-            __set_lock_level(unlock_level);
+            _set_lock_level(unlock_level);
         }
     } else {
-        __set_lock_level(unlock_level);
+        _set_lock_level(unlock_level);
     }
 }
 
@@ -287,21 +290,24 @@ declare_mm_lock(altp2mlist)
 
 #define MM_LOCK_ORDER_altp2m                 40
 declare_mm_rwlock(altp2m);
-#define p2m_lock(p)                             \
-    do {                                        \
-        if ( p2m_is_altp2m(p) )                 \
-            mm_write_lock(altp2m, &(p)->lock);  \
-        else                                    \
-            mm_write_lock(p2m, &(p)->lock);     \
-        (p)->defer_flush++;                     \
-    } while (0)
-#define p2m_unlock(p)                           \
-    do {                                        \
-        if ( --(p)->defer_flush == 0 )          \
-            p2m_unlock_and_tlb_flush(p);        \
-        else                                    \
-            mm_write_unlock(&(p)->lock);        \
-    } while (0)
+
+static inline void p2m_lock(struct p2m_domain *p)
+{
+    if ( p2m_is_altp2m(p) )
+        mm_write_lock(altp2m, &p->lock);
+    else
+        mm_write_lock(p2m, &p->lock);
+    p->defer_flush++;
+}
+
+static inline void p2m_unlock(struct p2m_domain *p)
+{
+    if ( --p->defer_flush == 0 )
+        p2m_unlock_and_tlb_flush(p);
+    else
+        mm_write_unlock(&p->lock);
+}
+
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
 #define p2m_read_lock(p)      mm_read_lock(p2m, &(p)->lock)
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-18 16:05 [PATCH 0/3] x86/mm-locks: add a bias to current domain lock levels Roger Pau Monne
  2018-12-18 16:05 ` [PATCH 1/3] x86/mm-locks: remove trailing whitespace Roger Pau Monne
  2018-12-18 16:05 ` [PATCH 2/3] x86/mm-locks: convert some macros to inline functions Roger Pau Monne
@ 2018-12-18 16:05 ` Roger Pau Monne
  2018-12-19 11:40   ` George Dunlap
  2 siblings, 1 reply; 28+ messages in thread
From: Roger Pau Monne @ 2018-12-18 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	Roger Pau Monne

paging_log_dirty_op function takes mm locks from a subject domain and
then attempts to perform copy to operations against the caller
domain in order to copy the result of the hypercall into the caller
provided buffer.

This works fine when the caller is a non-paging domain, but triggers a
lock order panic when the caller is a paging domain due to the fact
that at the point where the copy to operation is performed the subject
domain paging lock is locked, and the copy operation requires locking
the caller p2m lock which has a lower level.

Fix this limitation by adding a bias to the level of the caller domain
mm locks, so that the lower caller domain mm lock always has a level
greater than the higher subject domain lock level. This allows locking
the subject domain mm locks and then locking the caller domain mm
locks, while keeping the same lock ordering and the changes mostly
confined to mm-locks.h.

Note that so far only this flow (locking a subject domain locks and
then the caller domain ones) has been identified, but not all possible
code paths have been inspected. Hence this solution attempts to be a
non-intrusive fix for the problem at hand, without discarding further
changes in the future if other valid code paths are found that require
more complex lock level ordering.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Tim Deegan <tim@xen.org>
---
 xen/arch/x86/mm/mm-locks.h | 119 +++++++++++++++++++++++--------------
 xen/arch/x86/mm/p2m-pod.c  |   5 +-
 2 files changed, 78 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index d3497713e9..055f63e7ec 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -50,15 +50,35 @@ static inline int _get_lock_level(void)
     return this_cpu(mm_lock_level);
 }
 
+#define MM_LOCK_ORDER_MAX                    64
+/*
+ * Return the lock level taking the domain bias into account. If the domain
+ * matches the currently running one a bias of MM_LOCK_ORDER_MAX is applied to
+ * the lock level, so that mm locks that belong to the caller domain can be
+ * acquired after having acquired mm locks of a subject domain.
+ *
+ * This is required in order to use some hypercalls from a paging domain that
+ * take locks of a subject domain and then attempt to copy data to/from the
+ * caller domain.
+ */
+static inline int _lock_level(const struct domain *d, int l)
+{
+    ASSERT(l <= MM_LOCK_ORDER_MAX);
+
+    return l + (d == current->domain ? MM_LOCK_ORDER_MAX : 0);
+}
+
 /*
  * If you see this crash, the numbers printed are order levels defined
  * in this file.
  */
-static inline void _check_lock_level(int l)
+static inline void _check_lock_level(const struct domain *d, int l)
 {
-    if ( unlikely(_get_lock_level() > l) )
+    int lvl = _lock_level(d, l);
+
+    if ( unlikely(_get_lock_level() > lvl) )
     {
-        printk("mm locking order violation: %i > %i\n", _get_lock_level(), l);
+        printk("mm locking order violation: %i > %i\n", _get_lock_level(), lvl);
         BUG();
     }
 }
@@ -68,10 +88,11 @@ static inline void _set_lock_level(int l)
     this_cpu(mm_lock_level) = l;
 }
 
-static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
+static inline void _mm_lock(const struct domain *d, mm_lock_t *l,
+                            const char *func, int level, int rec)
 {
     if ( !((mm_locked_by_me(l)) && rec) )
-        _check_lock_level(level);
+        _check_lock_level(d, level);
     spin_lock_recursive(&l->lock);
     if ( l->lock.recurse_cnt == 1 )
     {
@@ -80,16 +101,17 @@ static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
     }
     else if ( (unlikely(!rec)) )
         panic("mm lock already held by %s\n", l->locker_function);
-    _set_lock_level(level);
+    _set_lock_level(_lock_level(d, level));
 }
 
-static inline void _mm_enforce_order_lock_pre(int level)
+static inline void _mm_enforce_order_lock_pre(const struct domain *d, int level)
 {
-    _check_lock_level(level);
+    _check_lock_level(d, level);
 }
 
-static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
-                                                unsigned short *recurse_count)
+static inline void _mm_enforce_order_lock_post(const struct domain *d, int level,
+                                               int *unlock_level,
+                                               unsigned short *recurse_count)
 {
     if ( recurse_count )
     {
@@ -100,7 +122,7 @@ static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
     } else {
         *unlock_level = _get_lock_level();
     }
-    _set_lock_level(level);
+    _set_lock_level(_lock_level(d, level));
 }
 
 
@@ -117,16 +139,17 @@ static inline int mm_write_locked_by_me(mm_rwlock_t *l)
     return (l->locker == get_processor_id());
 }
 
-static inline void _mm_write_lock(mm_rwlock_t *l, const char *func, int level)
+static inline void _mm_write_lock(const struct domain *d, mm_rwlock_t *l,
+                                  const char *func, int level)
 {
     if ( !mm_write_locked_by_me(l) )
     {
-        _check_lock_level(level);
+        _check_lock_level(d, level);
         percpu_write_lock(p2m_percpu_rwlock, &l->lock);
         l->locker = get_processor_id();
         l->locker_function = func;
         l->unlock_level = _get_lock_level();
-        _set_lock_level(level);
+        _set_lock_level(_lock_level(d, level));
     }
     l->recurse_count++;
 }
@@ -141,9 +164,10 @@ static inline void mm_write_unlock(mm_rwlock_t *l)
     percpu_write_unlock(p2m_percpu_rwlock, &l->lock);
 }
 
-static inline void _mm_read_lock(mm_rwlock_t *l, int level)
+static inline void _mm_read_lock(const struct domain *d, mm_rwlock_t *l,
+                                 int level)
 {
-    _check_lock_level(level);
+    _check_lock_level(d, level);
     percpu_read_lock(p2m_percpu_rwlock, &l->lock);
     /* There's nowhere to store the per-CPU unlock level so we can't
      * set the lock level. */
@@ -156,28 +180,32 @@ static inline void mm_read_unlock(mm_rwlock_t *l)
 
 /* This wrapper uses the line number to express the locking order below */
 #define declare_mm_lock(name)                                                 \
-    static inline void mm_lock_##name(mm_lock_t *l, const char *func, int rec)\
-    { _mm_lock(l, func, MM_LOCK_ORDER_##name, rec); }
+    static inline void mm_lock_##name(const struct domain *d, mm_lock_t *l,   \
+                                      const char *func, int rec)              \
+    { _mm_lock(d, l, func, MM_LOCK_ORDER_##name, rec); }
 #define declare_mm_rwlock(name)                                               \
-    static inline void mm_write_lock_##name(mm_rwlock_t *l, const char *func) \
-    { _mm_write_lock(l, func, MM_LOCK_ORDER_##name); }                                    \
-    static inline void mm_read_lock_##name(mm_rwlock_t *l)                    \
-    { _mm_read_lock(l, MM_LOCK_ORDER_##name); }
+    static inline void mm_write_lock_##name(const struct domain *d,           \
+                                            mm_rwlock_t *l, const char *func) \
+    { _mm_write_lock(d, l, func, MM_LOCK_ORDER_##name); }                     \
+    static inline void mm_read_lock_##name(const struct domain *d,            \
+                                           mm_rwlock_t *l)                    \
+    { _mm_read_lock(d, l, MM_LOCK_ORDER_##name); }
 /* These capture the name of the calling function */
-#define mm_lock(name, l) mm_lock_##name(l, __func__, 0)
-#define mm_lock_recursive(name, l) mm_lock_##name(l, __func__, 1)
-#define mm_write_lock(name, l) mm_write_lock_##name(l, __func__)
-#define mm_read_lock(name, l) mm_read_lock_##name(l)
+#define mm_lock(name, d, l) mm_lock_##name(d, l, __func__, 0)
+#define mm_lock_recursive(name, d, l) mm_lock_##name(d, l, __func__, 1)
+#define mm_write_lock(name, d, l) mm_write_lock_##name(d, l, __func__)
+#define mm_read_lock(name, d, l) mm_read_lock_##name(d, l)
 
 /* This wrapper is intended for "external" locks which do not use
  * the mm_lock_t types. Such locks inside the mm code are also subject
  * to ordering constraints. */
-#define declare_mm_order_constraint(name)                                   \
-    static inline void mm_enforce_order_lock_pre_##name(void)               \
-    { _mm_enforce_order_lock_pre(MM_LOCK_ORDER_##name); }                               \
-    static inline void mm_enforce_order_lock_post_##name(                   \
-                        int *unlock_level, unsigned short *recurse_count)   \
-    { _mm_enforce_order_lock_post(MM_LOCK_ORDER_##name, unlock_level, recurse_count); } \
+#define declare_mm_order_constraint(name)                                       \
+    static inline void mm_enforce_order_lock_pre_##name(const struct domain *d) \
+    { _mm_enforce_order_lock_pre(d, MM_LOCK_ORDER_##name); }                    \
+    static inline void mm_enforce_order_lock_post_##name(const struct domain *d,\
+                        int *unlock_level, unsigned short *recurse_count)       \
+    { _mm_enforce_order_lock_post(d, MM_LOCK_ORDER_##name, unlock_level,        \
+                                  recurse_count); }
 
 static inline void mm_unlock(mm_lock_t *l)
 {
@@ -221,7 +249,7 @@ static inline void mm_enforce_order_unlock(int unlock_level,
 
 #define MM_LOCK_ORDER_nestedp2m               8
 declare_mm_lock(nestedp2m)
-#define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
+#define nestedp2m_lock(d)   mm_lock(nestedp2m, d, &(d)->arch.nested_p2m_lock)
 #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
 
 /* P2M lock (per-non-alt-p2m-table)
@@ -260,9 +288,10 @@ declare_mm_rwlock(p2m);
 
 #define MM_LOCK_ORDER_per_page_sharing       24
 declare_mm_order_constraint(per_page_sharing)
-#define page_sharing_mm_pre_lock()   mm_enforce_order_lock_pre_per_page_sharing()
+#define page_sharing_mm_pre_lock() \
+        mm_enforce_order_lock_pre_per_page_sharing(NULL)
 #define page_sharing_mm_post_lock(l, r) \
-        mm_enforce_order_lock_post_per_page_sharing((l), (r))
+        mm_enforce_order_lock_post_per_page_sharing(NULL, (l), (r))
 #define page_sharing_mm_unlock(l, r) mm_enforce_order_unlock((l), (r))
 
 /* Alternate P2M list lock (per-domain)
@@ -275,7 +304,8 @@ declare_mm_order_constraint(per_page_sharing)
 
 #define MM_LOCK_ORDER_altp2mlist             32
 declare_mm_lock(altp2mlist)
-#define altp2m_list_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_list_lock)
+#define altp2m_list_lock(d)   mm_lock(altp2mlist, d, \
+                                      &(d)->arch.altp2m_list_lock)
 #define altp2m_list_unlock(d) mm_unlock(&(d)->arch.altp2m_list_lock)
 
 /* P2M lock (per-altp2m-table)
@@ -294,9 +324,9 @@ declare_mm_rwlock(altp2m);
 static inline void p2m_lock(struct p2m_domain *p)
 {
     if ( p2m_is_altp2m(p) )
-        mm_write_lock(altp2m, &p->lock);
+        mm_write_lock(altp2m, p->domain, &p->lock);
     else
-        mm_write_lock(p2m, &p->lock);
+        mm_write_lock(p2m, p->domain, &p->lock);
     p->defer_flush++;
 }
 
@@ -310,7 +340,7 @@ static inline void p2m_unlock(struct p2m_domain *p)
 
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
-#define p2m_read_lock(p)      mm_read_lock(p2m, &(p)->lock)
+#define p2m_read_lock(p)      mm_read_lock(p2m, (p)->domain, &(p)->lock)
 #define p2m_read_unlock(p)    mm_read_unlock(&(p)->lock)
 #define p2m_locked_by_me(p)   mm_write_locked_by_me(&(p)->lock)
 #define gfn_locked_by_me(p,g) p2m_locked_by_me(p)
@@ -322,7 +352,7 @@ static inline void p2m_unlock(struct p2m_domain *p)
 
 #define MM_LOCK_ORDER_pod                    48
 declare_mm_lock(pod)
-#define pod_lock(p)           mm_lock(pod, &(p)->pod.lock)
+#define pod_lock(p)           mm_lock(pod, (p)->domain, &(p)->pod.lock)
 #define pod_unlock(p)         mm_unlock(&(p)->pod.lock)
 #define pod_locked_by_me(p)   mm_locked_by_me(&(p)->pod.lock)
 
@@ -335,8 +365,9 @@ declare_mm_lock(pod)
 
 #define MM_LOCK_ORDER_page_alloc             56
 declare_mm_order_constraint(page_alloc)
-#define page_alloc_mm_pre_lock()   mm_enforce_order_lock_pre_page_alloc()
-#define page_alloc_mm_post_lock(l) mm_enforce_order_lock_post_page_alloc(&(l), NULL)
+#define page_alloc_mm_pre_lock(d)  mm_enforce_order_lock_pre_page_alloc(d)
+#define page_alloc_mm_post_lock(d, l) \
+        mm_enforce_order_lock_post_page_alloc(d, &(l), NULL)
 #define page_alloc_mm_unlock(l)    mm_enforce_order_unlock((l), NULL)
 
 /* Paging lock (per-domain)
@@ -356,9 +387,9 @@ declare_mm_order_constraint(page_alloc)
 
 #define MM_LOCK_ORDER_paging                 64
 declare_mm_lock(paging)
-#define paging_lock(d)         mm_lock(paging, &(d)->arch.paging.lock)
+#define paging_lock(d)         mm_lock(paging, d, &(d)->arch.paging.lock)
 #define paging_lock_recursive(d) \
-                    mm_lock_recursive(paging, &(d)->arch.paging.lock)
+                    mm_lock_recursive(paging, d, &(d)->arch.paging.lock)
 #define paging_unlock(d)       mm_unlock(&(d)->arch.paging.lock)
 #define paging_locked_by_me(d) mm_locked_by_me(&(d)->arch.paging.lock)
 
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 4c56cb58c6..4313863066 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -34,9 +34,10 @@
 /* Enforce lock ordering when grabbing the "external" page_alloc lock */
 static inline void lock_page_alloc(struct p2m_domain *p2m)
 {
-    page_alloc_mm_pre_lock();
+    page_alloc_mm_pre_lock(p2m->domain);
     spin_lock(&(p2m->domain->page_alloc_lock));
-    page_alloc_mm_post_lock(p2m->domain->arch.page_alloc_unlock_level);
+    page_alloc_mm_post_lock(p2m->domain,
+                            p2m->domain->arch.page_alloc_unlock_level);
 }
 
 static inline void unlock_page_alloc(struct p2m_domain *p2m)
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/mm-locks: remove trailing whitespace
  2018-12-18 16:05 ` [PATCH 1/3] x86/mm-locks: remove trailing whitespace Roger Pau Monne
@ 2018-12-19 10:55   ` George Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: George Dunlap @ 2018-12-19 10:55 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> No functional change.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/mm-locks: convert some macros to inline functions
  2018-12-18 16:05 ` [PATCH 2/3] x86/mm-locks: convert some macros to inline functions Roger Pau Monne
@ 2018-12-19 11:01   ` George Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: George Dunlap @ 2018-12-19 11:01 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> And rename to have only one prefix underscore where applicable.
> 
> No functional change.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-18 16:05 ` [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain Roger Pau Monne
@ 2018-12-19 11:40   ` George Dunlap
  2018-12-19 12:10     ` Roger Pau Monné
  0 siblings, 1 reply; 28+ messages in thread
From: George Dunlap @ 2018-12-19 11:40 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich, Tim Deegan

On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> paging_log_dirty_op function takes mm locks from a subject domain and
> then attempts to perform copy to operations against the caller
> domain in order to copy the result of the hypercall into the caller
> provided buffer.
> 
> This works fine when the caller is a non-paging domain, but triggers a
> lock order panic when the caller is a paging domain due to the fact
> that at the point where the copy to operation is performed the subject
> domain paging lock is locked, and the copy operation requires locking
> the caller p2m lock which has a lower level.
> 
> Fix this limitation by adding a bias to the level of the caller domain
> mm locks, so that the lower caller domain mm lock always has a level
> greater than the higher subject domain lock level. This allows locking
> the subject domain mm locks and then locking the caller domain mm
> locks, while keeping the same lock ordering and the changes mostly
> confined to mm-locks.h.
> 
> Note that so far only this flow (locking a subject domain locks and
> then the caller domain ones) has been identified, but not all possible
> code paths have been inspected. Hence this solution attempts to be a
> non-intrusive fix for the problem at hand, without discarding further
> changes in the future if other valid code paths are found that require
> more complex lock level ordering.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

As a quick fix I think this general approach is OK; the thing I don't
like is that it's symmetric.  We don't *expect* to ever have a situation
where A grabs one of its own MM locks and then one of B's, *and* B then
grabs one of its own locks and then A's; but it could happen.

Since we've generally identified dom0 which may be grabbing locks of a
PVH stubdom, which may be grabbing logs of a normal domU, would it be
possible / make sense instead to give a 2x bonus for dom0, and a 1x
bonus for "is_priv_for" domains?

That way we enforce that all target locks are acquired before all
stubomain locks, and all stubdomain locks are acquired before all dom0
locks: we have a total ordering of locks (not counting between "peer" VMs).

I suspect, however, that at some point we'll discover some path where
the order will need to be reversed; at which point we'll need to figure
out something else.

I also think it would be good to start recording specific instances of
ordering requirements, so that we don't have to go track them down.
(It's not immediately obvious to me, for instance, why the paging lock
is so far down the list.)  Could you add somewhere into the comments in
this section something like this:

paging_log_dirty_op grabs tgt paging_lock, then does copy_to_user which
grabs dom0's P2M lock.

The other thing we could do is generate the lock level as (OWNER_TYPE <<
11) | (LOCK_TYPE << 5) | (domain_id), where OWNER_TYPE is 2 for dom0, 1
for stub domains, 0 for normal domains, and LOCK_TYPE is the current
lock level; and then fail if the new lock level >= current lock level.
That would flag up any potential inter-domain deadlocks as well.

Thoughts?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 11:40   ` George Dunlap
@ 2018-12-19 12:10     ` Roger Pau Monné
  2018-12-19 12:17       ` Andrew Cooper
  2018-12-19 12:38       ` George Dunlap
  0 siblings, 2 replies; 28+ messages in thread
From: Roger Pau Monné @ 2018-12-19 12:10 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> > paging_log_dirty_op function takes mm locks from a subject domain and
> > then attempts to perform copy to operations against the caller
> > domain in order to copy the result of the hypercall into the caller
> > provided buffer.
> > 
> > This works fine when the caller is a non-paging domain, but triggers a
> > lock order panic when the caller is a paging domain due to the fact
> > that at the point where the copy to operation is performed the subject
> > domain paging lock is locked, and the copy operation requires locking
> > the caller p2m lock which has a lower level.
> > 
> > Fix this limitation by adding a bias to the level of the caller domain
> > mm locks, so that the lower caller domain mm lock always has a level
> > greater than the higher subject domain lock level. This allows locking
> > the subject domain mm locks and then locking the caller domain mm
> > locks, while keeping the same lock ordering and the changes mostly
> > confined to mm-locks.h.
> > 
> > Note that so far only this flow (locking a subject domain locks and
> > then the caller domain ones) has been identified, but not all possible
> > code paths have been inspected. Hence this solution attempts to be a
> > non-intrusive fix for the problem at hand, without discarding further
> > changes in the future if other valid code paths are found that require
> > more complex lock level ordering.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> As a quick fix I think this general approach is OK; the thing I don't
> like is that it's symmetric.  We don't *expect* to ever have a situation
> where A grabs one of its own MM locks and then one of B's, *and* B then
> grabs one of its own locks and then A's; but it could happen.

I have not identified such scenario ATM, but we cannot discard future
features needing such interlocking I guess. In any case, I think this
is something that would have to be solved when we came across such
scenario IMO.

> Since we've generally identified dom0 which may be grabbing locks of a
> PVH stubdom, which may be grabbing logs of a normal domU, would it be
> possible / make sense instead to give a 2x bonus for dom0, and a 1x
> bonus for "is_priv_for" domains?

Jan pointed out such case, but I'm not sure I can see how this is
supposedly to happen even given the scenario above, I have to admit
however I'm not that familiar with the mm code, so it's likely I'm
missing something.

Hypercalls AFAIK have a single target (or subject) domain, so even if
there's a stubdomain relation I'm not sure I see why that would
require this kind of locking, any domain can perform hypercalls
against a single subject domain, and the hypervisor itself doesn't
even know about stubdomain relations.

> That way we enforce that all target locks are acquired before all
> stubomain locks, and all stubdomain locks are acquired before all dom0
> locks: we have a total ordering of locks (not counting between "peer" VMs).
> 
> I suspect, however, that at some point we'll discover some path where
> the order will need to be reversed; at which point we'll need to figure
> out something else.

That's indeed possible, I've tried to note it in the last paragraph of
my commit message.

> I also think it would be good to start recording specific instances of
> ordering requirements, so that we don't have to go track them down.
> (It's not immediately obvious to me, for instance, why the paging lock
> is so far down the list.)  Could you add somewhere into the comments in
> this section something like this:
> 
> paging_log_dirty_op grabs tgt paging_lock, then does copy_to_user which
> grabs dom0's P2M lock.
> 
> The other thing we could do is generate the lock level as (OWNER_TYPE <<
> 11) | (LOCK_TYPE << 5) | (domain_id), where OWNER_TYPE is 2 for dom0, 1
> for stub domains, 0 for normal domains, and LOCK_TYPE is the current
> lock level; and then fail if the new lock level >= current lock level.
> That would flag up any potential inter-domain deadlocks as well.

I'm not sure it will catch all inter-domain locks. For example Xen
would still allow to take the same lock type from a domain with a
higher domain_id than the currently locked one if both have the same
OWNER_TYPE.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:10     ` Roger Pau Monné
@ 2018-12-19 12:17       ` Andrew Cooper
  2018-12-19 14:35         ` Roger Pau Monné
  2018-12-19 12:38       ` George Dunlap
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Cooper @ 2018-12-19 12:17 UTC (permalink / raw)
  To: Roger Pau Monné, George Dunlap
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Tim Deegan

On 19/12/2018 12:10, Roger Pau Monné wrote:
> Hypercalls AFAIK have a single target (or subject) domain, so even if
> there's a stubdomain relation I'm not sure I see why that would
> require this kind of locking, any domain can perform hypercalls
> against a single subject domain, and the hypervisor itself doesn't
> even know about stubdomain relations.

Grant copy has two domains, and neither need to be the callee. 
Specifically, we discovered during the SILO work that dom0 occasonally
issues a dom1 => dom2 grant copy.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:10     ` Roger Pau Monné
  2018-12-19 12:17       ` Andrew Cooper
@ 2018-12-19 12:38       ` George Dunlap
  2018-12-19 12:42         ` Andrew Cooper
                           ` (2 more replies)
  1 sibling, 3 replies; 28+ messages in thread
From: George Dunlap @ 2018-12-19 12:38 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On 12/19/18 12:10 PM, Roger Pau Monné wrote:
> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>> then attempts to perform copy to operations against the caller
>>> domain in order to copy the result of the hypercall into the caller
>>> provided buffer.
>>>
>>> This works fine when the caller is a non-paging domain, but triggers a
>>> lock order panic when the caller is a paging domain due to the fact
>>> that at the point where the copy to operation is performed the subject
>>> domain paging lock is locked, and the copy operation requires locking
>>> the caller p2m lock which has a lower level.
>>>
>>> Fix this limitation by adding a bias to the level of the caller domain
>>> mm locks, so that the lower caller domain mm lock always has a level
>>> greater than the higher subject domain lock level. This allows locking
>>> the subject domain mm locks and then locking the caller domain mm
>>> locks, while keeping the same lock ordering and the changes mostly
>>> confined to mm-locks.h.
>>>
>>> Note that so far only this flow (locking a subject domain locks and
>>> then the caller domain ones) has been identified, but not all possible
>>> code paths have been inspected. Hence this solution attempts to be a
>>> non-intrusive fix for the problem at hand, without discarding further
>>> changes in the future if other valid code paths are found that require
>>> more complex lock level ordering.
>>>
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> As a quick fix I think this general approach is OK; the thing I don't
>> like is that it's symmetric.  We don't *expect* to ever have a situation
>> where A grabs one of its own MM locks and then one of B's, *and* B then
>> grabs one of its own locks and then A's; but it could happen.
> 
> I have not identified such scenario ATM, but we cannot discard future
> features needing such interlocking I guess. In any case, I think this
> is something that would have to be solved when we came across such
> scenario IMO.

Right -- and the purpose of these macros is to make sure that we
discover such potential deadlocks in testing rather than in production.

>> Since we've generally identified dom0 which may be grabbing locks of a
>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>> bonus for "is_priv_for" domains?
> 
> Jan pointed out such case, but I'm not sure I can see how this is
> supposedly to happen even given the scenario above, I have to admit
> however I'm not that familiar with the mm code, so it's likely I'm
> missing something.
> 
> Hypercalls AFAIK have a single target (or subject) domain, so even if
> there's a stubdomain relation I'm not sure I see why that would
> require this kind of locking, any domain can perform hypercalls
> against a single subject domain, and the hypervisor itself doesn't
> even know about stubdomain relations.

We're considering three potential cases:

A. dom0 makes a hypercall w/ domU as a target.
B. dom0 makes a hypercall w/ stubdom as a target.
c. stubdom makes a hypercall w/ domU as a target.

We could give only dom0 a bonus.  In that case, A and B would work, but
C might fail (since stubdom's lock values are the same as domU's).

We could give both dom0 and stubdoms a bonus.  In that case, A and C
would work, but B might fail (since the stubdom's lock values are the
same as dom0's).

Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
double bonus.  That way all 3 work, since dom0's lock values !=
stubdom's lock values, and stubdom's lock values != domU's lock values.

On the other hand, starting simple and adding things in as you find you
need them isn't a bad approach; so possibly just giving a bonus to dom0
is a good place to start.

>> I also think it would be good to start recording specific instances of
>> ordering requirements, so that we don't have to go track them down.
>> (It's not immediately obvious to me, for instance, why the paging lock
>> is so far down the list.)  Could you add somewhere into the comments in
>> this section something like this:
>>
>> paging_log_dirty_op grabs tgt paging_lock, then does copy_to_user which
>> grabs dom0's P2M lock.
>>
>> The other thing we could do is generate the lock level as (OWNER_TYPE <<
>> 11) | (LOCK_TYPE << 5) | (domain_id), where OWNER_TYPE is 2 for dom0, 1
>> for stub domains, 0 for normal domains, and LOCK_TYPE is the current
>> lock level; and then fail if the new lock level >= current lock level.
>> That would flag up any potential inter-domain deadlocks as well.
> 
> I'm not sure it will catch all inter-domain locks. For example Xen
> would still allow to take the same lock type from a domain with a
> higher domain_id than the currently locked one if both have the same
> OWNER_TYPE.

If we wanted to ASSERT on all interdomain locks, we could just change
the '>' to a '>=' and call it a day.  The idea instead is to allow
inter-domain locking *if* it follows the "grab the lower domid lock
first" discipline (in which case there won't be any deadlocks).

But maybe that's beginning to get into 'feature creep'.  Best leave that
for an improvement patch, not a bug fix.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:38       ` George Dunlap
@ 2018-12-19 12:42         ` Andrew Cooper
  2018-12-19 13:55           ` George Dunlap
  2018-12-19 14:30         ` Jan Beulich
  2018-12-19 14:40         ` Roger Pau Monné
  2 siblings, 1 reply; 28+ messages in thread
From: Andrew Cooper @ 2018-12-19 12:42 UTC (permalink / raw)
  To: George Dunlap, Roger Pau Monné
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Tim Deegan

On 19/12/2018 12:38, George Dunlap wrote:
> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>>> then attempts to perform copy to operations against the caller
>>>> domain in order to copy the result of the hypercall into the caller
>>>> provided buffer.
>>>>
>>>> This works fine when the caller is a non-paging domain, but triggers a
>>>> lock order panic when the caller is a paging domain due to the fact
>>>> that at the point where the copy to operation is performed the subject
>>>> domain paging lock is locked, and the copy operation requires locking
>>>> the caller p2m lock which has a lower level.
>>>>
>>>> Fix this limitation by adding a bias to the level of the caller domain
>>>> mm locks, so that the lower caller domain mm lock always has a level
>>>> greater than the higher subject domain lock level. This allows locking
>>>> the subject domain mm locks and then locking the caller domain mm
>>>> locks, while keeping the same lock ordering and the changes mostly
>>>> confined to mm-locks.h.
>>>>
>>>> Note that so far only this flow (locking a subject domain locks and
>>>> then the caller domain ones) has been identified, but not all possible
>>>> code paths have been inspected. Hence this solution attempts to be a
>>>> non-intrusive fix for the problem at hand, without discarding further
>>>> changes in the future if other valid code paths are found that require
>>>> more complex lock level ordering.
>>>>
>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>> As a quick fix I think this general approach is OK; the thing I don't
>>> like is that it's symmetric.  We don't *expect* to ever have a situation
>>> where A grabs one of its own MM locks and then one of B's, *and* B then
>>> grabs one of its own locks and then A's; but it could happen.
>> I have not identified such scenario ATM, but we cannot discard future
>> features needing such interlocking I guess. In any case, I think this
>> is something that would have to be solved when we came across such
>> scenario IMO.
> Right -- and the purpose of these macros is to make sure that we
> discover such potential deadlocks in testing rather than in production.
>
>>> Since we've generally identified dom0 which may be grabbing locks of a
>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>>> bonus for "is_priv_for" domains?
>> Jan pointed out such case, but I'm not sure I can see how this is
>> supposedly to happen even given the scenario above, I have to admit
>> however I'm not that familiar with the mm code, so it's likely I'm
>> missing something.
>>
>> Hypercalls AFAIK have a single target (or subject) domain, so even if
>> there's a stubdomain relation I'm not sure I see why that would
>> require this kind of locking, any domain can perform hypercalls
>> against a single subject domain, and the hypervisor itself doesn't
>> even know about stubdomain relations.
> We're considering three potential cases:
>
> A. dom0 makes a hypercall w/ domU as a target.
> B. dom0 makes a hypercall w/ stubdom as a target.
> c. stubdom makes a hypercall w/ domU as a target.

I'm afraid that this approach isn't appropriate.

The privilege of the callee has no bearing on the correctness of the
locking.  Any logic based on IS_PRIV/target is buggy.  (Consider the
case where XSM lets an otherwise plain HVM domain use some of the more
interesting hypercalls.)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:42         ` Andrew Cooper
@ 2018-12-19 13:55           ` George Dunlap
  2018-12-19 14:07             ` Andrew Cooper
  0 siblings, 1 reply; 28+ messages in thread
From: George Dunlap @ 2018-12-19 13:55 UTC (permalink / raw)
  To: Andrew Cooper, Roger Pau Monné
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Tim Deegan

On 12/19/18 12:42 PM, Andrew Cooper wrote:
> On 19/12/2018 12:38, George Dunlap wrote:
>> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
>>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>>>> then attempts to perform copy to operations against the caller
>>>>> domain in order to copy the result of the hypercall into the caller
>>>>> provided buffer.
>>>>>
>>>>> This works fine when the caller is a non-paging domain, but triggers a
>>>>> lock order panic when the caller is a paging domain due to the fact
>>>>> that at the point where the copy to operation is performed the subject
>>>>> domain paging lock is locked, and the copy operation requires locking
>>>>> the caller p2m lock which has a lower level.
>>>>>
>>>>> Fix this limitation by adding a bias to the level of the caller domain
>>>>> mm locks, so that the lower caller domain mm lock always has a level
>>>>> greater than the higher subject domain lock level. This allows locking
>>>>> the subject domain mm locks and then locking the caller domain mm
>>>>> locks, while keeping the same lock ordering and the changes mostly
>>>>> confined to mm-locks.h.
>>>>>
>>>>> Note that so far only this flow (locking a subject domain locks and
>>>>> then the caller domain ones) has been identified, but not all possible
>>>>> code paths have been inspected. Hence this solution attempts to be a
>>>>> non-intrusive fix for the problem at hand, without discarding further
>>>>> changes in the future if other valid code paths are found that require
>>>>> more complex lock level ordering.
>>>>>
>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>> As a quick fix I think this general approach is OK; the thing I don't
>>>> like is that it's symmetric.  We don't *expect* to ever have a situation
>>>> where A grabs one of its own MM locks and then one of B's, *and* B then
>>>> grabs one of its own locks and then A's; but it could happen.
>>> I have not identified such scenario ATM, but we cannot discard future
>>> features needing such interlocking I guess. In any case, I think this
>>> is something that would have to be solved when we came across such
>>> scenario IMO.
>> Right -- and the purpose of these macros is to make sure that we
>> discover such potential deadlocks in testing rather than in production.
>>
>>>> Since we've generally identified dom0 which may be grabbing locks of a
>>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>>>> bonus for "is_priv_for" domains?
>>> Jan pointed out such case, but I'm not sure I can see how this is
>>> supposedly to happen even given the scenario above, I have to admit
>>> however I'm not that familiar with the mm code, so it's likely I'm
>>> missing something.
>>>
>>> Hypercalls AFAIK have a single target (or subject) domain, so even if
>>> there's a stubdomain relation I'm not sure I see why that would
>>> require this kind of locking, any domain can perform hypercalls
>>> against a single subject domain, and the hypervisor itself doesn't
>>> even know about stubdomain relations.
>> We're considering three potential cases:
>>
>> A. dom0 makes a hypercall w/ domU as a target.
>> B. dom0 makes a hypercall w/ stubdom as a target.
>> c. stubdom makes a hypercall w/ domU as a target.
> 
> I'm afraid that this approach isn't appropriate.
> 
> The privilege of the callee has no bearing on the correctness of the
> locking.  Any logic based on IS_PRIV/target is buggy.  (Consider the
> case where XSM lets an otherwise plain HVM domain use some of the more
> interesting hypercalls.)

You're not using the word "buggy" correctly.

The only *actual* bug is a concrete instance of two locking orders which
may deadlock.  But with this many locks (8!), proving to ourselves that
there is no possible combination of paths which would cause a deadlock
is too difficult.

The purpose of these checks is to flag up *potential* bugs during
testing, rather than relying on finding *actual* deadlocks in
production.  The test is *sufficient* to show that there are no
deadlocks, but not *necessary*; that is, the test is actually stricter
than it needs to be, but it's simple enough for us to understand.

The current case is an example of this: We don't think that anybody
grabs dom0's p2m lock and then grabs domU's paging lock; and so it
should be safe to grab domU's paging lock and then dom0's p2m lock.  But
the locking discipline doesn't know that.  It's doing its job by
flagging up a path we hadn't considered before.

Yes, if someone uses XSM to bypass the IS_PRIV() functionality to give
one domain access over another, then the lock checking will trigger.
That's what we want -- to force people to think carefully about new,
as-yet unconsidered cases when they come up.

It should be noted that my proposal for dom0 / stubdom bonuses will
strictly relax the requirements.  Anything that would trigger the lock
order checking under the new rules would also trigger the mm lock order
checking under the current rules.  So this won't cause any regressions.

 -George



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 13:55           ` George Dunlap
@ 2018-12-19 14:07             ` Andrew Cooper
  2018-12-19 14:32               ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Cooper @ 2018-12-19 14:07 UTC (permalink / raw)
  To: George Dunlap, Roger Pau Monné
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Tim Deegan

On 19/12/2018 13:55, George Dunlap wrote:
> On 12/19/18 12:42 PM, Andrew Cooper wrote:
>> On 19/12/2018 12:38, George Dunlap wrote:
>>> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
>>>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>>>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>>>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>>>>> then attempts to perform copy to operations against the caller
>>>>>> domain in order to copy the result of the hypercall into the caller
>>>>>> provided buffer.
>>>>>>
>>>>>> This works fine when the caller is a non-paging domain, but triggers a
>>>>>> lock order panic when the caller is a paging domain due to the fact
>>>>>> that at the point where the copy to operation is performed the subject
>>>>>> domain paging lock is locked, and the copy operation requires locking
>>>>>> the caller p2m lock which has a lower level.
>>>>>>
>>>>>> Fix this limitation by adding a bias to the level of the caller domain
>>>>>> mm locks, so that the lower caller domain mm lock always has a level
>>>>>> greater than the higher subject domain lock level. This allows locking
>>>>>> the subject domain mm locks and then locking the caller domain mm
>>>>>> locks, while keeping the same lock ordering and the changes mostly
>>>>>> confined to mm-locks.h.
>>>>>>
>>>>>> Note that so far only this flow (locking a subject domain locks and
>>>>>> then the caller domain ones) has been identified, but not all possible
>>>>>> code paths have been inspected. Hence this solution attempts to be a
>>>>>> non-intrusive fix for the problem at hand, without discarding further
>>>>>> changes in the future if other valid code paths are found that require
>>>>>> more complex lock level ordering.
>>>>>>
>>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>> As a quick fix I think this general approach is OK; the thing I don't
>>>>> like is that it's symmetric.  We don't *expect* to ever have a situation
>>>>> where A grabs one of its own MM locks and then one of B's, *and* B then
>>>>> grabs one of its own locks and then A's; but it could happen.
>>>> I have not identified such scenario ATM, but we cannot discard future
>>>> features needing such interlocking I guess. In any case, I think this
>>>> is something that would have to be solved when we came across such
>>>> scenario IMO.
>>> Right -- and the purpose of these macros is to make sure that we
>>> discover such potential deadlocks in testing rather than in production.
>>>
>>>>> Since we've generally identified dom0 which may be grabbing locks of a
>>>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>>>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>>>>> bonus for "is_priv_for" domains?
>>>> Jan pointed out such case, but I'm not sure I can see how this is
>>>> supposedly to happen even given the scenario above, I have to admit
>>>> however I'm not that familiar with the mm code, so it's likely I'm
>>>> missing something.
>>>>
>>>> Hypercalls AFAIK have a single target (or subject) domain, so even if
>>>> there's a stubdomain relation I'm not sure I see why that would
>>>> require this kind of locking, any domain can perform hypercalls
>>>> against a single subject domain, and the hypervisor itself doesn't
>>>> even know about stubdomain relations.
>>> We're considering three potential cases:
>>>
>>> A. dom0 makes a hypercall w/ domU as a target.
>>> B. dom0 makes a hypercall w/ stubdom as a target.
>>> c. stubdom makes a hypercall w/ domU as a target.
>> I'm afraid that this approach isn't appropriate.
>>
>> The privilege of the callee has no bearing on the correctness of the
>> locking.  Any logic based on IS_PRIV/target is buggy.  (Consider the
>> case where XSM lets an otherwise plain HVM domain use some of the more
>> interesting hypercalls.)
> You're not using the word "buggy" correctly.

"buggy" means that the logic is incorrectly, not that it manifests the
incorrectness in all cases.

> <snip>
>
> Yes, if someone uses XSM to bypass the IS_PRIV() functionality to give
> one domain access over another, then the lock checking will trigger.

Noone should be able to trigger assertions in the hypervisor by simply
editing the XSM policy.

This quite clearly demonstrates that the proposed logic isn't appropriate.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:38       ` George Dunlap
  2018-12-19 12:42         ` Andrew Cooper
@ 2018-12-19 14:30         ` Jan Beulich
  2018-12-19 14:37           ` George Dunlap
  2018-12-19 14:40         ` Roger Pau Monné
  2 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2018-12-19 14:30 UTC (permalink / raw)
  To: george.dunlap
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Roger Pau Monne

>>> On 19.12.18 at 13:38, <george.dunlap@citrix.com> wrote:
> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>>> then attempts to perform copy to operations against the caller
>>>> domain in order to copy the result of the hypercall into the caller
>>>> provided buffer.
>>>>
>>>> This works fine when the caller is a non-paging domain, but triggers a
>>>> lock order panic when the caller is a paging domain due to the fact
>>>> that at the point where the copy to operation is performed the subject
>>>> domain paging lock is locked, and the copy operation requires locking
>>>> the caller p2m lock which has a lower level.
>>>>
>>>> Fix this limitation by adding a bias to the level of the caller domain
>>>> mm locks, so that the lower caller domain mm lock always has a level
>>>> greater than the higher subject domain lock level. This allows locking
>>>> the subject domain mm locks and then locking the caller domain mm
>>>> locks, while keeping the same lock ordering and the changes mostly
>>>> confined to mm-locks.h.
>>>>
>>>> Note that so far only this flow (locking a subject domain locks and
>>>> then the caller domain ones) has been identified, but not all possible
>>>> code paths have been inspected. Hence this solution attempts to be a
>>>> non-intrusive fix for the problem at hand, without discarding further
>>>> changes in the future if other valid code paths are found that require
>>>> more complex lock level ordering.
>>>>
>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>
>>> As a quick fix I think this general approach is OK; the thing I don't
>>> like is that it's symmetric.  We don't *expect* to ever have a situation
>>> where A grabs one of its own MM locks and then one of B's, *and* B then
>>> grabs one of its own locks and then A's; but it could happen.
>> 
>> I have not identified such scenario ATM, but we cannot discard future
>> features needing such interlocking I guess. In any case, I think this
>> is something that would have to be solved when we came across such
>> scenario IMO.
> 
> Right -- and the purpose of these macros is to make sure that we
> discover such potential deadlocks in testing rather than in production.
> 
>>> Since we've generally identified dom0 which may be grabbing locks of a
>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>>> bonus for "is_priv_for" domains?
>> 
>> Jan pointed out such case, but I'm not sure I can see how this is
>> supposedly to happen even given the scenario above, I have to admit
>> however I'm not that familiar with the mm code, so it's likely I'm
>> missing something.
>> 
>> Hypercalls AFAIK have a single target (or subject) domain, so even if
>> there's a stubdomain relation I'm not sure I see why that would
>> require this kind of locking, any domain can perform hypercalls
>> against a single subject domain, and the hypervisor itself doesn't
>> even know about stubdomain relations.
> 
> We're considering three potential cases:
> 
> A. dom0 makes a hypercall w/ domU as a target.
> B. dom0 makes a hypercall w/ stubdom as a target.
> c. stubdom makes a hypercall w/ domU as a target.
> 
> We could give only dom0 a bonus.  In that case, A and B would work, but
> C might fail (since stubdom's lock values are the same as domU's).
> 
> We could give both dom0 and stubdoms a bonus.  In that case, A and C
> would work, but B might fail (since the stubdom's lock values are the
> same as dom0's).
> 
> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
> double bonus.  That way all 3 work, since dom0's lock values !=
> stubdom's lock values, and stubdom's lock values != domU's lock values.
> 
> On the other hand, starting simple and adding things in as you find you
> need them isn't a bad approach; so possibly just giving a bonus to dom0
> is a good place to start.

I'm not sure your reply is intentionally not matching the way Roger
has currently implemented it: There's no Dom0 bonus right now, but
instead a "current domain" one (covering Dom0 or stubdom, whenever
they act on their target domain(s)). It was for this reason that I did
suggest (but you better express by your A/B/C scheme) the two
layers of what you call bonus - it is more obviously correct that way,
despite there not being any scenario that we can think of right now
where all 3 layers would actually get involved.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 14:07             ` Andrew Cooper
@ 2018-12-19 14:32               ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-19 14:32 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, George Dunlap, Tim Deegan, george.dunlap, xen-devel,
	Roger Pau Monne

>>> On 19.12.18 at 15:07, <andrew.cooper3@citrix.com> wrote:
> On 19/12/2018 13:55, George Dunlap wrote:
>> Yes, if someone uses XSM to bypass the IS_PRIV() functionality to give
>> one domain access over another, then the lock checking will trigger.
> 
> Noone should be able to trigger assertions in the hypervisor by simply
> editing the XSM policy.
> 
> This quite clearly demonstrates that the proposed logic isn't appropriate.

You could view it the other way around: Anyone who writes an
XSM policy violating assumptions in the hypervisor is shooting
themselves in the foot (read: introduces a bug). In particular I
don't think something like fuzzed XSM policies would make any
sense (which is no different to someone making random edits).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:17       ` Andrew Cooper
@ 2018-12-19 14:35         ` Roger Pau Monné
  0 siblings, 0 replies; 28+ messages in thread
From: Roger Pau Monné @ 2018-12-19 14:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, George Dunlap, Tim Deegan, George Dunlap, Jan Beulich,
	xen-devel

On Wed, Dec 19, 2018 at 12:17:51PM +0000, Andrew Cooper wrote:
> On 19/12/2018 12:10, Roger Pau Monné wrote:
> > Hypercalls AFAIK have a single target (or subject) domain, so even if
> > there's a stubdomain relation I'm not sure I see why that would
> > require this kind of locking, any domain can perform hypercalls
> > against a single subject domain, and the hypervisor itself doesn't
> > even know about stubdomain relations.
> 
> Grant copy has two domains, and neither need to be the callee. 
> Specifically, we discovered during the SILO work that dom0 occasonally
> issues a dom1 => dom2 grant copy.

Grant copy doesn't attempt to lock multiple domains at the same time,
it takes a reference of each page and maps it into Xen address space,
but there's no cross domain mm locking at all AFAICT. It's indeed an
operation that *could* do interleaved domain locks, but it doesn't.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 14:30         ` Jan Beulich
@ 2018-12-19 14:37           ` George Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: George Dunlap @ 2018-12-19 14:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Roger Pau Monne

On 12/19/18 2:30 PM, Jan Beulich wrote:
>>>> On 19.12.18 at 13:38, <george.dunlap@citrix.com> wrote:
>> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
>>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>>>> then attempts to perform copy to operations against the caller
>>>>> domain in order to copy the result of the hypercall into the caller
>>>>> provided buffer.
>>>>>
>>>>> This works fine when the caller is a non-paging domain, but triggers a
>>>>> lock order panic when the caller is a paging domain due to the fact
>>>>> that at the point where the copy to operation is performed the subject
>>>>> domain paging lock is locked, and the copy operation requires locking
>>>>> the caller p2m lock which has a lower level.
>>>>>
>>>>> Fix this limitation by adding a bias to the level of the caller domain
>>>>> mm locks, so that the lower caller domain mm lock always has a level
>>>>> greater than the higher subject domain lock level. This allows locking
>>>>> the subject domain mm locks and then locking the caller domain mm
>>>>> locks, while keeping the same lock ordering and the changes mostly
>>>>> confined to mm-locks.h.
>>>>>
>>>>> Note that so far only this flow (locking a subject domain locks and
>>>>> then the caller domain ones) has been identified, but not all possible
>>>>> code paths have been inspected. Hence this solution attempts to be a
>>>>> non-intrusive fix for the problem at hand, without discarding further
>>>>> changes in the future if other valid code paths are found that require
>>>>> more complex lock level ordering.
>>>>>
>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>
>>>> As a quick fix I think this general approach is OK; the thing I don't
>>>> like is that it's symmetric.  We don't *expect* to ever have a situation
>>>> where A grabs one of its own MM locks and then one of B's, *and* B then
>>>> grabs one of its own locks and then A's; but it could happen.
>>>
>>> I have not identified such scenario ATM, but we cannot discard future
>>> features needing such interlocking I guess. In any case, I think this
>>> is something that would have to be solved when we came across such
>>> scenario IMO.
>>
>> Right -- and the purpose of these macros is to make sure that we
>> discover such potential deadlocks in testing rather than in production.
>>
>>>> Since we've generally identified dom0 which may be grabbing locks of a
>>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>>>> bonus for "is_priv_for" domains?
>>>
>>> Jan pointed out such case, but I'm not sure I can see how this is
>>> supposedly to happen even given the scenario above, I have to admit
>>> however I'm not that familiar with the mm code, so it's likely I'm
>>> missing something.
>>>
>>> Hypercalls AFAIK have a single target (or subject) domain, so even if
>>> there's a stubdomain relation I'm not sure I see why that would
>>> require this kind of locking, any domain can perform hypercalls
>>> against a single subject domain, and the hypervisor itself doesn't
>>> even know about stubdomain relations.
>>
>> We're considering three potential cases:
>>
>> A. dom0 makes a hypercall w/ domU as a target.
>> B. dom0 makes a hypercall w/ stubdom as a target.
>> c. stubdom makes a hypercall w/ domU as a target.
>>
>> We could give only dom0 a bonus.  In that case, A and B would work, but
>> C might fail (since stubdom's lock values are the same as domU's).
>>
>> We could give both dom0 and stubdoms a bonus.  In that case, A and C
>> would work, but B might fail (since the stubdom's lock values are the
>> same as dom0's).
>>
>> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
>> double bonus.  That way all 3 work, since dom0's lock values !=
>> stubdom's lock values, and stubdom's lock values != domU's lock values.
>>
>> On the other hand, starting simple and adding things in as you find you
>> need them isn't a bad approach; so possibly just giving a bonus to dom0
>> is a good place to start.
> 
> I'm not sure your reply is intentionally not matching the way Roger
> has currently implemented it: There's no Dom0 bonus right now, but
> instead a "current domain" one (covering Dom0 or stubdom, whenever
> they act on their target domain(s)). 

It is intentional; my first response to Roger's patch said that I didn't
like the "current domain" bonus, and suggested a fixed "dom0" and
"stubdomain" bonus instead.  Roger said he didn't understand the use
case, so this is me explaining the situation and how my suggestion
addresses it.

> It was for this reason that I did
> suggest (but you better express by your A/B/C scheme) the two
> layers of what you call bonus - it is more obviously correct that way,
> despite there not being any scenario that we can think of right now
> where all 3 layers would actually get involved.

OK, glad you like it. :-)

"Bonus" is probably not the right word.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 12:38       ` George Dunlap
  2018-12-19 12:42         ` Andrew Cooper
  2018-12-19 14:30         ` Jan Beulich
@ 2018-12-19 14:40         ` Roger Pau Monné
  2018-12-19 14:46           ` George Dunlap
  2 siblings, 1 reply; 28+ messages in thread
From: Roger Pau Monné @ 2018-12-19 14:40 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On Wed, Dec 19, 2018 at 12:38:50PM +0000, George Dunlap wrote:
> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
> > On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
> >> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> >>> paging_log_dirty_op function takes mm locks from a subject domain and
> >>> then attempts to perform copy to operations against the caller
> >>> domain in order to copy the result of the hypercall into the caller
> >>> provided buffer.
> >>>
> >>> This works fine when the caller is a non-paging domain, but triggers a
> >>> lock order panic when the caller is a paging domain due to the fact
> >>> that at the point where the copy to operation is performed the subject
> >>> domain paging lock is locked, and the copy operation requires locking
> >>> the caller p2m lock which has a lower level.
> >>>
> >>> Fix this limitation by adding a bias to the level of the caller domain
> >>> mm locks, so that the lower caller domain mm lock always has a level
> >>> greater than the higher subject domain lock level. This allows locking
> >>> the subject domain mm locks and then locking the caller domain mm
> >>> locks, while keeping the same lock ordering and the changes mostly
> >>> confined to mm-locks.h.
> >>>
> >>> Note that so far only this flow (locking a subject domain locks and
> >>> then the caller domain ones) has been identified, but not all possible
> >>> code paths have been inspected. Hence this solution attempts to be a
> >>> non-intrusive fix for the problem at hand, without discarding further
> >>> changes in the future if other valid code paths are found that require
> >>> more complex lock level ordering.
> >>>
> >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >>
> >> As a quick fix I think this general approach is OK; the thing I don't
> >> like is that it's symmetric.  We don't *expect* to ever have a situation
> >> where A grabs one of its own MM locks and then one of B's, *and* B then
> >> grabs one of its own locks and then A's; but it could happen.
> > 
> > I have not identified such scenario ATM, but we cannot discard future
> > features needing such interlocking I guess. In any case, I think this
> > is something that would have to be solved when we came across such
> > scenario IMO.
> 
> Right -- and the purpose of these macros is to make sure that we
> discover such potential deadlocks in testing rather than in production.
> 
> >> Since we've generally identified dom0 which may be grabbing locks of a
> >> PVH stubdom, which may be grabbing logs of a normal domU, would it be
> >> possible / make sense instead to give a 2x bonus for dom0, and a 1x
> >> bonus for "is_priv_for" domains?
> > 
> > Jan pointed out such case, but I'm not sure I can see how this is
> > supposedly to happen even given the scenario above, I have to admit
> > however I'm not that familiar with the mm code, so it's likely I'm
> > missing something.
> > 
> > Hypercalls AFAIK have a single target (or subject) domain, so even if
> > there's a stubdomain relation I'm not sure I see why that would
> > require this kind of locking, any domain can perform hypercalls
> > against a single subject domain, and the hypervisor itself doesn't
> > even know about stubdomain relations.
> 
> We're considering three potential cases:
> 
> A. dom0 makes a hypercall w/ domU as a target.
> B. dom0 makes a hypercall w/ stubdom as a target.
> c. stubdom makes a hypercall w/ domU as a target.
> 
> We could give only dom0 a bonus.  In that case, A and B would work, but
> C might fail (since stubdom's lock values are the same as domU's).
> 
> We could give both dom0 and stubdoms a bonus.  In that case, A and C
> would work, but B might fail (since the stubdom's lock values are the
> same as dom0's).
> 
> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
> double bonus.  That way all 3 work, since dom0's lock values !=
> stubdom's lock values, and stubdom's lock values != domU's lock values.
> 
> On the other hand, starting simple and adding things in as you find you
> need them isn't a bad approach; so possibly just giving a bonus to dom0
> is a good place to start.

IMO just giving a bonus to the caller domain (current->domain) is even
easier. I've only spotted a single case where there's such
interleaved domain locking, which is due to a copy_to into a caller
provided buffer while having some subject's domain mm locks taken.

On the line of your reply below, I would leave more complex locking
level adjustment to further patches if there's such a need.

> >> I also think it would be good to start recording specific instances of
> >> ordering requirements, so that we don't have to go track them down.
> >> (It's not immediately obvious to me, for instance, why the paging lock
> >> is so far down the list.)  Could you add somewhere into the comments in
> >> this section something like this:
> >>
> >> paging_log_dirty_op grabs tgt paging_lock, then does copy_to_user which
> >> grabs dom0's P2M lock.
> >>
> >> The other thing we could do is generate the lock level as (OWNER_TYPE <<
> >> 11) | (LOCK_TYPE << 5) | (domain_id), where OWNER_TYPE is 2 for dom0, 1
> >> for stub domains, 0 for normal domains, and LOCK_TYPE is the current
> >> lock level; and then fail if the new lock level >= current lock level.
> >> That would flag up any potential inter-domain deadlocks as well.
> > 
> > I'm not sure it will catch all inter-domain locks. For example Xen
> > would still allow to take the same lock type from a domain with a
> > higher domain_id than the currently locked one if both have the same
> > OWNER_TYPE.
> 
> If we wanted to ASSERT on all interdomain locks, we could just change
> the '>' to a '>=' and call it a day.  The idea instead is to allow
> inter-domain locking *if* it follows the "grab the lower domid lock
> first" discipline (in which case there won't be any deadlocks).
> 
> But maybe that's beginning to get into 'feature creep'.  Best leave that
> for an improvement patch, not a bug fix.

As you say, I would rather perform such changes when there's a real
need for them.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 14:40         ` Roger Pau Monné
@ 2018-12-19 14:46           ` George Dunlap
  2018-12-19 14:59             ` Roger Pau Monné
  0 siblings, 1 reply; 28+ messages in thread
From: George Dunlap @ 2018-12-19 14:46 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On 12/19/18 2:40 PM, Roger Pau Monné wrote:
> On Wed, Dec 19, 2018 at 12:38:50PM +0000, George Dunlap wrote:
>> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
>>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
>>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
>>>>> paging_log_dirty_op function takes mm locks from a subject domain and
>>>>> then attempts to perform copy to operations against the caller
>>>>> domain in order to copy the result of the hypercall into the caller
>>>>> provided buffer.
>>>>>
>>>>> This works fine when the caller is a non-paging domain, but triggers a
>>>>> lock order panic when the caller is a paging domain due to the fact
>>>>> that at the point where the copy to operation is performed the subject
>>>>> domain paging lock is locked, and the copy operation requires locking
>>>>> the caller p2m lock which has a lower level.
>>>>>
>>>>> Fix this limitation by adding a bias to the level of the caller domain
>>>>> mm locks, so that the lower caller domain mm lock always has a level
>>>>> greater than the higher subject domain lock level. This allows locking
>>>>> the subject domain mm locks and then locking the caller domain mm
>>>>> locks, while keeping the same lock ordering and the changes mostly
>>>>> confined to mm-locks.h.
>>>>>
>>>>> Note that so far only this flow (locking a subject domain locks and
>>>>> then the caller domain ones) has been identified, but not all possible
>>>>> code paths have been inspected. Hence this solution attempts to be a
>>>>> non-intrusive fix for the problem at hand, without discarding further
>>>>> changes in the future if other valid code paths are found that require
>>>>> more complex lock level ordering.
>>>>>
>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>
>>>> As a quick fix I think this general approach is OK; the thing I don't
>>>> like is that it's symmetric.  We don't *expect* to ever have a situation
>>>> where A grabs one of its own MM locks and then one of B's, *and* B then
>>>> grabs one of its own locks and then A's; but it could happen.
>>>
>>> I have not identified such scenario ATM, but we cannot discard future
>>> features needing such interlocking I guess. In any case, I think this
>>> is something that would have to be solved when we came across such
>>> scenario IMO.
>>
>> Right -- and the purpose of these macros is to make sure that we
>> discover such potential deadlocks in testing rather than in production.
>>
>>>> Since we've generally identified dom0 which may be grabbing locks of a
>>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
>>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
>>>> bonus for "is_priv_for" domains?
>>>
>>> Jan pointed out such case, but I'm not sure I can see how this is
>>> supposedly to happen even given the scenario above, I have to admit
>>> however I'm not that familiar with the mm code, so it's likely I'm
>>> missing something.
>>>
>>> Hypercalls AFAIK have a single target (or subject) domain, so even if
>>> there's a stubdomain relation I'm not sure I see why that would
>>> require this kind of locking, any domain can perform hypercalls
>>> against a single subject domain, and the hypervisor itself doesn't
>>> even know about stubdomain relations.
>>
>> We're considering three potential cases:
>>
>> A. dom0 makes a hypercall w/ domU as a target.
>> B. dom0 makes a hypercall w/ stubdom as a target.
>> c. stubdom makes a hypercall w/ domU as a target.
>>
>> We could give only dom0 a bonus.  In that case, A and B would work, but
>> C might fail (since stubdom's lock values are the same as domU's).
>>
>> We could give both dom0 and stubdoms a bonus.  In that case, A and C
>> would work, but B might fail (since the stubdom's lock values are the
>> same as dom0's).
>>
>> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
>> double bonus.  That way all 3 work, since dom0's lock values !=
>> stubdom's lock values, and stubdom's lock values != domU's lock values.
>>
>> On the other hand, starting simple and adding things in as you find you
>> need them isn't a bad approach; so possibly just giving a bonus to dom0
>> is a good place to start.
> 
> IMO just giving a bonus to the caller domain (current->domain) is even
> easier. I've only spotted a single case where there's such
> interleaved domain locking, which is due to a copy_to into a caller
> provided buffer while having some subject's domain mm locks taken.
> 
> On the line of your reply below, I would leave more complex locking
> level adjustment to further patches if there's such a need.

Not sure how it's easier -- one is (current == d), the other is
is_control_domain(d).

Using 'current' means that potential deadlocks which would now cause a
BUG() won't anymore.  I'm fine with not adding extra protections that
aren't there now; but I don't want to remove protections that are.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 14:46           ` George Dunlap
@ 2018-12-19 14:59             ` Roger Pau Monné
  2018-12-19 18:10               ` George Dunlap
  0 siblings, 1 reply; 28+ messages in thread
From: Roger Pau Monné @ 2018-12-19 14:59 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On Wed, Dec 19, 2018 at 02:46:55PM +0000, George Dunlap wrote:
> On 12/19/18 2:40 PM, Roger Pau Monné wrote:
> > On Wed, Dec 19, 2018 at 12:38:50PM +0000, George Dunlap wrote:
> >> On 12/19/18 12:10 PM, Roger Pau Monné wrote:
> >>> On Wed, Dec 19, 2018 at 11:40:14AM +0000, George Dunlap wrote:
> >>>> On 12/18/18 4:05 PM, Roger Pau Monne wrote:
> >>>>> paging_log_dirty_op function takes mm locks from a subject domain and
> >>>>> then attempts to perform copy to operations against the caller
> >>>>> domain in order to copy the result of the hypercall into the caller
> >>>>> provided buffer.
> >>>>>
> >>>>> This works fine when the caller is a non-paging domain, but triggers a
> >>>>> lock order panic when the caller is a paging domain due to the fact
> >>>>> that at the point where the copy to operation is performed the subject
> >>>>> domain paging lock is locked, and the copy operation requires locking
> >>>>> the caller p2m lock which has a lower level.
> >>>>>
> >>>>> Fix this limitation by adding a bias to the level of the caller domain
> >>>>> mm locks, so that the lower caller domain mm lock always has a level
> >>>>> greater than the higher subject domain lock level. This allows locking
> >>>>> the subject domain mm locks and then locking the caller domain mm
> >>>>> locks, while keeping the same lock ordering and the changes mostly
> >>>>> confined to mm-locks.h.
> >>>>>
> >>>>> Note that so far only this flow (locking a subject domain locks and
> >>>>> then the caller domain ones) has been identified, but not all possible
> >>>>> code paths have been inspected. Hence this solution attempts to be a
> >>>>> non-intrusive fix for the problem at hand, without discarding further
> >>>>> changes in the future if other valid code paths are found that require
> >>>>> more complex lock level ordering.
> >>>>>
> >>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >>>>
> >>>> As a quick fix I think this general approach is OK; the thing I don't
> >>>> like is that it's symmetric.  We don't *expect* to ever have a situation
> >>>> where A grabs one of its own MM locks and then one of B's, *and* B then
> >>>> grabs one of its own locks and then A's; but it could happen.
> >>>
> >>> I have not identified such scenario ATM, but we cannot discard future
> >>> features needing such interlocking I guess. In any case, I think this
> >>> is something that would have to be solved when we came across such
> >>> scenario IMO.
> >>
> >> Right -- and the purpose of these macros is to make sure that we
> >> discover such potential deadlocks in testing rather than in production.
> >>
> >>>> Since we've generally identified dom0 which may be grabbing locks of a
> >>>> PVH stubdom, which may be grabbing logs of a normal domU, would it be
> >>>> possible / make sense instead to give a 2x bonus for dom0, and a 1x
> >>>> bonus for "is_priv_for" domains?
> >>>
> >>> Jan pointed out such case, but I'm not sure I can see how this is
> >>> supposedly to happen even given the scenario above, I have to admit
> >>> however I'm not that familiar with the mm code, so it's likely I'm
> >>> missing something.
> >>>
> >>> Hypercalls AFAIK have a single target (or subject) domain, so even if
> >>> there's a stubdomain relation I'm not sure I see why that would
> >>> require this kind of locking, any domain can perform hypercalls
> >>> against a single subject domain, and the hypervisor itself doesn't
> >>> even know about stubdomain relations.
> >>
> >> We're considering three potential cases:
> >>
> >> A. dom0 makes a hypercall w/ domU as a target.
> >> B. dom0 makes a hypercall w/ stubdom as a target.
> >> c. stubdom makes a hypercall w/ domU as a target.
> >>
> >> We could give only dom0 a bonus.  In that case, A and B would work, but
> >> C might fail (since stubdom's lock values are the same as domU's).
> >>
> >> We could give both dom0 and stubdoms a bonus.  In that case, A and C
> >> would work, but B might fail (since the stubdom's lock values are the
> >> same as dom0's).
> >>
> >> Or, we could do what I've proposed: give stubdom a bonus, and dom0 a
> >> double bonus.  That way all 3 work, since dom0's lock values !=
> >> stubdom's lock values, and stubdom's lock values != domU's lock values.
> >>
> >> On the other hand, starting simple and adding things in as you find you
> >> need them isn't a bad approach; so possibly just giving a bonus to dom0
> >> is a good place to start.
> > 
> > IMO just giving a bonus to the caller domain (current->domain) is even
> > easier. I've only spotted a single case where there's such
> > interleaved domain locking, which is due to a copy_to into a caller
> > provided buffer while having some subject's domain mm locks taken.
> > 
> > On the line of your reply below, I would leave more complex locking
> > level adjustment to further patches if there's such a need.
> 
> Not sure how it's easier -- one is (current == d), the other is
> is_control_domain(d).

Using is_control_domain will limit the usage of paging_log_dirty_op to
Dom0 (because that's the only domain that has is_privileged set,
there's no way for the toolstack to create a domain with is_privileged
set), no other paging domain will be able to use such hypercall
without triggering the lock level check. Is that really what we
want?

I assume that under a distributed Xen system it would be possible to
issue this hypercall from a domain different that Dom0 if it's given
the right privileges?

> Using 'current' means that potential deadlocks which would now cause a
> BUG() won't anymore.  I'm fine with not adding extra protections that
> aren't there now; but I don't want to remove protections that are.

The lock ordering enforcement is still kept as-is, but Xen is allowed
to lock the caller mm locks in the right order (be it privileged or
not) after having locked a subject domain ones also in the correct
order.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 14:59             ` Roger Pau Monné
@ 2018-12-19 18:10               ` George Dunlap
  2018-12-20  9:05                 ` Roger Pau Monné
  0 siblings, 1 reply; 28+ messages in thread
From: George Dunlap @ 2018-12-19 18:10 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On 12/19/18 2:59 PM, Roger Pau Monné wrote:
>> Using 'current' means that potential deadlocks which would now cause a
>> BUG() won't anymore.  I'm fine with not adding extra protections that
>> aren't there now; but I don't want to remove protections that are.
> 
> The lock ordering enforcement is still kept as-is, but Xen is allowed
> to lock the caller mm locks in the right order (be it privileged or
> not) after having locked a subject domain ones also in the correct
> order.

Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.

Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.

And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x.

And suppose domA calls H1 on domB at the same time that domB calls H2 on
domA.  We could have the following sequence:

* H1: grab A->mm_lock_x
* H2: grab B->mm_lock_y
* H1: wait on B->mm_lock_y
* H2: wait on A->mm_lock_x #DEADLOCK

With the current mm lock checking, H2 would cause a BUG() the first time
it was called.

With my "special privilege only" boosting you'd also get a BUG() in at
least one of the invocations.

With your "current bias" patch, no BUG() would be encountered; we'd only
discover the deadlock once a live server had actually deadlocked.

This is what I'm talking about -- with "boost dom0", you have a global
order to the locks.  It goes:

1-8: All domU locks (in the order listed in mm-locks.h)
9-16: All dom0 locks

Thus we know for certain that there can't be a caller / callee deadlock
that's not detected.  With your patch, there isn't a global order: the
order changes based on who made the hypercall, so it's more difficult to
reason about whether there's a deadlock or not.

So, do H1 and H2 exist right now?  I think probably not, but I can't
immediately say.  Will such a pair *never* exist?  I don't think I can
guarantee that either.  That's why I want something to check.

> Using is_control_domain will limit the usage of paging_log_dirty_op to
> Dom0 (because that's the only domain that has is_privileged set,
> there's no way for the toolstack to create a domain with is_privileged
> set), no other paging domain will be able to use such hypercall
> without triggering the lock level check. Is that really what we
> want?
>
> I assume that under a distributed Xen system it would be possible to
> issue this hypercall from a domain different that Dom0 if it's given
> the right privileges?

I take it you mean a "disaggregated" system?  This is what Andy was
talking about with XSM.  As I said to him, at the moment, this will
_already_ fail, so it's not a regression.

I'm not willing to open up new potential holes for deadlocks,
particularly for a hypothetical use case that nobody has asked for and
doesn't work upstream at the moment.

If you want to make the calls succeed under a more "peer-to-peer"
arrangement, you're going to have to come up with something else:
Change the order of the mm locks, or grab the caller's p2m lock before
grabbing the paging lock, or refactoring what's covered by what, or
something.

Or, we can fix what's actually broken at the moment, and fix other
things when we encounter them.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-19 18:10               ` George Dunlap
@ 2018-12-20  9:05                 ` Roger Pau Monné
  2018-12-20  9:16                   ` Jan Beulich
  2018-12-20 13:47                   ` George Dunlap
  0 siblings, 2 replies; 28+ messages in thread
From: Roger Pau Monné @ 2018-12-20  9:05 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On Wed, Dec 19, 2018 at 06:10:04PM +0000, George Dunlap wrote:
> On 12/19/18 2:59 PM, Roger Pau Monné wrote:
> >> Using 'current' means that potential deadlocks which would now cause a
> >> BUG() won't anymore.  I'm fine with not adding extra protections that
> >> aren't there now; but I don't want to remove protections that are.
> > 
> > The lock ordering enforcement is still kept as-is, but Xen is allowed
> > to lock the caller mm locks in the right order (be it privileged or
> > not) after having locked a subject domain ones also in the correct
> > order.
> 
> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
> 
> Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.

With my proposal H1 won't be a valid lock sequence, since
current->mm_lock_x level > tgt->mm_lock_y level.

With your proposal H1 would be valid depending on whether 'current' is
Dom0 or not. If current == Dom0, H1 is invalid, since
current->mm_lock_x level > tgt->mm_lock_y level. If current != Dom0
then the H1 sequence would be valid.

I think this is not ideal, since the correctness of the H1 sequence
with your proposal depends on whether the caller is Dom0 or an
unprivileged domain. With my proposal there's no such difference, and
the validity of sequences doesn't depend on whether the caller is
privileged or not, which makes it easier to reason about.

> And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x.

H2 won't be a valid locking sequence with my proposal, since
current->mm_lock_y level > tgt->mm_lock_x level, so the H2 sequence
would be invalid and always trigger the BUG.

With my proposal all current domain mm locks always have a level
higher than any target mm locks.

> 
> And suppose domA calls H1 on domB at the same time that domB calls H2 on
> domA.  We could have the following sequence:
> 
> * H1: grab A->mm_lock_x
> * H2: grab B->mm_lock_y
> * H1: wait on B->mm_lock_y
> * H2: wait on A->mm_lock_x #DEADLOCK
> 
> With the current mm lock checking, H2 would cause a BUG() the first time
> it was called.
> 
> With my "special privilege only" boosting you'd also get a BUG() in at
> least one of the invocations.
> 
> With your "current bias" patch, no BUG() would be encountered; we'd only
> discover the deadlock once a live server had actually deadlocked.
> 
> This is what I'm talking about -- with "boost dom0", you have a global
> order to the locks.  It goes:
> 
> 1-8: All domU locks (in the order listed in mm-locks.h)
> 9-16: All dom0 locks
> 
> Thus we know for certain that there can't be a caller / callee deadlock
> that's not detected.  With your patch, there isn't a global order: the
> order changes based on who made the hypercall, so it's more difficult to
> reason about whether there's a deadlock or not.
> 
> So, do H1 and H2 exist right now?  I think probably not, but I can't
> immediately say.  Will such a pair *never* exist?  I don't think I can
> guarantee that either.  That's why I want something to check.

I'm fine with this proposal, it's just that if it's safe for Dom0 to
pick any other domain mm lock and then any Dom0 mm lock it should also
be safe for any domain and not Dom0 only.

> > Using is_control_domain will limit the usage of paging_log_dirty_op to
> > Dom0 (because that's the only domain that has is_privileged set,
> > there's no way for the toolstack to create a domain with is_privileged
> > set), no other paging domain will be able to use such hypercall
> > without triggering the lock level check. Is that really what we
> > want?
> >
> > I assume that under a distributed Xen system it would be possible to
> > issue this hypercall from a domain different that Dom0 if it's given
> > the right privileges?
> 
> I take it you mean a "disaggregated" system?  This is what Andy was
> talking about with XSM.  As I said to him, at the moment, this will
> _already_ fail, so it's not a regression.
> 
> I'm not willing to open up new potential holes for deadlocks,
> particularly for a hypothetical use case that nobody has asked for and
> doesn't work upstream at the moment.
> 
> If you want to make the calls succeed under a more "peer-to-peer"
> arrangement, you're going to have to come up with something else:
> Change the order of the mm locks, or grab the caller's p2m lock before
> grabbing the paging lock, or refactoring what's covered by what, or
> something.
> 
> Or, we can fix what's actually broken at the moment, and fix other
> things when we encounter them.

I'm fine with only boosting locks that belong to a privileged domain,
TBH the code is mostly the same, and we can always change the boosting
easily when required.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-20  9:05                 ` Roger Pau Monné
@ 2018-12-20  9:16                   ` Jan Beulich
  2018-12-20  9:58                     ` Roger Pau Monné
  2018-12-20 13:47                   ` George Dunlap
  1 sibling, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2018-12-20  9:16 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, george.dunlap,
	xen-devel

>>> On 20.12.18 at 10:05, <roger.pau@citrix.com> wrote:
> On Wed, Dec 19, 2018 at 06:10:04PM +0000, George Dunlap wrote:
>> On 12/19/18 2:59 PM, Roger Pau Monné wrote:
>> >> Using 'current' means that potential deadlocks which would now cause a
>> >> BUG() won't anymore.  I'm fine with not adding extra protections that
>> >> aren't there now; but I don't want to remove protections that are.
>> > 
>> > The lock ordering enforcement is still kept as-is, but Xen is allowed
>> > to lock the caller mm locks in the right order (be it privileged or
>> > not) after having locked a subject domain ones also in the correct
>> > order.
>> 
>> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
>> 
>> Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.
> 
> With my proposal H1 won't be a valid lock sequence, since
> current->mm_lock_x level > tgt->mm_lock_y level.
> 
> With your proposal H1 would be valid depending on whether 'current' is
> Dom0 or not. If current == Dom0, H1 is invalid, since
> current->mm_lock_x level > tgt->mm_lock_y level. If current != Dom0
> then the H1 sequence would be valid.
> 
> I think this is not ideal, since the correctness of the H1 sequence
> with your proposal depends on whether the caller is Dom0 or an
> unprivileged domain. With my proposal there's no such difference, and
> the validity of sequences doesn't depend on whether the caller is
> privileged or not, which makes it easier to reason about.
> 
>> And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x.
> 
> H2 won't be a valid locking sequence with my proposal, since
> current->mm_lock_y level > tgt->mm_lock_x level, so the H2 sequence
> would be invalid and always trigger the BUG.
> 
> With my proposal all current domain mm locks always have a level
> higher than any target mm locks.
> 
>> 
>> And suppose domA calls H1 on domB at the same time that domB calls H2 on
>> domA.  We could have the following sequence:
>> 
>> * H1: grab A->mm_lock_x
>> * H2: grab B->mm_lock_y
>> * H1: wait on B->mm_lock_y
>> * H2: wait on A->mm_lock_x #DEADLOCK
>> 
>> With the current mm lock checking, H2 would cause a BUG() the first time
>> it was called.
>> 
>> With my "special privilege only" boosting you'd also get a BUG() in at
>> least one of the invocations.
>> 
>> With your "current bias" patch, no BUG() would be encountered; we'd only
>> discover the deadlock once a live server had actually deadlocked.
>> 
>> This is what I'm talking about -- with "boost dom0", you have a global
>> order to the locks.  It goes:
>> 
>> 1-8: All domU locks (in the order listed in mm-locks.h)
>> 9-16: All dom0 locks
>> 
>> Thus we know for certain that there can't be a caller / callee deadlock
>> that's not detected.  With your patch, there isn't a global order: the
>> order changes based on who made the hypercall, so it's more difficult to
>> reason about whether there's a deadlock or not.
>> 
>> So, do H1 and H2 exist right now?  I think probably not, but I can't
>> immediately say.  Will such a pair *never* exist?  I don't think I can
>> guarantee that either.  That's why I want something to check.
> 
> I'm fine with this proposal, it's just that if it's safe for Dom0 to
> pick any other domain mm lock and then any Dom0 mm lock it should also
> be safe for any domain and not Dom0 only.

Since I'm not sure whether this was implied here: "Any" is of
course wrong here, or else there'd be ABBA deadlock potential
between such two arbitrary domains. I suppose you still mean
"any domain controlling another domain", and for such a domain
to only acquire locks of the domain it controls.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-20  9:16                   ` Jan Beulich
@ 2018-12-20  9:58                     ` Roger Pau Monné
       [not found]                       ` <E6041B7A020000919527FA34@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 28+ messages in thread
From: Roger Pau Monné @ 2018-12-20  9:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, george.dunlap,
	xen-devel

On Thu, Dec 20, 2018 at 02:16:15AM -0700, Jan Beulich wrote:
> >>> On 20.12.18 at 10:05, <roger.pau@citrix.com> wrote:
> > On Wed, Dec 19, 2018 at 06:10:04PM +0000, George Dunlap wrote:
> >> On 12/19/18 2:59 PM, Roger Pau Monné wrote:
> >> >> Using 'current' means that potential deadlocks which would now cause a
> >> >> BUG() won't anymore.  I'm fine with not adding extra protections that
> >> >> aren't there now; but I don't want to remove protections that are.
> >> > 
> >> > The lock ordering enforcement is still kept as-is, but Xen is allowed
> >> > to lock the caller mm locks in the right order (be it privileged or
> >> > not) after having locked a subject domain ones also in the correct
> >> > order.
> >> 
> >> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
> >> 
> >> Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.
> > 
> > With my proposal H1 won't be a valid lock sequence, since
> > current->mm_lock_x level > tgt->mm_lock_y level.
> > 
> > With your proposal H1 would be valid depending on whether 'current' is
> > Dom0 or not. If current == Dom0, H1 is invalid, since
> > current->mm_lock_x level > tgt->mm_lock_y level. If current != Dom0
> > then the H1 sequence would be valid.
> > 
> > I think this is not ideal, since the correctness of the H1 sequence
> > with your proposal depends on whether the caller is Dom0 or an
> > unprivileged domain. With my proposal there's no such difference, and
> > the validity of sequences doesn't depend on whether the caller is
> > privileged or not, which makes it easier to reason about.
> > 
> >> And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x.
> > 
> > H2 won't be a valid locking sequence with my proposal, since
> > current->mm_lock_y level > tgt->mm_lock_x level, so the H2 sequence
> > would be invalid and always trigger the BUG.
> > 
> > With my proposal all current domain mm locks always have a level
> > higher than any target mm locks.
> > 
> >> 
> >> And suppose domA calls H1 on domB at the same time that domB calls H2 on
> >> domA.  We could have the following sequence:
> >> 
> >> * H1: grab A->mm_lock_x
> >> * H2: grab B->mm_lock_y
> >> * H1: wait on B->mm_lock_y
> >> * H2: wait on A->mm_lock_x #DEADLOCK
> >> 
> >> With the current mm lock checking, H2 would cause a BUG() the first time
> >> it was called.
> >> 
> >> With my "special privilege only" boosting you'd also get a BUG() in at
> >> least one of the invocations.
> >> 
> >> With your "current bias" patch, no BUG() would be encountered; we'd only
> >> discover the deadlock once a live server had actually deadlocked.
> >> 
> >> This is what I'm talking about -- with "boost dom0", you have a global
> >> order to the locks.  It goes:
> >> 
> >> 1-8: All domU locks (in the order listed in mm-locks.h)
> >> 9-16: All dom0 locks
> >> 
> >> Thus we know for certain that there can't be a caller / callee deadlock
> >> that's not detected.  With your patch, there isn't a global order: the
> >> order changes based on who made the hypercall, so it's more difficult to
> >> reason about whether there's a deadlock or not.
> >> 
> >> So, do H1 and H2 exist right now?  I think probably not, but I can't
> >> immediately say.  Will such a pair *never* exist?  I don't think I can
> >> guarantee that either.  That's why I want something to check.
> > 
> > I'm fine with this proposal, it's just that if it's safe for Dom0 to
> > pick any other domain mm lock and then any Dom0 mm lock it should also
> > be safe for any domain and not Dom0 only.
> 
> Since I'm not sure whether this was implied here: "Any" is of
> course wrong here, or else there'd be ABBA deadlock potential
> between such two arbitrary domains. I suppose you still mean
> "any domain controlling another domain", and for such a domain
> to only acquire locks of the domain it controls.

Yes, the last "any" should be "any domain controlling another domain",
but it's not the job of the mm lock checker to assert whether a domain
is controlling another domain or not.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
       [not found]                       ` <E6041B7A020000919527FA34@prv1-mh.provo.novell.com>
@ 2018-12-20 10:06                         ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-20 10:06 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, george.dunlap,
	xen-devel

>>> On 20.12.18 at 10:58, <roger.pau@citrix.com> wrote:
> On Thu, Dec 20, 2018 at 02:16:15AM -0700, Jan Beulich wrote:
>> >>> On 20.12.18 at 10:05, <roger.pau@citrix.com> wrote:
>> > On Wed, Dec 19, 2018 at 06:10:04PM +0000, George Dunlap wrote:
>> >> On 12/19/18 2:59 PM, Roger Pau Monné wrote:
>> >> >> Using 'current' means that potential deadlocks which would now cause a
>> >> >> BUG() won't anymore.  I'm fine with not adding extra protections that
>> >> >> aren't there now; but I don't want to remove protections that are.
>> >> > 
>> >> > The lock ordering enforcement is still kept as-is, but Xen is allowed
>> >> > to lock the caller mm locks in the right order (be it privileged or
>> >> > not) after having locked a subject domain ones also in the correct
>> >> > order.
>> >> 
>> >> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
>> >> 
>> >> Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.
>> > 
>> > With my proposal H1 won't be a valid lock sequence, since
>> > current->mm_lock_x level > tgt->mm_lock_y level.
>> > 
>> > With your proposal H1 would be valid depending on whether 'current' is
>> > Dom0 or not. If current == Dom0, H1 is invalid, since
>> > current->mm_lock_x level > tgt->mm_lock_y level. If current != Dom0
>> > then the H1 sequence would be valid.
>> > 
>> > I think this is not ideal, since the correctness of the H1 sequence
>> > with your proposal depends on whether the caller is Dom0 or an
>> > unprivileged domain. With my proposal there's no such difference, and
>> > the validity of sequences doesn't depend on whether the caller is
>> > privileged or not, which makes it easier to reason about.
>> > 
>> >> And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x.
>> > 
>> > H2 won't be a valid locking sequence with my proposal, since
>> > current->mm_lock_y level > tgt->mm_lock_x level, so the H2 sequence
>> > would be invalid and always trigger the BUG.
>> > 
>> > With my proposal all current domain mm locks always have a level
>> > higher than any target mm locks.
>> > 
>> >> 
>> >> And suppose domA calls H1 on domB at the same time that domB calls H2 on
>> >> domA.  We could have the following sequence:
>> >> 
>> >> * H1: grab A->mm_lock_x
>> >> * H2: grab B->mm_lock_y
>> >> * H1: wait on B->mm_lock_y
>> >> * H2: wait on A->mm_lock_x #DEADLOCK
>> >> 
>> >> With the current mm lock checking, H2 would cause a BUG() the first time
>> >> it was called.
>> >> 
>> >> With my "special privilege only" boosting you'd also get a BUG() in at
>> >> least one of the invocations.
>> >> 
>> >> With your "current bias" patch, no BUG() would be encountered; we'd only
>> >> discover the deadlock once a live server had actually deadlocked.
>> >> 
>> >> This is what I'm talking about -- with "boost dom0", you have a global
>> >> order to the locks.  It goes:
>> >> 
>> >> 1-8: All domU locks (in the order listed in mm-locks.h)
>> >> 9-16: All dom0 locks
>> >> 
>> >> Thus we know for certain that there can't be a caller / callee deadlock
>> >> that's not detected.  With your patch, there isn't a global order: the
>> >> order changes based on who made the hypercall, so it's more difficult to
>> >> reason about whether there's a deadlock or not.
>> >> 
>> >> So, do H1 and H2 exist right now?  I think probably not, but I can't
>> >> immediately say.  Will such a pair *never* exist?  I don't think I can
>> >> guarantee that either.  That's why I want something to check.
>> > 
>> > I'm fine with this proposal, it's just that if it's safe for Dom0 to
>> > pick any other domain mm lock and then any Dom0 mm lock it should also
>> > be safe for any domain and not Dom0 only.
>> 
>> Since I'm not sure whether this was implied here: "Any" is of
>> course wrong here, or else there'd be ABBA deadlock potential
>> between such two arbitrary domains. I suppose you still mean
>> "any domain controlling another domain", and for such a domain
>> to only acquire locks of the domain it controls.
> 
> Yes, the last "any" should be "any domain controlling another domain",
> but it's not the job of the mm lock checker to assert whether a domain
> is controlling another domain or not.

Well, that depends on the model we pick.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-20  9:05                 ` Roger Pau Monné
  2018-12-20  9:16                   ` Jan Beulich
@ 2018-12-20 13:47                   ` George Dunlap
  2018-12-20 14:01                     ` Jan Beulich
  1 sibling, 1 reply; 28+ messages in thread
From: George Dunlap @ 2018-12-20 13:47 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	xen-devel

On 12/20/18 9:05 AM, Roger Pau Monné wrote:
> On Wed, Dec 19, 2018 at 06:10:04PM +0000, George Dunlap wrote:
>> On 12/19/18 2:59 PM, Roger Pau Monné wrote:
>>>> Using 'current' means that potential deadlocks which would now cause a
>>>> BUG() won't anymore.  I'm fine with not adding extra protections that
>>>> aren't there now; but I don't want to remove protections that are.
>>>
>>> The lock ordering enforcement is still kept as-is, but Xen is allowed
>>> to lock the caller mm locks in the right order (be it privileged or
>>> not) after having locked a subject domain ones also in the correct
>>> order.
>>
>> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
>>
>> Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.
> 
> With my proposal H1 won't be a valid lock sequence, since
> current->mm_lock_x level > tgt->mm_lock_y level.

I made a mistake; here's a corrected and expanded example.

Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.

Suppose we have H1, which grabs tgt->mm_lock_x and current->mm_lock_y.

And suppose we have H2, which grabs tgt->mm_lock_y, and current->mm_lock_x.

And suppose domA calls H1 on domB at the same time that domB calls H2 on
domA.  We could have the following sequence:

1. H1: grab B->mm_lock_x
2. H2: grab A->mm_lock_y
3. H1: wait on A->mm_lock_y
4. H2: wait on B->mm_lock_x #DEADLOCK

With the current mm lock checking:
* H1 grabs a lock of level 8, then a lock of level 16, so no BUG check.
* H2 grabs a lock of level 16 and then a lock of level 8, and so BUG
checks, catching this potential deadlock.

With your "current bias" patch:
* H1 grabs a lock of level 8, then a lock of level 80 (16 + 64), so no
BUG check.
* H2 grabs a lock of level 16, then a lock of level 72 (8 + 64), so no
BUG check, missing the potential deadlock.

With "dom0 bias", you basically have to choose one of the hypercalls to
be privileged only.  If we choose H2 to be privileged, then B must be
dom0 and A must be a domU.  So we get:
* H1 grabs lock level 72 (16 + 64), then lock level 16, causing a BUG check.
* H2 grabs lock level 16, then lock level 80, no bug check.

The practical implication of dom0 bias is that any hypercall which grabs
two mm locks, one of two things must be true:
* When it is called from one domU to another domU, locks must follow the
"normal" locking order listed in mm-locks.h
* When it is called between a dom0 and a domU, it must be consistenly
called either one way or the other; i.e., it must always be dom0 calling
it with a domU target, or domU calling it with a dom0 target (otherwise
the dom0 / domU locking order is violated).

Now that I state it that way, it's not immediately obvious that's a
property we have or want.

> I'm fine with this proposal, it's just that if it's safe for Dom0 to
> pick any other domain mm lock and then any Dom0 mm lock it should also
> be safe for any domain and not Dom0 only.

If this were true, I'd be happy with your proposal; but I'm pretty sure
it's not true.  If you can give me a proof that no two hypercalls H1 and
H2 can exist under your proposal, then we can consider it.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-20 13:47                   ` George Dunlap
@ 2018-12-20 14:01                     ` Jan Beulich
  2018-12-20 14:24                       ` George Dunlap
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2018-12-20 14:01 UTC (permalink / raw)
  To: george.dunlap
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Roger Pau Monne

>>> On 20.12.18 at 14:47, <george.dunlap@citrix.com> wrote:
> On 12/20/18 9:05 AM, Roger Pau Monné wrote:
>> On Wed, Dec 19, 2018 at 06:10:04PM +0000, George Dunlap wrote:
>>> On 12/19/18 2:59 PM, Roger Pau Monné wrote:
>>>>> Using 'current' means that potential deadlocks which would now cause a
>>>>> BUG() won't anymore.  I'm fine with not adding extra protections that
>>>>> aren't there now; but I don't want to remove protections that are.
>>>>
>>>> The lock ordering enforcement is still kept as-is, but Xen is allowed
>>>> to lock the caller mm locks in the right order (be it privileged or
>>>> not) after having locked a subject domain ones also in the correct
>>>> order.
>>>
>>> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
>>>
>>> Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.
>> 
>> With my proposal H1 won't be a valid lock sequence, since
>> current->mm_lock_x level > tgt->mm_lock_y level.
> 
> I made a mistake; here's a corrected and expanded example.
> 
> Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.
> 
> Suppose we have H1, which grabs tgt->mm_lock_x and current->mm_lock_y.
> 
> And suppose we have H2, which grabs tgt->mm_lock_y, and current->mm_lock_x.
> 
> And suppose domA calls H1 on domB at the same time that domB calls H2 on
> domA.  We could have the following sequence:
> 
> 1. H1: grab B->mm_lock_x
> 2. H2: grab A->mm_lock_y
> 3. H1: wait on A->mm_lock_y
> 4. H2: wait on B->mm_lock_x #DEADLOCK
> 
> With the current mm lock checking:
> * H1 grabs a lock of level 8, then a lock of level 16, so no BUG check.
> * H2 grabs a lock of level 16 and then a lock of level 8, and so BUG
> checks, catching this potential deadlock.
> 
> With your "current bias" patch:
> * H1 grabs a lock of level 8, then a lock of level 80 (16 + 64), so no
> BUG check.
> * H2 grabs a lock of level 16, then a lock of level 72 (8 + 64), so no
> BUG check, missing the potential deadlock.
> 
> With "dom0 bias", you basically have to choose one of the hypercalls to
> be privileged only.  If we choose H2 to be privileged, then B must be
> dom0 and A must be a domU.  So we get:
> * H1 grabs lock level 72 (16 + 64), then lock level 16, causing a BUG check.
> * H2 grabs lock level 16, then lock level 80, no bug check.
> 
> The practical implication of dom0 bias is that any hypercall which grabs
> two mm locks, one of two things must be true:
> * When it is called from one domU to another domU, locks must follow the
> "normal" locking order listed in mm-locks.h

This would be a problem in the stubdom case.

> * When it is called between a dom0 and a domU, it must be consistenly
> called either one way or the other; i.e., it must always be dom0 calling
> it with a domU target, or domU calling it with a dom0 target (otherwise
> the dom0 / domU locking order is violated).

Obviously (I hope) "domU calling it with a dom0 target" is an
uninteresting case, because we don't allow a domU to act on
dom0. Hence the consistency is achieved by it always going
to be "dom0 calling it with a domU target".

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain
  2018-12-20 14:01                     ` Jan Beulich
@ 2018-12-20 14:24                       ` George Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: George Dunlap @ 2018-12-20 14:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Roger Pau Monne

On 12/20/18 2:01 PM, Jan Beulich wrote:
>> The practical implication of dom0 bias is that any hypercall which grabs
>> two mm locks, one of two things must be true:
>> * When it is called from one domU to another domU, locks must follow the
>> "normal" locking order listed in mm-locks.h
> 
> This would be a problem in the stubdom case.

Sure; but everything extends naturally from the 2-level case (dom0 /
domU) to the 3-level case (dom0 / stubdom / domU).

>> * When it is called between a dom0 and a domU, it must be consistenly
>> called either one way or the other; i.e., it must always be dom0 calling
>> it with a domU target, or domU calling it with a dom0 target (otherwise
>> the dom0 / domU locking order is violated).
> 
> Obviously (I hope) "domU calling it with a dom0 target" is an
> uninteresting case, because we don't allow a domU to act on
> dom0. Hence the consistency is achieved by it always going
> to be "dom0 calling it with a domU target".

We allow domU to send event channels to dom0, for instance; and
presumably Argos allows domU to send things to dom0 as well as for dom0
to send things to domU.  I don't know if those require the mm locks of
both domains, but if they did, then we'd run into a problem again.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-12-20 14:25 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-18 16:05 [PATCH 0/3] x86/mm-locks: add a bias to current domain lock levels Roger Pau Monne
2018-12-18 16:05 ` [PATCH 1/3] x86/mm-locks: remove trailing whitespace Roger Pau Monne
2018-12-19 10:55   ` George Dunlap
2018-12-18 16:05 ` [PATCH 2/3] x86/mm-locks: convert some macros to inline functions Roger Pau Monne
2018-12-19 11:01   ` George Dunlap
2018-12-18 16:05 ` [PATCH 3/3] x86/mm-locks: apply a bias to lock levels for current domain Roger Pau Monne
2018-12-19 11:40   ` George Dunlap
2018-12-19 12:10     ` Roger Pau Monné
2018-12-19 12:17       ` Andrew Cooper
2018-12-19 14:35         ` Roger Pau Monné
2018-12-19 12:38       ` George Dunlap
2018-12-19 12:42         ` Andrew Cooper
2018-12-19 13:55           ` George Dunlap
2018-12-19 14:07             ` Andrew Cooper
2018-12-19 14:32               ` Jan Beulich
2018-12-19 14:30         ` Jan Beulich
2018-12-19 14:37           ` George Dunlap
2018-12-19 14:40         ` Roger Pau Monné
2018-12-19 14:46           ` George Dunlap
2018-12-19 14:59             ` Roger Pau Monné
2018-12-19 18:10               ` George Dunlap
2018-12-20  9:05                 ` Roger Pau Monné
2018-12-20  9:16                   ` Jan Beulich
2018-12-20  9:58                     ` Roger Pau Monné
     [not found]                       ` <E6041B7A020000919527FA34@prv1-mh.provo.novell.com>
2018-12-20 10:06                         ` Jan Beulich
2018-12-20 13:47                   ` George Dunlap
2018-12-20 14:01                     ` Jan Beulich
2018-12-20 14:24                       ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).