xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH for-4.13 0/2] xen/livepatch: Safety fixes
@ 2019-11-05 19:43 Andrew Cooper
  2019-11-05 19:43 ` [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks Andrew Cooper
  2019-11-05 19:43 ` [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues Andrew Cooper
  0 siblings, 2 replies; 12+ messages in thread
From: Andrew Cooper @ 2019-11-05 19:43 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Andrew Cooper, Ross Lagerwall, Konrad Rzeszutek Wilk

This pair of patches is long overdue being posted upstream.  For several years
now, XenServer has shipped the 2nd patch as a safety check (seeing as we have
both livepatching and introspection), implemented with some return address
manipulation to turn a void load hook into one which can return -EBUSY.

Andrew Cooper (2):
  xen/livepatch: Add a return value to load hooks
  x86/livepatch: Prevent patching with active waitqueues

 xen/arch/arm/livepatch.c             |  5 +++++
 xen/arch/x86/livepatch.c             | 39 ++++++++++++++++++++++++++++++++++++
 xen/common/livepatch.c               | 37 ++++++++++++++++++++++++----------
 xen/include/xen/livepatch.h          |  1 +
 xen/include/xen/livepatch_payload.h  |  2 +-
 xen/test/livepatch/xen_hello_world.c | 12 ++++++++---
 6 files changed, 81 insertions(+), 15 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks
  2019-11-05 19:43 [Xen-devel] [PATCH for-4.13 0/2] xen/livepatch: Safety fixes Andrew Cooper
@ 2019-11-05 19:43 ` Andrew Cooper
  2019-11-06 14:20   ` Konrad Rzeszutek Wilk
                     ` (2 more replies)
  2019-11-05 19:43 ` [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues Andrew Cooper
  1 sibling, 3 replies; 12+ messages in thread
From: Andrew Cooper @ 2019-11-05 19:43 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Andrew Cooper, Ross Lagerwall, Konrad Rzeszutek Wilk

One use of load hooks is to perform a safety check of the system in its
quiesced state.  Any non-zero return value from a load hook will abort the
apply attempt.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Juergen Gross <jgross@suse.com>

For several years, the following patch in the series has been shipped in every
XenServer livepatch, implemented as a void load hook which modifies its return
address to skip to the end of apply_payload().  This method is distinctly less
hacky.
---
 xen/common/livepatch.c               | 30 +++++++++++++++++++-----------
 xen/include/xen/livepatch_payload.h  |  2 +-
 xen/test/livepatch/xen_hello_world.c | 12 +++++++++---
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 7caa30c202..962647616a 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -1076,25 +1076,33 @@ static int apply_payload(struct payload *data)
      * temporarily disable the spin locks IRQ state checks.
      */
     spin_debug_disable();
-    for ( i = 0; i < data->n_load_funcs; i++ )
-        data->load_funcs[i]();
+    for ( i = 0; !rc && i < data->n_load_funcs; i++ )
+        rc = data->load_funcs[i]();
     spin_debug_enable();
 
+    if ( rc )
+        printk(XENLOG_ERR LIVEPATCH "%s: load_funcs[%u] failed: %d\n",
+               data->name, i, rc);
+
     ASSERT(!local_irq_is_enabled());
 
-    for ( i = 0; i < data->nfuncs; i++ )
-        arch_livepatch_apply(&data->funcs[i]);
+    if ( !rc )
+        for ( i = 0; i < data->nfuncs; i++ )
+            arch_livepatch_apply(&data->funcs[i]);
 
     arch_livepatch_revive();
 
-    /*
-     * We need RCU variant (which has barriers) in case we crash here.
-     * The applied_list is iterated by the trap code.
-     */
-    list_add_tail_rcu(&data->applied_list, &applied_list);
-    register_virtual_region(&data->region);
+    if ( !rc )
+    {
+        /*
+         * We need RCU variant (which has barriers) in case we crash here.
+         * The applied_list is iterated by the trap code.
+         */
+        list_add_tail_rcu(&data->applied_list, &applied_list);
+        register_virtual_region(&data->region);
+    }
 
-    return 0;
+    return rc;
 }
 
 static int revert_payload(struct payload *data)
diff --git a/xen/include/xen/livepatch_payload.h b/xen/include/xen/livepatch_payload.h
index 4a1a96d054..3788b52ddf 100644
--- a/xen/include/xen/livepatch_payload.h
+++ b/xen/include/xen/livepatch_payload.h
@@ -9,7 +9,7 @@
  * The following definitions are to be used in patches. They are taken
  * from kpatch.
  */
-typedef void livepatch_loadcall_t(void);
+typedef int livepatch_loadcall_t(void);
 typedef void livepatch_unloadcall_t(void);
 
 /*
diff --git a/xen/test/livepatch/xen_hello_world.c b/xen/test/livepatch/xen_hello_world.c
index 02f3f85dc0..d720001f07 100644
--- a/xen/test/livepatch/xen_hello_world.c
+++ b/xen/test/livepatch/xen_hello_world.c
@@ -16,9 +16,11 @@ static const char hello_world_patch_this_fnc[] = "xen_extra_version";
 extern const char *xen_hello_world(void);
 static unsigned int cnt;
 
-static void apply_hook(void)
+static int apply_hook(void)
 {
     printk(KERN_DEBUG "Hook executing.\n");
+
+    return 0;
 }
 
 static void revert_hook(void)
@@ -26,10 +28,14 @@ static void revert_hook(void)
     printk(KERN_DEBUG "Hook unloaded.\n");
 }
 
-static void  hi_func(void)
+static int hi_func(void)
 {
     printk(KERN_DEBUG "%s: Hi! (called %u times)\n", __func__, ++cnt);
+
+    return 0;
 };
+/* Safe to cast away the return value for this trivial case. */
+void (void_hi_func)(void) __attribute__((alias("hi_func")));
 
 static void check_fnc(void)
 {
@@ -43,7 +49,7 @@ LIVEPATCH_UNLOAD_HOOK(revert_hook);
 /* Imbalance here. Two load and three unload. */
 
 LIVEPATCH_LOAD_HOOK(hi_func);
-LIVEPATCH_UNLOAD_HOOK(hi_func);
+LIVEPATCH_UNLOAD_HOOK(void_hi_func);
 
 LIVEPATCH_UNLOAD_HOOK(check_fnc);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:43 [Xen-devel] [PATCH for-4.13 0/2] xen/livepatch: Safety fixes Andrew Cooper
  2019-11-05 19:43 ` [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks Andrew Cooper
@ 2019-11-05 19:43 ` Andrew Cooper
  2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
                     ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Andrew Cooper @ 2019-11-05 19:43 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Andrew Cooper, Ross Lagerwall, Konrad Rzeszutek Wilk

The safety of livepatching depends on every stack having been unwound, but
there is one corner case where this is not true.  The Sharing/Paging/Monitor
infrastructure may use waitqueues, which copy the stack frame sideways and
longjmp() to a different vcpu.

This case is rare, and can be worked around by pausing the offending
domain(s), waiting for their rings to drain, then performing a livepatch.

In the case that there is an active waitqueue, fail the livepatch attempt with
-EBUSY, which is preforable to the fireworks which occur from trying to unwind
the old stack frame at a later point.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Juergen Gross <jgross@suse.com>

This fix wants backporting, and is long overdue for posting upstream.
---
 xen/arch/arm/livepatch.c    |  5 +++++
 xen/arch/x86/livepatch.c    | 39 +++++++++++++++++++++++++++++++++++++++
 xen/common/livepatch.c      |  7 +++++++
 xen/include/xen/livepatch.h |  1 +
 4 files changed, 52 insertions(+)

diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
index 00c5e2bc45..915e9d926a 100644
--- a/xen/arch/arm/livepatch.c
+++ b/xen/arch/arm/livepatch.c
@@ -18,6 +18,11 @@
 
 void *vmap_of_xen_text;
 
+int arch_livepatch_safety_check(void)
+{
+    return 0;
+}
+
 int arch_livepatch_quiesce(void)
 {
     mfn_t text_mfn;
diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
index c82cf53b9e..0f129fa6b2 100644
--- a/xen/arch/x86/livepatch.c
+++ b/xen/arch/x86/livepatch.c
@@ -14,6 +14,45 @@
 #include <asm/nmi.h>
 #include <asm/livepatch.h>
 
+static bool has_active_waitqueue(const struct vm_event_domain *ved)
+{
+    /* ved may be xzalloc()'d without INIT_LIST_HEAD() yet. */
+    return (ved && !list_head_is_null(&ved->wq.list) &&
+            !list_empty(&ved->wq.list));
+}
+
+/*
+ * x86's implementation of waitqueue violates the livepatching safey principle
+ * of having unwound every CPUs stack before modifying live content.
+ *
+ * Search through every domain and check that no vCPUs have an active
+ * waitqueue.
+ */
+int arch_livepatch_safety_check(void);
+{
+    struct domain *d;
+
+    for_each_domain ( d )
+    {
+#ifdef CONFIG_MEM_SHARING
+        if ( has_active_waitqueue(d->vm_event_share) )
+            goto fail;
+#endif
+#ifdef CONFIG_MEM_PAGING
+        if ( has_active_waitqueue(d->vm_event_paging) )
+            goto fail;
+#endif
+        if ( has_active_waitqueue(d->vm_event_monitor) )
+            goto fail;
+    }
+
+    return 0;
+
+ fail:
+    printk(XENLOG_ERR LIVEPATCH "%pd found with active waitqueue\n", d);
+    return -EBUSY;
+}
+
 int arch_livepatch_quiesce(void)
 {
     /* Disable WP to allow changes to read-only pages. */
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 962647616a..27ee5bdeb7 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -1060,6 +1060,13 @@ static int apply_payload(struct payload *data)
     unsigned int i;
     int rc;
 
+    rc = apply_safety_checks();
+    if ( rc )
+    {
+        printk(XENLOG_ERR LIVEPATCH "%s: Safety checks failed\n", data->name);
+        return rc;
+    }
+
     printk(XENLOG_INFO LIVEPATCH "%s: Applying %u functions\n",
             data->name, data->nfuncs);
 
diff --git a/xen/include/xen/livepatch.h b/xen/include/xen/livepatch.h
index 1b1817ca0d..69ede75d20 100644
--- a/xen/include/xen/livepatch.h
+++ b/xen/include/xen/livepatch.h
@@ -104,6 +104,7 @@ static inline int livepatch_verify_distance(const struct livepatch_func *func)
  * These functions are called around the critical region patching live code,
  * for an architecture to take make appropratie global state adjustments.
  */
+int arch_livepatch_safety_check(void);
 int arch_livepatch_quiesce(void);
 void arch_livepatch_revive(void);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Xen-devel] [PATCH v1.5] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:43 ` [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues Andrew Cooper
@ 2019-11-05 19:49   ` Andrew Cooper
  2019-11-22 10:13     ` Jürgen Groß
                       ` (2 more replies)
  2019-11-06 14:25   ` [Xen-devel] [PATCH 2/2] " Konrad Rzeszutek Wilk
  2019-11-08 10:18   ` Ross Lagerwall
  2 siblings, 3 replies; 12+ messages in thread
From: Andrew Cooper @ 2019-11-05 19:49 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, Andrew Cooper, Ross Lagerwall, Konrad Rzeszutek Wilk

The safety of livepatching depends on every stack having been unwound, but
there is one corner case where this is not true.  The Sharing/Paging/Monitor
infrastructure may use waitqueues, which copy the stack frame sideways and
longjmp() to a different vcpu.

This case is rare, and can be worked around by pausing the offending
domain(s), waiting for their rings to drain, then performing a livepatch.

In the case that there is an active waitqueue, fail the livepatch attempt with
-EBUSY, which is preforable to the fireworks which occur from trying to unwind
the old stack frame at a later point.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Juergen Gross <jgross@suse.com>

This fix wants backporting, and is long overdue for posting upstream.

v1.5:
 * Send out a non-stale patch this time.
---
 xen/arch/arm/livepatch.c    |  5 +++++
 xen/arch/x86/livepatch.c    | 40 ++++++++++++++++++++++++++++++++++++++++
 xen/common/livepatch.c      |  8 ++++++++
 xen/include/xen/livepatch.h |  1 +
 4 files changed, 54 insertions(+)

diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
index 00c5e2bc45..915e9d926a 100644
--- a/xen/arch/arm/livepatch.c
+++ b/xen/arch/arm/livepatch.c
@@ -18,6 +18,11 @@
 
 void *vmap_of_xen_text;
 
+int arch_livepatch_safety_check(void)
+{
+    return 0;
+}
+
 int arch_livepatch_quiesce(void)
 {
     mfn_t text_mfn;
diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
index c82cf53b9e..2749cbc5cf 100644
--- a/xen/arch/x86/livepatch.c
+++ b/xen/arch/x86/livepatch.c
@@ -10,10 +10,50 @@
 #include <xen/vmap.h>
 #include <xen/livepatch_elf.h>
 #include <xen/livepatch.h>
+#include <xen/sched.h>
 
 #include <asm/nmi.h>
 #include <asm/livepatch.h>
 
+static bool has_active_waitqueue(const struct vm_event_domain *ved)
+{
+    /* ved may be xzalloc()'d without INIT_LIST_HEAD() yet. */
+    return (ved && !list_head_is_null(&ved->wq.list) &&
+            !list_empty(&ved->wq.list));
+}
+
+/*
+ * x86's implementation of waitqueue violates the livepatching safey principle
+ * of having unwound every CPUs stack before modifying live content.
+ *
+ * Search through every domain and check that no vCPUs have an active
+ * waitqueue.
+ */
+int arch_livepatch_safety_check(void)
+{
+    struct domain *d;
+
+    for_each_domain ( d )
+    {
+#ifdef CONFIG_MEM_SHARING
+        if ( has_active_waitqueue(d->vm_event_share) )
+            goto fail;
+#endif
+#ifdef CONFIG_MEM_PAGING
+        if ( has_active_waitqueue(d->vm_event_paging) )
+            goto fail;
+#endif
+        if ( has_active_waitqueue(d->vm_event_monitor) )
+            goto fail;
+    }
+
+    return 0;
+
+ fail:
+    printk(XENLOG_ERR LIVEPATCH "%pd found with active waitqueue\n", d);
+    return -EBUSY;
+}
+
 int arch_livepatch_quiesce(void)
 {
     /* Disable WP to allow changes to read-only pages. */
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 962647616a..8386e611f2 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -1060,6 +1060,14 @@ static int apply_payload(struct payload *data)
     unsigned int i;
     int rc;
 
+    rc = arch_livepatch_safety_check();
+    if ( rc )
+    {
+        printk(XENLOG_ERR LIVEPATCH "%s: Safety checks failed: %d\n",
+               data->name, rc);
+        return rc;
+    }
+
     printk(XENLOG_INFO LIVEPATCH "%s: Applying %u functions\n",
             data->name, data->nfuncs);
 
diff --git a/xen/include/xen/livepatch.h b/xen/include/xen/livepatch.h
index 1b1817ca0d..69ede75d20 100644
--- a/xen/include/xen/livepatch.h
+++ b/xen/include/xen/livepatch.h
@@ -104,6 +104,7 @@ static inline int livepatch_verify_distance(const struct livepatch_func *func)
  * These functions are called around the critical region patching live code,
  * for an architecture to take make appropratie global state adjustments.
  */
+int arch_livepatch_safety_check(void);
 int arch_livepatch_quiesce(void);
 void arch_livepatch_revive(void);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks
  2019-11-05 19:43 ` [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks Andrew Cooper
@ 2019-11-06 14:20   ` Konrad Rzeszutek Wilk
  2019-11-06 14:35   ` Jan Beulich
  2019-11-08 10:21   ` Ross Lagerwall
  2 siblings, 0 replies; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-11-06 14:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Juergen Gross, Xen-devel, Ross Lagerwall

On Tue, Nov 05, 2019 at 07:43:16PM +0000, Andrew Cooper wrote:
> One use of load hooks is to perform a safety check of the system in its
> quiesced state.  Any non-zero return value from a load hook will abort the
> apply attempt.
> 

Shouldn't you also update the documentation (design doc?)

Thanks.
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> 
> For several years, the following patch in the series has been shipped in every
> XenServer livepatch, implemented as a void load hook which modifies its return
> address to skip to the end of apply_payload().  This method is distinctly less
> hacky.
> ---
>  xen/common/livepatch.c               | 30 +++++++++++++++++++-----------
>  xen/include/xen/livepatch_payload.h  |  2 +-
>  xen/test/livepatch/xen_hello_world.c | 12 +++++++++---
>  3 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
> index 7caa30c202..962647616a 100644
> --- a/xen/common/livepatch.c
> +++ b/xen/common/livepatch.c
> @@ -1076,25 +1076,33 @@ static int apply_payload(struct payload *data)
>       * temporarily disable the spin locks IRQ state checks.
>       */
>      spin_debug_disable();
> -    for ( i = 0; i < data->n_load_funcs; i++ )
> -        data->load_funcs[i]();
> +    for ( i = 0; !rc && i < data->n_load_funcs; i++ )
> +        rc = data->load_funcs[i]();
>      spin_debug_enable();
>  
> +    if ( rc )
> +        printk(XENLOG_ERR LIVEPATCH "%s: load_funcs[%u] failed: %d\n",
> +               data->name, i, rc);
> +
>      ASSERT(!local_irq_is_enabled());
>  
> -    for ( i = 0; i < data->nfuncs; i++ )
> -        arch_livepatch_apply(&data->funcs[i]);
> +    if ( !rc )
> +        for ( i = 0; i < data->nfuncs; i++ )
> +            arch_livepatch_apply(&data->funcs[i]);
>  
>      arch_livepatch_revive();
>  
> -    /*
> -     * We need RCU variant (which has barriers) in case we crash here.
> -     * The applied_list is iterated by the trap code.
> -     */
> -    list_add_tail_rcu(&data->applied_list, &applied_list);
> -    register_virtual_region(&data->region);
> +    if ( !rc )
> +    {
> +        /*
> +         * We need RCU variant (which has barriers) in case we crash here.
> +         * The applied_list is iterated by the trap code.
> +         */
> +        list_add_tail_rcu(&data->applied_list, &applied_list);
> +        register_virtual_region(&data->region);
> +    }
>  
> -    return 0;
> +    return rc;
>  }
>  
>  static int revert_payload(struct payload *data)
> diff --git a/xen/include/xen/livepatch_payload.h b/xen/include/xen/livepatch_payload.h
> index 4a1a96d054..3788b52ddf 100644
> --- a/xen/include/xen/livepatch_payload.h
> +++ b/xen/include/xen/livepatch_payload.h
> @@ -9,7 +9,7 @@
>   * The following definitions are to be used in patches. They are taken
>   * from kpatch.
>   */
> -typedef void livepatch_loadcall_t(void);
> +typedef int livepatch_loadcall_t(void);
>  typedef void livepatch_unloadcall_t(void);
>  
>  /*
> diff --git a/xen/test/livepatch/xen_hello_world.c b/xen/test/livepatch/xen_hello_world.c
> index 02f3f85dc0..d720001f07 100644
> --- a/xen/test/livepatch/xen_hello_world.c
> +++ b/xen/test/livepatch/xen_hello_world.c
> @@ -16,9 +16,11 @@ static const char hello_world_patch_this_fnc[] = "xen_extra_version";
>  extern const char *xen_hello_world(void);
>  static unsigned int cnt;
>  
> -static void apply_hook(void)
> +static int apply_hook(void)
>  {
>      printk(KERN_DEBUG "Hook executing.\n");
> +
> +    return 0;
>  }
>  
>  static void revert_hook(void)
> @@ -26,10 +28,14 @@ static void revert_hook(void)
>      printk(KERN_DEBUG "Hook unloaded.\n");
>  }
>  
> -static void  hi_func(void)
> +static int hi_func(void)
>  {
>      printk(KERN_DEBUG "%s: Hi! (called %u times)\n", __func__, ++cnt);
> +
> +    return 0;
>  };
> +/* Safe to cast away the return value for this trivial case. */
> +void (void_hi_func)(void) __attribute__((alias("hi_func")));
>  
>  static void check_fnc(void)
>  {
> @@ -43,7 +49,7 @@ LIVEPATCH_UNLOAD_HOOK(revert_hook);
>  /* Imbalance here. Two load and three unload. */
>  
>  LIVEPATCH_LOAD_HOOK(hi_func);
> -LIVEPATCH_UNLOAD_HOOK(hi_func);
> +LIVEPATCH_UNLOAD_HOOK(void_hi_func);
>  
>  LIVEPATCH_UNLOAD_HOOK(check_fnc);
>  
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:43 ` [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues Andrew Cooper
  2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
@ 2019-11-06 14:25   ` Konrad Rzeszutek Wilk
  2019-11-08 10:18   ` Ross Lagerwall
  2 siblings, 0 replies; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-11-06 14:25 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Juergen Gross, Xen-devel, Ross Lagerwall

On Tue, Nov 05, 2019 at 07:43:17PM +0000, Andrew Cooper wrote:
> The safety of livepatching depends on every stack having been unwound, but
> there is one corner case where this is not true.  The Sharing/Paging/Monitor
> infrastructure may use waitqueues, which copy the stack frame sideways and
> longjmp() to a different vcpu.
> 
> This case is rare, and can be worked around by pausing the offending
> domain(s), waiting for their rings to drain, then performing a livepatch.
> 
> In the case that there is an active waitqueue, fail the livepatch attempt with
> -EBUSY, which is preforable to the fireworks which occur from trying to unwind
> the old stack frame at a later point.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> 
> This fix wants backporting, and is long overdue for posting upstream.
> ---
>  xen/arch/arm/livepatch.c    |  5 +++++
>  xen/arch/x86/livepatch.c    | 39 +++++++++++++++++++++++++++++++++++++++
>  xen/common/livepatch.c      |  7 +++++++
>  xen/include/xen/livepatch.h |  1 +
>  4 files changed, 52 insertions(+)
> 
> diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
> index 00c5e2bc45..915e9d926a 100644
> --- a/xen/arch/arm/livepatch.c
> +++ b/xen/arch/arm/livepatch.c
> @@ -18,6 +18,11 @@
>  
>  void *vmap_of_xen_text;
>  
> +int arch_livepatch_safety_check(void)
> +{
> +    return 0;
> +}
> +
>  int arch_livepatch_quiesce(void)
>  {
>      mfn_t text_mfn;
> diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
> index c82cf53b9e..0f129fa6b2 100644
> --- a/xen/arch/x86/livepatch.c
> +++ b/xen/arch/x86/livepatch.c
> @@ -14,6 +14,45 @@
>  #include <asm/nmi.h>
>  #include <asm/livepatch.h>
>  
> +static bool has_active_waitqueue(const struct vm_event_domain *ved)
> +{
> +    /* ved may be xzalloc()'d without INIT_LIST_HEAD() yet. */
> +    return (ved && !list_head_is_null(&ved->wq.list) &&
> +            !list_empty(&ved->wq.list));
> +}
> +
> +/*
> + * x86's implementation of waitqueue violates the livepatching safey principle
> + * of having unwound every CPUs stack before modifying live content.
> + *
> + * Search through every domain and check that no vCPUs have an active
> + * waitqueue.
> + */
> +int arch_livepatch_safety_check(void);
> +{
> +    struct domain *d;
> +
> +    for_each_domain ( d )
> +    {
> +#ifdef CONFIG_MEM_SHARING
> +        if ( has_active_waitqueue(d->vm_event_share) )
> +            goto fail;
> +#endif
> +#ifdef CONFIG_MEM_PAGING
> +        if ( has_active_waitqueue(d->vm_event_paging) )
> +            goto fail;
> +#endif
> +        if ( has_active_waitqueue(d->vm_event_monitor) )
> +            goto fail;
> +    }
> +
> +    return 0;
> +
> + fail:
> +    printk(XENLOG_ERR LIVEPATCH "%pd found with active waitqueue\n", d);
> +    return -EBUSY;
> +}
> +
>  int arch_livepatch_quiesce(void)
>  {
>      /* Disable WP to allow changes to read-only pages. */
> diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
> index 962647616a..27ee5bdeb7 100644
> --- a/xen/common/livepatch.c
> +++ b/xen/common/livepatch.c
> @@ -1060,6 +1060,13 @@ static int apply_payload(struct payload *data)
>      unsigned int i;
>      int rc;
>  
> +    rc = apply_safety_checks();
> +    if ( rc )
> +    {
> +        printk(XENLOG_ERR LIVEPATCH "%s: Safety checks failed\n", data->name);
> +        return rc;
> +    }
> +
>      printk(XENLOG_INFO LIVEPATCH "%s: Applying %u functions\n",
>              data->name, data->nfuncs);
>  
> diff --git a/xen/include/xen/livepatch.h b/xen/include/xen/livepatch.h
> index 1b1817ca0d..69ede75d20 100644
> --- a/xen/include/xen/livepatch.h
> +++ b/xen/include/xen/livepatch.h
> @@ -104,6 +104,7 @@ static inline int livepatch_verify_distance(const struct livepatch_func *func)
>   * These functions are called around the critical region patching live code,
>   * for an architecture to take make appropratie global state adjustments.
>   */
> +int arch_livepatch_safety_check(void);
>  int arch_livepatch_quiesce(void);
>  void arch_livepatch_revive(void);
>  
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks
  2019-11-05 19:43 ` [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks Andrew Cooper
  2019-11-06 14:20   ` Konrad Rzeszutek Wilk
@ 2019-11-06 14:35   ` Jan Beulich
  2019-11-08 10:21   ` Ross Lagerwall
  2 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2019-11-06 14:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Xen-devel, Konrad Rzeszutek Wilk, Ross Lagerwall

On 05.11.2019 20:43, Andrew Cooper wrote:
> --- a/xen/common/livepatch.c
> +++ b/xen/common/livepatch.c
> @@ -1076,25 +1076,33 @@ static int apply_payload(struct payload *data)
>       * temporarily disable the spin locks IRQ state checks.
>       */
>      spin_debug_disable();
> -    for ( i = 0; i < data->n_load_funcs; i++ )
> -        data->load_funcs[i]();
> +    for ( i = 0; !rc && i < data->n_load_funcs; i++ )
> +        rc = data->load_funcs[i]();
>      spin_debug_enable();
>  
> +    if ( rc )
> +        printk(XENLOG_ERR LIVEPATCH "%s: load_funcs[%u] failed: %d\n",
> +               data->name, i, rc);

Is there a possible problem here if some of the load_funcs()
succeeded before one fails? Or are those required to not do
any state change to the system (which would need rolling back)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:43 ` [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues Andrew Cooper
  2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
  2019-11-06 14:25   ` [Xen-devel] [PATCH 2/2] " Konrad Rzeszutek Wilk
@ 2019-11-08 10:18   ` Ross Lagerwall
  2 siblings, 0 replies; 12+ messages in thread
From: Ross Lagerwall @ 2019-11-08 10:18 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Juergen Gross, Konrad Rzeszutek Wilk

On 11/5/19 7:43 PM, Andrew Cooper wrote:
> The safety of livepatching depends on every stack having been unwound, but
> there is one corner case where this is not true.  The Sharing/Paging/Monitor
> infrastructure may use waitqueues, which copy the stack frame sideways and
> longjmp() to a different vcpu.
> 
> This case is rare, and can be worked around by pausing the offending
> domain(s), waiting for their rings to drain, then performing a livepatch.
> 
> In the case that there is an active waitqueue, fail the livepatch attempt with
> -EBUSY, which is preforable to the fireworks which occur from trying to unwind
> the old stack frame at a later point.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks
  2019-11-05 19:43 ` [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks Andrew Cooper
  2019-11-06 14:20   ` Konrad Rzeszutek Wilk
  2019-11-06 14:35   ` Jan Beulich
@ 2019-11-08 10:21   ` Ross Lagerwall
  2 siblings, 0 replies; 12+ messages in thread
From: Ross Lagerwall @ 2019-11-08 10:21 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Juergen Gross, Konrad Rzeszutek Wilk

On 11/5/19 7:43 PM, Andrew Cooper wrote:
> One use of load hooks is to perform a safety check of the system in its
> quiesced state.  Any non-zero return value from a load hook will abort the
> apply attempt.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH v1.5] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
@ 2019-11-22 10:13     ` Jürgen Groß
  2019-11-23  3:23     ` Sarah Newman
  2019-11-23  4:48     ` Sarah Newman
  2 siblings, 0 replies; 12+ messages in thread
From: Jürgen Groß @ 2019-11-22 10:13 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Ross Lagerwall, Konrad Rzeszutek Wilk

On 05.11.19 20:49, Andrew Cooper wrote:
> The safety of livepatching depends on every stack having been unwound, but
> there is one corner case where this is not true.  The Sharing/Paging/Monitor
> infrastructure may use waitqueues, which copy the stack frame sideways and
> longjmp() to a different vcpu.
> 
> This case is rare, and can be worked around by pausing the offending
> domain(s), waiting for their rings to drain, then performing a livepatch.
> 
> In the case that there is an active waitqueue, fail the livepatch attempt with
> -EBUSY, which is preforable to the fireworks which occur from trying to unwind
> the old stack frame at a later point.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH v1.5] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
  2019-11-22 10:13     ` Jürgen Groß
@ 2019-11-23  3:23     ` Sarah Newman
  2019-11-23  4:48     ` Sarah Newman
  2 siblings, 0 replies; 12+ messages in thread
From: Sarah Newman @ 2019-11-23  3:23 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Ross Lagerwall, Konrad Rzeszutek Wilk

On 11/5/19 11:49 AM, Andrew Cooper wrote:
> The safety of livepatching depends on every stack having been unwound, but
> there is one corner case where this is not true.  The Sharing/Paging/Monitor
> infrastructure may use waitqueues, which copy the stack frame sideways and
> longjmp() to a different vcpu.
> 
> This case is rare, and can be worked around by pausing the offending
> domain(s), waiting for their rings to drain, then performing a livepatch.
> 
> In the case that there is an active waitqueue, fail the livepatch attempt with
> -EBUSY, which is preforable to the fireworks which occur from trying to unwind
> the old stack frame at a later point.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> 
> This fix wants backporting, and is long overdue for posting upstream.
> 
> v1.5:
>   * Send out a non-stale patch this time.
> ---
>   xen/arch/arm/livepatch.c    |  5 +++++
>   xen/arch/x86/livepatch.c    | 40 ++++++++++++++++++++++++++++++++++++++++
>   xen/common/livepatch.c      |  8 ++++++++
>   xen/include/xen/livepatch.h |  1 +
>   4 files changed, 54 insertions(+)
> 
> diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
> index 00c5e2bc45..915e9d926a 100644
> --- a/xen/arch/arm/livepatch.c
> +++ b/xen/arch/arm/livepatch.c
> @@ -18,6 +18,11 @@
>   
>   void *vmap_of_xen_text;
>   
> +int arch_livepatch_safety_check(void)
> +{
> +    return 0;
> +}
> +
>   int arch_livepatch_quiesce(void)
>   {
>       mfn_t text_mfn;
> diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
> index c82cf53b9e..2749cbc5cf 100644
> --- a/xen/arch/x86/livepatch.c
> +++ b/xen/arch/x86/livepatch.c
> @@ -10,10 +10,50 @@
>   #include <xen/vmap.h>
>   #include <xen/livepatch_elf.h>
>   #include <xen/livepatch.h>
> +#include <xen/sched.h>
>   
>   #include <asm/nmi.h>
>   #include <asm/livepatch.h>
>   
> +static bool has_active_waitqueue(const struct vm_event_domain *ved)
> +{
> +    /* ved may be xzalloc()'d without INIT_LIST_HEAD() yet. */
> +    return (ved && !list_head_is_null(&ved->wq.list) &&
> +            !list_empty(&ved->wq.list));
> +}
> +
> +/*
> + * x86's implementation of waitqueue violates the livepatching safey principle
> + * of having unwound every CPUs stack before modifying live content.
> + *
> + * Search through every domain and check that no vCPUs have an active
> + * waitqueue.
> + */
> +int arch_livepatch_safety_check(void)
> +{
> +    struct domain *d;
> +
> +    for_each_domain ( d )
> +    {
> +#ifdef CONFIG_MEM_SHARING
> +        if ( has_active_waitqueue(d->vm_event_share) )
> +            goto fail;
> +#endif
> +#ifdef CONFIG_MEM_PAGING
> +        if ( has_active_waitqueue(d->vm_event_paging) )'

Is this supposed to be CONFIG_HAS_MEM_PAGING?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH v1.5] x86/livepatch: Prevent patching with active waitqueues
  2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
  2019-11-22 10:13     ` Jürgen Groß
  2019-11-23  3:23     ` Sarah Newman
@ 2019-11-23  4:48     ` Sarah Newman
  2 siblings, 0 replies; 12+ messages in thread
From: Sarah Newman @ 2019-11-23  4:48 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Juergen Gross, Ross Lagerwall, Konrad Rzeszutek Wilk

On 11/5/19 11:49 AM, Andrew Cooper wrote:
> The safety of livepatching depends on every stack having been unwound, but
> there is one corner case where this is not true.  The Sharing/Paging/Monitor
> infrastructure may use waitqueues, which copy the stack frame sideways and
> longjmp() to a different vcpu.
> 
> This case is rare, and can be worked around by pausing the offending
> domain(s), waiting for their rings to drain, then performing a livepatch.
> 
> In the case that there is an active waitqueue, fail the livepatch attempt with
> -EBUSY, which is preforable to the fireworks which occur from trying to unwind
> the old stack frame at a later point.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> 
> This fix wants backporting, and is long overdue for posting upstream.
> 
> v1.5:
>   * Send out a non-stale patch this time.
> ---
>   xen/arch/arm/livepatch.c    |  5 +++++
>   xen/arch/x86/livepatch.c    | 40 ++++++++++++++++++++++++++++++++++++++++
>   xen/common/livepatch.c      |  8 ++++++++
>   xen/include/xen/livepatch.h |  1 +
>   4 files changed, 54 insertions(+)
> 
> diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
> index 00c5e2bc45..915e9d926a 100644
> --- a/xen/arch/arm/livepatch.c
> +++ b/xen/arch/arm/livepatch.c
> @@ -18,6 +18,11 @@
>   
>   void *vmap_of_xen_text;
>   
> +int arch_livepatch_safety_check(void)
> +{
> +    return 0;
> +}
> +
>   int arch_livepatch_quiesce(void)
>   {
>       mfn_t text_mfn;
> diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
> index c82cf53b9e..2749cbc5cf 100644
> --- a/xen/arch/x86/livepatch.c
> +++ b/xen/arch/x86/livepatch.c
> @@ -10,10 +10,50 @@
>   #include <xen/vmap.h>
>   #include <xen/livepatch_elf.h>
>   #include <xen/livepatch.h>
> +#include <xen/sched.h>
>   
>   #include <asm/nmi.h>
>   #include <asm/livepatch.h>
>   
> +static bool has_active_waitqueue(const struct vm_event_domain *ved)
> +{
> +    /* ved may be xzalloc()'d without INIT_LIST_HEAD() yet. */
> +    return (ved && !list_head_is_null(&ved->wq.list) &&
> +            !list_empty(&ved->wq.list));
> +}
> +
> +/*
> + * x86's implementation of waitqueue violates the livepatching safey principle
> + * of having unwound every CPUs stack before modifying live content.
> + *
> + * Search through every domain and check that no vCPUs have an active
> + * waitqueue.
> + */
> +int arch_livepatch_safety_check(void)
> +{
> +    struct domain *d;
> +
> +    for_each_domain ( d )
> +    {
> +#ifdef CONFIG_MEM_SHARING
> +        if ( has_active_waitqueue(d->vm_event_share) )
> +            goto fail;
> +#endif
> +#ifdef CONFIG_MEM_PAGING
> +        if ( has_active_waitqueue(d->vm_event_paging) )
> +            goto fail;
> +#endif
> +        if ( has_active_waitqueue(d->vm_event_monitor) )
> +            goto fail;
> +    }
> +
> +    return 0;
> +
> + fail:
> +    printk(XENLOG_ERR LIVEPATCH "%pd found with active waitqueue\n", d);
> +    return -EBUSY;
> +}
> +
>   int arch_livepatch_quiesce(void)
>   {
>       /* Disable WP to allow changes to read-only pages. */
> diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
> index 962647616a..8386e611f2 100644
> --- a/xen/common/livepatch.c
> +++ b/xen/common/livepatch.c
> @@ -1060,6 +1060,14 @@ static int apply_payload(struct payload *data)
>       unsigned int i;
>       int rc;
>   
> +    rc = arch_livepatch_safety_check();
> +    if ( rc )
> +    {
> +        printk(XENLOG_ERR LIVEPATCH "%s: Safety checks failed: %d\n",
> +               data->name, rc);
> +        return rc;
> +    }
> +

Would it make sense to call arch_livepatch_safety_check from
arch_livepatch_quiesce rather than directly, so that
arch_livepatch_safety_check is also called from revert_payload?

--Sarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-23  4:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-05 19:43 [Xen-devel] [PATCH for-4.13 0/2] xen/livepatch: Safety fixes Andrew Cooper
2019-11-05 19:43 ` [Xen-devel] [PATCH 1/2] xen/livepatch: Add a return value to load hooks Andrew Cooper
2019-11-06 14:20   ` Konrad Rzeszutek Wilk
2019-11-06 14:35   ` Jan Beulich
2019-11-08 10:21   ` Ross Lagerwall
2019-11-05 19:43 ` [Xen-devel] [PATCH 2/2] x86/livepatch: Prevent patching with active waitqueues Andrew Cooper
2019-11-05 19:49   ` [Xen-devel] [PATCH v1.5] " Andrew Cooper
2019-11-22 10:13     ` Jürgen Groß
2019-11-23  3:23     ` Sarah Newman
2019-11-23  4:48     ` Sarah Newman
2019-11-06 14:25   ` [Xen-devel] [PATCH 2/2] " Konrad Rzeszutek Wilk
2019-11-08 10:18   ` Ross Lagerwall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).