[PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
@ 2015-06-03 13:35 Vitaly Kuznetsov
  2015-06-03 13:35 ` [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset Vitaly Kuznetsov
                   ` (6 more replies)
  0 siblings, 7 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-03 13:35 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Tim Deegan
  Cc: Andrew Jones, Julien Grall, Wei Liu, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Keir Fraser, Daniel De Graaf

Jan and Tim,

last week you expressed some concerns about if the toolstack-based
approach to PVHVM guest kexec is the best. Here you can see the 'reset
everything' approach to the same problem. It is the bare minimum of what
should be done to make it possible for the new kernel to boot. Despite
the fact that this implementation is much smaller in size and that it
doesn't require any toolstack changes I'm personally in favor of the
previous toolstack-based approach as it looks more general, less fragile
and easier to support to me. Please let me know your thoughts.

I used SCHEDOP_ interface here as DOMCTL_* is not currently supported in
Linux kernel and I seriously doubt we need to support something different
from DOMID_SELF for soft reset.

Current Linux kernel requires two changes to make use of these hypervisor
changes:
1) As XS_RESET_WATCHES is not supported by oxenstored we need to try removing
the watch in case add operation failed, e.g.:
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -712,6 +712,10 @@ int register_xenbus_watch(struct xenbus_watch *watch)
        spin_unlock(&watches_lock);
 
        err = xs_watch(watch->node, token);
+       if (err) {
+               if (!xs_unwatch(watch->node, token))
+                       err = xs_watch(watch->node, token);
+       }
 
        if (err) {
                spin_lock(&watches_lock);

2) HYPERVISOR_sched_op(SCHEDOP_soft_reset, NULL) is supposed to be called in
kexec/kdump situation (via machine_ops.shutdown and machine_ops.crash_shutdown
handlers).

v7 of the toolstack-based approach is available here:
http://comments.gmane.org/gmane.comp.emulators.xen.devel/245183
(for some reason it is missing on http://lists.xen.org/archives/html/xen-devel/2015-05/,
there are no messages in May after May, 26 and this series was sent on May, 27)

Vitaly Kuznetsov (4):
  xen: evtchn: make evtchn_reset() ready for soft reset
  xen: grant_table: implement grant_table_soft_reset()
  xen: implement SCHEDOP_soft_reset
  xen: arch-specific hooks for domain_soft_reset()

 xen/arch/arm/domain.c         |  5 +++
 xen/arch/x86/domain.c         | 73 +++++++++++++++++++++++++++++++++++++++++++
 xen/common/domain.c           | 39 ++++++++++++++++++++---
 xen/common/event_channel.c    | 52 +++++++++++++++++-------------
 xen/common/grant_table.c      | 62 ++++++++++++++++++++++++++++++++++++
 xen/common/schedule.c         |  4 +++
 xen/include/public/sched.h    |  7 +++++
 xen/include/xen/domain.h      |  4 +++
 xen/include/xen/event.h       |  3 ++
 xen/include/xen/grant_table.h |  6 ++++
 xen/include/xen/sched.h       |  1 +
 11 files changed, 231 insertions(+), 25 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
@ 2015-06-03 13:35 ` Vitaly Kuznetsov
  2015-06-04 14:05   ` Tim Deegan
  2015-06-08 14:10   ` Jan Beulich
  2015-06-03 13:35 ` [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset() Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-03 13:35 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Tim Deegan
  Cc: Andrew Jones, Julien Grall, Wei Liu, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Keir Fraser, Daniel De Graaf

We need to close all event channel so the domain performing soft reset
will be able to open them back. Interdomain channels are, however,
special. We need to keep track of who opened it as in (the most common)
case it was opened by the control domain we won't be able (and allowed) to
re-establish it.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 xen/common/event_channel.c | 52 +++++++++++++++++++++++++++-------------------
 xen/include/xen/event.h    |  3 +++
 xen/include/xen/sched.h    |  1 +
 3 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index fae242d..3204c74 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -274,11 +274,13 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
 
     lchn->u.interdomain.remote_dom  = rd;
     lchn->u.interdomain.remote_port = rport;
+    lchn->u.interdomain.opened_by   = current->domain;
     lchn->state                     = ECS_INTERDOMAIN;
     evtchn_port_init(ld, lchn);
     
     rchn->u.interdomain.remote_dom  = ld;
     rchn->u.interdomain.remote_port = lport;
+    rchn->u.interdomain.opened_by   = current->domain;
     rchn->state                     = ECS_INTERDOMAIN;
 
     /*
@@ -933,26 +935,30 @@ int evtchn_unmask(unsigned int port)
 }
 
 
-static long evtchn_reset(evtchn_reset_t *r)
+void evtchn_reset(struct domain *d, bool_t soft_reset)
 {
-    domid_t dom = r->dom;
-    struct domain *d;
-    int i, rc;
-
-    d = rcu_lock_domain_by_any_id(dom);
-    if ( d == NULL )
-        return -ESRCH;
-
-    rc = xsm_evtchn_reset(XSM_TARGET, current->domain, d);
-    if ( rc )
-        goto out;
+    int i;
+    struct evtchn *chn;
 
+    /*
+     * ECS_INTERDOMAIN channels with port number suitable for the 2-level ABI
+     * opened by other domains should remain opened as the domain doing soft
+     * reset won't be able to reopen them.
+     * __evtchn_close() also leaves consumer_is_xen() channels open.
+     */
     for ( i = 0; port_is_valid(d, i); i++ )
+    {
+        chn = evtchn_from_port(d, i);
+        if ( !soft_reset ||
+             i >= (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) ||
+             chn->state != ECS_INTERDOMAIN ||
+             chn->u.interdomain.opened_by == d )
         (void)__evtchn_close(d, i);
+    }
 
     spin_lock(&d->event_lock);
 
-    if ( (dom == DOMID_SELF) && d->evtchn_fifo )
+    if ( (d == current->domain) && d->evtchn_fifo )
     {
         /*
          * Guest domain called EVTCHNOP_reset with DOMID_SELF, destroying
@@ -964,13 +970,6 @@ static long evtchn_reset(evtchn_reset_t *r)
     }
 
     spin_unlock(&d->event_lock);
-
-    rc = 0;
-
-out:
-    rcu_unlock_domain(d);
-
-    return rc;
 }
 
 static long evtchn_set_priority(const struct evtchn_set_priority *set_priority)
@@ -1097,9 +1096,20 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case EVTCHNOP_reset: {
         struct evtchn_reset reset;
+        struct domain *d;
+
         if ( copy_from_guest(&reset, arg, 1) != 0 )
             return -EFAULT;
-        rc = evtchn_reset(&reset);
+
+        d = rcu_lock_domain_by_any_id(reset.dom);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = xsm_evtchn_reset(XSM_TARGET, current->domain, d);
+        if ( !rc )
+            evtchn_reset(d, 0);
+
+        rcu_unlock_domain(d);
         break;
     }
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 690f865..d0479a6 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -130,6 +130,9 @@ void evtchn_check_pollers(struct domain *d, unsigned int port);
 
 void evtchn_2l_init(struct domain *d);
 
+/* Close all event channels and reset to 2-level ABI */
+void evtchn_reset(struct domain *d, bool_t soft_reset);
+
 /*
  * Low-level event channel port ops.
  */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 80c6f62..13b6b86 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -98,6 +98,7 @@ struct evtchn
         struct {
             evtchn_port_t  remote_port;
             struct domain *remote_dom;
+            struct domain *opened_by;
         } interdomain; /* state == ECS_INTERDOMAIN */
         struct {
             u32            irq;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
  2015-06-03 13:35 ` [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset Vitaly Kuznetsov
@ 2015-06-03 13:35 ` Vitaly Kuznetsov
  2015-06-04 14:11   ` Tim Deegan
  2015-06-08 14:26   ` Jan Beulich
  2015-06-03 13:35 ` [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-03 13:35 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Tim Deegan
  Cc: Andrew Jones, Julien Grall, Wei Liu, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Keir Fraser, Daniel De Graaf

When soft reset is being performed we need to replace all actively
granted pages with empty pages to prevent possible future memory
corruption as the newly started kernel won't be aware of these
granted pages.

We make the tot_pages < max_pages assumption here: previously granted pages
need to belong to someone and we don't want to implement possible DoS by
reassigning them to the grantee/anonymous domain/xen/.. (the malicious guest
will be able to consume all host's memory).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 xen/common/grant_table.c      | 62 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/grant_table.h |  6 +++++
 2 files changed, 68 insertions(+)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index dfb45f8..815d5f3 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3120,6 +3120,68 @@ gnttab_release_mappings(
     }
 }
 
+int grant_table_soft_reset(struct domain *d)
+{
+    struct grant_table *gt = d->grant_table;
+    struct active_grant_entry *act;
+    grant_ref_t ref;
+    unsigned long mfn_new;
+    struct page_info *page, *new_page;
+    int ret = 0;
+
+    spin_lock(&gt->lock);
+
+    for ( ref = 0; ref != nr_grant_entries(gt); ref++ )
+    {
+        act = &active_entry(gt, ref);
+        if ( !act->pin )
+            continue;
+
+        page = mfn_to_page(act->frame);
+        if ( !page || !get_page(page, d) )
+        {
+            printk(XENLOG_G_ERR "Dom%d's grant frame %lx (gfn %lx) is invalid",
+                   d->domain_id, act->gfn, act->frame);
+            ret = -EINVAL;
+            break;
+        }
+
+        /*
+         * Here we assume the domain has tot_pages < max_pages. Previously
+         * granted page will still belong to us until it is unmapped from
+         * grantee.
+         */
+        new_page = alloc_domheap_page(d, 0);
+        if ( !new_page )
+        {
+            printk(XENLOG_G_ERR "Failed to alloc a page to replace"
+                   " Dom%d's grant frame %lx\n", d->domain_id, act->frame);
+            ret = -ENOMEM;
+            put_page(page);
+            break;
+        }
+        mfn_new = page_to_mfn(new_page);
+        guest_remove_page(d, act->gfn);
+
+        if ( guest_physmap_add_page(d, act->gfn, mfn_new, 0) )
+        {
+            printk(XENLOG_G_ERR "Failed to replace Dom%d's grant frame %lx"
+                   " with new MFN %lx\n", d->domain_id, act->frame, mfn_new);
+            ret = -EFAULT;
+        }
+
+        gnttab_flush_tlb(d);
+
+        put_page(new_page);
+        put_page(page);
+
+        if (ret)
+            break;
+    }
+
+    spin_unlock(&gt->lock);
+    return ret;
+}
 
 void
 grant_table_destroy(
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 32f5786..3f58c54 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -95,6 +95,12 @@ int grant_table_create(
 void grant_table_destroy(
     struct domain *d);
 
+/*
+ * Replace all granted pages with empty pages for a domain to prevent possible
+ * future memory corruption in case of soft reset.
+ */
+int grant_table_soft_reset(struct domain *d);
+
 /* Domain death release of granted mappings of other domains' memory. */
 void
 gnttab_release_mappings(
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
  2015-06-03 13:35 ` [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset Vitaly Kuznetsov
  2015-06-03 13:35 ` [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset() Vitaly Kuznetsov
@ 2015-06-03 13:35 ` Vitaly Kuznetsov
  2015-06-08 14:31   ` Jan Beulich
  2015-06-03 13:35 ` [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset() Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-03 13:35 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Tim Deegan
  Cc: Andrew Jones, Julien Grall, Wei Liu, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Keir Fraser, Daniel De Graaf

Reset event channels, replace grant pages with empty ones, unmount
previously mounted vcpu_info.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 xen/common/domain.c        | 37 +++++++++++++++++++++++++++++++++----
 xen/common/schedule.c      |  4 ++++
 xen/include/public/sched.h |  7 +++++++
 xen/include/xen/domain.h   |  2 ++
 4 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 6803c4d..44e532b 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1116,12 +1116,12 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset)
 
 /*
  * Unmap the vcpu info page if the guest decided to place it somewhere
- * else.  This is only used from arch_domain_destroy, so there's no
- * need to do anything clever.
+ * else.
  */
 void unmap_vcpu_info(struct vcpu *v)
 {
     unsigned long mfn;
+    struct domain *d = v->domain;
 
     if ( v->vcpu_info_mfn == INVALID_MFN )
         return;
@@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
     mfn = v->vcpu_info_mfn;
     unmap_domain_page_global((void *)
                              ((unsigned long)v->vcpu_info & PAGE_MASK));
-
-    v->vcpu_info = &dummy_vcpu_info;
+    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
+                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
+                    : &dummy_vcpu_info);
     v->vcpu_info_mfn = INVALID_MFN;
 
     put_page_and_type(mfn_to_page(mfn));
@@ -1446,6 +1447,34 @@ int continue_hypercall_on_cpu(
     return 0;
 }
 
+int domain_soft_reset(struct domain *d)
+{
+    struct vcpu *v;
+    int ret = 0;
+
+    for_each_vcpu ( d, v )
+    {
+        if ( v != current )
+            vcpu_pause(v);
+    }
+
+    evtchn_reset(d, 1);
+
+    ret = grant_table_soft_reset(d);
+    if ( ret )
+        goto vcpu_unpause;
+
+    for_each_vcpu ( d, v )
+            unmap_vcpu_info(v);
+
+ vcpu_unpause:
+    for_each_vcpu ( d, v )
+        if ( v != current )
+            vcpu_unpause(v);
+
+    return ret;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index f5a2e55..a76de60 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1030,6 +1030,10 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case SCHEDOP_soft_reset:
+        ret = domain_soft_reset(current->domain);
+        break;
+
     default:
         ret = -ENOSYS;
     }
diff --git a/xen/include/public/sched.h b/xen/include/public/sched.h
index 4000ac9..2fa65f3 100644
--- a/xen/include/public/sched.h
+++ b/xen/include/public/sched.h
@@ -118,6 +118,13 @@
  * With id != 0 and timeout != 0, poke watchdog timer and set new timeout.
  */
 #define SCHEDOP_watchdog    6
+
+/*
+ * Do soft reset.
+ */
+
+#define SCHEDOP_soft_reset  7
+
 /* ` } */
 
 struct sched_shutdown {
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 848db8a..57f8ddd 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -78,6 +78,8 @@ void arch_dump_domain_info(struct domain *d);
 
 int arch_vcpu_reset(struct vcpu *);
 
+int domain_soft_reset(struct domain *d);
+
 extern spinlock_t vcpu_alloc_lock;
 bool_t domctl_lock_acquire(void);
 void domctl_lock_release(void);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset()
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2015-06-03 13:35 ` [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset Vitaly Kuznetsov
@ 2015-06-03 13:35 ` Vitaly Kuznetsov
  2015-06-04 14:19   ` Tim Deegan
  2015-06-08 15:32   ` Jan Beulich
  2015-06-04 14:22 ` [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Tim Deegan
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-03 13:35 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Tim Deegan
  Cc: Andrew Jones, Julien Grall, Wei Liu, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Keir Fraser, Daniel De Graaf

x86-specific hook cleans up the pirq-emuirq mappings and replaces the
shared_info frame with an empty page to support subsequent
XENMAPSPACE_shared_info call.

ARM-specific hook is a noop for now.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 xen/arch/arm/domain.c    |  5 ++++
 xen/arch/x86/domain.c    | 73 ++++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/domain.c      |  2 ++
 xen/include/xen/domain.h |  2 ++
 4 files changed, 82 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 24b8938..c112afa 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -653,6 +653,11 @@ void arch_domain_unpause(struct domain *d)
 {
 }
 
+int arch_domain_soft_reset(struct domain *d)
+{
+    return 0;
+}
+
 static int is_guest_pv32_psr(uint32_t psr)
 {
     switch (psr & PSR_MODE_MASK)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1f1550e..2d23c17 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -679,6 +679,79 @@ void arch_domain_unpause(struct domain *d)
         viridian_time_ref_count_thaw(d);
 }
 
+int arch_domain_soft_reset(struct domain *d)
+{
+    struct page_info *page = virt_to_page(d->shared_info), *new_page;
+    int ret = 0;
+    struct domain *owner;
+    unsigned long mfn, mfn_new, gfn;
+    p2m_type_t __p2mt;
+
+    if ( is_hvm_domain(d) )
+    {
+        int i;
+
+        spin_lock(&d->event_lock);
+        for ( i = 1; i < d->nr_pirqs ; i ++ )
+            if ( domain_pirq_to_emuirq(d, i) != IRQ_UNBOUND )
+            {
+                ret = unmap_domain_pirq_emuirq(d, i);
+                if (ret)
+                    break;
+            }
+        spin_unlock(&d->event_lock);
+    }
+
+    if ( ret || !d->shared_info || !page )
+        return ret;
+
+    /*
+     * shared_info page needs to be replaced with a new page, otherwise we
+     * will get a hole if the domain does XENMAPSPACE_shared_info
+     */
+
+    owner = page_get_owner_and_reference(page);
+    if ( owner != d )
+    {
+        put_page(page);
+        return 0;
+    }
+
+    mfn = page_to_mfn(page);
+    if ( !mfn_valid(mfn) )
+    {
+        printk(XENLOG_G_ERR "Dom%d's shared_info page points to invalid MFN\n",
+               d->domain_id);
+        ret = -EINVAL;
+        goto fail_put_page;
+    }
+
+    gfn = mfn_to_gmfn(d, mfn);
+    get_gfn_query(d, gfn, &__p2mt);
+
+    new_page = alloc_domheap_page(d, 0);
+    if ( !new_page )
+    {
+        printk(XENLOG_G_ERR "Failed to alloc a page to replace"
+               " Dom%d's shared_info frame %lx\n", d->domain_id, gfn);
+        ret = -ENOMEM;
+        goto fail_put_gfn;
+    }
+    mfn_new = page_to_mfn(new_page);
+    guest_physmap_remove_page(d, gfn, mfn, 0);
+
+    ret = guest_physmap_add_page(d, gfn, mfn_new, 0);
+    if ( ret )
+        printk(XENLOG_G_ERR "Failed to add a page to replace"
+               " Dom%d's shared_info frame %lx\n", d->domain_id, gfn);
+ fail_put_gfn:
+    put_gfn(d, gfn);
+ fail_put_page:
+    put_page(page);
+
+    return ret;
+}
+
 unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4)
 {
     unsigned long hv_cr4_mask, hv_cr4 = real_cr4_to_pv_guest_cr4(read_cr4());
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 44e532b..0ab1a71 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1467,6 +1467,8 @@ int domain_soft_reset(struct domain *d)
     for_each_vcpu ( d, v )
             unmap_vcpu_info(v);
 
+    ret = arch_domain_soft_reset(d);
+
  vcpu_unpause:
     for_each_vcpu ( d, v )
         if ( v != current )
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 57f8ddd..311bead 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -65,6 +65,8 @@ void arch_domain_shutdown(struct domain *d);
 void arch_domain_pause(struct domain *d);
 void arch_domain_unpause(struct domain *d);
 
+int arch_domain_soft_reset(struct domain *d);
+
 int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
 void arch_get_info_guest(struct vcpu *, vcpu_guest_context_u);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-03 13:35 ` [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset Vitaly Kuznetsov
@ 2015-06-04 14:05   ` Tim Deegan
  2015-06-04 15:19     ` David Vrabel
  2015-06-08 14:10   ` Jan Beulich
  1 sibling, 1 reply; 39+ messages in thread
From: Tim Deegan @ 2015-06-04 14:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

At 15:35 +0200 on 03 Jun (1433345719), Vitaly Kuznetsov wrote:
> We need to close all event channel so the domain performing soft reset
> will be able to open them back. Interdomain channels are, however,
> special. We need to keep track of who opened it as in (the most common)
> case it was opened by the control domain we won't be able (and allowed) to
> re-establish it.


I'm not sure I understand -- can you give an example of what this is
avoiding?  I would have thought that the kexec'ing VM needs to tear
down _all_ its connections and then restart the ones it wantrs in the
new OS.

Cheers,

Tim.

> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  xen/common/event_channel.c | 52 +++++++++++++++++++++++++++-------------------
>  xen/include/xen/event.h    |  3 +++
>  xen/include/xen/sched.h    |  1 +
>  3 files changed, 35 insertions(+), 21 deletions(-)
> 
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index fae242d..3204c74 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -274,11 +274,13 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>  
>      lchn->u.interdomain.remote_dom  = rd;
>      lchn->u.interdomain.remote_port = rport;
> +    lchn->u.interdomain.opened_by   = current->domain;
>      lchn->state                     = ECS_INTERDOMAIN;
>      evtchn_port_init(ld, lchn);
>      
>      rchn->u.interdomain.remote_dom  = ld;
>      rchn->u.interdomain.remote_port = lport;
> +    rchn->u.interdomain.opened_by   = current->domain;
>      rchn->state                     = ECS_INTERDOMAIN;
>  
>      /*
> @@ -933,26 +935,30 @@ int evtchn_unmask(unsigned int port)
>  }
>  
>  
> -static long evtchn_reset(evtchn_reset_t *r)
> +void evtchn_reset(struct domain *d, bool_t soft_reset)
>  {
> -    domid_t dom = r->dom;
> -    struct domain *d;
> -    int i, rc;
> -
> -    d = rcu_lock_domain_by_any_id(dom);
> -    if ( d == NULL )
> -        return -ESRCH;
> -
> -    rc = xsm_evtchn_reset(XSM_TARGET, current->domain, d);
> -    if ( rc )
> -        goto out;
> +    int i;
> +    struct evtchn *chn;
>  
> +    /*
> +     * ECS_INTERDOMAIN channels with port number suitable for the 2-level ABI
> +     * opened by other domains should remain opened as the domain doing soft
> +     * reset won't be able to reopen them.
> +     * __evtchn_close() also leaves consumer_is_xen() channels open.
> +     */
>      for ( i = 0; port_is_valid(d, i); i++ )
> +    {
> +        chn = evtchn_from_port(d, i);
> +        if ( !soft_reset ||
> +             i >= (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) ||
> +             chn->state != ECS_INTERDOMAIN ||
> +             chn->u.interdomain.opened_by == d )
>          (void)__evtchn_close(d, i);
> +    }
>  
>      spin_lock(&d->event_lock);
>  
> -    if ( (dom == DOMID_SELF) && d->evtchn_fifo )
> +    if ( (d == current->domain) && d->evtchn_fifo )
>      {
>          /*
>           * Guest domain called EVTCHNOP_reset with DOMID_SELF, destroying
> @@ -964,13 +970,6 @@ static long evtchn_reset(evtchn_reset_t *r)
>      }
>  
>      spin_unlock(&d->event_lock);
> -
> -    rc = 0;
> -
> -out:
> -    rcu_unlock_domain(d);
> -
> -    return rc;
>  }
>  
>  static long evtchn_set_priority(const struct evtchn_set_priority *set_priority)
> @@ -1097,9 +1096,20 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      case EVTCHNOP_reset: {
>          struct evtchn_reset reset;
> +        struct domain *d;
> +
>          if ( copy_from_guest(&reset, arg, 1) != 0 )
>              return -EFAULT;
> -        rc = evtchn_reset(&reset);
> +
> +        d = rcu_lock_domain_by_any_id(reset.dom);
> +        if ( d == NULL )
> +            return -ESRCH;
> +
> +        rc = xsm_evtchn_reset(XSM_TARGET, current->domain, d);
> +        if ( !rc )
> +            evtchn_reset(d, 0);
> +
> +        rcu_unlock_domain(d);
>          break;
>      }
>  
> diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
> index 690f865..d0479a6 100644
> --- a/xen/include/xen/event.h
> +++ b/xen/include/xen/event.h
> @@ -130,6 +130,9 @@ void evtchn_check_pollers(struct domain *d, unsigned int port);
>  
>  void evtchn_2l_init(struct domain *d);
>  
> +/* Close all event channels and reset to 2-level ABI */
> +void evtchn_reset(struct domain *d, bool_t soft_reset);
> +
>  /*
>   * Low-level event channel port ops.
>   */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 80c6f62..13b6b86 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -98,6 +98,7 @@ struct evtchn
>          struct {
>              evtchn_port_t  remote_port;
>              struct domain *remote_dom;
> +            struct domain *opened_by;
>          } interdomain; /* state == ECS_INTERDOMAIN */
>          struct {
>              u32            irq;
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-03 13:35 ` [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset() Vitaly Kuznetsov
@ 2015-06-04 14:11   ` Tim Deegan
  2015-06-04 15:22     ` David Vrabel
  2015-06-08 14:26   ` Jan Beulich
  1 sibling, 1 reply; 39+ messages in thread
From: Tim Deegan @ 2015-06-04 14:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

At 15:35 +0200 on 03 Jun (1433345720), Vitaly Kuznetsov wrote:
> When soft reset is being performed we need to replace all actively
> granted pages with empty pages to prevent possible future memory
> corruption as the newly started kernel won't be aware of these
> granted pages.

I'm not sure about this one.  IMO it's the guest's responsibility to
find and disable whatever holds these grants.  This is by analogy with
a bus-master-capable device in a real PC: until the device is found
and reset, the new OS can't use memory outside its guaranteed safe
pool.

There has been talk before of an operation along the lines of "revoke
this grant entry and wait for all users to release it" but that would
only help with backends that are guaranteed to release grants in a
reasonable time.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset()
  2015-06-03 13:35 ` [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset() Vitaly Kuznetsov
@ 2015-06-04 14:19   ` Tim Deegan
  2015-06-22  9:44     ` Vitaly Kuznetsov
  2015-06-08 15:32   ` Jan Beulich
  1 sibling, 1 reply; 39+ messages in thread
From: Tim Deegan @ 2015-06-04 14:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

At 15:35 +0200 on 03 Jun (1433345722), Vitaly Kuznetsov wrote:
> x86-specific hook cleans up the pirq-emuirq mappings and replaces the
> shared_info frame with an empty page to support subsequent
> XENMAPSPACE_shared_info call.

That's a bit roundabout.  I think we might be better off allocating a
new shared-info page and abandoning the old one as an ordinary guest
RAM page.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2015-06-03 13:35 ` [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset() Vitaly Kuznetsov
@ 2015-06-04 14:22 ` Tim Deegan
  2015-06-08 15:38 ` Ian Jackson
  2015-06-08 15:53 ` Wei Liu
  6 siblings, 0 replies; 39+ messages in thread
From: Tim Deegan @ 2015-06-04 14:22 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

Hi,

At 15:35 +0200 on 03 Jun (1433345718), Vitaly Kuznetsov wrote:
> last week you expressed some concerns about if the toolstack-based
> approach to PVHVM guest kexec is the best. Here you can see the 'reset
> everything' approach to the same problem. It is the bare minimum of what
> should be done to make it possible for the new kernel to boot.

Thansk for prototyping this - it's very helpful.

> Despite
> the fact that this implementation is much smaller in size and that it
> doesn't require any toolstack changes I'm personally in favor of the
> previous toolstack-based approach as it looks more general, less fragile
> and easier to support to me. Please let me know your thoughts.

On the contrary, this looks neater, faster and with fewer special
cases (at least in the hypervisor) than the p2m-copying code.
I prefer it.  (I do appreciate that there may be more things needed to
get to a fully working system).

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-04 14:05   ` Tim Deegan
@ 2015-06-04 15:19     ` David Vrabel
  2015-06-04 15:47       ` Tim Deegan
  0 siblings, 1 reply; 39+ messages in thread
From: David Vrabel @ 2015-06-04 15:19 UTC (permalink / raw)
  To: Tim Deegan, Vitaly Kuznetsov
  Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Wei Liu, Daniel De Graaf

On 04/06/15 15:05, Tim Deegan wrote:
> At 15:35 +0200 on 03 Jun (1433345719), Vitaly Kuznetsov wrote:
>> We need to close all event channel so the domain performing soft reset
>> will be able to open them back. Interdomain channels are, however,
>> special. We need to keep track of who opened it as in (the most common)
>> case it was opened by the control domain we won't be able (and allowed) to
>> re-establish it.
> 
> 
> I'm not sure I understand -- can you give an example of what this is
> avoiding?  I would have thought that the kexec'ing VM needs to tear
> down _all_ its connections and then restart the ones it wantrs in the
> new OS.

There are some that are in state ECS_UNBOUND (console, xenstore, I
think) at start of day that are allocated on behalf of the domain by the
toolstack.  The kexec'd image will expect them in these same state.

David

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-04 14:11   ` Tim Deegan
@ 2015-06-04 15:22     ` David Vrabel
  2015-06-04 15:44       ` Tim Deegan
  0 siblings, 1 reply; 39+ messages in thread
From: David Vrabel @ 2015-06-04 15:22 UTC (permalink / raw)
  To: Tim Deegan, Vitaly Kuznetsov
  Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Wei Liu, Daniel De Graaf

On 04/06/15 15:11, Tim Deegan wrote:
> 
> There has been talk before of an operation along the lines of "revoke
> this grant entry and wait for all users to release it" but that would
> only help with backends that are guaranteed to release grants in a
> reasonable time.

Are you talking about my idea? This would be immediate because the grant
mapper would provide a local page to atomically switch the foreign
mapping to when the grant is revoked.

But, this isn't something that would be generally applicable to all grants.

David

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-04 15:22     ` David Vrabel
@ 2015-06-04 15:44       ` Tim Deegan
  0 siblings, 0 replies; 39+ messages in thread
From: Tim Deegan @ 2015-06-04 15:44 UTC (permalink / raw)
  To: David Vrabel
  Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Olaf Hering,
	Jan Beulich, xen-devel, Wei Liu, Vitaly Kuznetsov,
	Daniel De Graaf

At 16:22 +0100 on 04 Jun (1433434934), David Vrabel wrote:
> On 04/06/15 15:11, Tim Deegan wrote:
> > 
> > There has been talk before of an operation along the lines of "revoke
> > this grant entry and wait for all users to release it" but that would
> > only help with backends that are guaranteed to release grants in a
> > reasonable time.
> 
> Are you talking about my idea? This would be immediate because the grant
> mapper would provide a local page to atomically switch the foreign
> mapping to when the grant is revoked.

That's a stronger operation than I remembered, and similar to what's
happening here.

If this reset operation is to be called from the guest, then all the
guest can do is reshuffle its own memory -- it can't expect to have
enough spare memory to cover all outstanding granted pages.

But TBH this is a bit of a red herring; I ought not to have mentioned
it. :)

Tim.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-04 15:19     ` David Vrabel
@ 2015-06-04 15:47       ` Tim Deegan
  2015-06-05  8:52         ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Tim Deegan @ 2015-06-04 15:47 UTC (permalink / raw)
  To: David Vrabel
  Cc: Andrew Jones, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Olaf Hering,
	Jan Beulich, xen-devel, Wei Liu, Vitaly Kuznetsov,
	Daniel De Graaf

At 16:19 +0100 on 04 Jun (1433434780), David Vrabel wrote:
> On 04/06/15 15:05, Tim Deegan wrote:
> > At 15:35 +0200 on 03 Jun (1433345719), Vitaly Kuznetsov wrote:
> >> We need to close all event channel so the domain performing soft reset
> >> will be able to open them back. Interdomain channels are, however,
> >> special. We need to keep track of who opened it as in (the most common)
> >> case it was opened by the control domain we won't be able (and allowed) to
> >> re-establish it.
> > 
> > 
> > I'm not sure I understand -- can you give an example of what this is
> > avoiding?  I would have thought that the kexec'ing VM needs to tear
> > down _all_ its connections and then restart the ones it wantrs in the
> > new OS.
> 
> There are some that are in state ECS_UNBOUND (console, xenstore, I
> think) at start of day that are allocated on behalf of the domain by the
> toolstack.  The kexec'd image will expect them in these same state.

I see.  But does that not come under "tear down all its connections
and then restart the ones it wants"?  I.e. should the guest not reset
all its channels and then allocate fresh console & xenstore ones?

Tim.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-04 15:47       ` Tim Deegan
@ 2015-06-05  8:52         ` Jan Beulich
  2015-06-05  8:58           ` Ian Campbell
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-05  8:52 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	David Vrabel, xen-devel, Vitaly Kuznetsov, Keir Fraser,
	Daniel De Graaf

>>> On 04.06.15 at 17:47, <tim@xen.org> wrote:
> At 16:19 +0100 on 04 Jun (1433434780), David Vrabel wrote:
>> On 04/06/15 15:05, Tim Deegan wrote:
>> > At 15:35 +0200 on 03 Jun (1433345719), Vitaly Kuznetsov wrote:
>> >> We need to close all event channel so the domain performing soft reset
>> >> will be able to open them back. Interdomain channels are, however,
>> >> special. We need to keep track of who opened it as in (the most common)
>> >> case it was opened by the control domain we won't be able (and allowed) to
>> >> re-establish it.
>> > 
>> > 
>> > I'm not sure I understand -- can you give an example of what this is
>> > avoiding?  I would have thought that the kexec'ing VM needs to tear
>> > down _all_ its connections and then restart the ones it wantrs in the
>> > new OS.
>> 
>> There are some that are in state ECS_UNBOUND (console, xenstore, I
>> think) at start of day that are allocated on behalf of the domain by the
>> toolstack.  The kexec'd image will expect them in these same state.
> 
> I see.  But does that not come under "tear down all its connections
> and then restart the ones it wants"?  I.e. should the guest not reset
> all its channels and then allocate fresh console & xenstore ones?

But just like during boot the guest doesn't (and can't) created these,
it also can't do so after kexec. The remote side needs to be set up
for it, so it can simply bind to them. I don't think this can work without
tool stack involvement.

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-05  8:52         ` Jan Beulich
@ 2015-06-05  8:58           ` Ian Campbell
  2015-06-05  9:07             ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Ian Campbell @ 2015-06-05  8:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Ian Jackson, Andrew Jones, Tim Deegan,
	David Vrabel, xen-devel, Vitaly Kuznetsov, Keir Fraser,
	Daniel De Graaf

On Fri, 2015-06-05 at 09:52 +0100, Jan Beulich wrote:
> >>> On 04.06.15 at 17:47, <tim@xen.org> wrote:
> > At 16:19 +0100 on 04 Jun (1433434780), David Vrabel wrote:
> >> On 04/06/15 15:05, Tim Deegan wrote:
> >> > At 15:35 +0200 on 03 Jun (1433345719), Vitaly Kuznetsov wrote:
> >> >> We need to close all event channel so the domain performing soft reset
> >> >> will be able to open them back. Interdomain channels are, however,
> >> >> special. We need to keep track of who opened it as in (the most common)
> >> >> case it was opened by the control domain we won't be able (and allowed) to
> >> >> re-establish it.
> >> > 
> >> > 
> >> > I'm not sure I understand -- can you give an example of what this is
> >> > avoiding?  I would have thought that the kexec'ing VM needs to tear
> >> > down _all_ its connections and then restart the ones it wantrs in the
> >> > new OS.
> >> 
> >> There are some that are in state ECS_UNBOUND (console, xenstore, I
> >> think) at start of day that are allocated on behalf of the domain by the
> >> toolstack.  The kexec'd image will expect them in these same state.
> > 
> > I see.  But does that not come under "tear down all its connections
> > and then restart the ones it wants"?  I.e. should the guest not reset
> > all its channels and then allocate fresh console & xenstore ones?
> 
> But just like during boot the guest doesn't (and can't) created these,
> it also can't do so after kexec. The remote side needs to be set up
> for it, so it can simply bind to them. I don't think this can work without
> tool stack involvement.

Which IIRC is pretty much the logic which lead to the more toolstack
centric approach in the other series. Are you suggesting something in
the middle?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-05  8:58           ` Ian Campbell
@ 2015-06-05  9:07             ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2015-06-05  9:07 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Olaf Hering, WeiLiu, Stefano Stabellini, Andrew Cooper,
	Julien Grall, Ian Jackson, Andrew Jones, Tim Deegan,
	David Vrabel, xen-devel, Vitaly Kuznetsov, Keir Fraser,
	Daniel DeGraaf

>>> On 05.06.15 at 10:58, <ian.campbell@citrix.com> wrote:
> On Fri, 2015-06-05 at 09:52 +0100, Jan Beulich wrote:
>> >>> On 04.06.15 at 17:47, <tim@xen.org> wrote:
>> > At 16:19 +0100 on 04 Jun (1433434780), David Vrabel wrote:
>> >> On 04/06/15 15:05, Tim Deegan wrote:
>> >> > At 15:35 +0200 on 03 Jun (1433345719), Vitaly Kuznetsov wrote:
>> >> >> We need to close all event channel so the domain performing soft reset
>> >> >> will be able to open them back. Interdomain channels are, however,
>> >> >> special. We need to keep track of who opened it as in (the most common)
>> >> >> case it was opened by the control domain we won't be able (and allowed) to
>> >> >> re-establish it.
>> >> > 
>> >> > 
>> >> > I'm not sure I understand -- can you give an example of what this is
>> >> > avoiding?  I would have thought that the kexec'ing VM needs to tear
>> >> > down _all_ its connections and then restart the ones it wantrs in the
>> >> > new OS.
>> >> 
>> >> There are some that are in state ECS_UNBOUND (console, xenstore, I
>> >> think) at start of day that are allocated on behalf of the domain by the
>> >> toolstack.  The kexec'd image will expect them in these same state.
>> > 
>> > I see.  But does that not come under "tear down all its connections
>> > and then restart the ones it wants"?  I.e. should the guest not reset
>> > all its channels and then allocate fresh console & xenstore ones?
>> 
>> But just like during boot the guest doesn't (and can't) created these,
>> it also can't do so after kexec. The remote side needs to be set up
>> for it, so it can simply bind to them. I don't think this can work without
>> tool stack involvement.
> 
> Which IIRC is pretty much the logic which lead to the more toolstack
> centric approach in the other series. Are you suggesting something in
> the middle?

The main issue with the other series was the moving of all the
memory pages of the guest; I didn't mind there being tool stack
parts (and considering the split of duties between hypervisor and
tool stack it would also seem suspicious if domain reset could be
implemented entirely in the hypervisor).

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-03 13:35 ` [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset Vitaly Kuznetsov
  2015-06-04 14:05   ` Tim Deegan
@ 2015-06-08 14:10   ` Jan Beulich
  2015-06-08 15:05     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-08 14:10 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -274,11 +274,13 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>  
>      lchn->u.interdomain.remote_dom  = rd;
>      lchn->u.interdomain.remote_port = rport;
> +    lchn->u.interdomain.opened_by   = current->domain;
>      lchn->state                     = ECS_INTERDOMAIN;
>      evtchn_port_init(ld, lchn);
>      
>      rchn->u.interdomain.remote_dom  = ld;
>      rchn->u.interdomain.remote_port = lport;
> +    rchn->u.interdomain.opened_by   = current->domain;
>      rchn->state                     = ECS_INTERDOMAIN;

For one ld == current->domain. And I don't think you need to store
domain pointers here; storing the domain ID would seem sufficient
(with the nice benefit of not growing struct evtchn). Plus
"opened_by" doesn't really reflect what is being done here - the
event channels are being bound, not opened.

> @@ -933,26 +935,30 @@ int evtchn_unmask(unsigned int port)
>  }
>  
>  
> -static long evtchn_reset(evtchn_reset_t *r)
> +void evtchn_reset(struct domain *d, bool_t soft_reset)
>  {
> -    domid_t dom = r->dom;
> -    struct domain *d;
> -    int i, rc;
> -
> -    d = rcu_lock_domain_by_any_id(dom);
> -    if ( d == NULL )
> -        return -ESRCH;
> -
> -    rc = xsm_evtchn_reset(XSM_TARGET, current->domain, d);
> -    if ( rc )
> -        goto out;
> +    int i;

unsigned int

> +    struct evtchn *chn;

const (and perhaps moved inside the loop below)

> +    /*
> +     * ECS_INTERDOMAIN channels with port number suitable for the 2-level ABI
> +     * opened by other domains should remain opened as the domain doing soft
> +     * reset won't be able to reopen them.

One question of course is - does this really apply to _all_ interdomain
event channels (rather than just to the xenstore and xenconsole ones)?

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-03 13:35 ` [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset() Vitaly Kuznetsov
  2015-06-04 14:11   ` Tim Deegan
@ 2015-06-08 14:26   ` Jan Beulich
  2015-06-08 14:58     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-08 14:26 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
> When soft reset is being performed we need to replace all actively
> granted pages with empty pages to prevent possible future memory
> corruption as the newly started kernel won't be aware of these
> granted pages.
> 
> We make the tot_pages < max_pages assumption here: previously granted pages
> need to belong to someone and we don't want to implement possible DoS by
> reassigning them to the grantee/anonymous domain/xen/.. (the malicious guest
> will be able to consume all host's memory).

How is that going to look in practice? I.e. won't this cause frequent
failures?

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3120,6 +3120,68 @@ gnttab_release_mappings(
>      }
>  }
>  
> +int grant_table_soft_reset(struct domain *d)
> +{
> +    struct grant_table *gt = d->grant_table;
> +    struct active_grant_entry *act;
> +    grant_ref_t ref;
> +    unsigned long mfn_new;
> +    struct page_info *page, *new_page;
> +    int ret = 0;
> +
> +    spin_lock(&gt->lock);
> +
> +    for ( ref = 0; ref != nr_grant_entries(gt); ref++ )
> +    {
> +        act = &active_entry(gt, ref);
> +        if ( !act->pin )
> +            continue;
> +
> +        page = mfn_to_page(act->frame);
> +        if ( !page || !get_page(page, d) )

Under what conditions would mfn_to_page() return NULL? I.e. don't
you rather need to check mfn_valid(act->frame) if you want some
sort of guarantee that the pointer you obtain can be dereferenced?
(Quite likely this could be an ASSERT() instead of an if(), as is done
elsewhere.)

> +        {
> +            printk(XENLOG_G_ERR "Dom%d's grant frame %lx (gfn %lx) is invalid",
> +                   d->domain_id, act->gfn, act->frame);

Either you mean "mfn" in the format string, or you got ->gfn and
->frame swapped here.

> +            ret = -EINVAL;
> +            break;

If this happens after a successful replacement of another granted
page, what is the expected disposition of the domain? I.e. is it
correct to just return failure to the caller, rather than killing the
domain (which is now in an inconsistent state)?

> +        }
> +
> +        /*
> +         * Here we assume the domain has tot_pages < max_pages. Previously
> +         * granted page will still belong to us until it is unmapped from
> +         * grantee.
> +         */
> +        new_page = alloc_domheap_page(d, 0);
> +        if ( !new_page )
> +        {
> +            printk(XENLOG_G_ERR "Failed to alloc a page to replace"
> +                   " Dom%d's grant frame %lx\n", d->domain_id, act->frame);
> +            ret = -ENOMEM;
> +            put_page(page);
> +            break;
> +        }
> +        mfn_new = page_to_mfn(new_page);
> +        guest_remove_page(d, act->gfn);
> +
> +        if ( guest_physmap_add_page(d, act->gfn, mfn_new, 0) )
> +        {
> +            printk(XENLOG_G_ERR "Failed to replace Dom%d's grant frame %lx"
> +                   " with new MFN %lx\n", d->domain_id, act->frame, mfn_new);
> +            ret = -EFAULT;
> +        }
> +
> +        gnttab_flush_tlb(d);

Is this needed (a) at all or (b) on each iteration, considering that you
don't remove from the grantee or de-allocate the old page?

> +        put_page(new_page);
> +        put_page(page);

Do you really need to hold onto the page until here?

> +        if (ret)

Missing spaces inside the parentheses.

> +            break;
> +    }
> +
> +    spin_unlock(&gt->lock);
> +    return ret;

Blank line before final return statement please.

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-03 13:35 ` [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset Vitaly Kuznetsov
@ 2015-06-08 14:31   ` Jan Beulich
  2015-06-22 16:00     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-08 14:31 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1116,12 +1116,12 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset)
>  
>  /*
>   * Unmap the vcpu info page if the guest decided to place it somewhere
> - * else.  This is only used from arch_domain_destroy, so there's no
> - * need to do anything clever.
> + * else.
>   */
>  void unmap_vcpu_info(struct vcpu *v)

Please retain the full comment, extending it by the new caller's name.

> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>      mfn = v->vcpu_info_mfn;
>      unmap_domain_page_global((void *)
>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
> -
> -    v->vcpu_info = &dummy_vcpu_info;
> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])

Is this cast really needed?

> @@ -1446,6 +1447,34 @@ int continue_hypercall_on_cpu(
>      return 0;
>  }
>  
> +int domain_soft_reset(struct domain *d)
> +{
> +    struct vcpu *v;
> +    int ret = 0;

Pointless initializer.

> +    for_each_vcpu ( d, v )
> +    {
> +        if ( v != current )
> +            vcpu_pause(v);
> +    }
> +
> +    evtchn_reset(d, 1);
> +
> +    ret = grant_table_soft_reset(d);
> +    if ( ret )
> +        goto vcpu_unpause;
> +
> +    for_each_vcpu ( d, v )
> +            unmap_vcpu_info(v);

Indentation.

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-08 14:26   ` Jan Beulich
@ 2015-06-08 14:58     ` Vitaly Kuznetsov
  2015-06-08 15:35       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-08 14:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>> When soft reset is being performed we need to replace all actively
>> granted pages with empty pages to prevent possible future memory
>> corruption as the newly started kernel won't be aware of these
>> granted pages.
>> 
>> We make the tot_pages < max_pages assumption here: previously granted pages
>> need to belong to someone and we don't want to implement possible DoS by
>> reassigning them to the grantee/anonymous domain/xen/.. (the malicious guest
>> will be able to consume all host's memory).
>
> How is that going to look in practice? I.e. won't this cause frequent
> failures?
>

I'm not sure we actually need that in practice. In my testing backends
(even with persistent grants enabled) collaborate nicely and release all
grants. I can see a single page still being held and I suppose it's
being held by QEMU (haven't checked what that but I think it is the
console ring). In case we go for the toolstack-assisted approach we can
restart qemu and add some warning when there are active grants.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset
  2015-06-08 14:10   ` Jan Beulich
@ 2015-06-08 15:05     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-08 15:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -274,11 +274,13 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>  
>>      lchn->u.interdomain.remote_dom  = rd;
>>      lchn->u.interdomain.remote_port = rport;
>> +    lchn->u.interdomain.opened_by   = current->domain;
>>      lchn->state                     = ECS_INTERDOMAIN;
>>      evtchn_port_init(ld, lchn);
>>      
>>      rchn->u.interdomain.remote_dom  = ld;
>>      rchn->u.interdomain.remote_port = lport;
>> +    rchn->u.interdomain.opened_by   = current->domain;
>>      rchn->state                     = ECS_INTERDOMAIN;
>
> For one ld == current->domain. And I don't think you need to store
> domain pointers here; storing the domain ID would seem sufficient
> (with the nice benefit of not growing struct evtchn). Plus
> "opened_by" doesn't really reflect what is being done here - the
> event channels are being bound, not opened.
>
>> @@ -933,26 +935,30 @@ int evtchn_unmask(unsigned int port)
>>  }
>>  
>>  
>> -static long evtchn_reset(evtchn_reset_t *r)
>> +void evtchn_reset(struct domain *d, bool_t soft_reset)
>>  {
>> -    domid_t dom = r->dom;
>> -    struct domain *d;
>> -    int i, rc;
>> -
>> -    d = rcu_lock_domain_by_any_id(dom);
>> -    if ( d == NULL )
>> -        return -ESRCH;
>> -
>> -    rc = xsm_evtchn_reset(XSM_TARGET, current->domain, d);
>> -    if ( rc )
>> -        goto out;
>> +    int i;
>
> unsigned int
>
>> +    struct evtchn *chn;
>
> const (and perhaps moved inside the loop below)
>
>> +    /*
>> +     * ECS_INTERDOMAIN channels with port number suitable for the 2-level ABI
>> +     * opened by other domains should remain opened as the domain doing soft
>> +     * reset won't be able to reopen them.
>
> One question of course is - does this really apply to _all_ interdomain
> event channels (rather than just to the xenstore and xenconsole ones)?

The wild guess here is that if a channel was bound by someone else the
domain doing this reset operation won't probably be able (and allowed)
to bind such interdomain connectivity so we need to keep it, from the
hypervisor PoV xenstore and xenconsole channels are not special. We,
however, won't probably need that in case we go for the
toolstack-assisted approach (and as far as I understand there are no
objections against such approach for now but I'll have to prototype it
so we can make a decision.)

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset()
  2015-06-03 13:35 ` [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset() Vitaly Kuznetsov
  2015-06-04 14:19   ` Tim Deegan
@ 2015-06-08 15:32   ` Jan Beulich
  2015-06-08 15:59     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-08 15:32 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
> +int arch_domain_soft_reset(struct domain *d)
> +{
> +    struct page_info *page = virt_to_page(d->shared_info), *new_page;
> +    int ret = 0;
> +    struct domain *owner;
> +    unsigned long mfn, mfn_new, gfn;
> +    p2m_type_t __p2mt;

No need for leading underscores here.

> +    if ( is_hvm_domain(d) )
> +    {

Isn't the shared_info manipulation also HVM specific? Or at least
paging_mode_translate()-specific? xenmem_add_to_physmap_one()
bails on !paging_mode_translate()... But then I'm not really
understanding the issue you try to address there anyway, and
hence I may be overlooking something: Why does allocating a new
page help with a subsequent XENMAPSPACE_shared_info add-to-
physmap request?

> +        int i;

unsigned int

> +
> +        spin_lock(&d->event_lock);
> +        for ( i = 1; i < d->nr_pirqs ; i ++ )
> +            if ( domain_pirq_to_emuirq(d, i) != IRQ_UNBOUND )
> +            {
> +                ret = unmap_domain_pirq_emuirq(d, i);
> +                if (ret)

Coding style.

> +                    break;
> +            }
> +        spin_unlock(&d->event_lock);
> +    }
> +
> +    if ( ret || !d->shared_info || !page )

Again - how would page end up being NULL here?

> +        return ret;
> +
> +    /*
> +     * shared_info page needs to be replaced with a new page, otherwise we
> +     * will get a hole if the domain does XENMAPSPACE_shared_info
> +     */
> +
> +    owner = page_get_owner_and_reference(page);
> +    if ( owner != d )
> +    {
> +        put_page(page);
> +        return 0;

Isn't this an error?

> +    }
> +
> +    mfn = page_to_mfn(page);
> +    if ( !mfn_valid(mfn) )
> +    {
> +        printk(XENLOG_G_ERR "Dom%d's shared_info page points to invalid MFN\n",
> +               d->domain_id);
> +        ret = -EINVAL;
> +        goto fail_put_page;
> +    }
> +
> +    gfn = mfn_to_gmfn(d, mfn);
> +    get_gfn_query(d, gfn, &__p2mt);

Ignoring the return value? I think you should do any consistency
check you can in code like this.

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1467,6 +1467,8 @@ int domain_soft_reset(struct domain *d)
>      for_each_vcpu ( d, v )
>              unmap_vcpu_info(v);
>  
> +    ret = arch_domain_soft_reset(d);
> +
>   vcpu_unpause:
>      for_each_vcpu ( d, v )
>          if ( v != current )

Similar question as on an earlier patch - is it really correct to unpause
the vCPU-s again after a possible failure from arch_domain_soft_reset()?

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset()
  2015-06-08 14:58     ` Vitaly Kuznetsov
@ 2015-06-08 15:35       ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2015-06-08 15:35 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 08.06.15 at 16:58, <vkuznets@redhat.com> wrote:
> "Jan Beulich" <JBeulich@suse.com> writes:
> 
>>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>>> When soft reset is being performed we need to replace all actively
>>> granted pages with empty pages to prevent possible future memory
>>> corruption as the newly started kernel won't be aware of these
>>> granted pages.
>>> 
>>> We make the tot_pages < max_pages assumption here: previously granted pages
>>> need to belong to someone and we don't want to implement possible DoS by
>>> reassigning them to the grantee/anonymous domain/xen/.. (the malicious guest
>>> will be able to consume all host's memory).
>>
>> How is that going to look in practice? I.e. won't this cause frequent
>> failures?
>>
> 
> I'm not sure we actually need that in practice. In my testing backends
> (even with persistent grants enabled) collaborate nicely and release all
> grants. I can see a single page still being held and I suppose it's
> being held by QEMU (haven't checked what that but I think it is the
> console ring). In case we go for the toolstack-assisted approach we can
> restart qemu and add some warning when there are active grants.

But even a single page could cause the allocation to fail because of
otherwise going over the set limit.

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2015-06-04 14:22 ` [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Tim Deegan
@ 2015-06-08 15:38 ` Ian Jackson
  2015-06-08 15:53 ` Wei Liu
  6 siblings, 0 replies; 39+ messages in thread
From: Ian Jackson @ 2015-06-08 15:38 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Tim Deegan, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

Vitaly Kuznetsov writes ("[PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec"):
> 1) As XS_RESET_WATCHES is not supported by oxenstored we need to try removing
> the watch in case add operation failed, e.g.:

It would surely be better to fix oxenstored.  IIRC the approach of
trying to track down watches is not, in general, sufficient.  (BICBW.)

Ian.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2015-06-08 15:38 ` Ian Jackson
@ 2015-06-08 15:53 ` Wei Liu
  2015-06-08 17:43   ` Wei Liu
  6 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-06-08 15:53 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Tim Deegan, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, Andrew Cooper, xen-devel,
	Daniel De Graaf, Dave.Scott

On Wed, Jun 03, 2015 at 03:35:18PM +0200, Vitaly Kuznetsov wrote:
> Jan and Tim,
> 
> last week you expressed some concerns about if the toolstack-based
> approach to PVHVM guest kexec is the best. Here you can see the 'reset
> everything' approach to the same problem. It is the bare minimum of what
> should be done to make it possible for the new kernel to boot. Despite
> the fact that this implementation is much smaller in size and that it
> doesn't require any toolstack changes I'm personally in favor of the
> previous toolstack-based approach as it looks more general, less fragile
> and easier to support to me. Please let me know your thoughts.
> 
> I used SCHEDOP_ interface here as DOMCTL_* is not currently supported in
> Linux kernel and I seriously doubt we need to support something different
> from DOMID_SELF for soft reset.
> 
> Current Linux kernel requires two changes to make use of these hypervisor
> changes:
> 1) As XS_RESET_WATCHES is not supported by oxenstored we need to try removing
> the watch in case add operation failed, e.g.:

The changeset to implement XS_RESET_WATCHES in cxenstored is
1f9d04fb021cbf35cc420d401a88c696d6524c14

It doesn't look too complicated to do that in oxenstored. Dave
(oxenstored maintainer, CC'ed) might have insight.

Wei.

> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -712,6 +712,10 @@ int register_xenbus_watch(struct xenbus_watch *watch)
>         spin_unlock(&watches_lock);
>  
>         err = xs_watch(watch->node, token);
> +       if (err) {
> +               if (!xs_unwatch(watch->node, token))
> +                       err = xs_watch(watch->node, token);
> +       }
>  
>         if (err) {
>                 spin_lock(&watches_lock);
> 
> 2) HYPERVISOR_sched_op(SCHEDOP_soft_reset, NULL) is supposed to be called in
> kexec/kdump situation (via machine_ops.shutdown and machine_ops.crash_shutdown
> handlers).
> 
> v7 of the toolstack-based approach is available here:
> http://comments.gmane.org/gmane.comp.emulators.xen.devel/245183
> (for some reason it is missing on http://lists.xen.org/archives/html/xen-devel/2015-05/,
> there are no messages in May after May, 26 and this series was sent on May, 27)
> 
> Vitaly Kuznetsov (4):
>   xen: evtchn: make evtchn_reset() ready for soft reset
>   xen: grant_table: implement grant_table_soft_reset()
>   xen: implement SCHEDOP_soft_reset
>   xen: arch-specific hooks for domain_soft_reset()
> 
>  xen/arch/arm/domain.c         |  5 +++
>  xen/arch/x86/domain.c         | 73 +++++++++++++++++++++++++++++++++++++++++++
>  xen/common/domain.c           | 39 ++++++++++++++++++++---
>  xen/common/event_channel.c    | 52 +++++++++++++++++-------------
>  xen/common/grant_table.c      | 62 ++++++++++++++++++++++++++++++++++++
>  xen/common/schedule.c         |  4 +++
>  xen/include/public/sched.h    |  7 +++++
>  xen/include/xen/domain.h      |  4 +++
>  xen/include/xen/event.h       |  3 ++
>  xen/include/xen/grant_table.h |  6 ++++
>  xen/include/xen/sched.h       |  1 +
>  11 files changed, 231 insertions(+), 25 deletions(-)
> 
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset()
  2015-06-08 15:32   ` Jan Beulich
@ 2015-06-08 15:59     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-08 15:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>> +int arch_domain_soft_reset(struct domain *d)
>> +{
>> +    struct page_info *page = virt_to_page(d->shared_info), *new_page;
>> +    int ret = 0;
>> +    struct domain *owner;
>> +    unsigned long mfn, mfn_new, gfn;
>> +    p2m_type_t __p2mt;
>
> No need for leading underscores here.
>
>> +    if ( is_hvm_domain(d) )
>> +    {
>
> Isn't the shared_info manipulation also HVM specific? Or at least
> paging_mode_translate()-specific? xenmem_add_to_physmap_one()
> bails on !paging_mode_translate()... But then I'm not really
> understanding the issue you try to address there anyway, and
> hence I may be overlooking something: Why does allocating a new
> page help with a subsequent XENMAPSPACE_shared_info add-to-
> physmap request?

When an HVM domain starts it selects one of its pages and sacrifices it
in favor of the shared_info page which belongs to xen heap,
XENMAPSPACE_shared_info maps this page instead of the sacrificed one
which is being freed. After kexec the domain can select another page it
is willing to sacrifice and ask to map shared_info there but if we just
remap it from its previous location we get a hole. So here I'm trying to
unmap the shared_info page from domain's address space replacing it with
an empty page.

The other possible solution was suggested by Tim: instead of sucha a
swap reassign the currently-used shared_info page to the domain and
allocate a new shared_info page in xen heap. I haven't tried this
approach.

>> +        return ret;
>> +
>> +    /*
>> +     * shared_info page needs to be replaced with a new page, otherwise we
>> +     * will get a hole if the domain does XENMAPSPACE_shared_info
>> +     */
>> +
>> +    owner = page_get_owner_and_reference(page);
>> +    if ( owner != d )
>> +    {
>> +        put_page(page);
>> +        return 0;
>
> Isn't this an error?
>

It probably means the shared_info page was never mapped to the domain's
space, no need to replace it.

>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -1467,6 +1467,8 @@ int domain_soft_reset(struct domain *d)
>>      for_each_vcpu ( d, v )
>>              unmap_vcpu_info(v);
>>  
>> +    ret = arch_domain_soft_reset(d);
>> +
>>   vcpu_unpause:
>>      for_each_vcpu ( d, v )
>>          if ( v != current )
>
> Similar question as on an earlier patch - is it really correct to unpause
> the vCPU-s again after a possible failure from arch_domain_soft_reset()?

Not sure what's the best here: unpausing and hoping the domain can cope
(e.g. in kdump case we don't need much) or destroying the domain as we
can't recover from all possible failures.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-08 15:53 ` Wei Liu
@ 2015-06-08 17:43   ` Wei Liu
  2015-06-09  8:19     ` Dave Scott
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-06-08 17:43 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Tim Deegan, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, Andrew Cooper, xen-devel,
	Daniel De Graaf, Dave.Scott

On Mon, Jun 08, 2015 at 04:53:46PM +0100, Wei Liu wrote:
> On Wed, Jun 03, 2015 at 03:35:18PM +0200, Vitaly Kuznetsov wrote:
> > Jan and Tim,
> > 
> > last week you expressed some concerns about if the toolstack-based
> > approach to PVHVM guest kexec is the best. Here you can see the 'reset
> > everything' approach to the same problem. It is the bare minimum of what
> > should be done to make it possible for the new kernel to boot. Despite
> > the fact that this implementation is much smaller in size and that it
> > doesn't require any toolstack changes I'm personally in favor of the
> > previous toolstack-based approach as it looks more general, less fragile
> > and easier to support to me. Please let me know your thoughts.
> > 
> > I used SCHEDOP_ interface here as DOMCTL_* is not currently supported in
> > Linux kernel and I seriously doubt we need to support something different
> > from DOMID_SELF for soft reset.
> > 
> > Current Linux kernel requires two changes to make use of these hypervisor
> > changes:
> > 1) As XS_RESET_WATCHES is not supported by oxenstored we need to try removing
> > the watch in case add operation failed, e.g.:
> 
> The changeset to implement XS_RESET_WATCHES in cxenstored is
> 1f9d04fb021cbf35cc420d401a88c696d6524c14
> 
> It doesn't look too complicated to do that in oxenstored. Dave
> (oxenstored maintainer, CC'ed) might have insight.
> 
> Wei.
> 

And I took a stab at it. Here is my oxenstored patch. May contain
bugs. :-)

---8<---
>From 8dd35370442340493f856b0be8d99c87243e79f6 Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Mon, 8 Jun 2015 18:39:45 +0100
Subject: [PATCH] oxenstored: implement XS_RESET_WATCHES

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/ocaml/libs/xb/op.ml           | 6 ++++--
 tools/ocaml/libs/xb/xb.mli          | 1 +
 tools/ocaml/xenstored/connection.ml | 8 ++++++++
 tools/ocaml/xenstored/process.ml    | 6 ++++++
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/ocaml/libs/xb/op.ml b/tools/ocaml/libs/xb/op.ml
index 0ee8666..69346d8 100644
--- a/tools/ocaml/libs/xb/op.ml
+++ b/tools/ocaml/libs/xb/op.ml
@@ -19,7 +19,8 @@ type operation = Debug | Directory | Read | Getperms |
                  Transaction_end | Introduce | Release |
                  Getdomainpath | Write | Mkdir | Rm |
                  Setperms | Watchevent | Error | Isintroduced |
-                 Resume | Set_target | Restrict | Invalid
+                 Resume | Set_target | Restrict | Reset_watches |
+                 Invalid
 
 let operation_c_mapping =
 	[| Debug; Directory; Read; Getperms;
@@ -27,7 +28,7 @@ let operation_c_mapping =
            Transaction_end; Introduce; Release;
            Getdomainpath; Write; Mkdir; Rm;
            Setperms; Watchevent; Error; Isintroduced;
-           Resume; Set_target; Restrict |]
+           Resume; Set_target; Restrict; Reset_watches |]
 let size = Array.length operation_c_mapping
 
 let array_search el a =
@@ -68,4 +69,5 @@ let to_string ty =
 	| Resume		-> "RESUME"
 	| Set_target		-> "SET_TARGET"
 	| Restrict		-> "RESTRICT"
+	| Reset_watches         -> "RESET_WATCHES"
 	| Invalid		-> "INVALID"
diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli
index 4e1f833..6c242da 100644
--- a/tools/ocaml/libs/xb/xb.mli
+++ b/tools/ocaml/libs/xb/xb.mli
@@ -23,6 +23,7 @@ module Op :
       | Resume
       | Set_target
       | Restrict
+      | Reset_watches
       | Invalid
     val operation_c_mapping : operation array
     val size : int
diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index b4dc9cb..5e31588 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -186,6 +186,14 @@ let del_watch con path token =
 	con.nb_watches <- con.nb_watches - 1;
 	apath, w
 
+let del_watches con =
+  Hashtbl.clear con.watches;
+  con.next_tid <- initial_next_tid
+
+let del_transactions con =
+  Hashtbl.clear con.transactions;
+  con.nb_watches <- 0
+
 let list_watches con =
 	let ll = Hashtbl.fold 
 		(fun _ watches acc -> List.map (fun watch -> watch.path, watch.token) watches :: acc)
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index 0620585..e827678 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -272,6 +272,11 @@ let do_restrict con t domains cons data =
 	in
 	Connection.restrict con domid
 
+(* only in xen >= 4.2 *)
+let do_reset_watches con t domains cons data =
+  Connection.del_watches con;
+  Connection.del_transactions con
+
 (* only in >= xen3.3                                                                                    *)
 (* we ensure backward compatibility with restrict by counting the number of argument of set_target ...  *)
 (* This is not very elegant, but it is safe as 'restrict' only restricts permission of dom0 connections *)
@@ -324,6 +329,7 @@ let function_of_type ty =
 	| Xenbus.Xb.Op.Resume            -> reply_ack do_resume
 	| Xenbus.Xb.Op.Set_target        -> reply_ack do_set_target
 	| Xenbus.Xb.Op.Restrict          -> reply_ack do_restrict
+	| Xenbus.Xb.Op.Reset_watches     -> reply_ack do_reset_watches
 	| Xenbus.Xb.Op.Invalid           -> reply_ack do_error
 	| _                              -> reply_ack do_error
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-08 17:43   ` Wei Liu
@ 2015-06-09  8:19     ` Dave Scott
  2015-06-09  9:29       ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Dave Scott @ 2015-06-09  8:19 UTC (permalink / raw)
  To: Wei Liu
  Cc: Andrew Jones, Keir (Xen.org),
	Ian Campbell, Andrew Cooper, Julien Grall, Tim (Xen.org),
	Olaf Hering, Stefano Stabellini, David Vrabel, Jan Beulich,
	Ian Jackson, xen-devel, Vitaly Kuznetsov, Daniel De Graaf,
	Dave Scott


> On 8 Jun 2015, at 18:43, Wei Liu <wei.liu2@citrix.com> wrote:
> 
> On Mon, Jun 08, 2015 at 04:53:46PM +0100, Wei Liu wrote:
>> On Wed, Jun 03, 2015 at 03:35:18PM +0200, Vitaly Kuznetsov wrote:
>>> Jan and Tim,
>>> 
>>> last week you expressed some concerns about if the toolstack-based
>>> approach to PVHVM guest kexec is the best. Here you can see the 'reset
>>> everything' approach to the same problem. It is the bare minimum of what
>>> should be done to make it possible for the new kernel to boot. Despite
>>> the fact that this implementation is much smaller in size and that it
>>> doesn't require any toolstack changes I'm personally in favor of the
>>> previous toolstack-based approach as it looks more general, less fragile
>>> and easier to support to me. Please let me know your thoughts.
>>> 
>>> I used SCHEDOP_ interface here as DOMCTL_* is not currently supported in
>>> Linux kernel and I seriously doubt we need to support something different
>>> from DOMID_SELF for soft reset.
>>> 
>>> Current Linux kernel requires two changes to make use of these hypervisor
>>> changes:
>>> 1) As XS_RESET_WATCHES is not supported by oxenstored we need to try removing
>>> the watch in case add operation failed, e.g.:
>> 
>> The changeset to implement XS_RESET_WATCHES in cxenstored is
>> 1f9d04fb021cbf35cc420d401a88c696d6524c14
>> 
>> It doesn't look too complicated to do that in oxenstored. Dave
>> (oxenstored maintainer, CC'ed) might have insight.
>> 
>> Wei.
>> 
> 
> And I took a stab at it. Here is my oxenstored patch. May contain
> bugs. :-)

That was quick :-)

I have only one question, see below:

> 
> ---8<---
> From 8dd35370442340493f856b0be8d99c87243e79f6 Mon Sep 17 00:00:00 2001
> From: Wei Liu <wei.liu2@citrix.com>
> Date: Mon, 8 Jun 2015 18:39:45 +0100
> Subject: [PATCH] oxenstored: implement XS_RESET_WATCHES
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
> tools/ocaml/libs/xb/op.ml           | 6 ++++--
> tools/ocaml/libs/xb/xb.mli          | 1 +
> tools/ocaml/xenstored/connection.ml | 8 ++++++++
> tools/ocaml/xenstored/process.ml    | 6 ++++++
> 4 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/ocaml/libs/xb/op.ml b/tools/ocaml/libs/xb/op.ml
> index 0ee8666..69346d8 100644
> --- a/tools/ocaml/libs/xb/op.ml
> +++ b/tools/ocaml/libs/xb/op.ml
> @@ -19,7 +19,8 @@ type operation = Debug | Directory | Read | Getperms |
>                  Transaction_end | Introduce | Release |
>                  Getdomainpath | Write | Mkdir | Rm |
>                  Setperms | Watchevent | Error | Isintroduced |
> -                 Resume | Set_target | Restrict | Invalid
> +                 Resume | Set_target | Restrict | Reset_watches |
> +                 Invalid
> 
> let operation_c_mapping =
> 	[| Debug; Directory; Read; Getperms;
> @@ -27,7 +28,7 @@ let operation_c_mapping =
>            Transaction_end; Introduce; Release;
>            Getdomainpath; Write; Mkdir; Rm;
>            Setperms; Watchevent; Error; Isintroduced;
> -           Resume; Set_target; Restrict |]
> +           Resume; Set_target; Restrict; Reset_watches |]
> let size = Array.length operation_c_mapping
> 
> let array_search el a =
> @@ -68,4 +69,5 @@ let to_string ty =
> 	| Resume		-> "RESUME"
> 	| Set_target		-> "SET_TARGET"
> 	| Restrict		-> "RESTRICT"
> +	| Reset_watches         -> "RESET_WATCHES"
> 	| Invalid		-> "INVALID"
> diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli
> index 4e1f833..6c242da 100644
> --- a/tools/ocaml/libs/xb/xb.mli
> +++ b/tools/ocaml/libs/xb/xb.mli
> @@ -23,6 +23,7 @@ module Op :
>       | Resume
>       | Set_target
>       | Restrict
> +      | Reset_watches
>       | Invalid
>     val operation_c_mapping : operation array
>     val size : int
> diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
> index b4dc9cb..5e31588 100644
> --- a/tools/ocaml/xenstored/connection.ml
> +++ b/tools/ocaml/xenstored/connection.ml
> @@ -186,6 +186,14 @@ let del_watch con path token =
> 	con.nb_watches <- con.nb_watches - 1;
> 	apath, w
> 
> +let del_watches con =
> +  Hashtbl.clear con.watches;
> +  con.next_tid <- initial_next_tid

I couldn’t spot the part in the C version where the next transaction id is reset — is this a (minor) difference between the two implementations, or have I misread the code?

If you could clarify that for me, and assuming this all compiled ok, then feel free to add

Acked-by: David Scott <dave.scott@citrix.com>

Cheers,
Dave

> +
> +let del_transactions con =
> +  Hashtbl.clear con.transactions;
> +  con.nb_watches <- 0
> +
> let list_watches con =
> 	let ll = Hashtbl.fold 
> 		(fun _ watches acc -> List.map (fun watch -> watch.path, watch.token) watches :: acc)
> diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
> index 0620585..e827678 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -272,6 +272,11 @@ let do_restrict con t domains cons data =
> 	in
> 	Connection.restrict con domid
> 
> +(* only in xen >= 4.2 *)
> +let do_reset_watches con t domains cons data =
> +  Connection.del_watches con;
> +  Connection.del_transactions con
> +
> (* only in >= xen3.3                                                                                    *)
> (* we ensure backward compatibility with restrict by counting the number of argument of set_target ...  *)
> (* This is not very elegant, but it is safe as 'restrict' only restricts permission of dom0 connections *)
> @@ -324,6 +329,7 @@ let function_of_type ty =
> 	| Xenbus.Xb.Op.Resume            -> reply_ack do_resume
> 	| Xenbus.Xb.Op.Set_target        -> reply_ack do_set_target
> 	| Xenbus.Xb.Op.Restrict          -> reply_ack do_restrict
> +	| Xenbus.Xb.Op.Reset_watches     -> reply_ack do_reset_watches
> 	| Xenbus.Xb.Op.Invalid           -> reply_ack do_error
> 	| _                              -> reply_ack do_error
> 
> -- 
> 1.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-09  8:19     ` Dave Scott
@ 2015-06-09  9:29       ` Wei Liu
  2015-06-09  9:38         ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-06-09  9:29 UTC (permalink / raw)
  To: Dave Scott
  Cc: Andrew Jones, Wei Liu, Ian Campbell, Andrew Cooper, Julien Grall,
	Tim (Xen.org),
	Olaf Hering, Stefano Stabellini, David Vrabel, Jan Beulich,
	Ian Jackson, xen-devel, Vitaly Kuznetsov, Keir (Xen.org),
	Daniel De Graaf

On Tue, Jun 09, 2015 at 09:19:19AM +0100, Dave Scott wrote:
> 
> > On 8 Jun 2015, at 18:43, Wei Liu <wei.liu2@citrix.com> wrote:
> > 
> > On Mon, Jun 08, 2015 at 04:53:46PM +0100, Wei Liu wrote:
> >> On Wed, Jun 03, 2015 at 03:35:18PM +0200, Vitaly Kuznetsov wrote:
> >>> Jan and Tim,
> >>> 
> >>> last week you expressed some concerns about if the toolstack-based
> >>> approach to PVHVM guest kexec is the best. Here you can see the 'reset
> >>> everything' approach to the same problem. It is the bare minimum of what
> >>> should be done to make it possible for the new kernel to boot. Despite
> >>> the fact that this implementation is much smaller in size and that it
> >>> doesn't require any toolstack changes I'm personally in favor of the
> >>> previous toolstack-based approach as it looks more general, less fragile
> >>> and easier to support to me. Please let me know your thoughts.
> >>> 
> >>> I used SCHEDOP_ interface here as DOMCTL_* is not currently supported in
> >>> Linux kernel and I seriously doubt we need to support something different
> >>> from DOMID_SELF for soft reset.
> >>> 
> >>> Current Linux kernel requires two changes to make use of these hypervisor
> >>> changes:
> >>> 1) As XS_RESET_WATCHES is not supported by oxenstored we need to try removing
> >>> the watch in case add operation failed, e.g.:
> >> 
> >> The changeset to implement XS_RESET_WATCHES in cxenstored is
> >> 1f9d04fb021cbf35cc420d401a88c696d6524c14
> >> 
> >> It doesn't look too complicated to do that in oxenstored. Dave
> >> (oxenstored maintainer, CC'ed) might have insight.
> >> 
> >> Wei.
> >> 
> > 
> > And I took a stab at it. Here is my oxenstored patch. May contain
> > bugs. :-)
> 
> That was quick :-)
> 
> I have only one question, see below:
> 
> > 
> > ---8<---
> > From 8dd35370442340493f856b0be8d99c87243e79f6 Mon Sep 17 00:00:00 2001
> > From: Wei Liu <wei.liu2@citrix.com>
> > Date: Mon, 8 Jun 2015 18:39:45 +0100
> > Subject: [PATCH] oxenstored: implement XS_RESET_WATCHES
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> > tools/ocaml/libs/xb/op.ml           | 6 ++++--
> > tools/ocaml/libs/xb/xb.mli          | 1 +
> > tools/ocaml/xenstored/connection.ml | 8 ++++++++
> > tools/ocaml/xenstored/process.ml    | 6 ++++++
> > 4 files changed, 19 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/ocaml/libs/xb/op.ml b/tools/ocaml/libs/xb/op.ml
> > index 0ee8666..69346d8 100644
> > --- a/tools/ocaml/libs/xb/op.ml
> > +++ b/tools/ocaml/libs/xb/op.ml
> > @@ -19,7 +19,8 @@ type operation = Debug | Directory | Read | Getperms |
> >                  Transaction_end | Introduce | Release |
> >                  Getdomainpath | Write | Mkdir | Rm |
> >                  Setperms | Watchevent | Error | Isintroduced |
> > -                 Resume | Set_target | Restrict | Invalid
> > +                 Resume | Set_target | Restrict | Reset_watches |
> > +                 Invalid
> > 
> > let operation_c_mapping =
> > 	[| Debug; Directory; Read; Getperms;
> > @@ -27,7 +28,7 @@ let operation_c_mapping =
> >            Transaction_end; Introduce; Release;
> >            Getdomainpath; Write; Mkdir; Rm;
> >            Setperms; Watchevent; Error; Isintroduced;
> > -           Resume; Set_target; Restrict |]
> > +           Resume; Set_target; Restrict; Reset_watches |]
> > let size = Array.length operation_c_mapping
> > 
> > let array_search el a =
> > @@ -68,4 +69,5 @@ let to_string ty =
> > 	| Resume		-> "RESUME"
> > 	| Set_target		-> "SET_TARGET"
> > 	| Restrict		-> "RESTRICT"
> > +	| Reset_watches         -> "RESET_WATCHES"
> > 	| Invalid		-> "INVALID"
> > diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli
> > index 4e1f833..6c242da 100644
> > --- a/tools/ocaml/libs/xb/xb.mli
> > +++ b/tools/ocaml/libs/xb/xb.mli
> > @@ -23,6 +23,7 @@ module Op :
> >       | Resume
> >       | Set_target
> >       | Restrict
> > +      | Reset_watches
> >       | Invalid
> >     val operation_c_mapping : operation array
> >     val size : int
> > diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
> > index b4dc9cb..5e31588 100644
> > --- a/tools/ocaml/xenstored/connection.ml
> > +++ b/tools/ocaml/xenstored/connection.ml
> > @@ -186,6 +186,14 @@ let del_watch con path token =
> > 	con.nb_watches <- con.nb_watches - 1;
> > 	apath, w
> > 
> > +let del_watches con =
> > +  Hashtbl.clear con.watches;
> > +  con.next_tid <- initial_next_tid
> 
> I couldn’t spot the part in the C version where the next transaction id is reset — is this a (minor) difference between the two implementations, or have I misread the code?
> 
> If you could clarify that for me, and assuming this all compiled ok, then feel free to add
> 

Yes, it compiles OK.

I think I will just delete that line. I couldn't recollect why I added
that. Maybe it's copy-n-paste error.

Can I still have your ack if I drop that line?

> Acked-by: David Scott <dave.scott@citrix.com>
> 
> Cheers,
> Dave

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-09  9:29       ` Wei Liu
@ 2015-06-09  9:38         ` Wei Liu
  2015-06-09 10:02           ` Dave Scott
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-06-09  9:38 UTC (permalink / raw)
  To: Dave Scott
  Cc: Andrew Jones, Wei Liu, Ian Campbell, Andrew Cooper, Julien Grall,
	Tim (Xen.org),
	Olaf Hering, Stefano Stabellini, David Vrabel, Jan Beulich,
	Ian Jackson, xen-devel, Vitaly Kuznetsov, Keir (Xen.org),
	Daniel De Graaf

On Tue, Jun 09, 2015 at 10:29:56AM +0100, Wei Liu wrote:
[...]
> > > +  con.next_tid <- initial_next_tid
> > 
> > I couldn’t spot the part in the C version where the next transaction id is reset — is this a (minor) difference between the two implementations, or have I misread the code?
> > 
> > If you could clarify that for me, and assuming this all compiled ok, then feel free to add
> > 
> 
> Yes, it compiles OK.
> 
> I think I will just delete that line. I couldn't recollect why I added
> that. Maybe it's copy-n-paste error.
> 
> Can I still have your ack if I drop that line?
> 

And a new hunk will be added in the next version to deal with
logging.ml.

diff --git a/tools/ocaml/xenstored/logging.ml b/tools/ocaml/xenstored/logging.ml
index 665b922..4c90032 100644
--- a/tools/ocaml/xenstored/logging.ml
+++ b/tools/ocaml/xenstored/logging.ml
@@ -242,6 +242,7 @@ let string_of_access_type = function
        | Xenbus.Xb.Op.Rm                -> "rm       "
        | Xenbus.Xb.Op.Setperms          -> "setperms "
        | Xenbus.Xb.Op.Restrict          -> "restrict "
+       | Xenbus.Xb.Op.Reset_watches     -> "reset watches"
        | Xenbus.Xb.Op.Set_target        -> "settarget"
 
        | Xenbus.Xb.Op.Error             -> "error    "


> > Acked-by: David Scott <dave.scott@citrix.com>
> > 
> > Cheers,
> > Dave

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec
  2015-06-09  9:38         ` Wei Liu
@ 2015-06-09 10:02           ` Dave Scott
  0 siblings, 0 replies; 39+ messages in thread
From: Dave Scott @ 2015-06-09 10:02 UTC (permalink / raw)
  To: Wei Liu
  Cc: Andrew Jones, Keir (Xen.org),
	Ian Campbell, Andrew Cooper, Julien Grall, Tim (Xen.org),
	Olaf Hering, Stefano Stabellini, David Vrabel, Jan Beulich,
	Ian Jackson, xen-devel, Vitaly Kuznetsov, Daniel De Graaf


> On 9 Jun 2015, at 10:38, Wei Liu <wei.liu2@citrix.com> wrote:
> 
> On Tue, Jun 09, 2015 at 10:29:56AM +0100, Wei Liu wrote:
> [...]
>>>> +  con.next_tid <- initial_next_tid
>>> 
>>> I couldn’t spot the part in the C version where the next transaction id is reset — is this a (minor) difference between the two implementations, or have I misread the code?
>>> 
>>> If you could clarify that for me, and assuming this all compiled ok, then feel free to add
>>> 
>> 
>> Yes, it compiles OK.
>> 
>> I think I will just delete that line. I couldn't recollect why I added
>> that. Maybe it's copy-n-paste error.
>> 
>> Can I still have your ack if I drop that line?

Yes

>> 
> 
> And a new hunk will be added in the next version to deal with
> logging.ml.
> 
> diff --git a/tools/ocaml/xenstored/logging.ml b/tools/ocaml/xenstored/logging.ml
> index 665b922..4c90032 100644
> --- a/tools/ocaml/xenstored/logging.ml
> +++ b/tools/ocaml/xenstored/logging.ml
> @@ -242,6 +242,7 @@ let string_of_access_type = function
>        | Xenbus.Xb.Op.Rm                -> "rm       "
>        | Xenbus.Xb.Op.Setperms          -> "setperms "
>        | Xenbus.Xb.Op.Restrict          -> "restrict "
> +       | Xenbus.Xb.Op.Reset_watches     -> "reset watches"
>        | Xenbus.Xb.Op.Set_target        -> "settarget"
> 
>        | Xenbus.Xb.Op.Error             -> "error    “

All still fine:

Acked-by: David Scott <dave.scott@citrix.com>

Cheers,
Dave



> 
> 
>>> Acked-by: David Scott <dave.scott@citrix.com>
>>> 
>>> Cheers,
>>> Dave

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset()
  2015-06-04 14:19   ` Tim Deegan
@ 2015-06-22  9:44     ` Vitaly Kuznetsov
  2015-06-25  9:57       ` Tim Deegan
  0 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-22  9:44 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

Tim Deegan <tim@xen.org> writes:

> At 15:35 +0200 on 03 Jun (1433345722), Vitaly Kuznetsov wrote:
>> x86-specific hook cleans up the pirq-emuirq mappings and replaces the
>> shared_info frame with an empty page to support subsequent
>> XENMAPSPACE_shared_info call.
>
> That's a bit roundabout.  I think we might be better off allocating a
> new shared-info page and abandoning the old one as an ordinary guest
> RAM page.
>

I've tried looking into such approach and there is an issue: shared_info
page is being allocated from Xen heap and in case it's built with
CONFIG_SEPARATE_XENHEAP we can't just reassign the page to the domain's
pagelist or we'll open a possibility for a domain to move itself
entirely to Xen's heap by e.g looping over all its gfns, doing
XENMAPSPACE_shared_info and perforting soft reset.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-08 14:31   ` Jan Beulich
@ 2015-06-22 16:00     ` Vitaly Kuznetsov
  2015-06-22 16:06       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-22 16:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>>      mfn = v->vcpu_info_mfn;
>>      unmap_domain_page_global((void *)
>>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
>> -
>> -    v->vcpu_info = &dummy_vcpu_info;
>> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
>> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
>
> Is this cast really needed?
>

Without it my gcc-4.8.3 complains:

domain.c: In function ‘unmap_vcpu_info’:
domain.c:1158:21: error: pointer type mismatch in conditional expression [-Werror]
                     : &dummy_vcpu_info);
                     ^
cc1: all warnings being treated as errors

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-22 16:00     ` Vitaly Kuznetsov
@ 2015-06-22 16:06       ` Jan Beulich
  2015-06-22 16:24         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-22 16:06 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 22.06.15 at 18:00, <vkuznets@redhat.com> wrote:
> "Jan Beulich" <JBeulich@suse.com> writes:
> 
>>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>>> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>>>      mfn = v->vcpu_info_mfn;
>>>      unmap_domain_page_global((void *)
>>>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
>>> -
>>> -    v->vcpu_info = &dummy_vcpu_info;
>>> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
>>> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
>>
>> Is this cast really needed?
>>
> 
> Without it my gcc-4.8.3 complains:
> 
> domain.c: In function ‘unmap_vcpu_info’:
> domain.c:1158:21: error: pointer type mismatch in conditional expression 
> [-Werror]
>                      : &dummy_vcpu_info);
>                      ^
> cc1: all warnings being treated as errors

Which is the kind of warning one normally should _not_ work
around by adding a cast.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-22 16:06       ` Jan Beulich
@ 2015-06-22 16:24         ` Vitaly Kuznetsov
  2015-06-23  7:13           ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-22 16:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 22.06.15 at 18:00, <vkuznets@redhat.com> wrote:
>> "Jan Beulich" <JBeulich@suse.com> writes:
>> 
>>>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>>>> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>>>>      mfn = v->vcpu_info_mfn;
>>>>      unmap_domain_page_global((void *)
>>>>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
>>>> -
>>>> -    v->vcpu_info = &dummy_vcpu_info;
>>>> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
>>>> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
>>>
>>> Is this cast really needed?
>>>
>> 
>> Without it my gcc-4.8.3 complains:
>> 
>> domain.c: In function ‘unmap_vcpu_info’:
>> domain.c:1158:21: error: pointer type mismatch in conditional expression 
>> [-Werror]
>>                      : &dummy_vcpu_info);
>>                      ^
>> cc1: all warnings being treated as errors
>
> Which is the kind of warning one normally should _not_ work
> around by adding a cast.

In this (and in alloc_vcpu() from where this expression was copied)
particular case this is probably OK: in struct shared_info we have
'struct vcpu_info vcpu_info[XEN_LEGACY_MAX_VCPUS]' as its first
member. But I may be missing something..

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-22 16:24         ` Vitaly Kuznetsov
@ 2015-06-23  7:13           ` Jan Beulich
  2015-06-23 12:10             ` Vitaly Kuznetsov
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-06-23  7:13 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 22.06.15 at 18:24, <vkuznets@redhat.com> wrote:
> "Jan Beulich" <JBeulich@suse.com> writes:
> 
>>>>> On 22.06.15 at 18:00, <vkuznets@redhat.com> wrote:
>>> "Jan Beulich" <JBeulich@suse.com> writes:
>>> 
>>>>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>>>>> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>>>>>      mfn = v->vcpu_info_mfn;
>>>>>      unmap_domain_page_global((void *)
>>>>>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
>>>>> -
>>>>> -    v->vcpu_info = &dummy_vcpu_info;
>>>>> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
>>>>> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
>>>>
>>>> Is this cast really needed?
>>>>
>>> 
>>> Without it my gcc-4.8.3 complains:
>>> 
>>> domain.c: In function ‘unmap_vcpu_info’:
>>> domain.c:1158:21: error: pointer type mismatch in conditional expression 
>>> [-Werror]
>>>                      : &dummy_vcpu_info);
>>>                      ^
>>> cc1: all warnings being treated as errors
>>
>> Which is the kind of warning one normally should _not_ work
>> around by adding a cast.
> 
> In this (and in alloc_vcpu() from where this expression was copied)
> particular case this is probably OK: in struct shared_info we have
> 'struct vcpu_info vcpu_info[XEN_LEGACY_MAX_VCPUS]' as its first
> member. But I may be missing something..

Did you read the comment accompanying the definition of
__shared_info()?

The cast is presumably safe here, but it doesn't _look_ safe. And
for future readers (and future changes) it would be better if it did.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-23  7:13           ` Jan Beulich
@ 2015-06-23 12:10             ` Vitaly Kuznetsov
  2015-06-23 12:52               ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2015-06-23 12:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 22.06.15 at 18:24, <vkuznets@redhat.com> wrote:
>> "Jan Beulich" <JBeulich@suse.com> writes:
>> 
>>>>>> On 22.06.15 at 18:00, <vkuznets@redhat.com> wrote:
>>>> "Jan Beulich" <JBeulich@suse.com> writes:
>>>> 
>>>>>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>>>>>> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>>>>>>      mfn = v->vcpu_info_mfn;
>>>>>>      unmap_domain_page_global((void *)
>>>>>>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
>>>>>> -
>>>>>> -    v->vcpu_info = &dummy_vcpu_info;
>>>>>> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
>>>>>> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
>>>>>
>>>>> Is this cast really needed?
>>>>>
>>>> 
>>>> Without it my gcc-4.8.3 complains:
>>>> 
>>>> domain.c: In function ‘unmap_vcpu_info’:
>>>> domain.c:1158:21: error: pointer type mismatch in conditional expression 
>>>> [-Werror]
>>>>                      : &dummy_vcpu_info);
>>>>                      ^
>>>> cc1: all warnings being treated as errors
>>>
>>> Which is the kind of warning one normally should _not_ work
>>> around by adding a cast.
>> 
>> In this (and in alloc_vcpu() from where this expression was copied)
>> particular case this is probably OK: in struct shared_info we have
>> 'struct vcpu_info vcpu_info[XEN_LEGACY_MAX_VCPUS]' as its first
>> member. But I may be missing something..
>
> Did you read the comment accompanying the definition of
> __shared_info()?
>
> The cast is presumably safe here, but it doesn't _look_ safe. And
> for future readers (and future changes) it would be better if it did.

Sorry, it seems I don't undestand the suggestion on how to make this
look better.

With CONFIG_COMPAT in both struct shared_info and struct
compat_shared_info vcpu_info array is of type 'struct vcpu_info[]' but
v->vcpu_info is of type 'vcpu_info_t *' which means union of 'struct
vcpu_info' and 'struct compat_vcpu_info' in CONFIG_COMPAT case. Both
these structures have same size of '64' on x86 so it's really up to its
user how to treat this data...

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset
  2015-06-23 12:10             ` Vitaly Kuznetsov
@ 2015-06-23 12:52               ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2015-06-23 12:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Olaf Hering, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Julien Grall, Ian Jackson, Andrew Jones,
	Tim Deegan, David Vrabel, xen-devel, Daniel De Graaf,
	Keir Fraser

>>> On 23.06.15 at 14:10, <vkuznets@redhat.com> wrote:
> "Jan Beulich" <JBeulich@suse.com> writes:
> 
>>>>> On 22.06.15 at 18:24, <vkuznets@redhat.com> wrote:
>>> "Jan Beulich" <JBeulich@suse.com> writes:
>>> 
>>>>>>> On 22.06.15 at 18:00, <vkuznets@redhat.com> wrote:
>>>>> "Jan Beulich" <JBeulich@suse.com> writes:
>>>>> 
>>>>>>>>> On 03.06.15 at 15:35, <vkuznets@redhat.com> wrote:
>>>>>>> @@ -1129,8 +1129,9 @@ void unmap_vcpu_info(struct vcpu *v)
>>>>>>>      mfn = v->vcpu_info_mfn;
>>>>>>>      unmap_domain_page_global((void *)
>>>>>>>                               ((unsigned long)v->vcpu_info & PAGE_MASK));
>>>>>>> -
>>>>>>> -    v->vcpu_info = &dummy_vcpu_info;
>>>>>>> +    v->vcpu_info = ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS)
>>>>>>> +                    ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id])
>>>>>>
>>>>>> Is this cast really needed?
>>>>>>
>>>>> 
>>>>> Without it my gcc-4.8.3 complains:
>>>>> 
>>>>> domain.c: In function ‘unmap_vcpu_info’:
>>>>> domain.c:1158:21: error: pointer type mismatch in conditional expression 
>>>>> [-Werror]
>>>>>                      : &dummy_vcpu_info);
>>>>>                      ^
>>>>> cc1: all warnings being treated as errors
>>>>
>>>> Which is the kind of warning one normally should _not_ work
>>>> around by adding a cast.
>>> 
>>> In this (and in alloc_vcpu() from where this expression was copied)
>>> particular case this is probably OK: in struct shared_info we have
>>> 'struct vcpu_info vcpu_info[XEN_LEGACY_MAX_VCPUS]' as its first
>>> member. But I may be missing something..
>>
>> Did you read the comment accompanying the definition of
>> __shared_info()?
>>
>> The cast is presumably safe here, but it doesn't _look_ safe. And
>> for future readers (and future changes) it would be better if it did.
> 
> Sorry, it seems I don't undestand the suggestion on how to make this
> look better.

I didn't suggest anything in particular; all I care about is for such
casts to be avoided.

> With CONFIG_COMPAT in both struct shared_info and struct
> compat_shared_info vcpu_info array is of type 'struct vcpu_info[]' but
> v->vcpu_info is of type 'vcpu_info_t *' which means union of 'struct
> vcpu_info' and 'struct compat_vcpu_info' in CONFIG_COMPAT case. Both
> these structures have same size of '64' on x86 so it's really up to its
> user how to treat this data...

Sure, but that doesn't make the cast any more safe or look any
better. But then again I see we already have a similar construct in
alloc_vcpu(), i.e. - okay, keep it as is.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset()
  2015-06-22  9:44     ` Vitaly Kuznetsov
@ 2015-06-25  9:57       ` Tim Deegan
  0 siblings, 0 replies; 39+ messages in thread
From: Tim Deegan @ 2015-06-25  9:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Andrew Jones, Julien Grall, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Andrew Cooper, Ian Jackson, Olaf Hering,
	David Vrabel, Jan Beulich, xen-devel, Daniel De Graaf

At 11:44 +0200 on 22 Jun (1434973458), Vitaly Kuznetsov wrote:
> Tim Deegan <tim@xen.org> writes:
> 
> > At 15:35 +0200 on 03 Jun (1433345722), Vitaly Kuznetsov wrote:
> >> x86-specific hook cleans up the pirq-emuirq mappings and replaces the
> >> shared_info frame with an empty page to support subsequent
> >> XENMAPSPACE_shared_info call.
> >
> > That's a bit roundabout.  I think we might be better off allocating a
> > new shared-info page and abandoning the old one as an ordinary guest
> > RAM page.
> >
> 
> I've tried looking into such approach and there is an issue: shared_info
> page is being allocated from Xen heap and in case it's built with
> CONFIG_SEPARATE_XENHEAP we can't just reassign the page to the domain's
> pagelist or we'll open a possibility for a domain to move itself
> entirely to Xen's heap by e.g looping over all its gfns, doing
> XENMAPSPACE_shared_info and perforting soft reset.

Ah, true.  So your approach of replacing it is better.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-06-25  9:57 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-03 13:35 [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Vitaly Kuznetsov
2015-06-03 13:35 ` [PATCH RFC 1/4] xen: evtchn: make evtchn_reset() ready for soft reset Vitaly Kuznetsov
2015-06-04 14:05   ` Tim Deegan
2015-06-04 15:19     ` David Vrabel
2015-06-04 15:47       ` Tim Deegan
2015-06-05  8:52         ` Jan Beulich
2015-06-05  8:58           ` Ian Campbell
2015-06-05  9:07             ` Jan Beulich
2015-06-08 14:10   ` Jan Beulich
2015-06-08 15:05     ` Vitaly Kuznetsov
2015-06-03 13:35 ` [PATCH RFC 2/4] xen: grant_table: implement grant_table_soft_reset() Vitaly Kuznetsov
2015-06-04 14:11   ` Tim Deegan
2015-06-04 15:22     ` David Vrabel
2015-06-04 15:44       ` Tim Deegan
2015-06-08 14:26   ` Jan Beulich
2015-06-08 14:58     ` Vitaly Kuznetsov
2015-06-08 15:35       ` Jan Beulich
2015-06-03 13:35 ` [PATCH RFC 3/4] xen: implement SCHEDOP_soft_reset Vitaly Kuznetsov
2015-06-08 14:31   ` Jan Beulich
2015-06-22 16:00     ` Vitaly Kuznetsov
2015-06-22 16:06       ` Jan Beulich
2015-06-22 16:24         ` Vitaly Kuznetsov
2015-06-23  7:13           ` Jan Beulich
2015-06-23 12:10             ` Vitaly Kuznetsov
2015-06-23 12:52               ` Jan Beulich
2015-06-03 13:35 ` [PATCH RFC 4/4] xen: arch-specific hooks for domain_soft_reset() Vitaly Kuznetsov
2015-06-04 14:19   ` Tim Deegan
2015-06-22  9:44     ` Vitaly Kuznetsov
2015-06-25  9:57       ` Tim Deegan
2015-06-08 15:32   ` Jan Beulich
2015-06-08 15:59     ` Vitaly Kuznetsov
2015-06-04 14:22 ` [PATCH RFC 0/4] 'reset everything' approach to PVHVM guest kexec Tim Deegan
2015-06-08 15:38 ` Ian Jackson
2015-06-08 15:53 ` Wei Liu
2015-06-08 17:43   ` Wei Liu
2015-06-09  8:19     ` Dave Scott
2015-06-09  9:29       ` Wei Liu
2015-06-09  9:38         ` Wei Liu
2015-06-09 10:02           ` Dave Scott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).