All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
@ 2014-08-06 13:08 Vitaly Kuznetsov
  2014-08-06 13:08 ` [PATCH RFC 1/1] " Vitaly Kuznetsov
  2014-08-06 14:09 ` [PATCH RFC 0/1] " Jan Beulich
  0 siblings, 2 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2014-08-06 13:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Jones, Stefano Stabellini, David Vrabel

New VCPUOP_reset_vcpu_info is required to support kexec performed by smp pvhvm
guest. I'm sending almost unmodified version of Konrad's prototype. It was
tested with the following guest code:

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index c3a67c9..d90addb 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
         * This path is called twice on PVHVM - first during bootup via
         * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
         * hotplugged: cpu_up -> xen_hvm_cpu_notify.
-        * As we can only do the VCPUOP_register_vcpu_info once lets
-        * not over-write its result.
         *
         * For PV it is called during restore (xen_vcpu_restore) and bootup
         * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
@@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
        info.mfn = arbitrary_virt_to_mfn(vcpup);
        info.offset = offset_in_page(vcpup);
 
+       /*
+        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, this
+        * is required if we boot after kexec.
+        */
+
+       if (cpu != 0) {
+               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, NULL);
+               if (err)
+                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: %d\n",
+                               cpu, err);
+       }
+
        /* Check to see if the hypervisor will put the vcpu_info
           structure where we want it, which allows direct access via
           a percpu-variable.
-          N.B. This hypercall can _only_ be called once per CPU. Subsequent
-          calls will error out with -EINVAL. This is due to the fact that
-          hypervisor has no unregister variant and this hypercall does not
-          allow to over-write info.mfn and info.offset.
         */
+
        err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
 
        if (err) {
@@ -228,6 +235,22 @@ static void xen_vcpu_setup(int cpu)
        }
 }
 
+void xen_teardown_vcpu_setup(int cpu)
+{
+       int err;
+
+       if (!have_vcpu_info_placement)
+               return;
+
+       err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, NULL);
+       if (err) {
+               xen_raw_printk("%s: VCPUOP_reset_vcpu_info rc: %d\n", __func__, err);
+               return;
+       }
+       if (cpu < MAX_VIRT_CPUS)
+               per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
+}
+
 /*
  * On restore, set the vcpu placement up again.
  * If it fails, then we're in a bad state, since
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index bc5e897..cd57801 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -763,16 +763,32 @@ static int xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle)
 static void xen_hvm_cpu_die(unsigned int cpu)
 {
        xen_cpu_die(cpu);
+       xen_teardown_vcpu_setup(cpu);
        native_cpu_die(cpu);
 }
 
 #ifdef CONFIG_KEXEC
 void xen_kexec_shutdown(void)
 {
+       int first_cpu;
+
        if (!kexec_in_progress)
                return;
 
+       first_cpu = cpumask_first(cpu_online_mask);
+
+       gnttab_suspend();
+       xen_arch_pre_suspend();
+
+       /* Stop all CPUs except for the first one */
+       disable_nonboot_cpus();
+
        xen_hvm_reset_eventchannels();
+
+       /* Bring down the IPIs, PIRQs, on the BSP. */
+       xen_raw_printk("CPU0 down.\n");
+       xen_teardown_vcpu_setup(first_cpu);
+       xen_raw_printk("CPU0 down done.\n");
 }
 #endif
 
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index d083e82..36dd380 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -53,6 +53,7 @@ void xen_init_irq_ops(void);
 void xen_setup_timer(int cpu);
 void xen_setup_runstate_info(int cpu);
 void xen_teardown_timer(int cpu);
+void xen_teardown_vcpu_setup(int cpu);
 cycle_t xen_clocksource_read(void);
 void xen_setup_cpu_clockevents(void);
 void __init xen_init_time_ops(void);
diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h
index b05288c..a0a374c 100644
--- a/include/xen/interface/vcpu.h
+++ b/include/xen/interface/vcpu.h
@@ -172,4 +172,6 @@ DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info);
 
 /* Send an NMI to the specified VCPU. @extra_arg == NULL. */
 #define VCPUOP_send_nmi             11
+
+#define VCPUOP_reset_vcpu_info      14
 #endif /* __XEN_PUBLIC_VCPU_H__ */

Konrad Rzeszutek Wilk (1):
  Introduce VCPUOP_reset_vcpu_info

 xen/arch/x86/hvm/hvm.c    |  1 +
 xen/common/domain.c       | 22 +++++++++++++++++++---
 xen/include/public/vcpu.h |  6 ++++++
 3 files changed, 26 insertions(+), 3 deletions(-)

-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC 1/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 13:08 [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info Vitaly Kuznetsov
@ 2014-08-06 13:08 ` Vitaly Kuznetsov
  2014-08-06 14:36   ` Jan Beulich
  2014-08-06 14:09 ` [PATCH RFC 0/1] " Jan Beulich
  1 sibling, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2014-08-06 13:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Jones, Stefano Stabellini, David Vrabel

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

When an SMP guest performs kexec/kdump it tries issuing VCPUOP_register_vcpu_info.
This fails due to vcpu_info already being registered. Introduce new vcpu operation
to reset vcpu_info to its default state.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 xen/arch/x86/hvm/hvm.c    |  1 +
 xen/common/domain.c       | 22 +++++++++++++++++++---
 xen/include/public/vcpu.h |  6 ++++++
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 216c3f2..7917272 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3332,6 +3332,7 @@ static long hvm_vcpu_op(
     case VCPUOP_set_singleshot_timer:
     case VCPUOP_stop_singleshot_timer:
     case VCPUOP_register_vcpu_info:
+    case VCPUOP_reset_vcpu_info:
     case VCPUOP_register_vcpu_time_memory_area:
         rc = do_vcpu_op(cmd, vcpuid, arg);
         break;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ac444ba..024f366 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -961,9 +961,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset)
 }
 
 /*
- * Unmap the vcpu info page if the guest decided to place it somewhere
- * else.  This is only used from arch_domain_destroy, so there's no
- * need to do anything clever.
+ * Unmap the vcpu info page if the guest decided to place it somewhere else.
+ * This is used from arch_domain_destroy and from VCPUOP_reset_vcpu_info
+ * handler.
  */
 void unmap_vcpu_info(struct vcpu *v)
 {
@@ -1121,6 +1121,22 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case VCPUOP_reset_vcpu_info:
+    {
+        struct domain *d = v->domain;
+
+        if ( !test_bit(_VPF_down, &v->pause_flags) )
+            return -EFAULT;
+
+        domain_lock(d);
+        unmap_vcpu_info(v);
+        if ( vcpuid < XEN_LEGACY_MAX_VCPUS )
+            v->vcpu_info = (vcpu_info_t *)&shared_info(d, vcpu_info[vcpuid]);
+        domain_unlock(d);
+        rc = 0;
+        break;
+    }
+
     case VCPUOP_register_runstate_memory_area:
     {
         struct vcpu_register_runstate_memory_area area;
diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h
index e888daf..c0283bc 100644
--- a/xen/include/public/vcpu.h
+++ b/xen/include/public/vcpu.h
@@ -227,6 +227,12 @@ struct vcpu_register_time_memory_area {
 typedef struct vcpu_register_time_memory_area vcpu_register_time_memory_area_t;
 DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
 
+/*
+ * Reset all of the vcpu_info information from their previous location
+ * to the default one used at bootup.
+ */
+#define VCPUOP_reset_vcpu_info   14
+
 #endif /* __XEN_PUBLIC_VCPU_H__ */
 
 /*
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 13:08 [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info Vitaly Kuznetsov
  2014-08-06 13:08 ` [PATCH RFC 1/1] " Vitaly Kuznetsov
@ 2014-08-06 14:09 ` Jan Beulich
  2014-08-06 14:49   ` Konrad Rzeszutek Wilk
                     ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Jan Beulich @ 2014-08-06 14:09 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

>>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
>          * This path is called twice on PVHVM - first during bootup via
>          * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
>          * hotplugged: cpu_up -> xen_hvm_cpu_notify.
> -        * As we can only do the VCPUOP_register_vcpu_info once lets
> -        * not over-write its result.
>          *
>          * For PV it is called during restore (xen_vcpu_restore) and bootup
>          * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
> @@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
>         info.mfn = arbitrary_virt_to_mfn(vcpup);
>         info.offset = offset_in_page(vcpup);
>  
> +       /*
> +        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, this
> +        * is required if we boot after kexec.
> +        */
> +
> +       if (cpu != 0) {
> +               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, NULL);
> +               if (err)
> +                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: %d\n",
> +                               cpu, err);
> +       }

Just for my understanding of why exactly you need the new operation:
Why is this being done here, when you already do the reset in the
cpu-die/shutdown paths? And why not for CPU 0?

Furthermore, what is the state of vCPU-s beyond 31 going to be after
they got their vCPU info reset? They won't have any other area as
fallback. Yet I don't think you can now and forever guarantee that
native_cpu_die() won't do anything requiring that structure.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 1/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 13:08 ` [PATCH RFC 1/1] " Vitaly Kuznetsov
@ 2014-08-06 14:36   ` Jan Beulich
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2014-08-06 14:36 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

>>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
> +    case VCPUOP_reset_vcpu_info:
> +    {
> +        struct domain *d = v->domain;

There is a suitable variable in scope already; please don't repeat the
mistake of the code handling VCPUOP_register_vcpu_info.

> +
> +        if ( !test_bit(_VPF_down, &v->pause_flags) )
> +            return -EFAULT;

So how is that going to work then for CPU 0 in the guest? And why
-EFAULT?

> +
> +        domain_lock(d);
> +        unmap_vcpu_info(v);
> +        if ( vcpuid < XEN_LEGACY_MAX_VCPUS )
> +            v->vcpu_info = (vcpu_info_t *)&shared_info(d, vcpu_info[vcpuid]);

This would clearly need to be done before unmapping, and you'd have
to avoid unmap_vcpu_info() overwriting it again (and the order of
operations there doesn't fit your need anyway). Furthermore you'd
need to deal with the guest having set up FIFO event channels when
re-registering the vCPU info - this is a shortcoming in map_vcpu_info()
which I suppose doesn't matter much right now because interested
guests set their vCPU info area before switching away from 2-level
event channels. You'd also need to put consideration into 2-level
event channel behavior - it would seem to me that the tail part of
map_vcpu_info() would now also be needed in unmap_vcpu_info(),
since you won't know when or in which manner the vCPU would be
brought back up again. And that of course along with the copying of
the contents.

> --- a/xen/include/public/vcpu.h
> +++ b/xen/include/public/vcpu.h
> @@ -227,6 +227,12 @@ struct vcpu_register_time_memory_area {
>  typedef struct vcpu_register_time_memory_area 
> vcpu_register_time_memory_area_t;
>  DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
>  
> +/*
> + * Reset all of the vcpu_info information from their previous location
> + * to the default one used at bootup.
> + */
> +#define VCPUOP_reset_vcpu_info   14

Without explicitly explaining all the limitations this shouldn't be put in
a public header.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 14:09 ` [PATCH RFC 0/1] " Jan Beulich
@ 2014-08-06 14:49   ` Konrad Rzeszutek Wilk
  2014-08-06 14:54     ` Jan Beulich
  2014-08-06 14:51   ` Konrad Rzeszutek Wilk
  2014-08-06 15:03   ` Vitaly Kuznetsov
  2 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 14:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Vitaly Kuznetsov, Stefano Stabellini, Andrew Jones,
	David Vrabel

On Wed, Aug 06, 2014 at 03:09:43PM +0100, Jan Beulich wrote:
> >>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
> >          * This path is called twice on PVHVM - first during bootup via
> >          * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
> >          * hotplugged: cpu_up -> xen_hvm_cpu_notify.
> > -        * As we can only do the VCPUOP_register_vcpu_info once lets
> > -        * not over-write its result.
> >          *
> >          * For PV it is called during restore (xen_vcpu_restore) and bootup
> >          * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
> > @@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
> >         info.mfn = arbitrary_virt_to_mfn(vcpup);
> >         info.offset = offset_in_page(vcpup);
> >  
> > +       /*
> > +        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, this
> > +        * is required if we boot after kexec.
> > +        */
> > +
> > +       if (cpu != 0) {
> > +               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, NULL);
> > +               if (err)
> > +                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: %d\n",
> > +                               cpu, err);
> > +       }
> 
> Just for my understanding of why exactly you need the new operation:
> Why is this being done here, when you already do the reset in the
> cpu-die/shutdown paths? And why not for CPU 0?
> 
> Furthermore, what is the state of vCPU-s beyond 31 going to be after
> they got their vCPU info reset? They won't have any other area as
> fallback. Yet I don't think you can now and forever guarantee that
> native_cpu_die() won't do anything requiring that structure.

native_cpu_die just spins around in an infinite halt sequence. The
call could be replaced with the VCPU_down call instead, something like:

	if (xen_teardown_cpu(cpu) == -ENOSYS)
		native_cpu_die();
	
	(void)HYPERVISOR_vcpu_op(VCPUP_down, cpu, NULL);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 14:09 ` [PATCH RFC 0/1] " Jan Beulich
  2014-08-06 14:49   ` Konrad Rzeszutek Wilk
@ 2014-08-06 14:51   ` Konrad Rzeszutek Wilk
  2014-08-06 15:00     ` Jan Beulich
  2014-08-06 15:03   ` Vitaly Kuznetsov
  2 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 14:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Vitaly Kuznetsov, Stefano Stabellini, Andrew Jones,
	David Vrabel

On Wed, Aug 06, 2014 at 03:09:43PM +0100, Jan Beulich wrote:
> >>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
> >          * This path is called twice on PVHVM - first during bootup via
> >          * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
> >          * hotplugged: cpu_up -> xen_hvm_cpu_notify.
> > -        * As we can only do the VCPUOP_register_vcpu_info once lets
> > -        * not over-write its result.
> >          *
> >          * For PV it is called during restore (xen_vcpu_restore) and bootup
> >          * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
> > @@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
> >         info.mfn = arbitrary_virt_to_mfn(vcpup);
> >         info.offset = offset_in_page(vcpup);
> >  
> > +       /*
> > +        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, this
> > +        * is required if we boot after kexec.
> > +        */
> > +
> > +       if (cpu != 0) {
> > +               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, NULL);
> > +               if (err)
> > +                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: %d\n",
> > +                               cpu, err);
> > +       }
> 
> Just for my understanding of why exactly you need the new operation:
> Why is this being done here, when you already do the reset in the
> cpu-die/shutdown paths? And why not for CPU 0?

For CPU0 it will always error out (as we can't do the hypercall on
v != current as we are doing this on an BSP path) in the hypervisor.

This path could also been taken when doing a kdump instead of kexec
path - hence the reset hadn't happen in the cpu-die shutdown paths.

> 
> Furthermore, what is the state of vCPU-s beyond 31 going to be after
> they got their vCPU info reset? They won't have any other area as
> fallback. Yet I don't think you can now and forever guarantee that
> native_cpu_die() won't do anything requiring that structure.
> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 14:49   ` Konrad Rzeszutek Wilk
@ 2014-08-06 14:54     ` Jan Beulich
  2014-08-06 15:03       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-08-06 14:54 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Andrew Jones, Vitaly Kuznetsov, David Vrabel,
	Stefano Stabellini

>>> On 06.08.14 at 16:49, <konrad.wilk@oracle.com> wrote:
> On Wed, Aug 06, 2014 at 03:09:43PM +0100, Jan Beulich wrote:
>> Just for my understanding of why exactly you need the new operation:
>> Why is this being done here, when you already do the reset in the
>> cpu-die/shutdown paths? And why not for CPU 0?
>> 
>> Furthermore, what is the state of vCPU-s beyond 31 going to be after
>> they got their vCPU info reset? They won't have any other area as
>> fallback. Yet I don't think you can now and forever guarantee that
>> native_cpu_die() won't do anything requiring that structure.
> 
> native_cpu_die just spins around in an infinite halt sequence. The
> call could be replaced with the VCPU_down call instead, something like:
> 
> 	if (xen_teardown_cpu(cpu) == -ENOSYS)
> 		native_cpu_die();
> 	
> 	(void)HYPERVISOR_vcpu_op(VCPUP_down, cpu, NULL);

But the suggested new hypercall requires the vCPU to be down in
order to allow the reset...

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 14:51   ` Konrad Rzeszutek Wilk
@ 2014-08-06 15:00     ` Jan Beulich
  2014-08-06 15:02       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-08-06 15:00 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Andrew Jones, Vitaly Kuznetsov, David Vrabel,
	Stefano Stabellini

>>> On 06.08.14 at 16:51, <konrad.wilk@oracle.com> wrote:
> On Wed, Aug 06, 2014 at 03:09:43PM +0100, Jan Beulich wrote:
>> >>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
>> > --- a/arch/x86/xen/enlighten.c
>> > +++ b/arch/x86/xen/enlighten.c
>> > @@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
>> >          * This path is called twice on PVHVM - first during bootup via
>> >          * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
>> >          * hotplugged: cpu_up -> xen_hvm_cpu_notify.
>> > -        * As we can only do the VCPUOP_register_vcpu_info once lets
>> > -        * not over-write its result.
>> >          *
>> >          * For PV it is called during restore (xen_vcpu_restore) and bootup
>> >          * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
>> > @@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
>> >         info.mfn = arbitrary_virt_to_mfn(vcpup);
>> >         info.offset = offset_in_page(vcpup);
>> >  
>> > +       /*
>> > +        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, 
> this
>> > +        * is required if we boot after kexec.
>> > +        */
>> > +
>> > +       if (cpu != 0) {
>> > +               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, 
> NULL);
>> > +               if (err)
>> > +                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: 
> %d\n",
>> > +                               cpu, err);
>> > +       }
>> 
>> Just for my understanding of why exactly you need the new operation:
>> Why is this being done here, when you already do the reset in the
>> cpu-die/shutdown paths? And why not for CPU 0?
> 
> For CPU0 it will always error out (as we can't do the hypercall on
> v != current as we are doing this on an BSP path) in the hypervisor.
> 
> This path could also been taken when doing a kdump instead of kexec
> path - hence the reset hadn't happen in the cpu-die shutdown paths.

But that's unsafe, as the hypervisor would continue to write fields
in that structure if it finds a need to.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 15:00     ` Jan Beulich
@ 2014-08-06 15:02       ` Konrad Rzeszutek Wilk
  2014-08-06 15:38         ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 15:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Jones, Vitaly Kuznetsov, David Vrabel,
	Stefano Stabellini

On Wed, Aug 06, 2014 at 04:00:12PM +0100, Jan Beulich wrote:
> >>> On 06.08.14 at 16:51, <konrad.wilk@oracle.com> wrote:
> > On Wed, Aug 06, 2014 at 03:09:43PM +0100, Jan Beulich wrote:
> >> >>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
> >> > --- a/arch/x86/xen/enlighten.c
> >> > +++ b/arch/x86/xen/enlighten.c
> >> > @@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
> >> >          * This path is called twice on PVHVM - first during bootup via
> >> >          * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
> >> >          * hotplugged: cpu_up -> xen_hvm_cpu_notify.
> >> > -        * As we can only do the VCPUOP_register_vcpu_info once lets
> >> > -        * not over-write its result.
> >> >          *
> >> >          * For PV it is called during restore (xen_vcpu_restore) and bootup
> >> >          * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
> >> > @@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
> >> >         info.mfn = arbitrary_virt_to_mfn(vcpup);
> >> >         info.offset = offset_in_page(vcpup);
> >> >  
> >> > +       /*
> >> > +        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, 
> > this
> >> > +        * is required if we boot after kexec.
> >> > +        */
> >> > +
> >> > +       if (cpu != 0) {
> >> > +               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, 
> > NULL);
> >> > +               if (err)
> >> > +                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: 
> > %d\n",
> >> > +                               cpu, err);
> >> > +       }
> >> 
> >> Just for my understanding of why exactly you need the new operation:
> >> Why is this being done here, when you already do the reset in the
> >> cpu-die/shutdown paths? And why not for CPU 0?
> > 
> > For CPU0 it will always error out (as we can't do the hypercall on
> > v != current as we are doing this on an BSP path) in the hypervisor.
> > 
> > This path could also been taken when doing a kdump instead of kexec
> > path - hence the reset hadn't happen in the cpu-die shutdown paths.
> 
> But that's unsafe, as the hypervisor would continue to write fields
> in that structure if it finds a need to.

That is OK. The kdump kernel boots from a reserved region so the
area where the hypervisor keeps on writting is in the 'outside-kdump-kernel'
boot area.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 14:54     ` Jan Beulich
@ 2014-08-06 15:03       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 15:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Jones, Vitaly Kuznetsov, David Vrabel,
	Stefano Stabellini

On Wed, Aug 06, 2014 at 03:54:57PM +0100, Jan Beulich wrote:
> >>> On 06.08.14 at 16:49, <konrad.wilk@oracle.com> wrote:
> > On Wed, Aug 06, 2014 at 03:09:43PM +0100, Jan Beulich wrote:
> >> Just for my understanding of why exactly you need the new operation:
> >> Why is this being done here, when you already do the reset in the
> >> cpu-die/shutdown paths? And why not for CPU 0?
> >> 
> >> Furthermore, what is the state of vCPU-s beyond 31 going to be after
> >> they got their vCPU info reset? They won't have any other area as
> >> fallback. Yet I don't think you can now and forever guarantee that
> >> native_cpu_die() won't do anything requiring that structure.
> > 
> > native_cpu_die just spins around in an infinite halt sequence. The
> > call could be replaced with the VCPU_down call instead, something like:
> > 
> > 	if (xen_teardown_cpu(cpu) == -ENOSYS)
> > 		native_cpu_die();
> > 	
> > 	(void)HYPERVISOR_vcpu_op(VCPUP_down, cpu, NULL);
> 
> But the suggested new hypercall requires the vCPU to be down in
> order to allow the reset...

Duh! I wrote it and I should have remember that :-) It is RFC for a reason!
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 14:09 ` [PATCH RFC 0/1] " Jan Beulich
  2014-08-06 14:49   ` Konrad Rzeszutek Wilk
  2014-08-06 14:51   ` Konrad Rzeszutek Wilk
@ 2014-08-06 15:03   ` Vitaly Kuznetsov
  2014-08-06 15:43     ` Jan Beulich
  2 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2014-08-06 15:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 06.08.14 at 15:08, <vkuznets@redhat.com> wrote:
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -183,8 +183,6 @@ static void xen_vcpu_setup(int cpu)
>>          * This path is called twice on PVHVM - first during bootup via
>>          * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
>>          * hotplugged: cpu_up -> xen_hvm_cpu_notify.
>> -        * As we can only do the VCPUOP_register_vcpu_info once lets
>> -        * not over-write its result.
>>          *
>>          * For PV it is called during restore (xen_vcpu_restore) and bootup
>>          * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
>> @@ -207,14 +205,23 @@ static void xen_vcpu_setup(int cpu)
>>         info.mfn = arbitrary_virt_to_mfn(vcpup);
>>         info.offset = offset_in_page(vcpup);
>>  
>> +       /*
>> +        * Call VCPUOP_reset_vcpu_info before VCPUOP_register_vcpu_info, this
>> +        * is required if we boot after kexec.
>> +        */
>> +
>> +       if (cpu != 0) {
>> +               err = HYPERVISOR_vcpu_op(VCPUOP_reset_vcpu_info, cpu, NULL);
>> +               if (err)
>> +                       pr_warn("VCPUOP_reset_vcpu_info for CPU%d failed: %d\n",
>> +                               cpu, err);
>> +       }
>
> Just for my understanding of why exactly you need the new operation:
> Why is this being done here, when you already do the reset in the
> cpu-die/shutdown paths? 

We can avoid doing it here if we put VCPUOP_reset_vcpu_info to kdump
handler in addition to kexec path. 

> And why not for CPU 0?
>

Because the suggested op will fail for already running CPU0..

> Furthermore, what is the state of vCPU-s beyond 31 going to be after
> they got their vCPU info reset? They won't have any other area as
> fallback.

In xen their vcpu_info will point to dummy_vcpu_info (that's what we
have there before VCPUOP_register_vcpu_info I guess). I tested kexec
with 64 vcpus case, no issues noticed.

> Yet I don't think you can now and forever guarantee that
> native_cpu_die() won't do anything requiring that structure.

I agree with Konrad, we can exclude native_cpu_die() as it does
almost nothing (or call VCPUOP_reset_vcpu_info after it).

Thank you for your comments (especially for your other email), I'll try
addressing them.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 15:02       ` Konrad Rzeszutek Wilk
@ 2014-08-06 15:38         ` Jan Beulich
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2014-08-06 15:38 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Andrew Jones, Vitaly Kuznetsov, David Vrabel,
	Stefano Stabellini

>>> On 06.08.14 at 17:02, <konrad.wilk@oracle.com> wrote:
> On Wed, Aug 06, 2014 at 04:00:12PM +0100, Jan Beulich wrote:
>> >>> On 06.08.14 at 16:51, <konrad.wilk@oracle.com> wrote:
>> > This path could also been taken when doing a kdump instead of kexec
>> > path - hence the reset hadn't happen in the cpu-die shutdown paths.
>> 
>> But that's unsafe, as the hypervisor would continue to write fields
>> in that structure if it finds a need to.
> 
> That is OK. The kdump kernel boots from a reserved region so the
> area where the hypervisor keeps on writting is in the 'outside-kdump-kernel'
> boot area.

But it may invalidate what ends up in the dump.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 15:03   ` Vitaly Kuznetsov
@ 2014-08-06 15:43     ` Jan Beulich
  2014-08-07 11:41       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-08-06 15:43 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

>>> On 06.08.14 at 17:03, <vkuznets@redhat.com> wrote:
> "Jan Beulich" <JBeulich@suse.com> writes:
>> And why not for CPU 0?
>>
> 
> Because the suggested op will fail for already running CPU0..

Sure, but how do you deal with that?

>> Furthermore, what is the state of vCPU-s beyond 31 going to be after
>> they got their vCPU info reset? They won't have any other area as
>> fallback.
> 
> In xen their vcpu_info will point to dummy_vcpu_info (that's what we
> have there before VCPUOP_register_vcpu_info I guess).

Right, but such a vCPU can't be brought up. Did you verify that
this holds with your additions?

> I tested kexec with 64 vcpus case, no issues noticed.

That's purely observation based then, in an environment where
(for the specific operation here) not much can go wrong. But I
guess we'll see how things are once you properly document all
limitations.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-06 15:43     ` Jan Beulich
@ 2014-08-07 11:41       ` Vitaly Kuznetsov
  2014-08-07 11:48         ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2014-08-07 11:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 06.08.14 at 17:03, <vkuznets@redhat.com> wrote:
>> "Jan Beulich" <JBeulich@suse.com> writes:
>>> And why not for CPU 0?
>>>
>> 
>> Because the suggested op will fail for already running CPU0..
>
> Sure, but how do you deal with that?
>

Now pvhvm guests don't call VCPUOP_register_vcpu_info for CPU0 on boot
so it always points to shared_info and there is no need in doing
VCPUOP_reset_vcpu_info.

>>> Furthermore, what is the state of vCPU-s beyond 31 going to be after
>>> they got their vCPU info reset? They won't have any other area as
>>> fallback.
>> 
>> In xen their vcpu_info will point to dummy_vcpu_info (that's what we
>> have there before VCPUOP_register_vcpu_info I guess).
>
> Right, but such a vCPU can't be brought up. Did you verify that
> this holds with your additions?

That's fine as new kernel will call VCPUOP_register_vcpu_info on
bootup. 

>
>> I tested kexec with 64 vcpus case, no issues noticed.
>
> That's purely observation based then, in an environment where
> (for the specific operation here) not much can go wrong. But I guess
> we'll see how things are once you properly document all limitations.

Actually my intention is to bring the state of affairs to the point it
was before first kernel booted (so newly booted kernel will
succeed). VCPUs >= XEN_LEGACY_MAX_VCPUS have their vcpu_info pointing to
dummy_vcpu_info before first VCPUOP_register_vcpu_info call so let's
that is the desired state when doing kexec.

>
> Jan

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-07 11:41       ` Vitaly Kuznetsov
@ 2014-08-07 11:48         ` Jan Beulich
  2014-08-07 11:54           ` Vitaly Kuznetsov
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-08-07 11:48 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

>>> On 07.08.14 at 13:41, <vkuznets@redhat.com> wrote:
> "Jan Beulich" <JBeulich@suse.com> writes:
>>>>> On 06.08.14 at 17:03, <vkuznets@redhat.com> wrote:
>>> In xen their vcpu_info will point to dummy_vcpu_info (that's what we
>>> have there before VCPUOP_register_vcpu_info I guess).
>>
>> Right, but such a vCPU can't be brought up. Did you verify that
>> this holds with your additions?
> 
> That's fine as new kernel will call VCPUOP_register_vcpu_info on
> bootup. 

But my question was targeting the hypervisor side, i.e. I was
asking that you get away from your "works fine in my limited use
case" way of thinking, and rather check that you don't subtly
break anything else.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info
  2014-08-07 11:48         ` Jan Beulich
@ 2014-08-07 11:54           ` Vitaly Kuznetsov
  0 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2014-08-07 11:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Jones, David Vrabel, Stefano Stabellini

"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 07.08.14 at 13:41, <vkuznets@redhat.com> wrote:
>> "Jan Beulich" <JBeulich@suse.com> writes:
>>>>>> On 06.08.14 at 17:03, <vkuznets@redhat.com> wrote:
>>>> In xen their vcpu_info will point to dummy_vcpu_info (that's what we
>>>> have there before VCPUOP_register_vcpu_info I guess).
>>>
>>> Right, but such a vCPU can't be brought up. Did you verify that
>>> this holds with your additions?
>> 
>> That's fine as new kernel will call VCPUOP_register_vcpu_info on
>> bootup. 
>
> But my question was targeting the hypervisor side, i.e. I was
> asking that you get away from your "works fine in my limited use
> case" way of thinking, and rather check that you don't subtly
> break anything else.

Thanks, that's fine, that's what the RFC is about. I'll definitely try
looking deeper and I'll be grateful for some examples of what can
actually go wrong.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-08-07 11:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-06 13:08 [PATCH RFC 0/1] Introduce VCPUOP_reset_vcpu_info Vitaly Kuznetsov
2014-08-06 13:08 ` [PATCH RFC 1/1] " Vitaly Kuznetsov
2014-08-06 14:36   ` Jan Beulich
2014-08-06 14:09 ` [PATCH RFC 0/1] " Jan Beulich
2014-08-06 14:49   ` Konrad Rzeszutek Wilk
2014-08-06 14:54     ` Jan Beulich
2014-08-06 15:03       ` Konrad Rzeszutek Wilk
2014-08-06 14:51   ` Konrad Rzeszutek Wilk
2014-08-06 15:00     ` Jan Beulich
2014-08-06 15:02       ` Konrad Rzeszutek Wilk
2014-08-06 15:38         ` Jan Beulich
2014-08-06 15:03   ` Vitaly Kuznetsov
2014-08-06 15:43     ` Jan Beulich
2014-08-07 11:41       ` Vitaly Kuznetsov
2014-08-07 11:48         ` Jan Beulich
2014-08-07 11:54           ` Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.