* [PATCH v1 0/3] x86/xen: pvclock vdso support @ 2017-01-25 17:33 Joao Martins 2017-01-25 17:33 ` [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va Joao Martins ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Joao Martins @ 2017-01-25 17:33 UTC (permalink / raw) To: linux-kernel, xen-devel Cc: Joao Martins, Boris Ostrovsky, Juergen Gross, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Andy Lutomirski Hey, This small series presents support for vDSO for Xen domains. PVCLOCK_TSC_STABLE_BIT can be set starting Xen 4.8 which is required for vdso time related calls. In order to have it on, you need to have the hypervisor clocksource be TSC e.g. with the following boot params "clocksource=tsc tsc=stable:socket". Series is structured as following: Patch 1 streamlines pvti page get/set in pvclock for both of its users Patch 2 registers the pvti page on Xen and sets it in pvclock accordingly and Patch 3 (new in this version) adds an entry to maintainers for tracking pvclock ABI changes. Changelog since RFC is included in individual patches. Any comments/suggestions are welcome. Thanks, Joao Joao Martins (3): x86/pvclock: add setter for pvclock_pvti_cpu0_va x86/xen/time: setup vcpu 0 time info page MAINTAINERS: xen, kvm: track pvclock-abi.h changes MAINTAINERS | 2 ++ arch/x86/include/asm/pvclock.h | 22 ++++++++++-------- arch/x86/kernel/kvmclock.c | 6 +---- arch/x86/kernel/pvclock.c | 13 +++++++++++ arch/x86/xen/enlighten.c | 2 ++ arch/x86/xen/time.c | 51 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/xen/xen-ops.h | 1 + include/xen/interface/vcpu.h | 28 +++++++++++++++++++++++ 8 files changed, 111 insertions(+), 14 deletions(-) -- 2.1.4 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va 2017-01-25 17:33 [PATCH v1 0/3] x86/xen: pvclock vdso support Joao Martins @ 2017-01-25 17:33 ` Joao Martins 2017-01-26 17:25 ` Andy Lutomirski 2017-01-25 17:33 ` [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page Joao Martins 2017-01-25 17:33 ` [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes Joao Martins 2 siblings, 1 reply; 12+ messages in thread From: Joao Martins @ 2017-01-25 17:33 UTC (permalink / raw) To: linux-kernel, kvm Cc: Joao Martins, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Paolo Bonzini, Radim Krcmar, Andy Lutomirski, xen-devel Right now there is only a pvclock_pvti_cpu0_va() which is defined on kvmclock since: commit dac16fba6fc5 ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap") The only user of this interface was kvm. This commit moves pvclock_pvti_cpu0_va to pvclock which is a more generic place to have it and adds the correspondent setter routine for it. This allows other pvclock-based clocksources to use it, such as Xen. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- Changes since RFC: (Comments from Andy Lutomirski) * Add WARN_ON(vclock_was_used(VCLOCK_PVCLOCK)) to pvclock_set_pvti_cpu0_va --- arch/x86/include/asm/pvclock.h | 22 +++++++++++++--------- arch/x86/kernel/kvmclock.c | 6 +----- arch/x86/kernel/pvclock.c | 13 +++++++++++++ 3 files changed, 27 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index 448cfe1..58399e1 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -4,15 +4,6 @@ #include <linux/clocksource.h> #include <asm/pvclock-abi.h> -#ifdef CONFIG_KVM_GUEST -extern struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void); -#else -static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) -{ - return NULL; -} -#endif - /* some helper functions for xen and kvm pv clock sources */ u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src); u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src); @@ -101,4 +92,17 @@ struct pvclock_vsyscall_time_info { #define PVTI_SIZE sizeof(struct pvclock_vsyscall_time_info) +#ifdef CONFIG_PARAVIRT_CLOCK +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti); +struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void); +#else +static inline void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti) +{ +} +static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) +{ + return NULL; +} +#endif + #endif /* _ASM_X86_PVCLOCK_H */ diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index 2a5cafd..9dfbb79 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -45,11 +45,6 @@ early_param("no-kvmclock", parse_no_kvmclock); static struct pvclock_vsyscall_time_info *hv_clock; static struct pvclock_wall_clock wall_clock; -struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) -{ - return hv_clock; -} - /* * The wallclock is the time of day when we booted. Since then, some time may * have elapsed since the hypervisor wrote the data. So we try to account for @@ -330,6 +325,7 @@ int __init kvm_setup_vsyscall_timeinfo(void) return 1; } + pvclock_set_pvti_cpu0_va(hv_clock); put_cpu(); kvm_clock.archdata.vclock_mode = VCLOCK_PVCLOCK; diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c index 9e93fe5..b281060 100644 --- a/arch/x86/kernel/pvclock.c +++ b/arch/x86/kernel/pvclock.c @@ -23,8 +23,10 @@ #include <linux/bootmem.h> #include <asm/fixmap.h> #include <asm/pvclock.h> +#include <asm/vgtod.h> static u8 valid_flags __read_mostly = 0; +static struct pvclock_vsyscall_time_info *pvti_cpu0_va __read_mostly = NULL; void pvclock_set_flags(u8 flags) { @@ -142,3 +144,14 @@ void pvclock_read_wallclock(struct pvclock_wall_clock *wall_clock, set_normalized_timespec(ts, now.tv_sec, now.tv_nsec); } + +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti) +{ + WARN_ON(vclock_was_used(VCLOCK_PVCLOCK)); + pvti_cpu0_va = pvti; +} + +struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) +{ + return pvti_cpu0_va; +} -- 2.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va 2017-01-25 17:33 ` [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va Joao Martins @ 2017-01-26 17:25 ` Andy Lutomirski 2017-01-26 19:58 ` Joao Martins 0 siblings, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2017-01-26 17:25 UTC (permalink / raw) To: Joao Martins Cc: linux-kernel, kvm list, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, X86 ML, Paolo Bonzini, Radim Krcmar, xen-devel On Wed, Jan 25, 2017 at 9:33 AM, Joao Martins <joao.m.martins@oracle.com> wrote: > Right now there is only a pvclock_pvti_cpu0_va() which is defined > on kvmclock since: > > commit dac16fba6fc5 > ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap") > > The only user of this interface was kvm. This commit moves > pvclock_pvti_cpu0_va to pvclock which is a more generic place to have it > and adds the correspondent setter routine for it. This allows other > pvclock-based clocksources to use it, such as Xen. With a minor nit: Acked-by: Andy Lutomirski <luto@kernel.org> > +#else > +static inline void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti) > +{ > +} How about just not providing pvclock_set_pvti_cpu0_va() in this case? It'll save three lines of code, and, more importantly, it will force us to notice if we screw up the Kconfig stuff. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va 2017-01-26 17:25 ` Andy Lutomirski @ 2017-01-26 19:58 ` Joao Martins 0 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2017-01-26 19:58 UTC (permalink / raw) To: Andy Lutomirski Cc: linux-kernel, kvm list, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, X86 ML, Paolo Bonzini, Radim Krcmar, xen-devel On 01/26/2017 05:25 PM, Andy Lutomirski wrote: > On Wed, Jan 25, 2017 at 9:33 AM, Joao Martins <joao.m.martins@oracle.com> wrote: >> Right now there is only a pvclock_pvti_cpu0_va() which is defined >> on kvmclock since: >> >> commit dac16fba6fc5 >> ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap") >> >> The only user of this interface was kvm. This commit moves >> pvclock_pvti_cpu0_va to pvclock which is a more generic place to have it >> and adds the correspondent setter routine for it. This allows other >> pvclock-based clocksources to use it, such as Xen. > > With a minor nit: > > Acked-by: Andy Lutomirski <luto@kernel.org> > >> +#else >> +static inline void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti) >> +{ >> +} > > How about just not providing pvclock_set_pvti_cpu0_va() in this case? > It'll save three lines of code, and, more importantly, it will force > us to notice if we screw up the Kconfig stuff. Sounds good, will remove this then. Thanks! Joao ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page 2017-01-25 17:33 [PATCH v1 0/3] x86/xen: pvclock vdso support Joao Martins 2017-01-25 17:33 ` [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va Joao Martins @ 2017-01-25 17:33 ` Joao Martins 2017-01-25 19:26 ` Boris Ostrovsky 2017-01-26 17:18 ` Andy Lutomirski 2017-01-25 17:33 ` [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes Joao Martins 2 siblings, 2 replies; 12+ messages in thread From: Joao Martins @ 2017-01-25 17:33 UTC (permalink / raw) To: linux-kernel, xen-devel Cc: Joao Martins, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Boris Ostrovsky, Juergen Gross, Andy Lutomirski In order to support pvclock vdso on xen we need to setup the time info page for vcpu 0 and register the page with Xen using the VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall will also forcefully update the pvti which will set some of the necessary flags for vdso. Afterwards we check if it supports the PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having vdso/vsyscall support. And if so, it will set the cpu 0 pvti that will be later on used when mapping the vdso image. The xen headers are also updated to include the new hypercall for registering the secondary vcpu_time_info struct. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- Changes since RFC: (Comments from Boris and David) * Remove Kconfig option * Use get_zeroed_page/free/page * Remove the hypercall availability check * Unregister pvti with arg.addr.v = NULL if stable bit isn't supported. (New) * Set secondary copy on restore such that it works on migration. * Drop global xen_clock variable and stash it locally on xen_setup_vsyscall_time_info. * WARN_ON(ret) if we fail to unregister the pvti. --- arch/x86/xen/enlighten.c | 2 ++ arch/x86/xen/time.c | 51 ++++++++++++++++++++++++++++++++++++++++++++ arch/x86/xen/xen-ops.h | 1 + include/xen/interface/vcpu.h | 28 ++++++++++++++++++++++++ 4 files changed, 82 insertions(+) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 51ef952..15d271d 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -270,6 +270,8 @@ void xen_vcpu_restore(void) HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL)) BUG(); } + + xen_setup_vsyscall_time_info(0); } static void __init xen_banner(void) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 1e69956..e90f703 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -367,6 +367,56 @@ static const struct pv_time_ops xen_time_ops __initconst = { .steal_clock = xen_steal_clock, }; +int xen_setup_vsyscall_time_info(int cpu) +{ + struct pvclock_vsyscall_time_info *xen_clock; + struct vcpu_register_time_memory_area t; + struct pvclock_vcpu_time_info *pvti; + unsigned long addr; + u8 flags; + int ret; + + addr = get_zeroed_page(GFP_KERNEL); + if (!addr) + return -ENOMEM; + + xen_clock = (struct pvclock_vsyscall_time_info *) addr; + memset(xen_clock, 0, PAGE_SIZE); + + t.addr.v = &xen_clock->pvti; + + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, + cpu, &t); + + if (ret) { + pr_debug("xen: cannot register vcpu_time_info err %d\n", ret); + free_page(addr); + return ret; + } + + pvti = &xen_clock->pvti; + flags = pvti->flags; + + if (!(flags & PVCLOCK_TSC_STABLE_BIT)) { + t.addr.v = NULL; + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, + cpu, &t); + if (!ret) + free_page(addr); + + WARN_ON(ret); + pr_debug("xen: VCLOCK_PVCLOCK not supported\n"); + return -ENOTSUPP; + } + + pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); + pvclock_set_pvti_cpu0_va(xen_clock); + + xen_clocksource.archdata.vclock_mode = VCLOCK_PVCLOCK; + + return 0; +} + static void __init xen_time_init(void) { int cpu = smp_processor_id(); @@ -393,6 +443,7 @@ static void __init xen_time_init(void) setup_force_cpu_cap(X86_FEATURE_TSC); xen_setup_runstate_info(cpu); + xen_setup_vsyscall_time_info(cpu); xen_setup_timer(cpu); xen_setup_cpu_clockevents(); diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index ac0a2b0..4036d15 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -66,6 +66,7 @@ void __init xen_vmalloc_p2m_tree(void); void xen_init_irq_ops(void); void xen_setup_timer(int cpu); void xen_setup_runstate_info(int cpu); +int xen_setup_vsyscall_time_info(int cpu); void xen_teardown_timer(int cpu); u64 xen_clocksource_read(void); void xen_setup_cpu_clockevents(void); diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h index 98188c8..8da788c 100644 --- a/include/xen/interface/vcpu.h +++ b/include/xen/interface/vcpu.h @@ -178,4 +178,32 @@ DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info); /* Send an NMI to the specified VCPU. @extra_arg == NULL. */ #define VCPUOP_send_nmi 11 + +/* + * Register a memory location to get a secondary copy of the vcpu time + * parameters. The master copy still exists as part of the vcpu shared + * memory area, and this secondary copy is updated whenever the master copy + * is updated (and using the same versioning scheme for synchronisation). + * + * The intent is that this copy may be mapped (RO) into userspace so + * that usermode can compute system time using the time info and the + * tsc. Usermode will see an array of vcpu_time_info structures, one + * for each vcpu, and choose the right one by an existing mechanism + * which allows it to get the current vcpu number (such as via a + * segment limit). It can then apply the normal algorithm to compute + * system time from the tsc. + * + * @extra_arg == pointer to vcpu_register_time_info_memory_area structure. + */ +#define VCPUOP_register_vcpu_time_memory_area 13 +DEFINE_GUEST_HANDLE_STRUCT(vcpu_time_info_t); +struct vcpu_register_time_memory_area { + union { + GUEST_HANDLE(vcpu_time_info_t) h; + struct pvclock_vcpu_time_info *v; + uint64_t p; + } addr; +}; +DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_time_memory_area_t); + #endif /* __XEN_PUBLIC_VCPU_H__ */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page 2017-01-25 17:33 ` [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page Joao Martins @ 2017-01-25 19:26 ` Boris Ostrovsky 2017-01-26 13:22 ` Joao Martins 2017-01-26 17:18 ` Andy Lutomirski 1 sibling, 1 reply; 12+ messages in thread From: Boris Ostrovsky @ 2017-01-25 19:26 UTC (permalink / raw) To: Joao Martins, linux-kernel, xen-devel Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Juergen Gross, Andy Lutomirski On 01/25/2017 12:33 PM, Joao Martins wrote: > In order to support pvclock vdso on xen we need to setup the time > info page for vcpu 0 and register the page with Xen using the > VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall > will also forcefully update the pvti which will set some of the > necessary flags for vdso. Afterwards we check if it supports the > PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having > vdso/vsyscall support. And if so, it will set the cpu 0 pvti that > will be later on used when mapping the vdso image. > > The xen headers are also updated to include the new hypercall for > registering the secondary vcpu_time_info struct. > > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > --- > Changes since RFC: > (Comments from Boris and David) > * Remove Kconfig option > * Use get_zeroed_page/free/page > * Remove the hypercall availability check > * Unregister pvti with arg.addr.v = NULL if stable bit isn't supported. > (New) > * Set secondary copy on restore such that it works on migration. > * Drop global xen_clock variable and stash it locally on > xen_setup_vsyscall_time_info. > * WARN_ON(ret) if we fail to unregister the pvti. > --- > arch/x86/xen/enlighten.c | 2 ++ > arch/x86/xen/time.c | 51 ++++++++++++++++++++++++++++++++++++++++++++ > arch/x86/xen/xen-ops.h | 1 + > include/xen/interface/vcpu.h | 28 ++++++++++++++++++++++++ > 4 files changed, 82 insertions(+) > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index 51ef952..15d271d 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -270,6 +270,8 @@ void xen_vcpu_restore(void) This is called for PV only. What about HVM? > HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL)) > BUG(); > } > + > + xen_setup_vsyscall_time_info(0); Do we need to tear down time memory area on VCPU suspend? > } > > static void __init xen_banner(void) > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c > index 1e69956..e90f703 100644 > --- a/arch/x86/xen/time.c > +++ b/arch/x86/xen/time.c > @@ -367,6 +367,56 @@ static const struct pv_time_ops xen_time_ops __initconst = { > .steal_clock = xen_steal_clock, > }; > > +int xen_setup_vsyscall_time_info(int cpu) > +{ > + struct pvclock_vsyscall_time_info *xen_clock; > + struct vcpu_register_time_memory_area t; > + struct pvclock_vcpu_time_info *pvti; > + unsigned long addr; > + u8 flags; > + int ret; > + > + addr = get_zeroed_page(GFP_KERNEL); > + if (!addr) > + return -ENOMEM; > + > + xen_clock = (struct pvclock_vsyscall_time_info *) addr; > + memset(xen_clock, 0, PAGE_SIZE); You don't really need addr and there is no reason to memset the page to zero, given that you got it with get_zeroed_page(). > + > + t.addr.v = &xen_clock->pvti; > + > + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, > + cpu, &t); > + > + if (ret) { > + pr_debug("xen: cannot register vcpu_time_info err %d\n", ret); pr_warn() would be more appropriate I think. You also have blank line before 'if'. > + free_page(addr); > + return ret; > + } > + > + pvti = &xen_clock->pvti; > + flags = pvti->flags; I don't think you need these, given that you only reference flags once below. > + > + if (!(flags & PVCLOCK_TSC_STABLE_BIT)) { > + t.addr.v = NULL; > + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, > + cpu, &t); > + if (!ret) > + free_page(addr); > + > + WARN_ON(ret); Did someone ask for WARN_ON()? (from your changelog it looks like it could have been either David or me). I think pr_warn() is sufficient, just like above. > + pr_debug("xen: VCLOCK_PVCLOCK not supported\n"); pr_notice() -boris > + return -ENOTSUPP; > + } > + > + pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); > + pvclock_set_pvti_cpu0_va(xen_clock); > + > + xen_clocksource.archdata.vclock_mode = VCLOCK_PVCLOCK; > + > + return 0; > +} > + > static void __init xen_time_init(void) > { > int cpu = smp_processor_id(); > @@ -393,6 +443,7 @@ static void __init xen_time_init(void) > setup_force_cpu_cap(X86_FEATURE_TSC); > > xen_setup_runstate_info(cpu); > + xen_setup_vsyscall_time_info(cpu); > xen_setup_timer(cpu); > xen_setup_cpu_clockevents(); > > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h > index ac0a2b0..4036d15 100644 > --- a/arch/x86/xen/xen-ops.h > +++ b/arch/x86/xen/xen-ops.h > @@ -66,6 +66,7 @@ void __init xen_vmalloc_p2m_tree(void); > void xen_init_irq_ops(void); > void xen_setup_timer(int cpu); > void xen_setup_runstate_info(int cpu); > +int xen_setup_vsyscall_time_info(int cpu); > void xen_teardown_timer(int cpu); > u64 xen_clocksource_read(void); > void xen_setup_cpu_clockevents(void); > diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h > index 98188c8..8da788c 100644 > --- a/include/xen/interface/vcpu.h > +++ b/include/xen/interface/vcpu.h > @@ -178,4 +178,32 @@ DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info); > > /* Send an NMI to the specified VCPU. @extra_arg == NULL. */ > #define VCPUOP_send_nmi 11 > + > +/* > + * Register a memory location to get a secondary copy of the vcpu time > + * parameters. The master copy still exists as part of the vcpu shared > + * memory area, and this secondary copy is updated whenever the master copy > + * is updated (and using the same versioning scheme for synchronisation). > + * > + * The intent is that this copy may be mapped (RO) into userspace so > + * that usermode can compute system time using the time info and the > + * tsc. Usermode will see an array of vcpu_time_info structures, one > + * for each vcpu, and choose the right one by an existing mechanism > + * which allows it to get the current vcpu number (such as via a > + * segment limit). It can then apply the normal algorithm to compute > + * system time from the tsc. > + * > + * @extra_arg == pointer to vcpu_register_time_info_memory_area structure. > + */ > +#define VCPUOP_register_vcpu_time_memory_area 13 > +DEFINE_GUEST_HANDLE_STRUCT(vcpu_time_info_t); > +struct vcpu_register_time_memory_area { > + union { > + GUEST_HANDLE(vcpu_time_info_t) h; > + struct pvclock_vcpu_time_info *v; > + uint64_t p; > + } addr; > +}; > +DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_time_memory_area_t); > + > #endif /* __XEN_PUBLIC_VCPU_H__ */ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page 2017-01-25 19:26 ` Boris Ostrovsky @ 2017-01-26 13:22 ` Joao Martins 0 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2017-01-26 13:22 UTC (permalink / raw) To: Boris Ostrovsky Cc: linux-kernel, xen-devel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Juergen Gross, Andy Lutomirski On 01/25/2017 07:26 PM, Boris Ostrovsky wrote: > On 01/25/2017 12:33 PM, Joao Martins wrote: >> In order to support pvclock vdso on xen we need to setup the time >> info page for vcpu 0 and register the page with Xen using the >> VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall >> will also forcefully update the pvti which will set some of the >> necessary flags for vdso. Afterwards we check if it supports the >> PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having >> vdso/vsyscall support. And if so, it will set the cpu 0 pvti that >> will be later on used when mapping the vdso image. >> >> The xen headers are also updated to include the new hypercall for >> registering the secondary vcpu_time_info struct. >> >> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >> --- >> Changes since RFC: >> (Comments from Boris and David) >> * Remove Kconfig option >> * Use get_zeroed_page/free/page >> * Remove the hypercall availability check >> * Unregister pvti with arg.addr.v = NULL if stable bit isn't supported. >> (New) >> * Set secondary copy on restore such that it works on migration. >> * Drop global xen_clock variable and stash it locally on >> xen_setup_vsyscall_time_info. >> * WARN_ON(ret) if we fail to unregister the pvti. >> --- >> arch/x86/xen/enlighten.c | 2 ++ >> arch/x86/xen/time.c | 51 ++++++++++++++++++++++++++++++++++++++++++++ >> arch/x86/xen/xen-ops.h | 1 + >> include/xen/interface/vcpu.h | 28 ++++++++++++++++++++++++ >> 4 files changed, 82 insertions(+) >> >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >> index 51ef952..15d271d 100644 >> --- a/arch/x86/xen/enlighten.c >> +++ b/arch/x86/xen/enlighten.c >> @@ -270,6 +270,8 @@ void xen_vcpu_restore(void) > > This is called for PV only. What about HVM? The call is missing from xen_hvm_post_suspend(...), I will add it. > >> HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL)) >> BUG(); >> } >> + >> + xen_setup_vsyscall_time_info(0); > > Do we need to tear down time memory area on VCPU suspend? I also missed that; otherwise I am leaking a page. I could also rework this patch such that the initially allocated xen_clock page is reused and simply register/unregister in save/restore paths. This would probably mean adding one extra helper to register the vcpu_time info and perhaps make xen_setup_vsyscall_time_info a bit simpler. >> } >> >> static void __init xen_banner(void) >> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c >> index 1e69956..e90f703 100644 >> --- a/arch/x86/xen/time.c >> +++ b/arch/x86/xen/time.c >> @@ -367,6 +367,56 @@ static const struct pv_time_ops xen_time_ops __initconst = { >> .steal_clock = xen_steal_clock, >> }; >> >> +int xen_setup_vsyscall_time_info(int cpu) >> +{ >> + struct pvclock_vsyscall_time_info *xen_clock; >> + struct vcpu_register_time_memory_area t; >> + struct pvclock_vcpu_time_info *pvti; >> + unsigned long addr; >> + u8 flags; >> + int ret; >> + >> + addr = get_zeroed_page(GFP_KERNEL); >> + if (!addr) >> + return -ENOMEM; >> + >> + xen_clock = (struct pvclock_vsyscall_time_info *) addr; >> + memset(xen_clock, 0, PAGE_SIZE); > > You don't really need addr The reason I had this was to avoid to save one cast to unsigned long (on free_page paths). But maybe it's not worth it and looking at the rest of the x86/xen code, this doesn't seem to be the case. I will remove it. > and there is no reason to memset the page to > zero, given that you got it with get_zeroed_page(). Yeap, Fixed. > >> + >> + t.addr.v = &xen_clock->pvti; >> + >> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, >> + cpu, &t); >> + >> + if (ret) { >> + pr_debug("xen: cannot register vcpu_time_info err %d\n", ret); > > pr_warn() would be more appropriate I think. You also have blank line > before 'if'. Fixed. >> + free_page(addr); >> + return ret; >> + } >> + >> + pvti = &xen_clock->pvti; >> + flags = pvti->flags; > > I don't think you need these, given that you only reference flags once > below. > Indeed, fixed as well. >> + >> + if (!(flags & PVCLOCK_TSC_STABLE_BIT)) { >> + t.addr.v = NULL; >> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, >> + cpu, &t); >> + if (!ret) >> + free_page(addr); >> + >> + WARN_ON(ret); > > Did someone ask for WARN_ON()? (from your changelog it looks like it > could have been either David or me). I think pr_warn() is sufficient, > just like above. Looking again at previous comments no, this WARN_ON wasn't asked. I can changed it to the similar above as you are suggesting. > >> + pr_debug("xen: VCLOCK_PVCLOCK not supported\n"); > > pr_notice() Fixed it too. > > -boris > >> + return -ENOTSUPP; >> + } >> + >> + pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); >> + pvclock_set_pvti_cpu0_va(xen_clock); >> + >> + xen_clocksource.archdata.vclock_mode = VCLOCK_PVCLOCK; >> + >> + return 0; >> +} >> + >> static void __init xen_time_init(void) >> { >> int cpu = smp_processor_id(); >> @@ -393,6 +443,7 @@ static void __init xen_time_init(void) >> setup_force_cpu_cap(X86_FEATURE_TSC); >> >> xen_setup_runstate_info(cpu); >> + xen_setup_vsyscall_time_info(cpu); >> xen_setup_timer(cpu); >> xen_setup_cpu_clockevents(); >> >> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h >> index ac0a2b0..4036d15 100644 >> --- a/arch/x86/xen/xen-ops.h >> +++ b/arch/x86/xen/xen-ops.h >> @@ -66,6 +66,7 @@ void __init xen_vmalloc_p2m_tree(void); >> void xen_init_irq_ops(void); >> void xen_setup_timer(int cpu); >> void xen_setup_runstate_info(int cpu); >> +int xen_setup_vsyscall_time_info(int cpu); >> void xen_teardown_timer(int cpu); >> u64 xen_clocksource_read(void); >> void xen_setup_cpu_clockevents(void); >> diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h >> index 98188c8..8da788c 100644 >> --- a/include/xen/interface/vcpu.h >> +++ b/include/xen/interface/vcpu.h >> @@ -178,4 +178,32 @@ DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info); >> >> /* Send an NMI to the specified VCPU. @extra_arg == NULL. */ >> #define VCPUOP_send_nmi 11 >> + >> +/* >> + * Register a memory location to get a secondary copy of the vcpu time >> + * parameters. The master copy still exists as part of the vcpu shared >> + * memory area, and this secondary copy is updated whenever the master copy >> + * is updated (and using the same versioning scheme for synchronisation). >> + * >> + * The intent is that this copy may be mapped (RO) into userspace so >> + * that usermode can compute system time using the time info and the >> + * tsc. Usermode will see an array of vcpu_time_info structures, one >> + * for each vcpu, and choose the right one by an existing mechanism >> + * which allows it to get the current vcpu number (such as via a >> + * segment limit). It can then apply the normal algorithm to compute >> + * system time from the tsc. >> + * >> + * @extra_arg == pointer to vcpu_register_time_info_memory_area structure. >> + */ >> +#define VCPUOP_register_vcpu_time_memory_area 13 >> +DEFINE_GUEST_HANDLE_STRUCT(vcpu_time_info_t); >> +struct vcpu_register_time_memory_area { >> + union { >> + GUEST_HANDLE(vcpu_time_info_t) h; >> + struct pvclock_vcpu_time_info *v; >> + uint64_t p; >> + } addr; >> +}; >> +DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_time_memory_area_t); >> + >> #endif /* __XEN_PUBLIC_VCPU_H__ */ > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page 2017-01-25 17:33 ` [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page Joao Martins 2017-01-25 19:26 ` Boris Ostrovsky @ 2017-01-26 17:18 ` Andy Lutomirski 1 sibling, 0 replies; 12+ messages in thread From: Andy Lutomirski @ 2017-01-26 17:18 UTC (permalink / raw) To: Joao Martins Cc: linux-kernel, xen-devel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, X86 ML, Boris Ostrovsky, Juergen Gross On Wed, Jan 25, 2017 at 9:33 AM, Joao Martins <joao.m.martins@oracle.com> wrote: > In order to support pvclock vdso on xen we need to setup the time > info page for vcpu 0 and register the page with Xen using the > VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall > will also forcefully update the pvti which will set some of the > necessary flags for vdso. Afterwards we check if it supports the > PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having > vdso/vsyscall support. And if so, it will set the cpu 0 pvti that > will be later on used when mapping the vdso image. > > The xen headers are also updated to include the new hypercall for > registering the secondary vcpu_time_info struct. No objections from me assuming the code is correct. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes 2017-01-25 17:33 [PATCH v1 0/3] x86/xen: pvclock vdso support Joao Martins 2017-01-25 17:33 ` [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va Joao Martins 2017-01-25 17:33 ` [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page Joao Martins @ 2017-01-25 17:33 ` Joao Martins 2017-01-26 17:25 ` Andy Lutomirski 2017-01-27 14:54 ` Juergen Gross 2 siblings, 2 replies; 12+ messages in thread From: Joao Martins @ 2017-01-25 17:33 UTC (permalink / raw) To: linux-kernel, xen-devel, kvm Cc: Joao Martins, Boris Ostrovsky, Juergen Gross, Paolo Bonzini, Radim Krcmar, Andy Lutomirski This file defines an ABI shared between guest and hypervisor(s) (KVM, Xen) and as such there should be an correspondent entry in MAINTAINERS file. Notice that there's already a text notice at the top of the header file, hence this commit simply enforces it more explicitly and have both peers noticed when such changes happen. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- This was suggested by folks at xen-devel as we missed some of the ABI additions (e.g. flags field in pvti, TSC stable bit) - so this patch is to help preventing that from happening. Alternatively I could instead add a "PVCLOCK ABI" section in this file with the two mailing lists. --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 26edd83..c4315d1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7041,6 +7041,7 @@ F: Documentation/virtual/kvm/ F: arch/*/kvm/ F: arch/x86/kernel/kvm.c F: arch/x86/kernel/kvmclock.c +F: arch/x86/include/asm/pvclock-abi.h F: arch/*/include/asm/kvm* F: include/linux/kvm* F: include/uapi/linux/kvm* @@ -13483,6 +13484,7 @@ M: Juergen Gross <jgross@suse.com> L: xen-devel@lists.xenproject.org (moderated for non-subscribers) T: git git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git S: Supported +F: arch/x86/include/asm/pvclock-abi.h F: arch/x86/xen/ F: drivers/*/xen-*front.c F: drivers/xen/ -- 2.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes 2017-01-25 17:33 ` [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes Joao Martins @ 2017-01-26 17:25 ` Andy Lutomirski 2017-01-26 20:08 ` Joao Martins 2017-01-27 14:54 ` Juergen Gross 1 sibling, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2017-01-26 17:25 UTC (permalink / raw) To: Joao Martins Cc: linux-kernel, xen-devel, kvm list, Boris Ostrovsky, Juergen Gross, Paolo Bonzini, Radim Krcmar On Wed, Jan 25, 2017 at 9:33 AM, Joao Martins <joao.m.martins@oracle.com> wrote: > This file defines an ABI shared between guest and hypervisor(s) > (KVM, Xen) and as such there should be an correspondent entry in > MAINTAINERS file. Notice that there's already a text notice at the > top of the header file, hence this commit simply enforces it more > explicitly and have both peers noticed when such changes happen. > > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > --- > This was suggested by folks at xen-devel as we missed some of the > ABI additions (e.g. flags field in pvti, TSC stable bit) - so this > patch is to help preventing that from happening. Alternatively I > could instead add a "PVCLOCK ABI" section in this file with the > two mailing lists. If you do the latter, please add me as an R:. --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes 2017-01-26 17:25 ` Andy Lutomirski @ 2017-01-26 20:08 ` Joao Martins 0 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2017-01-26 20:08 UTC (permalink / raw) To: Andy Lutomirski Cc: linux-kernel, xen-devel, kvm list, Boris Ostrovsky, Juergen Gross, Paolo Bonzini, Radim Krcmar On 01/26/2017 05:25 PM, Andy Lutomirski wrote: > On Wed, Jan 25, 2017 at 9:33 AM, Joao Martins <joao.m.martins@oracle.com> wrote: >> This file defines an ABI shared between guest and hypervisor(s) >> (KVM, Xen) and as such there should be an correspondent entry in >> MAINTAINERS file. Notice that there's already a text notice at the >> top of the header file, hence this commit simply enforces it more >> explicitly and have both peers noticed when such changes happen. >> >> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >> --- >> This was suggested by folks at xen-devel as we missed some of the >> ABI additions (e.g. flags field in pvti, TSC stable bit) - so this >> patch is to help preventing that from happening. Alternatively I >> could instead add a "PVCLOCK ABI" section in this file with the >> two mailing lists. > > If you do the latter, please add me as an R:. OK, Thanks. Since the ABI is used on both hypervisors I'll leave/wait for maintainers to voice their preference. Joao ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes 2017-01-25 17:33 ` [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes Joao Martins 2017-01-26 17:25 ` Andy Lutomirski @ 2017-01-27 14:54 ` Juergen Gross 1 sibling, 0 replies; 12+ messages in thread From: Juergen Gross @ 2017-01-27 14:54 UTC (permalink / raw) To: Joao Martins, linux-kernel, xen-devel, kvm Cc: Boris Ostrovsky, Paolo Bonzini, Radim Krcmar, Andy Lutomirski On 25/01/17 18:33, Joao Martins wrote: > This file defines an ABI shared between guest and hypervisor(s) > (KVM, Xen) and as such there should be an correspondent entry in > MAINTAINERS file. Notice that there's already a text notice at the > top of the header file, hence this commit simply enforces it more > explicitly and have both peers noticed when such changes happen. > > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > --- > This was suggested by folks at xen-devel as we missed some of the > ABI additions (e.g. flags field in pvti, TSC stable bit) - so this > patch is to help preventing that from happening. Alternatively I > could instead add a "PVCLOCK ABI" section in this file with the > two mailing lists. I don't mind either way. In case of an own section: make it more generic like "PARAVIRT ABIS" as there might be more than PVCLOCK (if not now maybe in future). For both variants: Acked-by: Juergen Gross <jgross@suse.com> Juergen > --- > MAINTAINERS | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 26edd83..c4315d1 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -7041,6 +7041,7 @@ F: Documentation/virtual/kvm/ > F: arch/*/kvm/ > F: arch/x86/kernel/kvm.c > F: arch/x86/kernel/kvmclock.c > +F: arch/x86/include/asm/pvclock-abi.h > F: arch/*/include/asm/kvm* > F: include/linux/kvm* > F: include/uapi/linux/kvm* > @@ -13483,6 +13484,7 @@ M: Juergen Gross <jgross@suse.com> > L: xen-devel@lists.xenproject.org (moderated for non-subscribers) > T: git git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git > S: Supported > +F: arch/x86/include/asm/pvclock-abi.h > F: arch/x86/xen/ > F: drivers/*/xen-*front.c > F: drivers/xen/ > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-01-27 14:54 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-25 17:33 [PATCH v1 0/3] x86/xen: pvclock vdso support Joao Martins 2017-01-25 17:33 ` [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va Joao Martins 2017-01-26 17:25 ` Andy Lutomirski 2017-01-26 19:58 ` Joao Martins 2017-01-25 17:33 ` [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page Joao Martins 2017-01-25 19:26 ` Boris Ostrovsky 2017-01-26 13:22 ` Joao Martins 2017-01-26 17:18 ` Andy Lutomirski 2017-01-25 17:33 ` [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes Joao Martins 2017-01-26 17:25 ` Andy Lutomirski 2017-01-26 20:08 ` Joao Martins 2017-01-27 14:54 ` Juergen Gross
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).