From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751758AbdAYM4u (ORCPT ); Wed, 25 Jan 2017 07:56:50 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:33297 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751391AbdAYM4s (ORCPT ); Wed, 25 Jan 2017 07:56:48 -0500 Subject: Re: [patch 4/4] PTP: add kvm PTP driver To: Marcelo Tosatti , kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <20170124170938.909652059@redhat.com> <20170124171020.537965390@redhat.com> Cc: Radim Krcmar , Richard Cochran , Miroslav Lichvar From: Paolo Bonzini Message-ID: <5e538bf2-8c30-edd1-1f90-1423cc645563@redhat.com> Date: Wed, 25 Jan 2017 13:56:45 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20170124171020.537965390@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 24/01/2017 18:09, Marcelo Tosatti wrote: > Add a driver with gettime method returning hosts realtime clock. > This allows Chrony to synchronize host and guest clocks with > high precision (see results below). > > chronyc> sources > MS Name/IP address Stratum Poll Reach LastRx Last sample > =============================================================================== > #* PHC0 0 3 377 4 +162ns[ -683ns] +/- 11ns > > To configure Chronyd to use PHC refclock, add the > following line to its configuration file: > > refclock PHC /dev/ptpX poll 3 dpoll -2 offset 0 > > Where /dev/ptpX is the kvmclock PTP clock. > > Signed-off-by: Marcelo Tosatti > > --- > drivers/ptp/Kconfig | 12 ++ > drivers/ptp/Makefile | 1 > drivers/ptp/ptp_kvm.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 215 insertions(+) > > v2: check for kvmclock (Radim) > initialize global variables before device registration (Radim) > v3: use cross timestamps callback (Paolo, Miroslav, Radim) > v4: remove gettime64 callback (Paolo) > v5: drop useless nsec assignment (Radim) > > Index: kvm-ptpdriver/drivers/ptp/Kconfig > =================================================================== > --- kvm-ptpdriver.orig/drivers/ptp/Kconfig 2017-01-24 09:03:47.604586098 -0200 > +++ kvm-ptpdriver/drivers/ptp/Kconfig 2017-01-24 09:05:06.321704124 -0200 > @@ -90,4 +90,16 @@ > To compile this driver as a module, choose M here: the module > will be called ptp_pch. > > +config PTP_1588_CLOCK_KVM > + tristate "KVM virtual PTP clock" > + depends on PTP_1588_CLOCK > + depends on KVM_GUEST > + default y > + help > + This driver adds support for using kvm infrastructure as a PTP > + clock. This clock is only useful if you are using KVM guests. > + > + To compile this driver as a module, choose M here: the module > + will be called ptp_kvm. > + > endmenu > Index: kvm-ptpdriver/drivers/ptp/Makefile > =================================================================== > --- kvm-ptpdriver.orig/drivers/ptp/Makefile 2017-01-24 09:03:47.604586098 -0200 > +++ kvm-ptpdriver/drivers/ptp/Makefile 2017-01-24 09:05:06.322704125 -0200 > @@ -6,3 +6,4 @@ > obj-$(CONFIG_PTP_1588_CLOCK) += ptp.o > obj-$(CONFIG_PTP_1588_CLOCK_IXP46X) += ptp_ixp46x.o > obj-$(CONFIG_PTP_1588_CLOCK_PCH) += ptp_pch.o > +obj-$(CONFIG_PTP_1588_CLOCK_KVM) += ptp_kvm.o > Index: kvm-ptpdriver/drivers/ptp/ptp_kvm.c > =================================================================== > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ kvm-ptpdriver/drivers/ptp/ptp_kvm.c 2017-01-24 09:05:06.322704125 -0200 > @@ -0,0 +1,202 @@ > +/* > + * Virtual PTP 1588 clock for use with KVM guests > + * > + * Copyright (C) 2017 Red Hat Inc. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +struct kvm_ptp_clock { > + struct ptp_clock *ptp_clock; > + struct ptp_clock_info caps; > +}; > + > +DEFINE_SPINLOCK(kvm_ptp_lock); > + > +static struct pvclock_vsyscall_time_info *hv_clock; > + > +static struct kvm_clock_offset clock_off; > +static phys_addr_t clock_off_gpa; > + > +static int ptp_kvm_get_time_fn(ktime_t *device_time, > + struct system_counterval_t *system_counter, > + void *ctx) > +{ > + unsigned long ret; > + struct timespec64 tspec; > + unsigned version; > + u8 flags; > + int cpu; > + struct pvclock_vcpu_time_info *src; > + > + preempt_disable_notrace(); > + cpu = smp_processor_id(); > + src = &hv_clock[cpu].pvti; > + > + spin_lock(&kvm_ptp_lock); Any reason to take the spinlock in the non-preemptable region? This would be worse for RT kernels than taking it outside. > + do { > + /* > + * We are using a TSC value read in the hosts > + * kvm_hc_clock_pairing handling. > + * So any changes to tsc_to_system_mul > + * and tsc_shift or any other pvclock > + * data invalidate that measurement. > + */ > + version = pvclock_read_begin(src); > + > + ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING, > + clock_off_gpa, > + KVM_CLOCK_PAIRING_WALLCLOCK); > + if (ret != 0) { > + pr_err("clock offset hypercall ret %lu\n", ret); Need ratelimiting in case something goes wrong and the host stops using the TSC clocksource? > + spin_unlock(&kvm_ptp_lock); > + preempt_enable_notrace(); > + return -EOPNOTSUPP; > + } > + > + tspec.tv_sec = clock_off.sec; > + tspec.tv_nsec = clock_off.nsec; > + ret = __pvclock_read_cycles(src, clock_off.tsc); > + flags = src->flags; Flags written and not read. Also, I think I disagree with Radim, gettime64 should use the TSC correction. It's the right thing to do for the POSIX clock interface. It's a bit arbitrary for PTP_SYS_OFFSET (though you cannot really argue with better precision), but chrony is now using PTP_SYS_OFFSET_PRECISE (and again you cannot really argue with better precision anyway). I'll test this a bit and then push it to kvm/queue. Please take a look! Paolo