From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753317AbdATSbC (ORCPT <rfc822;w@1wt.eu>);
        Fri, 20 Jan 2017 13:31:02 -0500
Received: from mx1.redhat.com ([209.132.183.28]:43920 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1752266AbdATSbA (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 20 Jan 2017 13:31:00 -0500
Date: Fri, 20 Jan 2017 19:30:57 +0100
From: Radim Krcmar <rkrcmar@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org,
        Richard Cochran <richardcochran@gmail.com>,
        Miroslav Lichvar <mlichvar@redhat.com>
Subject: Re: [patch 4/5] PTP: add PTP_SYS_OFFSET emulation via cross
 timestamps infrastructure
Message-ID: <20170120183057.GC1365@potion>
References: <20170120122025.665985919@redhat.com>
 <20170120122503.746158230@redhat.com>
 <48bb2650-ed00-ec07-31bf-8780d3ab5568@redhat.com>
 <20170120130711.GA27440@amt.cnet>
 <2d213ad9-fa40-1f1e-90a9-404764969d35@redhat.com>
 <20170120140216.GA1358@potion>
 <66e145c8-0e7c-5ae4-486b-385a058f7e05@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <66e145c8-0e7c-5ae4-486b-385a058f7e05@redhat.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 20 Jan 2017 18:31:00 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

2017-01-20 15:23+0100, Paolo Bonzini:
> On 20/01/2017 15:02, Radim Krcmar wrote:
>> 2017-01-20 14:36+0100, Paolo Bonzini:
>>> On 20/01/2017 14:07, Marcelo Tosatti wrote:
>>>> On Fri, Jan 20, 2017 at 01:55:27PM +0100, Paolo Bonzini wrote:
>>>>>
>>>>>
>>>>> On 20/01/2017 13:20, Marcelo Tosatti wrote:
>>>>>>  kernel/time/timekeeping.c        |   79 +++++++++++++++++++++++++++++++++++++++
>>>>>
>>>>> Why not leave this in drivers/ptp/ptp_chardev.c?
>>>>
>>>> timekeeper_lock
>>>
>>> Why does emulate_ptp_sys_offset need it, if the current PTP_SYS_OFFSET
>>> code doesn't?  Is the latency acceptable (considering this is a raw spin
>>> lock) or is there a seqlock that we can use instead (such as tk_core.seq
>>> like in get_device_system_crosststamp)?
>> 
>> The spinlock prevents writers to take the tk_core.seq, which means that
>> time cannot be changed during that.
>> 
>> The simplest alternative would be to use tk_core.seq for all our reads,
>> but that would increse the chance of re-reading, even infinitely.
> 
> How much?  If a hypercall takes 1 microsecond, and PTP_MAX_SAMPLES is
> 25, we should be done in less than 50 microseconds.  If update_wall-time
> is called with 250 Hz frequency (sounds like a lot), that's still 4000
> microseconds so the probability of even 3-4 consecutive retries should
> be very low.

You are right, I was overestimating the worst case.
Host/guest preemption (1000 Hz) will also force a re-read, but both of
these diminishing probabilities and a tendency to align.

>> But we don't need to read everything with the same time base -- if the
>> time is changed (by NTP/user/...) between our reads, then the value will
>> be off, but if writer took tk_core.seq just to accumulate current time,
>> then the time after accumulation stays the same and it would work as if
>> we had the tk_core.seq for the whole time.
> 
> You mean only check seqlock separately for each sample, but restart the
> entire loop upon changes to cs_was_changed_seq or clock_was_set_seq?
> That would work too.

I wanted to accept that our measuerements can be imprecise and just have
the seqlock for each sample.  It should not make a difference without
misconfiguration and we can't do anything about a malicious root anyway.

Thanks.