All of lore.kernel.org
 help / color / mirror / Atom feed
* Improving accuracy of PHC readings
@ 2018-10-19  9:51 Miroslav Lichvar
  2018-10-19 16:52 ` Keller, Jacob E
  2018-10-22 22:48 ` Richard Cochran
  0 siblings, 2 replies; 6+ messages in thread
From: Miroslav Lichvar @ 2018-10-19  9:51 UTC (permalink / raw)
  To: netdev; +Cc: Richard Cochran, Keller, Jacob E

I think there might be a way how we could significantly improve
accuracy of synchronization between the system clock and a PTP
hardware clock, at least with some network drivers.

Currently, the PTP_SYS_OFFSET ioctl reads the system clock, reads the
PHC using the gettime64 function of the driver, and reads the system
clock again. The ioctl can repeat this to provide multiple readings to
the user space.

phc2sys (or another program synchronizing the system clock to the PHC)
assumes the PHC timestamps were captured in the middle between the two
closest system clock timestamps.

The trouble is that gettime64 typically reads multiple (2-3) registers
and the timestamp is latched on the first one, so the assumption about
middle point is wrong. There is an asymmetry, even if the delays on
the PCIe bus are perfectly symmetric.

A solution to this would be a new driver function that wraps the
latching register read with readings of the system clock and return
three timestamps instead of one. For example:

        ktime_get_real_ts64(&sys_ts1);
	IXGBE_READ_REG(hw, IXGBE_SYSTIMR);
	ktime_get_real_ts64(&sys_ts2);
	phc_ts.tv_nsec = IXGBE_READ_REG(hw, IXGBE_SYSTIML);
	phc_ts.tv_sec = IXGBE_READ_REG(hw, IXGBE_SYSTIMH);
 
The extra timestamp doesn't fit the API of the PTP_SYS_OFFSET ioctl,
so it would need to shift the timestamp it returns by the missing
intervals (assuming the frequency offset between the PHC and system
clock is small), or a new ioctl could be introduced that would return
all timestamps in an array looking like this:

	[sys, phc, sys, sys, phc, sys, ...]

This should significantly improve the accuracy of the synchronization,
reduce the uncertainty in the readings to less than a half or third,
and also reduce the jitter as there are fewer register reads sensitive
to the PCIe delay.

What do you think?

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Improving accuracy of PHC readings
  2018-10-19  9:51 Improving accuracy of PHC readings Miroslav Lichvar
@ 2018-10-19 16:52 ` Keller, Jacob E
  2018-10-22 22:48   ` Richard Cochran
  2018-10-23 16:48   ` Miroslav Lichvar
  2018-10-22 22:48 ` Richard Cochran
  1 sibling, 2 replies; 6+ messages in thread
From: Keller, Jacob E @ 2018-10-19 16:52 UTC (permalink / raw)
  To: Miroslav Lichvar, netdev; +Cc: Richard Cochran

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Miroslav Lichvar
> Sent: Friday, October 19, 2018 2:52 AM
> To: netdev@vger.kernel.org
> Cc: Richard Cochran <richardcochran@gmail.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>
> Subject: Improving accuracy of PHC readings
> 
> I think there might be a way how we could significantly improve
> accuracy of synchronization between the system clock and a PTP
> hardware clock, at least with some network drivers.
> 
> Currently, the PTP_SYS_OFFSET ioctl reads the system clock, reads the
> PHC using the gettime64 function of the driver, and reads the system
> clock again. The ioctl can repeat this to provide multiple readings to
> the user space.
> 
> phc2sys (or another program synchronizing the system clock to the PHC)
> assumes the PHC timestamps were captured in the middle between the two
> closest system clock timestamps.
> 
> The trouble is that gettime64 typically reads multiple (2-3) registers
> and the timestamp is latched on the first one, so the assumption about
> middle point is wrong. There is an asymmetry, even if the delays on
> the PCIe bus are perfectly symmetric.
> 

Right! I feel like this is obvious now that you said it, so I'm surprised no one thought of it before...

> A solution to this would be a new driver function that wraps the
> latching register read with readings of the system clock and return
> three timestamps instead of one. For example:
> 
>         ktime_get_real_ts64(&sys_ts1);
> 	IXGBE_READ_REG(hw, IXGBE_SYSTIMR);
> 	ktime_get_real_ts64(&sys_ts2);
> 	phc_ts.tv_nsec = IXGBE_READ_REG(hw, IXGBE_SYSTIML);
> 	phc_ts.tv_sec = IXGBE_READ_REG(hw, IXGBE_SYSTIMH);
> 
> The extra timestamp doesn't fit the API of the PTP_SYS_OFFSET ioctl,
> so it would need to shift the timestamp it returns by the missing
> intervals (assuming the frequency offset between the PHC and system
> clock is small), or a new ioctl could be introduced that would return
> all timestamps in an array looking like this:
> 
> 	[sys, phc, sys, sys, phc, sys, ...]
> 

I think the new ioctl is probably the better solution.

> This should significantly improve the accuracy of the synchronization,
> reduce the uncertainty in the readings to less than a half or third,
> and also reduce the jitter as there are fewer register reads sensitive
> to the PCIe delay.
> 
> What do you think?
> 

Nice! I think this is good. I'd love to see some data to back it up, but it makes sense to me.

Thanks,
Jake

> --
> Miroslav Lichvar

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Improving accuracy of PHC readings
  2018-10-19  9:51 Improving accuracy of PHC readings Miroslav Lichvar
  2018-10-19 16:52 ` Keller, Jacob E
@ 2018-10-22 22:48 ` Richard Cochran
  2018-10-23 14:07   ` Miroslav Lichvar
  1 sibling, 1 reply; 6+ messages in thread
From: Richard Cochran @ 2018-10-22 22:48 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: netdev, Keller, Jacob E

On Fri, Oct 19, 2018 at 11:51:37AM +0200, Miroslav Lichvar wrote:
> A solution to this would be a new driver function that wraps the
> latching register read with readings of the system clock and return
> three timestamps instead of one. For example:
> 
>         ktime_get_real_ts64(&sys_ts1);
> 	IXGBE_READ_REG(hw, IXGBE_SYSTIMR);
> 	ktime_get_real_ts64(&sys_ts2);
> 	phc_ts.tv_nsec = IXGBE_READ_REG(hw, IXGBE_SYSTIML);
> 	phc_ts.tv_sec = IXGBE_READ_REG(hw, IXGBE_SYSTIMH);

Makes sense...
  
> The extra timestamp doesn't fit the API of the PTP_SYS_OFFSET ioctl,
> so it would need to shift the timestamp it returns by the missing
> intervals (assuming the frequency offset between the PHC and system
> clock is small), or a new ioctl could be introduced that would return
> all timestamps in an array looking like this:
> 
> 	[sys, phc, sys, sys, phc, sys, ...]

How about a new ioctl with number of trials as input and single offset
as output?

Then it would be up to the driver to implement the device-specific
loop.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Improving accuracy of PHC readings
  2018-10-19 16:52 ` Keller, Jacob E
@ 2018-10-22 22:48   ` Richard Cochran
  2018-10-23 16:48   ` Miroslav Lichvar
  1 sibling, 0 replies; 6+ messages in thread
From: Richard Cochran @ 2018-10-22 22:48 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: Miroslav Lichvar, netdev

On Fri, Oct 19, 2018 at 04:52:13PM +0000, Keller, Jacob E wrote:
> Nice! I think this is good. I'd love to see some data to back it up, but it makes sense to me.

+1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Improving accuracy of PHC readings
  2018-10-22 22:48 ` Richard Cochran
@ 2018-10-23 14:07   ` Miroslav Lichvar
  0 siblings, 0 replies; 6+ messages in thread
From: Miroslav Lichvar @ 2018-10-23 14:07 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, Keller, Jacob E

On Mon, Oct 22, 2018 at 03:48:02PM -0700, Richard Cochran wrote:
> On Fri, Oct 19, 2018 at 11:51:37AM +0200, Miroslav Lichvar wrote:
> > The extra timestamp doesn't fit the API of the PTP_SYS_OFFSET ioctl,
> > so it would need to shift the timestamp it returns by the missing
> > intervals (assuming the frequency offset between the PHC and system
> > clock is small), or a new ioctl could be introduced that would return
> > all timestamps in an array looking like this:
> > 
> > 	[sys, phc, sys, sys, phc, sys, ...]
> 
> How about a new ioctl with number of trials as input and single offset
> as output?

The difference between the system timestamps is important as it gives
an upper bound on the error in the offset, so I think the output
should be at least a pair of offset and delay.

The question is from which triplet should be the offset and delay
calculated. The one with the minimum delay is a good choice, but it's
not the only option. For instance, an average or median from all
triplets that have delay smaller than the minimum + 30 nanoseconds may
give a more stable offset.

This is not that different from an NTP client filtering measurements
made over network. I'm not sure if we should try to solve it in the
kernel or drivers. My preference would be to give the user space all
the data and process it there.

If the increased size of the array is an issue, we can reduce the
maximum number of readings.

Does that make sense?

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Improving accuracy of PHC readings
  2018-10-19 16:52 ` Keller, Jacob E
  2018-10-22 22:48   ` Richard Cochran
@ 2018-10-23 16:48   ` Miroslav Lichvar
  1 sibling, 0 replies; 6+ messages in thread
From: Miroslav Lichvar @ 2018-10-23 16:48 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: netdev, Richard Cochran

On Fri, Oct 19, 2018 at 04:52:13PM +0000, Keller, Jacob E wrote:
> > This should significantly improve the accuracy of the synchronization,
> > reduce the uncertainty in the readings to less than a half or third,
> > and also reduce the jitter as there are fewer register reads sensitive
> > to the PCIe delay.
> > 
> > What do you think?
> > 
> 
> Nice! I think this is good. I'd love to see some data to back it up, but it makes sense to me.

I tried a quick hack with an X550 and I219. The delay dropped from
about 2940 ns to 1040 ns on the first port of the X550, from 1920 ns
to 660 ns on the second port of the X550, and from 2500 ns to 1300 ns
on the I219.

The I219 supports the SYS_OFFSET_PRECISE ioctl (cross timestamping),
which we can use for comparison. The difference between the offsets
calculated using the two ioctls was about 500-600 ns before and now it
is about -50--150 ns.

I was not able to find any information on how accurate cross
timestamping on this HW is actually supposed to be, so I'm wondering
which of the two is closer to the truth.

Here is an output from phc2sys with the I219:

Before:
phc offset       -59 s2 freq     +40 delay   2527
phc offset        19 s2 freq    +101 delay   2526
phc offset       -23 s2 freq     +64 delay   2522
phc offset        46 s2 freq    +126 delay   2535
phc offset       -32 s2 freq     +62 delay   2530
phc offset       -10 s2 freq     +75 delay   2526
phc offset       102 s2 freq    +184 delay   2523

After:
phc offset        17 s2 freq    +105 delay   1298
phc offset        47 s2 freq    +140 delay   1299
phc offset       -42 s2 freq     +65 delay   1293
phc offset        -6 s2 freq     +88 delay   1299
phc offset        34 s2 freq    +127 delay   1300
phc offset       -14 s2 freq     +89 delay   1301
phc offset       -86 s2 freq     +13 delay   1296
phc offset       -21 s2 freq     +52 delay   1298

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-24  1:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-19  9:51 Improving accuracy of PHC readings Miroslav Lichvar
2018-10-19 16:52 ` Keller, Jacob E
2018-10-22 22:48   ` Richard Cochran
2018-10-23 16:48   ` Miroslav Lichvar
2018-10-22 22:48 ` Richard Cochran
2018-10-23 14:07   ` Miroslav Lichvar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.