On Tue, Oct 16, 2018 at 4:07 PM Keith Packard <keithp@keithp.com> wrote:
Jason Ekstrand <jason@jlekstrand.net> writes:

> I think what Bas is getting at is that there are two problems:
>
>  1) We are not sampling at exactly the same time
>  2) The two clocks may not tick at exactly the same time.

Thanks for the clarification.

> If we want to be conservative, I suspect Bas may be right that adding is
> the safer thing to do.

Yes, it's certainly safe to increase the value of
maxDeviation. Currently, the time it takes to sample all of the clocks
is far larger than the GPU tick, so adding that in would not have a huge
impact on the value returned to the application.

I'd like to dig in a little further and actually understand if the
current computation (which is derived directly from the Vulkan spec) is
wrong, and if so, whether the spec needs to be adjusted.

I think the question is what 'maxDeviation' is supposed to
represent. All the spec says is:

 * pMaxDeviation is a pointer to a 64-bit unsigned integer value in
   which the strictly positive maximum deviation, in nanoseconds, of the
   calibrated timestamp values is returned.

I interpret this as the maximum error in sampling the individual clocks,
which is to say that the clock values are guaranteed to have been
sampled within this interval of each other.

So, if we have a monotonic clock and GPU clock:

          0 1 2 3 4 5 6 7 8 9 a b c d e f
Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-

          0         1         2         3
GPU       -----_____-----_____-----_____-----_____


gpu_period in this case is 5 ticks of the monotonic clock.

Now, I perform three operations:

        start = read(monotonic)
        gpu   = read(GPU)
        end   = read(monotonic)

Let's say that:

        start = 2
        GPU = 1 * 5 = 5 monotonic equivalent ticks
        end = b

So, the question is only how large the error between GPU and start could
possibly be. Certainly the GPU clock was sampled some time between
when monotonic tick 2 started and monotonic tick b ended. But, we have
no idea what phase the GPU clock was in when sampled.

Let's imagine we manage to sample the GPU clock immediately after the
first monotonic sample. I'll shift the offset of the monotonic and GPU
clocks to retain the same values (start = 2, GPU = 1), but now
the GPU clock is being sampled immediately after monotonic time 2:

                w x y z 0 1 2 3 4 5 6 7 8 9 a b c d e f
Monotonic       -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-

          0         1         2         3
GPU       -----_____-----_____-----_____-----_____


In this case, the GPU tick started at monotonic time y, nearly 5
monotonic ticks earlier than the measured monotonic time, so the
deviation between GPU and monotonic would be 5 ticks.

If we sample the GPU clock immediately before the second monotonic
sample, then that GPU tick either starts earlier than the range, in
which case the above evaluation holds, or the GPU tick is entirely
contained within the range:

          0 1 2 3 4 5 6 7 8 9 a b c d e f
Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-

           z         0         1         2         3
GPU      __-----_____-----_____-----_____-----_____-----

In this case, the deviation between the first monotonic sample (that
returned to the application as the monotonic time) and the GPU sample is
the whole interval of measurement (b - 2).

I think I've just managed to convince myself that Jason's first
suggestion (max2(sample interval, gpu interval)) is correct, although I
think we should add '1' to the interval to account for sampling phase
errors in the monotonic clock. As that's measured in ns, and I'm
currently getting values in the µs range, that's a small error in
comparison.

You've got me almost convinced as well.  Thanks for the diagrams!  I think instead of adding 1 perhaps what we want is

max2(sample_interval_ns, gpu_tick_ns + monotonic_tick_ns)

Where monotonic_tick_ns is maybe as low as 1.  Am I following you correctly?

--Jason