On Tue, Oct 16, 2018 at 2:35 PM Keith Packard <keithp@keithp.com> wrote:

> Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> writes:
>
> >> +       end = radv_clock_gettime(CLOCK_MONOTONIC_RAW);
> >> +
> >> +       uint64_t clock_period = end - begin;
> >> +       uint64_t device_period = DIV_ROUND_UP(1000000,
> clock_crystal_freq);
> >> +
> >> +       *pMaxDeviation = MAX2(clock_period, device_period);
> >
> > Should this not be a sum? Those deviations can happen independently
> > from each other, so worst case both deviations happen in the same
> > direction which causes the magnitude to be combined.
>
> This use of MAX2 comes right from one of the issues raised during work
> on the extension:
>
>  8) Can the maximum deviation reported ever be zero?
>
>  RESOLVED: Unless the tick of each clock corresponding to the set of
>  time domains coincides and all clocks can literally be sampled
>  simutaneously, there isn’t really a possibility for the maximum
>  deviation to be zero, so by convention the maximum deviation is always
>  at least the maximum of the length of the ticks of the set of time
>  domains calibrated and thus can never be zero.
>
> I can't wrap my brain around this entirely, but I think that this says
> that the deviation reported is supposed to only reflect the fact that we
> aren't sampling the clocks at the same time, and so there may be a
> 'tick' of error for any sampled clock.
>
> If you look at the previous issue in the spec, that actually has the
> pseudo code I used in this implementation for computing maxDeviation
> which doesn't include anything about the time period of the GPU.
>
> Jason suggested using the GPU period as the minimum value for
> maxDeviation at the GPU time period to make sure we never accidentally
> returned zero, as that is forbidden by the spec. We might be able to use
> 1 instead, but it won't matter in practice as the time it takes to
> actually sample all of the clocks is far longer than a GPU tick.
>

I think what Bas is getting at is that there are two problems:

 1) We are not sampling at exactly the same time
 2) The two clocks may not tick at exactly the same time.

Even if I can simultaneously sample the CPU and GPU clocks, their
oscilators are not aligned and I my sample may be at the begining of the
CPU tick and the end of the GPU tick.  If I had sampled 75ns earlier, I
could have gotten lower CPU time but the same GPU time (most intel GPUs
have about an 80ns tick).

If we want to be conservative, I suspect Bas may be right that adding is
the safer thing to do.

--Jason