From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jason Ekstrand <jason@jlekstrand.net>
Subject: Re: [PATCH] vulkan: Add VK_EXT_calibrated_timestamps
 extension (radv and anv) [v4]
Date: Tue, 16 Oct 2018 16:33:18 -0500
Message-ID: <CAOFGe94mZT3naH=vzKvCG1EwRDLpdXez5nhg9i3Fc1gB5v+QoQ@mail.gmail.com>
References: <20181015230515.3695-1-keithp@keithp.com>
 <20181016053150.11453-1-keithp@keithp.com>
 <CAP+8YyE1XVe3GRVqr7Vay69iooMGiOD8_nEAWcnFgRYxEMtt8w@mail.gmail.com>
 <87bm7t8z3k.fsf@keithp.com>
 <CAOFGe95-Drre9W0R4mG=5QcDGhmptM35WZtnFkTFsN+ZxiF81g@mail.gmail.com>
 <8736t58usr.fsf@keithp.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0236627500=="
Return-path: <mesa-dev-bounces@lists.freedesktop.org>
In-Reply-To: <8736t58usr.fsf@keithp.com>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/mesa-dev>,
 <mailto:mesa-dev-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/mesa-dev>
List-Post: <mailto:mesa-dev@lists.freedesktop.org>
List-Help: <mailto:mesa-dev-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>,
 <mailto:mesa-dev-request@lists.freedesktop.org?subject=subscribe>
Errors-To: mesa-dev-bounces@lists.freedesktop.org
Sender: "mesa-dev" <mesa-dev-bounces@lists.freedesktop.org>
To: Keith Packard <keithp@keithp.com>
Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org>, Maling list - DRI developers <dri-devel@lists.freedesktop.org>
List-Id: dri-devel@lists.freedesktop.org

--===============0236627500==
Content-Type: multipart/alternative; boundary="0000000000000f6dbb05785f4edd"

--0000000000000f6dbb05785f4edd
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 16, 2018 at 4:07 PM Keith Packard <keithp@keithp.com> wrote:

> Jason Ekstrand <jason@jlekstrand.net> writes:
>
> > I think what Bas is getting at is that there are two problems:
> >
> >  1) We are not sampling at exactly the same time
> >  2) The two clocks may not tick at exactly the same time.
>
> Thanks for the clarification.
>
> > If we want to be conservative, I suspect Bas may be right that adding i=
s
> > the safer thing to do.
>
> Yes, it's certainly safe to increase the value of
> maxDeviation. Currently, the time it takes to sample all of the clocks
> is far larger than the GPU tick, so adding that in would not have a huge
> impact on the value returned to the application.
>
> I'd like to dig in a little further and actually understand if the
> current computation (which is derived directly from the Vulkan spec) is
> wrong, and if so, whether the spec needs to be adjusted.
>
> I think the question is what 'maxDeviation' is supposed to
> represent. All the spec says is:
>
>  * pMaxDeviation is a pointer to a 64-bit unsigned integer value in
>    which the strictly positive maximum deviation, in nanoseconds, of the
>    calibrated timestamp values is returned.
>
> I interpret this as the maximum error in sampling the individual clocks,
> which is to say that the clock values are guaranteed to have been
> sampled within this interval of each other.
>
> So, if we have a monotonic clock and GPU clock:
>
>           0 1 2 3 4 5 6 7 8 9 a b c d e f
> Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
>
>           0         1         2         3
> GPU       -----_____-----_____-----_____-----_____
>
>
> gpu_period in this case is 5 ticks of the monotonic clock.
>
> Now, I perform three operations:
>
>         start =3D read(monotonic)
>         gpu   =3D read(GPU)
>         end   =3D read(monotonic)
>
> Let's say that:
>
>         start =3D 2
>         GPU =3D 1 * 5 =3D 5 monotonic equivalent ticks
>         end =3D b
>
> So, the question is only how large the error between GPU and start could
> possibly be. Certainly the GPU clock was sampled some time between
> when monotonic tick 2 started and monotonic tick b ended. But, we have
> no idea what phase the GPU clock was in when sampled.
>
> Let's imagine we manage to sample the GPU clock immediately after the
> first monotonic sample. I'll shift the offset of the monotonic and GPU
> clocks to retain the same values (start =3D 2, GPU =3D 1), but now
> the GPU clock is being sampled immediately after monotonic time 2:
>
>                 w x y z 0 1 2 3 4 5 6 7 8 9 a b c d e f
> Monotonic       -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
>
>           0         1         2         3
> GPU       -----_____-----_____-----_____-----_____
>
>
> In this case, the GPU tick started at monotonic time y, nearly 5
> monotonic ticks earlier than the measured monotonic time, so the
> deviation between GPU and monotonic would be 5 ticks.
>
> If we sample the GPU clock immediately before the second monotonic
> sample, then that GPU tick either starts earlier than the range, in
> which case the above evaluation holds, or the GPU tick is entirely
> contained within the range:
>
>           0 1 2 3 4 5 6 7 8 9 a b c d e f
> Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
>
>            z         0         1         2         3
> GPU      __-----_____-----_____-----_____-----_____-----
>
> In this case, the deviation between the first monotonic sample (that
> returned to the application as the monotonic time) and the GPU sample is
> the whole interval of measurement (b - 2).
>
> I think I've just managed to convince myself that Jason's first
> suggestion (max2(sample interval, gpu interval)) is correct, although I
> think we should add '1' to the interval to account for sampling phase
> errors in the monotonic clock. As that's measured in ns, and I'm
> currently getting values in the =C2=B5s range, that's a small error in
> comparison.
>

You've got me almost convinced as well.  Thanks for the diagrams!  I think
instead of adding 1 perhaps what we want is

max2(sample_interval_ns, gpu_tick_ns + monotonic_tick_ns)

Where monotonic_tick_ns is maybe as low as 1.  Am I following you correctly=
?

--Jason

--0000000000000f6dbb05785f4edd
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr">On Tue, Oct 16=
, 2018 at 4:07 PM Keith Packard &lt;<a href=3D"mailto:keithp@keithp.com">ke=
ithp@keithp.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Jaso=
n Ekstrand &lt;<a href=3D"mailto:jason@jlekstrand.net" target=3D"_blank">ja=
son@jlekstrand.net</a>&gt; writes:<br>
<br>
&gt; I think what Bas is getting at is that there are two problems:<br>
&gt;<br>
&gt;=C2=A0 1) We are not sampling at exactly the same time<br>
&gt;=C2=A0 2) The two clocks may not tick at exactly the same time.<br>
<br>
Thanks for the clarification.<br>
<br>
&gt; If we want to be conservative, I suspect Bas may be right that adding =
is<br>
&gt; the safer thing to do.<br>
<br>
Yes, it&#39;s certainly safe to increase the value of<br>
maxDeviation. Currently, the time it takes to sample all of the clocks<br>
is far larger than the GPU tick, so adding that in would not have a huge<br=
>
impact on the value returned to the application.<br>
<br>
I&#39;d like to dig in a little further and actually understand if the<br>
current computation (which is derived directly from the Vulkan spec) is<br>
wrong, and if so, whether the spec needs to be adjusted.<br>
<br>
I think the question is what &#39;maxDeviation&#39; is supposed to<br>
represent. All the spec says is:<br>
<br>
=C2=A0* pMaxDeviation is a pointer to a 64-bit unsigned integer value in<br=
>
=C2=A0 =C2=A0which the strictly positive maximum deviation, in nanoseconds,=
 of the<br>
=C2=A0 =C2=A0calibrated timestamp values is returned.<br>
<br>
I interpret this as the maximum error in sampling the individual clocks,<br=
>
which is to say that the clock values are guaranteed to have been<br>
sampled within this interval of each other.<br>
<br>
So, if we have a monotonic clock and GPU clock:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 1 2 3 4 5 6 7 8 9 a b c d e f<br>
Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03<br>
GPU=C2=A0 =C2=A0 =C2=A0 =C2=A0-----_____-----_____-----_____-----_____<br>
<br>
<br>
gpu_period in this case is 5 ticks of the monotonic clock.<br>
<br>
Now, I perform three operations:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 start =3D read(monotonic)<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 gpu=C2=A0 =C2=A0=3D read(GPU)<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 end=C2=A0 =C2=A0=3D read(monotonic)<br>
<br>
Let&#39;s say that:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 start =3D 2<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 GPU =3D 1 * 5 =3D 5 monotonic equivalent ticks<=
br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 end =3D b<br>
<br>
So, the question is only how large the error between GPU and start could<br=
>
possibly be. Certainly the GPU clock was sampled some time between<br>
when monotonic tick 2 started and monotonic tick b ended. But, we have<br>
no idea what phase the GPU clock was in when sampled.<br>
<br>
Let&#39;s imagine we manage to sample the GPU clock immediately after the<b=
r>
first monotonic sample. I&#39;ll shift the offset of the monotonic and GPU<=
br>
clocks to retain the same values (start =3D 2, GPU =3D 1), but now<br>
the GPU clock is being sampled immediately after monotonic time 2:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 w x y z 0 1 2 3 4 5=
 6 7 8 9 a b c d e f<br>
Monotonic=C2=A0 =C2=A0 =C2=A0 =C2=A0-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-<b=
r>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03<br>
GPU=C2=A0 =C2=A0 =C2=A0 =C2=A0-----_____-----_____-----_____-----_____<br>
<br>
<br>
In this case, the GPU tick started at monotonic time y, nearly 5<br>
monotonic ticks earlier than the measured monotonic time, so the<br>
deviation between GPU and monotonic would be 5 ticks.<br>
<br>
If we sample the GPU clock immediately before the second monotonic<br>
sample, then that GPU tick either starts earlier than the range, in<br>
which case the above evaluation holds, or the GPU tick is entirely<br>
contained within the range:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 1 2 3 4 5 6 7 8 9 a b c d e f<br>
Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0z=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03<br>
GPU=C2=A0 =C2=A0 =C2=A0 __-----_____-----_____-----_____-----_____-----<br>
<br>
In this case, the deviation between the first monotonic sample (that<br>
returned to the application as the monotonic time) and the GPU sample is<br=
>
the whole interval of measurement (b - 2).<br>
<br>
I think I&#39;ve just managed to convince myself that Jason&#39;s first<br>
suggestion (max2(sample interval, gpu interval)) is correct, although I<br>
think we should add &#39;1&#39; to the interval to account for sampling pha=
se<br>
errors in the monotonic clock. As that&#39;s measured in ns, and I&#39;m<br=
>
currently getting values in the =C2=B5s range, that&#39;s a small error in<=
br>
comparison.<br></blockquote><div><br></div><div>You&#39;ve got me almost co=
nvinced as well.=C2=A0 Thanks for the diagrams!=C2=A0 I think instead of ad=
ding 1 perhaps what we want is</div><div><br></div><div>max2(sample_interva=
l_ns, gpu_tick_ns + monotonic_tick_ns)</div><div><br></div><div>Where monot=
onic_tick_ns is maybe as low as 1.=C2=A0 Am I following you correctly?<br><=
/div><div><br></div><div>--Jason<br></div></div></div>

--0000000000000f6dbb05785f4edd--

--===============0236627500==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbWVzYS1kZXYg
bWFpbGluZyBsaXN0Cm1lc2EtZGV2QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3Rz
LmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL21lc2EtZGV2Cg==

--===============0236627500==--