From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Ekstrand Subject: Re: [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v4] Date: Tue, 16 Oct 2018 16:33:18 -0500 Message-ID: References: <20181015230515.3695-1-keithp@keithp.com> <20181016053150.11453-1-keithp@keithp.com> <87bm7t8z3k.fsf@keithp.com> <8736t58usr.fsf@keithp.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0236627500==" Return-path: In-Reply-To: <8736t58usr.fsf@keithp.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: mesa-dev-bounces@lists.freedesktop.org Sender: "mesa-dev" To: Keith Packard Cc: ML mesa-dev , Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org --===============0236627500== Content-Type: multipart/alternative; boundary="0000000000000f6dbb05785f4edd" --0000000000000f6dbb05785f4edd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Oct 16, 2018 at 4:07 PM Keith Packard wrote: > Jason Ekstrand writes: > > > I think what Bas is getting at is that there are two problems: > > > > 1) We are not sampling at exactly the same time > > 2) The two clocks may not tick at exactly the same time. > > Thanks for the clarification. > > > If we want to be conservative, I suspect Bas may be right that adding i= s > > the safer thing to do. > > Yes, it's certainly safe to increase the value of > maxDeviation. Currently, the time it takes to sample all of the clocks > is far larger than the GPU tick, so adding that in would not have a huge > impact on the value returned to the application. > > I'd like to dig in a little further and actually understand if the > current computation (which is derived directly from the Vulkan spec) is > wrong, and if so, whether the spec needs to be adjusted. > > I think the question is what 'maxDeviation' is supposed to > represent. All the spec says is: > > * pMaxDeviation is a pointer to a 64-bit unsigned integer value in > which the strictly positive maximum deviation, in nanoseconds, of the > calibrated timestamp values is returned. > > I interpret this as the maximum error in sampling the individual clocks, > which is to say that the clock values are guaranteed to have been > sampled within this interval of each other. > > So, if we have a monotonic clock and GPU clock: > > 0 1 2 3 4 5 6 7 8 9 a b c d e f > Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_- > > 0 1 2 3 > GPU -----_____-----_____-----_____-----_____ > > > gpu_period in this case is 5 ticks of the monotonic clock. > > Now, I perform three operations: > > start =3D read(monotonic) > gpu =3D read(GPU) > end =3D read(monotonic) > > Let's say that: > > start =3D 2 > GPU =3D 1 * 5 =3D 5 monotonic equivalent ticks > end =3D b > > So, the question is only how large the error between GPU and start could > possibly be. Certainly the GPU clock was sampled some time between > when monotonic tick 2 started and monotonic tick b ended. But, we have > no idea what phase the GPU clock was in when sampled. > > Let's imagine we manage to sample the GPU clock immediately after the > first monotonic sample. I'll shift the offset of the monotonic and GPU > clocks to retain the same values (start =3D 2, GPU =3D 1), but now > the GPU clock is being sampled immediately after monotonic time 2: > > w x y z 0 1 2 3 4 5 6 7 8 9 a b c d e f > Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_- > > 0 1 2 3 > GPU -----_____-----_____-----_____-----_____ > > > In this case, the GPU tick started at monotonic time y, nearly 5 > monotonic ticks earlier than the measured monotonic time, so the > deviation between GPU and monotonic would be 5 ticks. > > If we sample the GPU clock immediately before the second monotonic > sample, then that GPU tick either starts earlier than the range, in > which case the above evaluation holds, or the GPU tick is entirely > contained within the range: > > 0 1 2 3 4 5 6 7 8 9 a b c d e f > Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_- > > z 0 1 2 3 > GPU __-----_____-----_____-----_____-----_____----- > > In this case, the deviation between the first monotonic sample (that > returned to the application as the monotonic time) and the GPU sample is > the whole interval of measurement (b - 2). > > I think I've just managed to convince myself that Jason's first > suggestion (max2(sample interval, gpu interval)) is correct, although I > think we should add '1' to the interval to account for sampling phase > errors in the monotonic clock. As that's measured in ns, and I'm > currently getting values in the =C2=B5s range, that's a small error in > comparison. > You've got me almost convinced as well. Thanks for the diagrams! I think instead of adding 1 perhaps what we want is max2(sample_interval_ns, gpu_tick_ns + monotonic_tick_ns) Where monotonic_tick_ns is maybe as low as 1. Am I following you correctly= ? --Jason --0000000000000f6dbb05785f4edd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Oct 16= , 2018 at 4:07 PM Keith Packard <ke= ithp@keithp.com> wrote:
Jaso= n Ekstrand <ja= son@jlekstrand.net> writes:

> I think what Bas is getting at is that there are two problems:
>
>=C2=A0 1) We are not sampling at exactly the same time
>=C2=A0 2) The two clocks may not tick at exactly the same time.

Thanks for the clarification.

> If we want to be conservative, I suspect Bas may be right that adding = is
> the safer thing to do.

Yes, it's certainly safe to increase the value of
maxDeviation. Currently, the time it takes to sample all of the clocks
is far larger than the GPU tick, so adding that in would not have a huge impact on the value returned to the application.

I'd like to dig in a little further and actually understand if the
current computation (which is derived directly from the Vulkan spec) is
wrong, and if so, whether the spec needs to be adjusted.

I think the question is what 'maxDeviation' is supposed to
represent. All the spec says is:

=C2=A0* pMaxDeviation is a pointer to a 64-bit unsigned integer value in =C2=A0 =C2=A0which the strictly positive maximum deviation, in nanoseconds,= of the
=C2=A0 =C2=A0calibrated timestamp values is returned.

I interpret this as the maximum error in sampling the individual clocks, which is to say that the clock values are guaranteed to have been
sampled within this interval of each other.

So, if we have a monotonic clock and GPU clock:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 1 2 3 4 5 6 7 8 9 a b c d e f
Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03
GPU=C2=A0 =C2=A0 =C2=A0 =C2=A0-----_____-----_____-----_____-----_____


gpu_period in this case is 5 ticks of the monotonic clock.

Now, I perform three operations:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 start =3D read(monotonic)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 gpu=C2=A0 =C2=A0=3D read(GPU)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 end=C2=A0 =C2=A0=3D read(monotonic)

Let's say that:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 start =3D 2
=C2=A0 =C2=A0 =C2=A0 =C2=A0 GPU =3D 1 * 5 =3D 5 monotonic equivalent ticks<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 end =3D b

So, the question is only how large the error between GPU and start could possibly be. Certainly the GPU clock was sampled some time between
when monotonic tick 2 started and monotonic tick b ended. But, we have
no idea what phase the GPU clock was in when sampled.

Let's imagine we manage to sample the GPU clock immediately after the first monotonic sample. I'll shift the offset of the monotonic and GPU<= br> clocks to retain the same values (start =3D 2, GPU =3D 1), but now
the GPU clock is being sampled immediately after monotonic time 2:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 w x y z 0 1 2 3 4 5= 6 7 8 9 a b c d e f
Monotonic=C2=A0 =C2=A0 =C2=A0 =C2=A0-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03
GPU=C2=A0 =C2=A0 =C2=A0 =C2=A0-----_____-----_____-----_____-----_____


In this case, the GPU tick started at monotonic time y, nearly 5
monotonic ticks earlier than the measured monotonic time, so the
deviation between GPU and monotonic would be 5 ticks.

If we sample the GPU clock immediately before the second monotonic
sample, then that GPU tick either starts earlier than the range, in
which case the above evaluation holds, or the GPU tick is entirely
contained within the range:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 1 2 3 4 5 6 7 8 9 a b c d e f
Monotonic -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0z=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03
GPU=C2=A0 =C2=A0 =C2=A0 __-----_____-----_____-----_____-----_____-----

In this case, the deviation between the first monotonic sample (that
returned to the application as the monotonic time) and the GPU sample is the whole interval of measurement (b - 2).

I think I've just managed to convince myself that Jason's first
suggestion (max2(sample interval, gpu interval)) is correct, although I
think we should add '1' to the interval to account for sampling pha= se
errors in the monotonic clock. As that's measured in ns, and I'm currently getting values in the =C2=B5s range, that's a small error in<= br> comparison.

You've got me almost co= nvinced as well.=C2=A0 Thanks for the diagrams!=C2=A0 I think instead of ad= ding 1 perhaps what we want is

max2(sample_interva= l_ns, gpu_tick_ns + monotonic_tick_ns)

Where monot= onic_tick_ns is maybe as low as 1.=C2=A0 Am I following you correctly?
<= /div>

--Jason
--0000000000000f6dbb05785f4edd-- --===============0236627500== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbWVzYS1kZXYg bWFpbGluZyBsaXN0Cm1lc2EtZGV2QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3Rz LmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL21lc2EtZGV2Cg== --===============0236627500==--