From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Ekstrand Subject: Re: [PATCH] vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v4] Date: Tue, 16 Oct 2018 21:18:14 -0500 Message-ID: References: <20181015230515.3695-1-keithp@keithp.com> <20181016053150.11453-1-keithp@keithp.com> <87bm7t8z3k.fsf@keithp.com> <8736t58usr.fsf@keithp.com> <87tvll7dha.fsf@keithp.com> <87r2gp7b6m.fsf@keithp.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0838996888==" Return-path: Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by gabe.freedesktop.org (Postfix) with ESMTPS id 265686E2E6 for ; Wed, 17 Oct 2018 02:18:28 +0000 (UTC) Received: by mail-ed1-x52b.google.com with SMTP id w19-v6so23308884eds.1 for ; Tue, 16 Oct 2018 19:18:27 -0700 (PDT) In-Reply-To: <87r2gp7b6m.fsf@keithp.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Keith Packard Cc: ML mesa-dev , Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org --===============0838996888== Content-Type: multipart/alternative; boundary="000000000000107cda0578634919" --000000000000107cda0578634919 Content-Type: text/plain; charset="UTF-8" On Tue, Oct 16, 2018 at 5:56 PM Keith Packard wrote: > Bas Nieuwenhuizen writes: > > > You can make the monotonic case the same as the raw case if you make > > sure to always sample the CPU first by e.g. splitting the loops into > > two and doing CPU in the first and GPU in the second. That way you > > make the case above impossible. > > Right, I had thought of that, and it probably solves the problem for > us. If more time domains are added, things become 'more complicated' > though. > Doing all of the CPU sampling on one side or the other of the GPU sampling would probably reduce our window. > > That said "start of the interval of the tick" is kinda arbitrary and > > you could pick random other points on that interval, so depending on > > what requirements you put on it (i.e. can the chosen position be > > different per call, consistent but implicit or explicitly picked which > > might be leaked through the interface) the max deviation changes. So > > depending on interpretation this thing can be very moot ... > > It doesn't really matter what phase you use; the timer increments > periodically, and what really matters is the time when that happens > relative to other clocks changing. > Agreed. Thinking about this a bit more, I think it helps to consider each clock to be a real number that's changing continuously in time and what you actually measure is floor(x / P(x)) where P(x) is the period of x in nanoseconds.. (or ceil; it doesn't matter so long as you're consistent.) At any given point, the clocks do have an exact value relative to each other. When you sample, you grab floor(M / P(M)), floor(G / P(G)), and floor(R / P(R)) all in some interval of size I. The delta between the real values sampled is most I but the sampling takes a floor operation, so the actual value of any given clock C may be as much as P(C) greater than what was sampled but it cannot be lower (assuming the floor convention). This leaves us with a delta of I + max(P(M), P(R), P(G)). In particular, any two real-number valued times are, instantaneously, within that interval. The next question becomes, if I sample again and assume zero clock drift, what are the bounds on the next sampling. Above, we calculated the maximum delta between real-valued clocks. However, because we're sampling again, we may end up with more phase shift issues and any clock may, again, be off by as much as P(C). However, again assuming no drift, no clock is going to be off with respect to itself; just sampled at a different phase so I think the most delta you can see between two clocks in the two samplings is the sum of their periods. So if the delta we're looking for is a delta for a theoretical second sampling, I think it's I plus the maximum of the sums of all pairs of periods. Personally, I'm completely content to have the delta just be a the first one: a bound on the difference between any two real-valued times. At this point, I can guarantee you that far more thought has been put into this mesa-dev discussion than was put into the spec and I think we're rapidly getting to the point of diminishing returns. :-) --Jason --000000000000107cda0578634919 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Oct 16= , 2018 at 5:56 PM Keith Packard <ke= ithp@keithp.com> wrote:
Bas = Nieuwenhuizen <bas@basnieuwenhuizen.nl> writes:

> You can make the monotonic case the same as the raw case if you make > sure to always sample the CPU first by e.g. splitting the loops into > two and doing CPU in the first and GPU in the second. That way you
> make the case above impossible.

Right, I had thought of that, and it probably solves the problem for
us. If more time domains are added, things become 'more complicated'= ;
though.

Doing all of the CPU sampling o= n one side or the other of the GPU sampling would probably reduce our windo= w.
=C2=A0
> That said "start of the interval of the tick" is kinda arbit= rary and
> you could pick random other points on that interval, so depending on > what requirements you put on it (i.e. can the chosen position be
> different per call, consistent but implicit or explicitly picked which=
> might be leaked through the interface) the max deviation changes. So > depending on interpretation this thing can be very moot ...

It doesn't really matter what phase you use; the timer increments
periodically, and what really matters is the time when that happens
relative to other clocks changing.

Agre= ed.

Thinking about this a bit more, I think it hel= ps to consider each clock to be a real number that's changing continuou= sly in time and what you actually measure is floor(x / P(x)) where P(x) is = the period of x in nanoseconds.. (or ceil; it doesn't matter so long as= you're consistent.)=C2=A0 At any given point, the clocks do have an ex= act value relative to each other.=C2=A0 When you sample, you grab floor(M /= P(M)), floor(G / P(G)), and floor(R / P(R)) all in some interval of size I= .=C2=A0 The delta between the real values sampled is most I but the samplin= g takes a floor operation, so the actual value of any given clock C may be = as much as P(C) greater than what was sampled but it cannot be lower (assum= ing the floor convention).=C2=A0 This leaves us with a delta of I + max(P(M= ), P(R), P(G)).=C2=A0 In particular, any two real-number valued times are, = instantaneously, within that interval.

The next qu= estion becomes, if I sample again and assume zero clock drift, what are the= bounds on the next sampling.=C2=A0 Above, we calculated the maximum delta = between real-valued clocks.=C2=A0 However, because we're sampling again= , we may end up with more phase shift issues and any clock may, again, be o= ff by as much as P(C).=C2=A0 However, again assuming no drift, no clock is = going to be off with respect to itself; just sampled at a different phase s= o I think the most delta you can see between two clocks in the two sampling= s is the sum of their periods.=C2=A0 So if the delta we're looking for = is a delta for a theoretical second sampling, I think it's I plus the m= aximum of the sums of all pairs of periods.

Person= ally, I'm completely content to have the delta just be a the first one:= a bound on the difference between any two real-valued times.=C2=A0 At this= point, I can guarantee you that far more thought has been put into this me= sa-dev discussion than was put into the spec and I think we're rapidly = getting to the point of diminishing returns. :-)

--Jason
--000000000000107cda0578634919-- --===============0838996888== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0838996888==--