From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: BUG: NOW() seems to (sometimes) go backwards! Date: Wed, 8 Jun 2016 15:43:37 +0200 Message-ID: <1465393417.15816.206.camel@citrix.com> References: <1465314895.15816.165.camel@citrix.com> <575812BD02000078000F2F82@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7223054000126524434==" Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bAdlu-0005eY-JD for xen-devel@lists.xenproject.org; Wed, 08 Jun 2016 13:43:50 +0000 In-Reply-To: <575812BD02000078000F2F82@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich Cc: Andrew Cooper , Joao Martins , Tim Deegan , George Dunlap , "xen-devel@lists.xenproject.org" List-Id: xen-devel@lists.xenproject.org --===============7223054000126524434== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-efDKiVy67YCKrF912hqQ" --=-efDKiVy67YCKrF912hqQ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-06-08 at 04:42 -0600, Jan Beulich wrote: > > > > On 07.06.16 at 17:54, wrote: > > So, it looks to me that the TSC is actually ok, and it could be the > > local_tsc_stamp and scale_delta() magic done in get_s_time_fixed() > > (which, AFAICUI, is there to convert cycles to time) that does not > > guarantee the result to be monotonic, even if the input is... > > Thoughts? > Indeed this smells like an issue in the scaling. However, the scaling > values vary only when !CONSTANT_TSC. Can you check that this > flag is actually set on that system?=20 > Checked. I do have it. I instrumented the code to print stuff if it finds it, and it prints. Also: root@Zhaman:~# xl debug-keys s (XEN) [=C2=A0=C2=A0406.719464] TSC marked as reliable, warp =3D 0 (count=3D= 3) (XEN) [=C2=A0=C2=A0406.719467] dom1(hvm): mode=3D0,ofs=3D0xffffd9279716c276= ,khz=3D2394069,inc=3D4,vtsc count: 195367 kernel, 0 user > (I hope you're not running a > strange Dom0 setup with FREQCTL_dom0_kernel in effect.)=20 > I've no idea what this is. I've been running 4.1.0, built myself, and stock Debian unstable 4.5.0, and I'm seeing the issue in both cases. Looking FREQCTL_dom0_kernel up, I guess you mean what happens when one passes cpufreq=3D"dom0-kernel" to xen on boot command line. In which case, no, I'm not doing that. > And > at the same time that it's time_calibration_tsc_rendezvous() that > is in use? >=20 The code you're referring to should be this: =C2=A0 =C2=A0 /* If we have constant-rate TSCs then scale factor can be sha= red. */ =C2=A0=C2=A0=C2=A0=C2=A0if ( boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ) =C2=A0=C2=A0=C2=A0=C2=A0{ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* If TSCs are not marked a= s 'reliable', re-sync during rendezvous. */ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( !boot_cpu_has(X86_FEAT= URE_TSC_RELIABLE) ) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0tim= e_calibration_rendezvous_fn =3D time_calibration_tsc_rendezvous; =C2=A0=C2=A0=C2=A0=C2=A0} And I have both X86_FEATURE_CONSTANT_TSC and X86_FEATURE_TSC_RELIABLE. I've again instrumented the code in order to check whether it is time_calibration_tsc_rendezvous() or time_calibration_std_rendezvous() that is being used, and it's the latter: (XEN) [=C2=A0=C2=A0=C2=A0=C2=A01.795916] TSC reliable. Yay!! Using ffff82d0= 80198362 for rendevousez [dario@Solace xen.git] $ objdump -D xen/xen-syms |grep ffff82d080198362 ffff82d080198362 : which looks correct to me. > Yet when the scaling values get set only once at boot, monotonic > (cross-CPU) TSC means monotonic (cross-CPU) returns from NOW(). >=20 Yep. And at this point, this is what needs to be verified, I guess... > In any event - could you try to exclude C- and P-state effects? Of > course that makes sense only if you can reasonably repro the > problem situation (and hence can tell from its absence over a certain > period of time that whatever change was done did have some > positive effect). >=20 It's actually quite hard *NOT* to reproduce the problem... it happens all the time, and if the load is high enough, I see the "Time went backwards?" printk several times per second! So, trying to do what you suggest in an online fashion, i.e., issueing: =C2=A0# xenpm=C2=A0set-max-cstate 0 =C2=A0# xenpm=C2=A0set-scaling-governor all performance did not change the situation (I still see the printks). I've then tried passing both cpufreq=3D"none" and=C2=A0max_cstate=3D0 to xe= n at boot, but they made no difference at all either. Most of the time, we're speaking of very small skews, e.g.: (XEN) [=C2=A0=C2=A0=C2=A059.999959] WARNING: __update_runq_load: Time went = backwards? now 59999946079 llu 59999946085 (XEN) [=C2=A0=C2=A0117.595508] WARNING: __update_runq_load: Time went backw= ards? now 117595495295 llu 117595495310 i.e., 6 nanoseconds or 15 nanoseconds! Then there are instances where the difference is bigger (microseconds time scale, like in the first email of the thread). > How big of system are we talking about? I'm asking to assess the > overhead of adding some cross-CPU checks to get_s_time_fixed() > (in a debugging patch), logging relevant values if non-monotonic > output gets detected. (On too big a system, the overhead here > might end up masking the problem.) >=20 Yeah, I sort of tried doing something like that already, but was logging the wrong thing (I was not yet suspecting a problem with scaling). I can try putting something together again. In any case, the system I use most for testing is a 16 cpus (in 2 NUMA nodes) Xeon. But I see the issue even within the same socket, so I can easily reduce to use a subset of the available processors. Thanks for your time. :-) Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-efDKiVy67YCKrF912hqQ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXWCEKAAoJEBZCeImluHPuc7IP/RxMhxmEsSfEjSWXQaefpQYV CNftMMpvljoe1k4uYhZKl+00K/jliRc6GN3eExlYo2cp+X+/FdKljiGIT8OlmHS8 Uge8Xw+0TEJXC3hsNhTuYhQ4CxxIZpLA4baZ7gav+0Fq6068vBh23I+xUQo0rmOz IM/agyhWD9Dw54LvxjhbU52M62VGu3lswhGwsmETjrOQdxRCsuYaN+c1ZOGROJ6O ACy2F4V2ybHDVrGX8D3k0M02snHyH/U1w5VaVLRk3wr1bJIDDXS6bfG9APrizZRQ mLYtjQdZl1xFIdwDvEMpfDp0c2rmmK+4TvWHZNrJXbB6df8qpiHTbmzQ7Nwc+pbE NgPj7Qz0aAB9hoZDS0x+6mbCpCrsx4vDm0DW1leNUKIkRqhLotLwn88EcP+nBy9v /7nNPhcGtEcG7IU7Ujdj9zH7/q2EDKOqSyvL64t3AYimX3L5vrI/b7qYAp5CBSiU 1Wz6LskyUks8g67fmQbjc/D0w2yBOlLJB9s+1nBCJmkRKv2B4icMFF7W3bBx3NHr 0Tw2SPzUL7TJKMTyksEMtkxiTCbyZxAm3z32zUNsYSXo2gG7IFj1lQdxJbbW4CwW tEE5DWrNT7X+ybwPoYfY3B+zB1ltYBUoQtLjjuGjkVlqGAVw81woH3Vi1JVVaFTS 7dzzbLf82Ak2TMmKmXTG =m1UG -----END PGP SIGNATURE----- --=-efDKiVy67YCKrF912hqQ-- --===============7223054000126524434== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============7223054000126524434==--