From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: BUG: NOW() seems to (sometimes) go backwards! Date: Tue, 7 Jun 2016 17:54:55 +0200 Message-ID: <1465314895.15816.165.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7268147003227721746==" Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bAJM8-0000EB-SY for xen-devel@lists.xenproject.org; Tue, 07 Jun 2016 15:55:53 +0000 List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Joao Martins , Tim Deegan , George Dunlap , Jan Beulich List-Id: xen-devel@lists.xenproject.org --===============7268147003227721746== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-ZdNafubp8+Ua3khrIWh9" --=-ZdNafubp8+Ua3khrIWh9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, I've been fighting with this during the past few days, while testing and benchmarking some of the modification I'm doing to Credit2. In summary, what I see is NOW() going backwards. The issues shows up when I call NOW() on different pCPUs, and compare the results (which is something the Credit2 scheduler does a lot, and I would like it to be able to continue to do so. :-D). I can elaborate on why this is really really bad for scheduling in general, and even more so for Credit2. For now, let me just say it's pretty bad. Feel free to ask for details if interested. In any case, IMO, we need to fix this. It does not only happen cross-socket, i.e., it happens even if the two pCPUs are part of the same socket (in the example shown below, they're two SMT threads of the same core!!). I'm able to trigger this on more than just one box, all with invariant TSC support. (All Intel, though, I don't have any AMD test hardware available right now). In some more details. The scheduler (both in scheduler.c and in sched_credit{1,2}.c) uses NOW() for all the timing and time sampling. In the trace below, at time T1 (equal to 1.264582607), we do this: =C2=A0d0v9->start_time =3D NOW() and we are on pCPU 3. Then, at time T2, on pCPU2, there is this: =C2=A0if ( d0v9->start_time < NOW() ) =C2=A0{ =C2=A0 =C2=A0/* OMG!! */ =C2=A0 =C2=A0__trace("csched2:time_backwards ...") =C2=A0} and we indeed hit the __trace(). What the trace point says is that NOW() was 1972180296557 at time T1, on pCPU 3, and it is 1972180270044 at time T2 on pCPU2 (i.e., ~26.5 microsecs in the past). I speak about NOW() going backwards because, by looking at both the code and the trace, I am positive about the fact that T2 happens later than T1 (go to [1], if interested)... So, why NOW() is smaller at T2? :-( =C2=A0=C2=A0=C2=A01.264577304 ||x||||||||||||| d0v10 runstate_change d0v10 = running->runnable =C2=A0=C2=A0=C2=A01.264577611 ||x||||||||||||| d?v? runstate_change d0v5 ru= nnable->running =C2=A0=C2=A0=C2=A01.264580025 |||x||||||-||||| d0v0 vcpu_block d0v0 =C2=A0=C2=A0=C2=A01.264580795 |||x||||||-||||| d0v0 csched2:schedule curren= t =3D d0v0, now =3D 1972180296557 =C2=A0=C2=A0=C2=A01.264581014 |||x||||||-||||| d0v0 csched2:burn_credits d0= v0, credit =3D 9091956, delta =3D 39433, start_time =3D 1972180296557 =C2=A0=C2=A0=C2=A01.264581660 |||x||||||-||||| d0v0 sched_switch prev d0, r= un for 112.503us =C2=A0=C2=A0=C2=A01.264581809 |||x||||||-||||| d0v0 sched_switch next d0, w= as runnable for 54.532us, next slice 2000.0us =C2=A0=C2=A0=C2=A01.264582033 |||x||||||-||||| d0v0 sched_switch prev d0v0 = next d0v9 =C2=A0=C2=A0=C2=A01.264582269 |||x||||||-||||| d0v0 runstate_change d0v0 ru= nning->blocked T1=C2=A01.264582607 |||x||||||-||||| d?v? runstate_change d0v9 runnable->ru= nning =C2=A0=C2=A0=C2=A01.264583109 ||x|||||||-||||| d0v5 csched2:runq_insert d0v= 10, position 0 =C2=A0=C2=A0=C2=A01.264583341 ||x|||||||-||||| d0v5 csched2:runq_tickle_new= d0v10, credit =3D 5691963 =C2=A0=C2=A0=C2=A01.264583907 ||x|||||||-||||| d0v5 csched2:burn_credits d0= v5, credit =3D 8575685, delta =3D 8241, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264584236 ||x|||||||-||||| d0v5 csched2:burn_credits d1= v1, credit =3D 8696613, delta =3D 12920, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264584431 ||x|||||||-||||| d0v5 csched2:tickle_check d1= v1, credit =3D 8696613 =C2=A0=C2=A0=C2=A01.264584745 ||x|||||||-||||| d0v5 csched2:burn_credits d1= v3, credit =3D 10182550, delta =3D 12920, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264584966 ||x|||||||-||||| d0v5 csched2:tickle_check d1= v3, credit =3D 10182550 T2=C2=A01.264585207 ||x|||||||-||||| d0v5 csched2:time_backwards d0v9: cur_= time =3D 1972180270044, cur_cpu =3D 2, old_time =3D 1972180296557, old_cpu = =3D 3 =C2=A0=C2=A0=C2=A01.264585364 ||x|||||||-||||| d0v5 csched2:burn_credits d0= v9, credit =3D 5759499, delta =3D -26513, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264585537 ||x|||||||-||||| d0v5 csched2:tickle_check d0= v9, credit =3D 5759499 =C2=A0=C2=A0=C2=A01.264585876 ||x|||||||-||||| d0v5 csched2:burn_credits d0= v7, credit =3D 8835314, delta =3D 12920, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264586041 ||x|||||||-||||| d0v5 csched2:tickle_check d0= v7, credit =3D 8835314 =C2=A0=C2=A0=C2=A01.264586491 ||x|||||||-||||| d0v5 csched2:burn_credits d0= v15, credit =3D 5822635, delta =3D 12920, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264586655 ||x|||||||-||||| d0v5 csched2:tickle_check d0= v15, credit =3D 5822635 =C2=A0=C2=A0=C2=A01.264586952 ||x|||||||-||||| d0v5 csched2:burn_credits d0= v13, credit =3D 7956174, delta =3D 12920, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264587115 ||x|||||||-||||| d0v5 csched2:tickle_check d0= v13, credit =3D 7956174 =C2=A0=C2=A0=C2=A01.264587385 ||x|||||||-||||| d0v5 csched2:burn_credits d0= v3, credit =3D 7870338, delta =3D 12920, start_time =3D 1972180270044 =C2=A0=C2=A0=C2=A01.264587545 ||x|||||||-||||| d0v5 csched2:tickle_check d0= v3, credit =3D 7870338 =C2=A0=C2=A0=C2=A01.264595859 |||x||||||-||||| d0v9 csched2:schedule curren= t =3D d0v9, now =3D 1972180311345 =C2=A0=C2=A0=C2=A01.264596313 |||x||||||-||||| d0v9 csched2:burn_credits d0= v9, credit =3D 5718198, delta =3D 4877, start_time =3D 1972180311345 Interestingly enough, if we look at the same (well, sort of) set of events with just xentrace_format, we, after having sorted the records by their timestamps, get this: CPU2=C2=A0=C2=A05613124777311 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0507)=C2=A0= =C2=A0running_to_runnable [ dom:vcpu =3D 0x0000000a ] CPU2=C2=A0=C2=A05613124778046 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0735)=C2=A0= =C2=A0runnable_to_running [ dom:vcpu =3D 0x00000005 ] CPU3=C2=A0=C2=A05613124788657 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0537)=C2=A0= =C2=A0__enter_scheduler [ prev =3D 0x00000000, next =3D= 0x00000009 ] CPU3=C2=A0=C2=A05613124789224 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0567)=C2=A0= =C2=A0running_to_blocked=C2=A0=C2=A0[ dom:vcpu =3D 0x00000000 ] T1 CPU3=C2=A0=C2=A05613124790034 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0810)=C2=A0= =C2=A0runnable_to_running [ dom:vcpu =3D 0x00000009 ] CPU2=C2=A0=C2=A05613124791240 (+=C2=A0=C2=A0=C2=A013194)=C2=A0=C2=A0csch= ed2:runq_pos=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x0000= 000a, pos =3D 0] CPU2=C2=A0=C2=A05613124791795 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0555)=C2=A0= =C2=A0csched2:tickle_new=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x0000= 000a, credit =3D 0x5691963 ] CPU2=C2=A0=C2=A05613124793154 (+=C2=A0=C2=A0=C2=A0=C2=A01359)=C2=A0=C2= =A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc, cr= edit =3D 459, delta =3D 5 ] CPU2=C2=A0=C2=A05613124793943 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0789)=C2=A0= =C2=A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc,= credit =3D 459, delta =3D 65537 ] CPU2=C2=A0=C2=A05613124794411 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0468)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x00010001, cred= it =3D 8696613 ] CPU2=C2=A0=C2=A05613124795164 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0753)=C2=A0= =C2=A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc,= credit =3D 459, delta =3D 65539 ] CPU2=C2=A0=C2=A05613124795695 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0531)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x00010003, cred= it =3D 10182550 ] T2 CPU2=C2=A0=C2=A05613124796274 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0579)=C2=A0= =C2=A0unknown (0x000000000002221e)=C2=A0=C2=A0[ 0x2f1abbdc 0x000001cb 0x2f1= b236d 0x000001cb 0x00000009 0x00030002 0x00000000 ] CPU2=C2=A0=C2=A05613124796649 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0375)=C2=A0= =C2=A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc,= credit =3D 459, delta =3D 9 ] CPU2=C2=A0=C2=A05613124797066 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0417)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x00000009, cred= it =3D 5759499 ] CPU2=C2=A0=C2=A05613124797879 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0813)=C2=A0= =C2=A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc,= credit =3D 459, delta =3D 7 ] CPU2=C2=A0=C2=A05613124798275 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0396)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x00000007, cred= it =3D 8835314 ] CPU2=C2=A0=C2=A05613124799355 (+=C2=A0=C2=A0=C2=A0=C2=A01080)=C2=A0=C2= =A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc, cr= edit =3D 459, delta =3D 15 ] CPU2=C2=A0=C2=A05613124799748 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0393)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x0000000f, cred= it =3D 5822635 ] CPU2=C2=A0=C2=A05613124800459 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0711)=C2=A0= =C2=A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc,= credit =3D 459, delta =3D 13 ] CPU2=C2=A0=C2=A05613124800852 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0393)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x0000000d, cred= it =3D 7956174 ] CPU2=C2=A0=C2=A05613124801500 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0648)=C2=A0= =C2=A0csched2:credit burn=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x2f1abbdc,= credit =3D 459, delta =3D 3 ] CPU2=C2=A0=C2=A05613124801884 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0384)=C2=A0= =C2=A0csched2:tickle_check=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x00000003, cred= it =3D 7870338 ] CPU3=C2=A0=C2=A05613124803177 (+=C2=A0=C2=A0=C2=A013143)=C2=A0=C2=A0csch= ed2:update_load CPU3=C2=A0=C2=A05613124803786 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0609)=C2=A0= =C2=A0unknown (0x0000000000022220)=C2=A0=C2=A0[ 0x0000d61e 0x00000000 0x000= 00000 0x00000000 0x00000000 0x00000000 0x00000000 ] CPU3=C2=A0=C2=A05613124804560 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0774)=C2=A0= =C2=A0csched2:updt_runq_load [ rq_load[16]:rq_id[8]:wshift[8] =3D 0x1200000= 9, rq_avgload =3D 0x0000000000216566, b_avgload =3D 0x0000000000216566, CPU3=C2=A0=C2=A05613124804929 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0369)=C2=A0= =C2=A0unknown (0x0000000000022220)=C2=A0=C2=A0[ 0x0001e8de 0x00000000 0x000= 00000 0x00000000 0x00000000 0x00000000 0x00000000 ] CPU3=C2=A0=C2=A05613124805256 (+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0327)=C2=A0= =C2=A0csched2:updt_vcpu_load [ dom:vcpu =3D 0x00000000, vcpuload =3D 0x0000= 000000026040, wshift =3D 18 ] CPU2=C2=A0=C2=A05613124975596 (+=C2=A0=C2=A0173712)=C2=A0=C2=A0domain_wa= ke=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0[ dom:vcpu =3D 0x00010005 ] Interestingly, the timestamp at T1 is 5613124790034, which looks _correctly_ smaller than the timestamp at T2 (5613124796274). NOW(), as it is called from the scheduling code, does this: =C2=A0s_time_t NOW(u64 at_tsc) =C2=A0//it's actually get_s_time_fixed() =C2=A0{ =C2=A0 =C2=A0 =C2=A0struct cpu_time *t =3D &this_cpu(cpu_time); =C2=A0 =C2=A0 =C2=A0u64 tsc, delta; =C2=A0 =C2=A0 =C2=A0s_time_t now; =C2=A0 =C2=A0 =C2=A0tsc =3D rdtsc(); =C2=A0 =C2=A0 =C2=A0delta =3D tsc - t->local_tsc_stamp; =C2=A0 =C2=A0 =C2=A0now =3D t->stime_local_stamp + scale_delta(delta, &t->t= sc_scale); =C2=A0 =C2=A0 =C2=A0return now; =C2=A0} While the timestamps in the trace are the result of get_cycles(): =C2=A0static inline cycles_t get_cycles(void) =C2=A0{ =C2=A0 =C2=A0 =C2=A0return rdtsc(); =C2=A0} So, it looks to me that the TSC is actually ok, and it could be the local_tsc_stamp and scale_delta() magic done in get_s_time_fixed() (which, AFAICUI, is there to convert cycles to time) that does not guarantee the result to be monotonic, even if the input is... Thoughts? Archeology says that the current shape of get_s_time_fixed() comes from: commit bbb0eb32255d1a3e97994d9c57ca5e3d50de4298 commit 8cf6cfe286aa6949f9bd546d03e275e21e9723db But commit messages are not particularly helpful... and I'm not familiar at all with all this. I'll keep investigating, but in the meanwhile, any thought and idea is more than welcome. Thanks and Regards, Dario [1] So. all the various functions/chunks of code of interest here are serialized by the same spinlock (there is a runqueue for the whole socket). Considering both the code and the trace, the flow of execution is this: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0CPU2 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0CPU3 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0csched2_schedule() =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_lock(rq->lock) =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0| 1.264577304 =C2=A0 d0v10 --> runnable =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264577611 =C2=A0 current <-- d0v5 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_unlock(rq->lock) =C2= =A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = | 1.264580025 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0vcpu_block(d0v0) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_lock(rq->lock) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_unlock(rq->lock) 1.264580795 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 csched2_schedule() =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(rq->lock) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0current <-- d0v9 1.264582607 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0d0v9->start_time =3D NOW() =C2=A0(***T1) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(rq->lock) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0context_saved(d0v10) =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_lock(rq->lock); =C2= =A0 =C2=A0 =C2=A0 =C2=A0 | 1.264583109 =C2=A0 runq_insert(d0v10) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264583341 =C2=A0 runq_tickle_new(d0v10) =C2=A0 =C2=A0 =C2=A0 | 1.264583907 =C2=A0 burn_credits(d0v5) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264584236 =C2=A0 burn_credits(d1v1) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264584431 =C2=A0 tickle_check(d1v1) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264584745 =C2=A0 burn_credits(d1v3) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264584966 =C2=A0 tickle_check(d1v3) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264585207 =C2=A0 !!time_backwards(d0v9) =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(***T2) 1.264585364 =C2=A0 burn_credits(d0v9) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.264585537 =C2=A0 tickle_check d0v9 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ... =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ... =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_lock(rq->lock) =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0| NOW() is always sampled and checked *inside* a critical section protected by rq->lock, so it's not possible that the value gets stale waiting for a lock (I've fixed a couple of instances of this already). So, if this is what happens, it's correct to expect that NOW() at T2 is ahead of NOW() at T1 (and hence, what really happens, T2processor, and in this case, this would be d0v10 checking d0v5, at time 1.264583907. Afterwards, we check the vcpu running on each pCPU (of our runqueue), and that would be d1v1 on pCPU 0, d1v3 on pCPU 1, then we skip d0v5 on pCPU 2 (we already checked it outside of the loop), and then we check d0v9 on pCPU 3. And this is eventually why I say I'm sure about the execution flow: d0v9 is current on pCPU 3 _only_ if the call to=C2=A0csched2_schedule() tha= t put it in execution has happened already (still on pCPU 3), and that, thanks also to locking, means d0v9->start_time =3D NOW() at T1 already happened when we reach T2. --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-ZdNafubp8+Ua3khrIWh9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXVu5QAAoJEBZCeImluHPu5lUQAMjGw4FzgFpvUSVcvdxM5wt7 WEax9bYDr2K0XqRDBWBISInisUWjxK0Uq9pRQNTQb6j4it84SeFnohb3mgypdt0P GiV08JCM3Pklzme2A4KCP/lMT8WON3nqyP+DDXXhXDH3owdIa/mVsqMNP/5Lw9cZ l+vfcEABX2+14diAhDQPaZ7F9xZAmq0hA/9X1xLTj2jumWYLuu+56JZXdsqww89n 4KzhRaXS+cbPIIiy8W5HZlCcb+ZlOx21c3Vo6XjWZR71n+bY2jjEhR0j8/kKkNrS LJ/B63cqZP6aDO4E5+3/3ieeQL5drEfyD3PHfx6mHAHy6FELrKr8c4uHMRuni0SD 4Q0Q34bQC0NKMM2n/va4mTUKIQi816MosvFUAIThU36wHIbStMbiZB/2gs7X/xFt 95sqmjfm/RQ+nel1eguLWedvIiZpj9D7G074/Pc2V7IZKzd021drHcEtBEVOZsBC eNUZMhCw8BJHlIJyI+NJiQ02dzUzMJ0bARrBucFUKxeutYd+HGlXfiqBp+zRvIXz 81/rx5DoTkcLnoxhyx+eSWulfY2Bd/yBBN7dH+cNMZLwFZwUZ6Ya92B5GGGTK8OR JLEkxysln61buRM2H5O+qJzt4BSkFjbIjMU/tkJf8IesPgZJA+Q0chAwiWqlf8ff 0pHTtc098V61XLgKzn9Q =/lD6 -----END PGP SIGNATURE----- --=-ZdNafubp8+Ua3khrIWh9-- --===============7268147003227721746== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============7268147003227721746==--