From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4DE396D4.40508@domain.hid> Date: Mon, 30 May 2011 15:08:36 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <4DDFB780.4010009@domain.hid> <4DDFBDCD.4040809@domain.hid> <4DDFEDA2.40206@domain.hid> <4DDFF74E.2000400@domain.hid> <4DE1078D.3090503@domain.hid> <20110530070322.GA3248@domain.hid> <4DE34223.8030505@domain.hid> <4DE34AA3.2090500@domain.hid> <4DE34E02.6000206@domain.hid> <4DE371F4.5040304@domain.hid> <20110530103324.GA26311@domain.hid> <4DE38E79.20308@domain.hid> In-Reply-To: <4DE38E79.20308@domain.hid> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-help] Huge clock drift List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jonas Witt Cc: xenomai@xenomai.org On 2011-05-30 14:32, Jonas Witt wrote: > Am 30.05.2011 12:33, schrieb Pavel Machek: >> On Mon 2011-05-30 12:31:16, Jonas Witt wrote: >>> Am 30.05.2011 09:57, schrieb Jan Kiszka: >>>> On 2011-05-30 09:43, Jonas Witt wrote: >>>>> Am 30.05.2011 09:07, schrieb Jan Kiszka: >>>>>> On 2011-05-30 09:03, Pavel Machek wrote: >>>>>>> On Sat 2011-05-28 16:32:45, Jan Kiszka wrote: >>>>>>>> On 2011-05-27 21:11, Gilles Chanteperdrix wrote: >>>>>>>>> On 05/27/2011 08:29 PM, Jonas Witt wrote: >>>>>>>>>> Sorry, I missed the NTP-part. I am not using NTP. Just plain >>>>>>>>>> timer >>>>>>>>>> queries on a single system. >>>>>>>>>> >>>>>>>>>> My clock source is tsc which is the same for Xenomai I suppose. >>>>>>>>>> >>>>>>>>>> I wonder how a Xenomai task, even if it occupies 50% or even 90% >>>>>>>>>> of a 4 >>>>>>>>>> milliseconds time slice can interfere with the tsc. The tsc is >>>>>>>>>> not >>>>>>>>>> incremented via an interrupt, is it? But I do not know much >>>>>>>>>> about the >>>>>>>>>> inner workings of these functions. >>>>>>>>> The problem is not the clocksource, the problem is the timer >>>>>>>>> interrupt. >>>>>>>>> The kernel expects 1 timer tick every millisecond. >>>>>>>> Not on archs that are CONFIG_NO_HZ capable. >>>>>>> Umm. NO_HZ is only active while system is idle. Kernel will still >>>>>>> expect the periodic ticks when CPU is busy.... >>>>>>> >>>>>>> (I'm not sure how the compensation works; perhaps it can compensate >>>>>>> even while busy..) >>>>>> See update_wall_time, the !CONFIG_ARCH_USES_GETTIMEOFFSET includes no >>>>>> fixed tick length. >>>>>> >>>>>> Again, this is also important for Linux when running over hypervisors >>>>>> which tend to miss ticks on overcommitment as well. >>>>>> >>>>>> Jan >>>>> Thanks for the active discussion of the issue. I attached my config. >>>>> CONFIG_NO_HZ is activated and I think I disabled all power management >>>>> and frequency scaling correctly. Do you think it is worth trying a >>>>> kernel with fixed Hz as Gilles suggested? Actually the 1ms Xenomai >>>>> load >>>>> seems to play at least some role in the issue. >>>> For sure, I may also be proven wrong by plain reality. >>>> >>>> In addition, enable CONFIG_PM and ACPI with the exception of >>>> ACPI_PROCESSOR. Who knows what your BIOS is doing in the absence of OS >>>> support for this. >>>> >>>> Jan >>> I just compiled another kernel with an alternate configuration as >>> you and Gilles described (see the attached file). Now this is the >>> result: >>> >>> # ./clocktest >>> == Tested clock: 0 (CLOCK_REALTIME) >>> CPU ToD offset [us] ToD drift [us/s] warps max delta [us] >>> --- -------------------- ---------------- ---------- -------------- >>> 0 -1004111.0 0.026 0 0.00 >>> 1 -1004110.4 0.025 0 0.0 >>> >>> >>> Looks perfect now (even with 2500us processing of 4000us periods)! A >>> big thank you to all of you. So either the 100Hz changed the >>> situation or the ACPI changes. The secondary mode switches for my >>> XenoQueue are still there, though. I will work on a minimal test >>> program to reproduce this. Thanks again! Do you think this >>> configuration advice should be put somewhere for others to read? >> If you could verify config with 100Hz but no ACPI changes, that would >> be great... > I just built another kernel with power management completely disabled > and got similar timing results. So it actually seems to be related to > timer interrupts that are missed in the 1000Hz setting as Gilles suggested. Weird and not explainable. I'm currently running a RT CPU hog that eats 500 ms of each second on a single-core x86-64 box, and clocktest reports the very same drift as without any load. I'll see if I can give your .config a try later on. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux