Nicholas Mc Guire wrote: > >> Latencies are mainly due to cache refills on the P4. Have you already >> put load onto your system? If not, worst case latencies will be even >> longer. > > > one posibility we found in RTLinux/GPL to reduce latency is to free up > TLBs by flushing a few of the TLB hot spots, basically these flushpoints > are something like: > > __asm__ __volatile__("invlpg %0": :"m" > (*(char*)__builtin_return_address(0))); > > put at places where we know we don't need thos lines any more (i.e. > after switching tasks or the like). By inserting only a few such > flushpoints in > hot code on the kernel side we found a clear reduction of the worst case > jitter and interrupt response times. Interesting. Are these flushpoints present in latest kernel patches of RTLinux/GPL? Sounds like a nice thing to play with on a rainy day. :) > > Aside from caches, BTB exhaustion in high load situations is also a > problem that has not been addressed much in the realtime variants - with > the P6 families having a botched BTB prediction unit, one can use some > "strange" constructions to reduce branch penalties - i.e.: > > if(!condition){slow_path();} > else{fast_path();} > > if more predictalbe than > > if(codition){fast_path();} > else{slow_path();} I think this is also what likely()/unlikely() teaches to the the compiler on x86 (where there is no branch prediction predicate for the instructions), isn't it? > > as in the first case the branch prediction is static, thus the worst case > is that you are jumping over a few bytes of object code when the condition > is not met. in the second case the default if the BTB does not yet know > this branch is to guess not-taken and thus load the jump target of the > slow patch with the overhead of TLB/Cache penalties. > > Regarding the PPC numbers, the surprising thing for me is that the same > archs are doing MUCH better with old RTAI/RTLinux versions, i.e. 2.4.4 > kernel on a 50MHz MPC860 shows a worst case of 57us - so I do question > what is going wrong here in the 2.6.X branches of hard-realtime Linux - You forget that old stuff was kernel-only, lacking a lot of Linux integration features. Recent I-pipe-based real-time via Xenomai normally includes support for user-space RT (you can switch it off, but hardly anyone does). So its not a useful comparison given that new real-time projects almost always want full-featured user space these days. For a fairer comparison, one should consider a simple I-pipe domain that contains the real-time "application". > my suspicion is that there is too much work being done on fast-hot CPUs > and the low-end is being neglected - which is bad as the numbers you > post here for ADEOS are numbers reachable with mainstream preemptive > kernel by now as well (off course not on the low end systems though). That's scenario-dependent. Simple setups like a plain timed task can reach the dimension of I-pipe-based Xenomai, but more complex scenarios suffer from the exploding complexity in mainstream Linux, even with -rt. Just think of "simple" mutexes realised via futexes. Jan