All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] latency spikes under load
@ 2013-12-03 11:38 Kurijn Buys
  2013-12-03 11:54 ` Gilles Chanteperdrix
  2013-12-03 12:31 ` Gilles Chanteperdrix
  0 siblings, 2 replies; 38+ messages in thread
From: Kurijn Buys @ 2013-12-03 11:38 UTC (permalink / raw)
  To: Xenomai

Hi all,

New to Xenomai, and having passed the last few weeks on attempting to set it up on a Pentium 4 (see lower for hard-, software and config details), there seems to be something wrong, as I experience large latency spikes, especially under load.
I searched in the archives, I followed the troubleshooting guide (notably the part on high latencies), but with no success...
I ran the I-pipe tracer with latency -f (and
 dohell in another terminal) and I pasted the frozen output at the end of this message.
I googled for some terms to understand this log, but I can't figure out how to interpret it. I hope someone of you has some advice on that and/or sees potential problems in my set-up...

Thanks in advance!
]{urijn

Some tests I performed:
-xenomai/bin/latency (priority 99) -> latency spike from time to time, especially under load they can be up to 400µs, but also without load they can be up to 200µs sometimes.
-A similar test (from: http://www.blaess.fr/christophe/livres/solutions-temps-reel-sous-linux/) with a 3000 µs period, allowed to observe statistical output. While running the xenomai/bin/dohell program in parallel, I obtained a normal distribution, within the range [2913 - 3087]µs, except for about ten (out of 336688) measurements who occurred between 1µs and 400µs! (also tested with /proc/sys/kernel/sched_rt_runtime_us on -1)

The kernel log looks normal to me...:
[    1.973038] I-pipe: Domain Xenomai registered.
[    1.973559] Xenomai: hal/i386 started.
[    1.973949] Xenomai: scheduling class idle registered.
[    1.974135] Xenomai: scheduling class rt registered.
[    1.998987] Xenomai: real-time nucleus v2.6.3 (Lies and Truths) loaded.
[    1.999195] Xenomai: debug mode enabled.
[    2.000391] Xenomai: SMI-enabled chipset found
[    2.000605] Xenomai: SMI workaround enabled
[    2.001015] Xenomai: starting native API services.
[    2.001198] Xenomai: starting POSIX services.
[    2.001753] Xenomai: starting RTDM services.

And the files in /proc/ipipe look how they should...

Hardware: Pentium IV (lspci: 3,2GHz, i686, 32,64bit, 2 cpu's), 2Gb RAM
Software: Ubuntu 10.04, kernel&patch 2.6.38.8, Xenomai 2.6.3
Installation details:
-kernel configuration options (starting from the kernel config that my machine was already using)
-http://www.xenomai.org/index.php/Configuring_x86_kernels
-http://www.xenomai.org/documentation/xenomai-2.6/html/TROUBLESHOOTING/index.html#kconf
-options for Analogy (as I want to use a National Instruments card): http://www.lara.unb.br/wiki/index.php/Data_Acquisition_Xenomai_Analogy
-to avoid conflicts, I deactivated the Comedi drivers (Device drivers/Staging drivers/Exclude Staging drivers from being built/Data acquisition support (comedi) -> disabled)
-as recommended by the book "Solutions temps réel sous Linux",
-set "Processor type and features/Preemption Model" to "Preemptible Kernel (Low-Latency Desktop)
-enable "Real-time sub-system/Priority Coupling Support"
-I-pipe tracer options (http://www.xenomai.org/index.php/I-pipe:Tracer)
-xenomai configuration options:  --enable-smp --enable-x86-sep --enable-x86-tsc --enable-debug
(I chose to enable smp as I found smp enabled in the kernel options that worked with the original ubuntu install, however, I've tried also to install xenomai without this option here)

Notes:
-My processor seems to support both 32 as 64 bit, now I'm using 32 (but maybe I should have opted for 64?)
-I've set the SMI workaround by adding "xeno_hal.smi=-1" on the kernel command line (http://www.xenomai.org/documentation/xenomai-2.6/html/TROUBLESHOOTING/index.html#SMI), but maybe it makes a difference when it is set in the kernel config, as proposed here (http://www.xenomai.org/index.php/Configuring_x86_kernels), by setting the CONFIG_XENO_HW_SMI_WORKAROUND variable. However, I didn't find this variable in my kernel configuration...
-I disabled the legacy USB switch at BIOS configuration level.
-After the dpkg command to install the linux image I had a post installation script error that seems due to the fact the .deb  still ships a vmlinuz file, but no bzImage file (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561287), but I successfully followed the instructions proposed on that site: correcting the concerning .postinst file and running apt-get -f install.


I-pipe frozen back-tracing service on 2.6.38.8-xenomai-2.6.3-2nd/ipipe-2.11-03
------------------------------------------------------------
CPU: 0, Freeze: 1977279815608 cycles, Trace Points: 128 (+10)
Calibrated minimum trace-point overhead: 0.222 us

 +----- Hard IRQs ('|': locked)
 |+---- <unused>
 ||+--- <unused>
 |||+-- Xenomai
 ||||+- Linux ('*': domain stalled, '+': current, '#': current+stalled)
 |||||                        +---------- Delay flag ('+': > 1 us, '!': > 10 us)
 |||||                        |        +- NMI noise ('N')
 |||||                        |        |
      Type    User Val.   Time    Delay  Function (Parent)
:    #func                -227    0.502  hid_dump_input+0xf [hid] (hid_process_event+0x25 [hid])
:    #func                -226    0.469  hid_resolv_usage+0x9 [hid] (hid_dump_input+0x23 [hid])
:    #func                -226    0.474  kmem_cache_alloc_trace+0xf (hid_resolv_usage+0x261 [hid])
:    #func                -225    0.432  ipipe_check_context+0xf (kmem_cache_alloc_trace+0x63)
:|   #begin   0x80000001  -225    0.557  ipipe_check_context+0xa3 (kmem_cache_alloc_trace+0x63)
:|   #end     0x80000001  -224    0.582  ipipe_check_context+0x87 (kmem_cache_alloc_trace+0x63)
:    #func                -224    0.509  __ipipe_restore_root+0x4 (kmem_cache_alloc_trace+0xca)
:    #func                -223    0.444  ipipe_check_context+0xf (__ipipe_restore_root+0x15)
:|   #begin   0x80000001  -223    0.524  ipipe_check_context+0xa3 (__ipipe_restore_root+0x15)
:|   #end     0x80000001  -222    0.639  ipipe_check_context+0x87 (__ipipe_restore_root+0x15)
:    #func                -221    0.954  memset+0xd (kmem_cache_alloc_trace+0xdf)
:    #func                -220    0.912  strnlen+0x3 (string+0x38)
:    #func                -220+   1.359  strlen+0x4 (hid_resolv_usage+0x1b0 [hid])
:    #func                -218    0.664  memcpy+0x11 (vsnprintf+0x2da)
:    #func                -218    0.559  strnlen+0x3 (string+0x38)
:    #func                -217    0.639  strlen+0x4 (hid_dump_input+0x2e [hid])
:    #func                -216    0.694  memcpy+0x11 (vsnprintf+0x2da)
:    #func                -216    0.522  memcpy+0x11 (vsnprintf+0x2da)
:    #func                -215    0.492  hid_debug_event+0x9 [hid] (hid_dump_input+0x59 [hid])
:    #func                -215    0.514  __wake_up+0xf (hid_debug_event+0xb5 [hid])
:    #func                -214    0.549  _raw_spin_lock_irqsave+0xd (__wake_up+0x20)
:    #func                -214    0.494  ipipe_check_context+0xf (_raw_spin_lock_irqsave+0x3e)
:|   #begin   0x80000001  -213    0.642  ipipe_check_context+0xa3 (_raw_spin_lock_irqsave+0x3e)
:|   #end     0x80000001  -212    0.544  ipipe_check_context+0x87 (_raw_spin_lock_irqsave+0x3e)
:    #func                -212    0.459  ipipe_check_context+0xf (add_preempt_count+0x15)
:|   #begin   0x80000001  -211    0.567  ipipe_check_context+0xa3 (add_preempt_count+0x15)
:|   #end     0x80000001  -211+   1.039  ipipe_check_context+0x87 (add_preempt_count+0x15)
:    #func                -210    0.759  __wake_up_common+0x9 (__wake_up+0x3c)
:    #func                -209    0.672  __ipipe_spin_unlock_debug+0x3 (__wake_up+0x43)
:    #func                -208    0.404  _raw_spin_unlock_irqrestore+0x3 (__wake_up+0x4c)
:    #func                -208    0.422  __ipipe_restore_root+0x4 (_raw_spin_unlock_irqrestore+0x19)
:    #func                -208    0.432  ipipe_check_context+0xf (__ipipe_restore_root+0x15)
:|   #begin   0x80000001  -207    0.582  ipipe_check_context+0xa3 (__ipipe_restore_root+0x15)
:|   #end     0x80000001  -207    0.584  ipipe_check_context+0x87 (__ipipe_restore_root+0x15)
:    #func                -206    0.472  ipipe_check_context+0xf (sub_preempt_count+0x15)
:|   #begin   0x80000001  -205    0.592  ipipe_check_context+0xa3 (sub_preempt_count+0x15)
:|   #end     0x80000001  -205    0.717  ipipe_check_context+0x87 (sub_preempt_count+0x15)
:    #func                -204    0.542  kfree+0x9 (hid_dump_input+0x60 [hid])
:    #func                -204    0.434  ipipe_check_context+0xf (kfree+0x87)
:|   #begin   0x80000001  -203    0.552  ipipe_check_context+0xa3 (kfree+0x87)
:|   #end     0x80000001  -203    0.532  ipipe_check_context+0x87 (kfree+0x87)
:    #func                -202    0.469  __ipipe_restore_root+0x4 (kfree+0xda)
:    #func                -202    0.419  ipipe_check_context+0xf (__ipipe_restore_root+0x15)
:|   #begin   0x80000001  -201    0.739  ipipe_check_context+0xa3 (__ipipe_restore_root+0x15)
:|   #end     0x80000001  -200    0.822  ipipe_check_context+0x87 (__ipipe_restore_root+0x15)
:    #func                -200    0.502  __wake_up+0xf (hid_dump_input+0x7c [hid])
:    #func                -199    0.484  _raw_spin_lock_irqsave+0xd (__wake_up+0x20)
:    #func                -199    0.462  ipipe_check_context+0xf (_raw_spin_lock_irqsave+0x3e)
:|   #begin   0x80000001  -198    0.654  ipipe_check_context+0xa3 (_raw_spin_lock_irqsave+0x3e)
:|   #end     0x80000001  -198    0.527  ipipe_check_context+0x87 (_raw_spin_lock_irqsave+0x3e)
:    #func                -197    0.484  ipipe_check_context+0xf (add_preempt_count+0x15)
:|   #begin   0x80000001  -197    0.679  ipipe_check_context+0xa3 (add_preempt_count+0x15)
:|   #end     0x80000001  -196    0.604  ipipe_check_context+0x87 (add_preempt_count+0x15)
:    #func                -195    0.479  __wake_up_common+0x9 (__wake_up+0x3c)
:    #func                -195    0.464  __ipipe_spin_unlock_debug+0x3 (__wake_up+0x43)
:    #func                -194    0.437  _raw_spin_unlock_irqrestore+0x3 (__wake_up+0x4c)
:    #func                -194    0.427  __ipipe_restore_root+0x4 (_raw_spin_unlock_irqrestore+0x19)
:    #func                -193    0.514  ipipe_check_context+0xf (__ipipe_restore_root+0x15)
:|   #begin   0x80000001  -193    0.634  ipipe_check_context+0xa3 (__ipipe_restore_root+0x15)
:|   #end     0x80000001  -192    0.607  ipipe_check_context+0x87 (__ipipe_restore_root+0x15)
:    #func                -192    0.672  ipipe_check_context+0xf (sub_preempt_count+0x15)
:|   #begin   0x80000001  -191+   2.031  ipipe_check_context+0xa3 (sub_preempt_count+0x15)
:|   #end     0x80000001  -189    0.584  ipipe_check_context+0x87 (sub_preempt_count+0x15)
:    #func                -188    0.444  hidinput_hid_event+0xf [hid] (hid_process_event+0xf2 [hid])
:    #func                -188    0.457  input_event+0xf (hidinput_hid_event+0x152 [hid])
:    #func                -187    0.519  _raw_spin_lock_irqsave+0xd (input_event+0x46)
:    #func                -187    0.459  ipipe_check_context+0xf (_raw_spin_lock_irqsave+0x3e)
:|   #begin   0x80000001  -187    0.584  ipipe_check_context+0xa3 (_raw_spin_lock_irqsave+0x3e)
:|   #end     0x80000001  -186    0.467  ipipe_check_context+0x87 (_raw_spin_lock_irqsave+0x3e)
:    #func                -185    0.392  ipipe_check_context+0xf (add_preempt_count+0x15)
:|   #begin   0x80000001  -185    0.549  ipipe_check_context+0xa3 (add_preempt_count+0x15)
:|   #end     0x80000001  -185    0.462  ipipe_check_context+0x87 (add_preempt_count+0x15)
:    #func                -184    0.447  add_input_randomness+0x4 (input_event+0x55)
:    #func                -184    0.414  add_timer_randomness+0x8 (add_input_randomness+0x32)
:    #func                -183    0.324  ipipe_check_context+0xf (add_preempt_count+0x15)
:|   #begin   0x80000001  -183    0.437  ipipe_check_context+0xa3 (add_preempt_count+0x15)
:|   #end     0x80000001  -182    0.454  ipipe_check_context+0x87 (add_preempt_count+0x15)
:    #func                -182    0.352  mix_pool_bytes_extract+0x9 (add_timer_randomness+0x8c)
:    #func                -182    0.364  _raw_spin_lock_irqsave+0xd (mix_pool_bytes_extract+0x48)
:    #func                -181    0.329  ipipe_check_context+0xf (_raw_spin_lock_irqsave+0x3e)
:|   #begin   0x80000001  -181    0.437  ipipe_check_context+0xa3 (_raw_spin_lock_irqsave+0x3e)
:|   #end     0x80000001  -180    0.407  ipipe_check_context+0x87 (_raw_spin_lock_irqsave+0x3e)
:    #func                -180    0.322  ipipe_check_context+0xf (add_preempt_count+0x15)
:|   #begin   0x80000001  -180    0.439  ipipe_check_context+0xa3 (add_preempt_count+0x15)
:|   #end     0x80000001  -179! 149.235  ipipe_check_context+0x87 (add_preempt_count+0x15)
:|   #begin   0xffffff0c   -30    0.707  ipipe_ipi0+0x30 (mix_pool_bytes_extract+0x105)
:|   #func                 -29    0.739  __ipipe_handle_irq+0x9 (ipipe_ipi0+0x37)
:|   #func                 -29    0.547  irq_to_desc+0x3 (__ipipe_handle_irq+0x250)
:|   #func                 -28    0.542  __ipipe_ack_apic+0x3 (__ipipe_handle_irq+0x257)
:|   #func                 -28    0.542  native_apic_mem_write+0x3 (__ipipe_ack_apic+0x1b)
:|   #func                 -27    0.597  __ipipe_dispatch_wired+0x11 (__ipipe_handle_irq+0x1ed)
:|   #func                 -26    0.789  __ipipe_dispatch_wired_nocheck+0x9 (__ipipe_dispatch_wired+0x48)
:|  #*func                 -26    0.594  xnintr_clock_handler+0x9 (__ipipe_dispatch_wired_nocheck+0x91)
:|  #*func                 -25    0.529  xntimer_tick_aperiodic+0x9 (xnintr_clock_handler+0xba)
:|  #*func                 -24    0.424  xnthread_periodic_handler+0x3 (xntimer_tick_aperiodic+0xce)
:|  #*func                 -24    0.579  xnpod_resume_thread+0x9 (xnthread_periodic_handler+0x25)
:|  #*[ 1226] -<?>-   99   -23    0.499  xnpod_resume_thread+0xe5 (xnthread_periodic_handler+0x25)
:|  #*func                 -23    0.807  T.832+0xf (xnpod_resume_thread+0x1bf)
:|  #*func                 -22    0.467  xntimer_next_local_shot+0xf (xntimer_tick_aperiodic+0x86)
:|  #*event   tick@33      -22    0.454  xntimer_next_local_shot+0x9f (xntimer_tick_aperiodic+0x86)
:|  #*func                 -21    0.517  native_apic_mem_read+0x3 (xntimer_next_local_shot+0xe3)
:|  #*func                 -21    0.584  native_apic_mem_write+0x3 (xntimer_next_local_shot+0xf6)
:|  #*func                 -20    0.709  __xnpod_schedule+0x9 (xnintr_clock_handler+0x245)
:|  #*[ 1625] -<?>-   -1   -19    0.459  __xnpod_schedule+0x132 (xnintr_clock_handler+0x245)
:|  #*func                 -19+   1.592  xnsched_pick_next+0x11 (__xnpod_schedule+0x1d2)
:|  #*[ 1226] -<?>-   99   -17    0.832  __xnpod_schedule+0x4e0 (xnpod_schedule+0x3d)
:|  #*func                 -17    0.624  T.832+0xf (__xnpod_schedule+0x175)
:|  #*func                 -16    0.594  T.832+0xf (xnpod_suspend_thread+0x160)
:|  #*func                 -15    0.614  xntimer_get_overruns+0xf (xnpod_wait_thread_period+0x14a)
:|  #*func                 -15    0.672  T.832+0xf (xnpod_wait_thread_period+0x162)
:|  #*func                 -14    0.614  __ipipe_restore_pipeline_head+0xc (T.832+0xb8)
:|  +*end     0x80000000   -13    0.787  __ipipe_restore_pipeline_head+0x80 (T.832+0xb8)
:|  +*begin   0x80000001   -13    0.762  __ipipe_dispatch_event+0x20b (__ipipe_syscall_root+0x51)
:|  +*end     0x80000001   -12    0.569  __ipipe_dispatch_event+0x297 (__ipipe_syscall_root+0x51)
:|  +*begin   0x80000001   -11+   3.271  __ipipe_syscall_root+0x1b3 (sysenter_past_esp+0x55)
:   +*func                  -8    0.494  __ipipe_syscall_root+0xf (sysenter_past_esp+0x55)
:   +*func                  -8    0.457  __ipipe_dispatch_event+0x9 (__ipipe_syscall_root+0x51)
:|  +*begin   0x80000001    -7    0.682  __ipipe_dispatch_event+0x2cb (__ipipe_syscall_root+0x51)
:|  +*end     0x80000001    -6    0.494  __ipipe_dispatch_event+0x1fa (__ipipe_syscall_root+0x51)
:   +*func                  -6    0.967  hisyscall_event+0xf (__ipipe_dispatch_event+0xc0)
:   +*func                  -5    0.854  xnshadow_sys_trace+0x11 (hisyscall_event+0x1db)
:   +*func                  -4    0.557  ipipe_trace_frozen_reset+0x9 (xnshadow_sys_trace+0xd9)
:   +*func                  -4    0.442  __ipipe_global_path_lock+0x9 (ipipe_trace_frozen_reset+0x1d)
:   +*func                  -3    0.477  __ipipe_spin_lock_irqsave+0xf (__ipipe_global_path_lock+0x22)
:|  +*begin   0x80000001    -3+   2.046  __ipipe_spin_lock_irqsave+0xa3 (__ipipe_global_path_lock+0x22)
:|  #*func                  -1    0.564  __ipipe_spin_unlock_irqcomplete+0xf (__ipipe_global_path_unlock+0x67)
:|  +*end     0x80000001     0    0.514  __ipipe_spin_unlock_irqcomplete+0x6c (__ipipe_global_path_unlock+0x67)
<   +*freeze  0x0002515c     0    0.594  xnshadow_sys_trace+0xe2 (hisyscall_event+0x1db)
 |  +*begin   0x80000001     0    0.654  __ipipe_dispatch_event+0x20b (__ipipe_syscall_root+0x51)
 |  +*end     0x80000001     1    0.669  __ipipe_dispatch_event+0x297 (__ipipe_syscall_root+0x51)
 |  +*begin   0x80000001     1    0.884  __ipipe_syscall_root+0x1b3 (sysenter_past_esp+0x55)
    +*func                   2    0.509  __ipipe_syscall_root+0xf (sysenter_past_esp+0x55)
    +*func                   3    0.412  __ipipe_dispatch_event+0x9 (__ipipe_syscall_root+0x51)
 |  +*begin   0x80000001     3    0.609  __ipipe_dispatch_event+0x2cb (__ipipe_syscall_root+0x51)
 |  +*end     0x80000001     4    0.482  __ipipe_dispatch_event+0x1fa (__ipipe_syscall_root+0x51)
    +*func                   4    0.492  hisyscall_event+0xf (__ipipe_dispatch_event+0xc0)
    +*func                   5    0.449  __rt_task_wait_period+0x7 (hisyscall_event+0x1db)
    +*func                   5    0.000  rt_task_wait_period+0x3 (__rt_task_wait_period+0x16)


-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 11:38 [Xenomai] latency spikes under load Kurijn Buys
@ 2013-12-03 11:54 ` Gilles Chanteperdrix
  2013-12-03 12:31 ` Gilles Chanteperdrix
  1 sibling, 0 replies; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-03 11:54 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/03/2013 12:38 PM, Kurijn Buys wrote:
> Hi all,
>
> New to Xenomai, and having passed the last few weeks on attempting to set it up on a Pentium 4 (see lower for hard-, software and config details), there seems to be something wrong, as I experience large latency spikes, especially under load.
> I searched in the archives, I followed the troubleshooting guide (notably the part on high latencies), but with no success...
> I ran the I-pipe tracer with latency -f (and
>   dohell in another terminal) and I pasted the frozen output at the end of this message.
> I googled for some terms to understand this log, but I can't figure out how to interpret it. I hope someone of you has some advice on that and/or sees potential problems in my set-up...
>
> Thanks in advance!
> ]{urijn
>
> Some tests I performed:
> -xenomai/bin/latency (priority 99) -> latency spike from time to time, especially under load they can be up to 400µs, but also without load they can be up to 200µs sometimes.
> -A similar test (from: http://www.blaess.fr/christophe/livres/solutions-temps-reel-sous-linux/) with a 3000 µs period, allowed to observe statistical output. While running the xenomai/bin/dohell program in parallel, I obtained a normal distribution, within the range [2913 - 3087]µs, except for about ten (out of 336688) measurements who occurred between 1µs and 400µs! (also tested with /proc/sys/kernel/sched_rt_runtime_us on -1)
>
> The kernel log looks normal to me...:
> [    1.973038] I-pipe: Domain Xenomai registered.
> [    1.973559] Xenomai: hal/i386 started.
> [    1.973949] Xenomai: scheduling class idle registered.
> [    1.974135] Xenomai: scheduling class rt registered.
> [    1.998987] Xenomai: real-time nucleus v2.6.3 (Lies and Truths) loaded.
> [    1.999195] Xenomai: debug mode enabled.
> [    2.000391] Xenomai: SMI-enabled chipset found
> [    2.000605] Xenomai: SMI workaround enabled
> [    2.001015] Xenomai: starting native API services.
> [    2.001198] Xenomai: starting POSIX services.
> [    2.001753] Xenomai: starting RTDM services.
>
> And the files in /proc/ipipe look how they should...
>
> Hardware: Pentium IV (lspci: 3,2GHz, i686, 32,64bit, 2 cpu's), 2Gb RAM
> Software: Ubuntu 10.04, kernel&patch 2.6.38.8, Xenomai 2.6.3
> Installation details:
> -kernel configuration options (starting from the kernel config that my machine was already using)
> -http://www.xenomai.org/index.php/Configuring_x86_kernels
> -http://www.xenomai.org/documentation/xenomai-2.6/html/TROUBLESHOOTING/index.html#kconf
> -options for Analogy (as I want to use a National Instruments card): http://www.lara.unb.br/wiki/index.php/Data_Acquisition_Xenomai_Analogy
> -to avoid conflicts, I deactivated the Comedi drivers (Device drivers/Staging drivers/Exclude Staging drivers from being built/Data acquisition support (comedi) -> disabled)
> -as recommended by the book "Solutions temps réel sous Linux",
> -set "Processor type and features/Preemption Model" to "Preemptible Kernel (Low-Latency Desktop)
> -enable "Real-time sub-system/Priority Coupling Support"
> -I-pipe tracer options (http://www.xenomai.org/index.php/I-pipe:Tracer)
> -xenomai configuration options:  --enable-smp --enable-x86-sep --enable-x86-tsc --enable-debug
> (I chose to enable smp as I found smp enabled in the kernel options that worked with the original ubuntu install, however, I've tried also to install xenomai without this option here)
>
> Notes:
> -My processor seems to support both 32 as 64 bit, now I'm using 32 (but maybe I should have opted for 64?)
> -I've set the SMI workaround by adding "xeno_hal.smi=-1" on the kernel command line (http://www.xenomai.org/documentation/xenomai-2.6/html/TROUBLESHOOTING/index.html#SMI), but maybe it makes a difference when it is set in the kernel config, as proposed here (http://www.xenomai.org/index.php/Configuring_x86_kernels), by setting the CONFIG_XENO_HW_SMI_WORKAROUND variable. However, I didn't find this variable in my kernel configuration...
> -I disabled the legacy USB switch at BIOS configuration level.

The least we can say is that it seems you have done your homework before 
posting to the list, congratulations, that is rather unusual. A few 
things to check:
- do you have ACPI enabled, if no, enable it;
- xeno_hal.smi=-1 normally disables the SMI workaround, the right option 
is xeno_hal.smi=1 to enable the workaround, but this may be a typo since 
the boot logs you show us show that you have the workaround enabled;
- if your processor has hyper-threading, have you tried disabling it?

Regards.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 11:38 [Xenomai] latency spikes under load Kurijn Buys
  2013-12-03 11:54 ` Gilles Chanteperdrix
@ 2013-12-03 12:31 ` Gilles Chanteperdrix
  2013-12-03 13:07   ` Kurijn Buys
  1 sibling, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-03 12:31 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/03/2013 12:38 PM, Kurijn Buys wrote:
> Hardware: Pentium IV (lspci: 3,2GHz, i686, 32,64bit, 2 cpu's), 2Gb
> RAM Software: Ubuntu 10.04, kernel&patch 2.6.38.8, Xenomai 2.6.3

2.6.38 is rather old, could you try a more recent kernel like 3.8.13 to
see if you get the same problem?

> -I've set the SMI workaround by adding "xeno_hal.smi=-1" on the
> kernel command line
> (http://www.xenomai.org/documentation/xenomai-2.6/html/TROUBLESHOOTING/index.html#SMI),
> but maybe it makes a difference when it is set in the kernel config,
> as proposed here
> (http://www.xenomai.org/index.php/Configuring_x86_kernels), by
> setting the CONFIG_XENO_HW_SMI_WORKAROUND variable. However, I didn't
> find this variable in my kernel configuration...

Yes, the SMI workaround has changed from a compile-time option to a
run-time option, in order to avoid having to recompile the kernel to
change the option.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 12:31 ` Gilles Chanteperdrix
@ 2013-12-03 13:07   ` Kurijn Buys
  2013-12-03 13:23     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-03 13:07 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

Thanks for the quick response,
ACPI is enabled, I only disabled "Processor" in there...
-1 was a typo indeed, it is at 1...
I see SCHED_SMT [=y] in my kernel config... shall I recompile the kernel with this disabled then... no other things to try first/at the same time?

I realized that the test with sched_rt_runtime_us on -1 I performed was with an earlier set-up. When I set it now to -1, I have better performance, but:
1) still spikes of up to 87us under load with ./latency
2) still some completely shifted occurrences with the other latency test, with a 1000µs period (but now only 2 out of 890814), and the rest of the distribution lies in [861-1139]µs, which is also rather large I suppose.

The ipipe trace after test (1) was similar to the one I posted, where this line seems to be the problem I suppose:
:|   #end     0x80000001  -179! 149.235  ipipe_check_context+0x87 (add_preempt_count+0x15)

Merci!
]{urijn

Op 3-dec.-2013, om 12:31 heeft Gilles Chanteperdrix het volgende geschreven:

On 12/03/2013 12:38 PM, Kurijn Buys wrote:
Hardware: Pentium IV (lspci: 3,2GHz, i686, 32,64bit, 2 cpu's), 2Gb
RAM Software: Ubuntu 10.04, kernel&patch 2.6.38.8, Xenomai 2.6.3

2.6.38 is rather old, could you try a more recent kernel like 3.8.13 to
see if you get the same problem?

I tried with 3.8.13 first, but with the same results.
As I found on the internet that it is not advised to run a too new kernel with an older ubuntu release I switched to 2.6.38, which is close to the original ubuntu 10.04 one: 2.6.32...

-I've set the SMI workaround by adding "xeno_hal.smi=-1" on the
kernel command line
(http://www.xenomai.org/documentation/xenomai-2.6/html/TROUBLESHOOTING/index.html#SMI),
but maybe it makes a difference when it is set in the kernel config,
as proposed here
(http://www.xenomai.org/index.php/Configuring_x86_kernels), by
setting the CONFIG_XENO_HW_SMI_WORKAROUND variable. However, I didn't
find this variable in my kernel configuration...

Yes, the SMI workaround has changed from a compile-time option to a
run-time option, in order to avoid having to recompile the kernel to
change the option.

--
                                                               Gilles.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 13:07   ` Kurijn Buys
@ 2013-12-03 13:23     ` Gilles Chanteperdrix
  2013-12-03 15:31       ` Kurijn Buys
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-03 13:23 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/03/2013 02:07 PM, Kurijn Buys wrote:
> Thanks for the quick response, ACPI is enabled, I only disabled
> "Processor" in there... -1 was a typo indeed, it is at 1... I see
> SCHED_SMT [=y] in my kernel config... shall I recompile the kernel
> with this disabled then... no other things to try first/at the same
> time?

To remove hyperthreading, either:
- disable it in the BIOS configuration;
- or disable CONFIG_SMP (not SCHED_SMPT) in the kernel configuration.

>
> I realized that the test with sched_rt_runtime_us on -1 I performed
> was with an earlier set-up. When I set it now to -1, I have better
> performance, but: 1) still spikes of up to 87us under load with
> ./latency 2) still some completely shifted occurrences with the other
> latency test, with a 1000µs period (but now only 2 out of 890814),
> and the rest of the distribution lies in [861-1139]µs, which is also
> rather large I suppose.

sched_rt_runtime_us should not make any difference.

Something else you should try is to disable root thread priority coupling.

>
> The ipipe trace after test (1) was similar to the one I posted, where
> this line seems to be the problem I suppose: :|   #end     0x80000001
> -179! 149.235  ipipe_check_context+0x87 (add_preempt_count+0x15)
>
> Merci! ]{urijn

You are welcome. Please avoid top-posting.

Regards.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 13:23     ` Gilles Chanteperdrix
@ 2013-12-03 15:31       ` Kurijn Buys
  2013-12-03 15:54         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-03 15:31 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende geschreven:

> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>> Thanks for the quick response, ACPI is enabled, I only disabled
>> "Processor" in there... -1 was a typo indeed, it is at 1... I see
>> SCHED_SMT [=y] in my kernel config... shall I recompile the kernel
>> with this disabled then... no other things to try first/at the same
>> time?
>
> To remove hyperthreading, either:
> - disable it in the BIOS configuration;
> - or disable CONFIG_SMP (not SCHED_SMPT) in the kernel configuration.
>
Ah I see, CONFIG_SMP is also enabled...
I've disabled it in BIOS, but no success (tell me if it is worth trying to disable it in the kernel config in stead).

>>
>> I realized that the test with sched_rt_runtime_us on -1 I performed
>> was with an earlier set-up. When I set it now to -1, I have better
>> performance, but: 1) still spikes of up to 87us under load with
>> ./latency 2) still some completely shifted occurrences with the other
>> latency test, with a 1000µs period (but now only 2 out of 890814),
>> and the rest of the distribution lies in [861-1139]µs, which is also
>> rather large I suppose.
>
> sched_rt_runtime_us should not make any difference.
>
> Something else you should try is to disable root thread priority coupling.
>
I have tried a config with priority coupling support disabled before, but then the system was even more vulnerable for such latency peaks (however the mean latency was a little lower!)
(I still have the kernel, but unfortunately the I-pipe tracer isn't installed there)

>>
>> The ipipe trace after test (1) was similar to the one I posted, where
>> this line seems to be the problem I suppose: :|   #end     0x80000001
>> -179! 149.235  ipipe_check_context+0x87 (add_preempt_count+0x15)
>>
...I hoped the I-pipe trace would help..?

>> Merci! ]{urijn
>
> You are welcome. Please avoid top-posting.
>
> Regards.
>
> --
>                                           Gilles.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 15:31       ` Kurijn Buys
@ 2013-12-03 15:54         ` Gilles Chanteperdrix
  2013-12-03 16:49           ` Kurijn Buys
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-03 15:54 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/03/2013 04:31 PM, Kurijn Buys wrote:
> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
> geschreven:
>
>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>> kernel with this disabled then... no other things to try first/at
>>> the same time?
>>
>> To remove hyperthreading, either: - disable it in the BIOS
>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>> kernel configuration.
>>
> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
> no success (tell me if it is worth trying to disable it in the kernel
> config in stead).

When you say "no success", you mean you still have 2 cpus ? Or you still
have latency pikes? If the former, then yes, try without CONFIG_SMP, or
pass nr_cpus=1 on the command line. If the latter, then no, testing
without CONFIG_SMP is useless.

>
>>>
>>> I realized that the test with sched_rt_runtime_us on -1 I
>>> performed was with an earlier set-up. When I set it now to -1, I
>>> have better performance, but: 1) still spikes of up to 87us under
>>> load with ./latency 2) still some completely shifted occurrences
>>> with the other latency test, with a 1000µs period (but now only 2
>>> out of 890814), and the rest of the distribution lies in
>>> [861-1139]µs, which is also rather large I suppose.
>>
>> sched_rt_runtime_us should not make any difference.
>>
>> Something else you should try is to disable root thread priority
>> coupling.
>>
> I have tried a config with priority coupling support disabled before,
> but then the system was even more vulnerable for such latency peaks
> (however the mean latency was a little lower!) (I still have the
> kernel, but unfortunately the I-pipe tracer isn't installed there)

Please keep priority coupling disabled in further tests.

>
>>>
>>> The ipipe trace after test (1) was similar to the one I posted,
>>> where this line seems to be the problem I suppose: :|   #end
>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>> (add_preempt_count+0x15)
>>>
> ...I hoped the I-pipe trace would help..?

Unfortunately the trace is not helping much. Either something happens on
the other CPU, or you have an SMI. Disabling SMP should avoid the first
issue, and for the SMI, I may have broke the SMI workaround code when
replacing the compile-time option with the run-time option, so, you may
want to try Xenomai 2.6.2.1 to see if the SMI workaround really works in
that case.

Are you still running a 2.6 kernel, or did you upgrade to 3.8.13?

Regards.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 15:54         ` Gilles Chanteperdrix
@ 2013-12-03 16:49           ` Kurijn Buys
  2013-12-03 18:50             ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-03 16:49 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende geschreven:

> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>> geschreven:
>>
>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>> kernel with this disabled then... no other things to try first/at
>>>> the same time?
>>>
>>> To remove hyperthreading, either: - disable it in the BIOS
>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>> kernel configuration.
>>>
>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
>> no success (tell me if it is worth trying to disable it in the kernel
>> config in stead).
>
> When you say "no success", you mean you still have 2 cpus ? Or you still
> have latency pikes? If the former, then yes, try without CONFIG_SMP, or
> pass nr_cpus=1 on the command line. If the latter, then no, testing
> without CONFIG_SMP is useless.

the second: still latency...
(lscpu says there is only 1 cpu now)

>
>>
>>>>
>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>> have better performance, but: 1) still spikes of up to 87us under
>>>> load with ./latency 2) still some completely shifted occurrences
>>>> with the other latency test, with a 1000µs period (but now only 2
>>>> out of 890814), and the rest of the distribution lies in
>>>> [861-1139]µs, which is also rather large I suppose.
>>>
>>> sched_rt_runtime_us should not make any difference.
>>>
>>> Something else you should try is to disable root thread priority
>>> coupling.
>>>
>> I have tried a config with priority coupling support disabled before,
>> but then the system was even more vulnerable for such latency peaks
>> (however the mean latency was a little lower!) (I still have the
>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>
> Please keep priority coupling disabled in further tests.
>
>>
>>>>
>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>> where this line seems to be the problem I suppose: :|   #end
>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>> (add_preempt_count+0x15)
>>>>
>> ...I hoped the I-pipe trace would help..?
>
> Unfortunately the trace is not helping much.

If it would help, I've another trace (joint as txt) where the following line seems to indicate a problem:
:    +func                -141! 117.825  i915_gem_flush_ring+0x9 [i915] (i915_gem_do_execbuffer+0xb46 [i915])
-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: latencytest_ipipetrace_highload_root2.txt
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20131203/4fad1efe/attachment.txt>
-------------- next part --------------


> Either something happens on
> the other CPU, or you have an SMI. Disabling SMP should avoid the first
> issue, and for the SMI, I may have broke the SMI workaround code when
> replacing the compile-time option with the run-time option, so, you may
> want to try Xenomai 2.6.2.1 to see if the SMI workaround really works in
> that case.

I've a colleague who succeeded the SMI workaround in 2.6.3, I'm not sure, I'm waiting his response.

> 
> Are you still running a 2.6 kernel, or did you upgrade to 3.8.13?

Still on 2.6, I stated with 3.8.13 but it had the same problem.
Maybe I could start over with a newer ubuntu as well, initially I didn't as I tried keeping the system as close as possible to the one of this colleague...

> 
> Regards.
> 
> -- 
> 					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 16:49           ` Kurijn Buys
@ 2013-12-03 18:50             ` Gilles Chanteperdrix
  2013-12-04  8:44               ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-03 18:50 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/03/2013 05:49 PM, Kurijn Buys wrote:
> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende geschreven:
> 
>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>> geschreven:
>>>
>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>> kernel with this disabled then... no other things to try first/at
>>>>> the same time?
>>>>
>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>> kernel configuration.
>>>>
>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
>>> no success (tell me if it is worth trying to disable it in the kernel
>>> config in stead).
>>
>> When you say "no success", you mean you still have 2 cpus ? Or you still
>> have latency pikes? If the former, then yes, try without CONFIG_SMP, or
>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>> without CONFIG_SMP is useless.
> 
> the second: still latency...
> (lscpu says there is only 1 cpu now)
> 
>>
>>>
>>>>>
>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>> out of 890814), and the rest of the distribution lies in
>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>
>>>> sched_rt_runtime_us should not make any difference.
>>>>
>>>> Something else you should try is to disable root thread priority
>>>> coupling.
>>>>
>>> I have tried a config with priority coupling support disabled before,
>>> but then the system was even more vulnerable for such latency peaks
>>> (however the mean latency was a little lower!) (I still have the
>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>
>> Please keep priority coupling disabled in further tests.
>>
>>>
>>>>>
>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>> (add_preempt_count+0x15)
>>>>>
>>> ...I hoped the I-pipe trace would help..?
>>
>> Unfortunately the trace is not helping much.
> 
> If it would help, I've another trace (joint as txt) where the following line seems to indicate a problem:
> :    +func                -141! 117.825  i915_gem_flush_ring+0x9 [i915] (i915_gem_do_execbuffer+0xb46 [i915])
> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).

Ah this is a known issue then. I traced back this issue some time ago,
and from what I understood on the rt-users mailing list it is fixed on
more recent kernels. So, I would advise to update to 3.10.18 branch,
available here by git:

git://git.xenomai.org/ipipe.git branch ipipe-3.10.18

Alternatively, you can disable graphic acceleration by using XFree/Xorg
fbdev driver instead of i915.

Regards.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-03 18:50             ` Gilles Chanteperdrix
@ 2013-12-04  8:44               ` Philippe Gerum
  2013-12-04  8:51                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04  8:44 UTC (permalink / raw)
  To: Gilles Chanteperdrix, Kurijn Buys; +Cc: Xenomai

On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende geschreven:
>>
>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>> geschreven:
>>>>
>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>> kernel with this disabled then... no other things to try first/at
>>>>>> the same time?
>>>>>
>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>> kernel configuration.
>>>>>
>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
>>>> no success (tell me if it is worth trying to disable it in the kernel
>>>> config in stead).
>>>
>>> When you say "no success", you mean you still have 2 cpus ? Or you still
>>> have latency pikes? If the former, then yes, try without CONFIG_SMP, or
>>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>>> without CONFIG_SMP is useless.
>>
>> the second: still latency...
>> (lscpu says there is only 1 cpu now)
>>
>>>
>>>>
>>>>>>
>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>
>>>>> sched_rt_runtime_us should not make any difference.
>>>>>
>>>>> Something else you should try is to disable root thread priority
>>>>> coupling.
>>>>>
>>>> I have tried a config with priority coupling support disabled before,
>>>> but then the system was even more vulnerable for such latency peaks
>>>> (however the mean latency was a little lower!) (I still have the
>>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>>
>>> Please keep priority coupling disabled in further tests.
>>>
>>>>
>>>>>>
>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>> (add_preempt_count+0x15)
>>>>>>
>>>> ...I hoped the I-pipe trace would help..?
>>>
>>> Unfortunately the trace is not helping much.
>>
>> If it would help, I've another trace (joint as txt) where the following line seems to indicate a problem:
>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9 [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).
>
> Ah this is a known issue then. I traced back this issue some time ago,
> and from what I understood on the rt-users mailing list it is fixed on
> more recent kernels. So, I would advise to update to 3.10.18 branch,
> available here by git:

Incidentally, I've been chasing a latency issue on x86 involving the 
i915 chipset recently on 3.10, and it turned out that we were still 
badly hit by wbinvd instructions, emitted on _all_ cores via an IPI in 
the GEM control code, when the LLC cache is present.

The jitter incurred by invalidating all internal caches exceeds 300 us 
in my test case, so it seems that we are not there yet.

>
> git://git.xenomai.org/ipipe.git branch ipipe-3.10.18
>
> Alternatively, you can disable graphic acceleration by using XFree/Xorg
> fbdev driver instead of i915.
>

Yes, this is still the safest option with some chipsets unfortunately.

> Regards.
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04  8:44               ` Philippe Gerum
@ 2013-12-04  8:51                 ` Gilles Chanteperdrix
  2013-12-04  9:27                   ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-04  8:51 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 09:44 AM, Philippe Gerum wrote:
> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende geschreven:
>>>
>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>> geschreven:
>>>>>
>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>>> kernel with this disabled then... no other things to try first/at
>>>>>>> the same time?
>>>>>>
>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>> kernel configuration.
>>>>>>
>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
>>>>> no success (tell me if it is worth trying to disable it in the kernel
>>>>> config in stead).
>>>>
>>>> When you say "no success", you mean you still have 2 cpus ? Or you still
>>>> have latency pikes? If the former, then yes, try without CONFIG_SMP, or
>>>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>>>> without CONFIG_SMP is useless.
>>>
>>> the second: still latency...
>>> (lscpu says there is only 1 cpu now)
>>>
>>>>
>>>>>
>>>>>>>
>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>
>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>
>>>>>> Something else you should try is to disable root thread priority
>>>>>> coupling.
>>>>>>
>>>>> I have tried a config with priority coupling support disabled before,
>>>>> but then the system was even more vulnerable for such latency peaks
>>>>> (however the mean latency was a little lower!) (I still have the
>>>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>>>
>>>> Please keep priority coupling disabled in further tests.
>>>>
>>>>>
>>>>>>>
>>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>> (add_preempt_count+0x15)
>>>>>>>
>>>>> ...I hoped the I-pipe trace would help..?
>>>>
>>>> Unfortunately the trace is not helping much.
>>>
>>> If it would help, I've another trace (joint as txt) where the following line seems to indicate a problem:
>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9 [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).
>>
>> Ah this is a known issue then. I traced back this issue some time ago,
>> and from what I understood on the rt-users mailing list it is fixed on
>> more recent kernels. So, I would advise to update to 3.10.18 branch,
>> available here by git:
>
> Incidentally, I've been chasing a latency issue on x86 involving the
> i915 chipset recently on 3.10,

was it 3.10 or 3.10.18 ?

> and it turned out that we were still
> badly hit by wbinvd instructions, emitted on _all_ cores via an IPI in
> the GEM control code, when the LLC cache is present.
>
> The jitter incurred by invalidating all internal caches exceeds 300 us
> in my test case, so it seems that we are not there yet.

Ok, maybe the preempt_rt workaround is only enabled for 
CONFIG_PREEMPT_RT? In which case we can try and import the patch in the 
I-pipe.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04  8:51                 ` Gilles Chanteperdrix
@ 2013-12-04  9:27                   ` Philippe Gerum
  2013-12-04  9:31                     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04  9:27 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>> geschreven:
>>>>
>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>>> geschreven:
>>>>>>
>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>>>> kernel with this disabled then... no other things to try first/at
>>>>>>>> the same time?
>>>>>>>
>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>> kernel configuration.
>>>>>>>
>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
>>>>>> no success (tell me if it is worth trying to disable it in the kernel
>>>>>> config in stead).
>>>>>
>>>>> When you say "no success", you mean you still have 2 cpus ? Or you
>>>>> still
>>>>> have latency pikes? If the former, then yes, try without
>>>>> CONFIG_SMP, or
>>>>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>>>>> without CONFIG_SMP is useless.
>>>>
>>>> the second: still latency...
>>>> (lscpu says there is only 1 cpu now)
>>>>
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>
>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>
>>>>>>> Something else you should try is to disable root thread priority
>>>>>>> coupling.
>>>>>>>
>>>>>> I have tried a config with priority coupling support disabled before,
>>>>>> but then the system was even more vulnerable for such latency peaks
>>>>>> (however the mean latency was a little lower!) (I still have the
>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>>>>
>>>>> Please keep priority coupling disabled in further tests.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>
>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>
>>>>> Unfortunately the trace is not helping much.
>>>>
>>>> If it would help, I've another trace (joint as txt) where the
>>>> following line seems to indicate a problem:
>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>> -- The Open University is incorporated by Royal Charter (RC 000391),
>>>> an exempt charity in England & Wales and a charity registered in
>>>> Scotland (SC 038302).
>>>
>>> Ah this is a known issue then. I traced back this issue some time ago,
>>> and from what I understood on the rt-users mailing list it is fixed on
>>> more recent kernels. So, I would advise to update to 3.10.18 branch,
>>> available here by git:
>>
>> Incidentally, I've been chasing a latency issue on x86 involving the
>> i915 chipset recently on 3.10,
>
> was it 3.10 or 3.10.18 ?
>

http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10

which is currently 3.10.18.

>> and it turned out that we were still
>> badly hit by wbinvd instructions, emitted on _all_ cores via an IPI in
>> the GEM control code, when the LLC cache is present.
>>
>> The jitter incurred by invalidating all internal caches exceeds 300 us
>> in my test case, so it seems that we are not there yet.
>
> Ok, maybe the preempt_rt workaround is only enabled for
> CONFIG_PREEMPT_RT? In which case we can try and import the patch in the
> I-pipe.
>

Looking at the comment in the GEM code, this invalidation is required to 
flush transactions before updating the fence register.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04  9:27                   ` Philippe Gerum
@ 2013-12-04  9:31                     ` Gilles Chanteperdrix
  2013-12-04  9:40                       ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-04  9:31 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 10:27 AM, Philippe Gerum wrote:
> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>> geschreven:
>>>>>
>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>>>> geschreven:
>>>>>>>
>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>>>>> kernel with this disabled then... no other things to try first/at
>>>>>>>>> the same time?
>>>>>>>>
>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>> kernel configuration.
>>>>>>>>
>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in BIOS, but
>>>>>>> no success (tell me if it is worth trying to disable it in the kernel
>>>>>>> config in stead).
>>>>>>
>>>>>> When you say "no success", you mean you still have 2 cpus ? Or you
>>>>>> still
>>>>>> have latency pikes? If the former, then yes, try without
>>>>>> CONFIG_SMP, or
>>>>>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>>>>>> without CONFIG_SMP is useless.
>>>>>
>>>>> the second: still latency...
>>>>> (lscpu says there is only 1 cpu now)
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>
>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>
>>>>>>>> Something else you should try is to disable root thread priority
>>>>>>>> coupling.
>>>>>>>>
>>>>>>> I have tried a config with priority coupling support disabled before,
>>>>>>> but then the system was even more vulnerable for such latency peaks
>>>>>>> (however the mean latency was a little lower!) (I still have the
>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>>>>>
>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>
>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>
>>>>>> Unfortunately the trace is not helping much.
>>>>>
>>>>> If it would help, I've another trace (joint as txt) where the
>>>>> following line seems to indicate a problem:
>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>> -- The Open University is incorporated by Royal Charter (RC 000391),
>>>>> an exempt charity in England & Wales and a charity registered in
>>>>> Scotland (SC 038302).
>>>>
>>>> Ah this is a known issue then. I traced back this issue some time ago,
>>>> and from what I understood on the rt-users mailing list it is fixed on
>>>> more recent kernels. So, I would advise to update to 3.10.18 branch,
>>>> available here by git:
>>>
>>> Incidentally, I've been chasing a latency issue on x86 involving the
>>> i915 chipset recently on 3.10,
>>
>> was it 3.10 or 3.10.18 ?
>>
>
> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>
> which is currently 3.10.18.
>
>>> and it turned out that we were still
>>> badly hit by wbinvd instructions, emitted on _all_ cores via an IPI in
>>> the GEM control code, when the LLC cache is present.
>>>
>>> The jitter incurred by invalidating all internal caches exceeds 300 us
>>> in my test case, so it seems that we are not there yet.
>>
>> Ok, maybe the preempt_rt workaround is only enabled for
>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in the
>> I-pipe.
>>
>
> Looking at the comment in the GEM code, this invalidation is required to
> flush transactions before updating the fence register.
>

 From what I understood, the preempt_rt patch asks users to pin the X 
server on one cpu and disables the IPI, so the invalidation can be run 
on only one cpu. That said, if that had solved the issue, Kurijn would 
not have observed the latency spikes when running with only one cpu.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04  9:31                     ` Gilles Chanteperdrix
@ 2013-12-04  9:40                       ` Philippe Gerum
  2013-12-04  9:51                         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04  9:40 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>>> geschreven:
>>>>>>
>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>>>>> geschreven:
>>>>>>>>
>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>>>>>> kernel with this disabled then... no other things to try first/at
>>>>>>>>>> the same time?
>>>>>>>>>
>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>>> kernel configuration.
>>>>>>>>>
>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>> BIOS, but
>>>>>>>> no success (tell me if it is worth trying to disable it in the
>>>>>>>> kernel
>>>>>>>> config in stead).
>>>>>>>
>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or you
>>>>>>> still
>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>> CONFIG_SMP, or
>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>>>>>>> without CONFIG_SMP is useless.
>>>>>>
>>>>>> the second: still latency...
>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>>>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>>>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>
>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>
>>>>>>>>> Something else you should try is to disable root thread priority
>>>>>>>>> coupling.
>>>>>>>>>
>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>> before,
>>>>>>>> but then the system was even more vulnerable for such latency peaks
>>>>>>>> (however the mean latency was a little lower!) (I still have the
>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>>>>>>
>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>
>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>
>>>>>>> Unfortunately the trace is not helping much.
>>>>>>
>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>> following line seems to indicate a problem:
>>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>> -- The Open University is incorporated by Royal Charter (RC 000391),
>>>>>> an exempt charity in England & Wales and a charity registered in
>>>>>> Scotland (SC 038302).
>>>>>
>>>>> Ah this is a known issue then. I traced back this issue some time ago,
>>>>> and from what I understood on the rt-users mailing list it is fixed on
>>>>> more recent kernels. So, I would advise to update to 3.10.18 branch,
>>>>> available here by git:
>>>>
>>>> Incidentally, I've been chasing a latency issue on x86 involving the
>>>> i915 chipset recently on 3.10,
>>>
>>> was it 3.10 or 3.10.18 ?
>>>
>>
>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>
>> which is currently 3.10.18.
>>
>>>> and it turned out that we were still
>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an IPI in
>>>> the GEM control code, when the LLC cache is present.
>>>>
>>>> The jitter incurred by invalidating all internal caches exceeds 300 us
>>>> in my test case, so it seems that we are not there yet.
>>>
>>> Ok, maybe the preempt_rt workaround is only enabled for
>>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in the
>>> I-pipe.
>>>
>>
>> Looking at the comment in the GEM code, this invalidation is required to
>> flush transactions before updating the fence register.
>>
>
>  From what I understood, the preempt_rt patch asks users to pin the X
> server on one cpu and disables the IPI, so the invalidation can be run
> on only one cpu. That said, if that had solved the issue, Kurijn would
> not have observed the latency spikes when running with only one cpu.
>

	if (HAS_LLC(obj->base.dev))
		on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);

So this will run on every CPU regardless of the number of CPUs, in sync 
mode. In addition, this section is interrupt-enabled. Some of my tests 
were conducted in UP mode to make sure we did not face a locking latency 
inherited from another core, like we had with the APIC madness in the 
early days, and the jitter was still right there. I don't see much hope.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04  9:40                       ` Philippe Gerum
@ 2013-12-04  9:51                         ` Gilles Chanteperdrix
  2013-12-04 10:29                           ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-04  9:51 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 10:40 AM, Philippe Gerum wrote:
> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>>>> geschreven:
>>>>>>>
>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>>>>>> geschreven:
>>>>>>>>>
>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>>>>>>> kernel with this disabled then... no other things to try first/at
>>>>>>>>>>> the same time?
>>>>>>>>>>
>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>>>> kernel configuration.
>>>>>>>>>>
>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>> BIOS, but
>>>>>>>>> no success (tell me if it is worth trying to disable it in the
>>>>>>>>> kernel
>>>>>>>>> config in stead).
>>>>>>>>
>>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or you
>>>>>>>> still
>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>> CONFIG_SMP, or
>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no, testing
>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>
>>>>>>> the second: still latency...
>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>> performed was with an earlier set-up. When I set it now to -1, I
>>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us under
>>>>>>>>>>> load with ./latency 2) still some completely shifted occurrences
>>>>>>>>>>> with the other latency test, with a 1000µs period (but now only 2
>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>
>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>
>>>>>>>>>> Something else you should try is to disable root thread priority
>>>>>>>>>> coupling.
>>>>>>>>>>
>>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>>> before,
>>>>>>>>> but then the system was even more vulnerable for such latency peaks
>>>>>>>>> (however the mean latency was a little lower!) (I still have the
>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed there)
>>>>>>>>
>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>
>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>
>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>
>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>> following line seems to indicate a problem:
>>>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>> -- The Open University is incorporated by Royal Charter (RC 000391),
>>>>>>> an exempt charity in England & Wales and a charity registered in
>>>>>>> Scotland (SC 038302).
>>>>>>
>>>>>> Ah this is a known issue then. I traced back this issue some time ago,
>>>>>> and from what I understood on the rt-users mailing list it is fixed on
>>>>>> more recent kernels. So, I would advise to update to 3.10.18 branch,
>>>>>> available here by git:
>>>>>
>>>>> Incidentally, I've been chasing a latency issue on x86 involving the
>>>>> i915 chipset recently on 3.10,
>>>>
>>>> was it 3.10 or 3.10.18 ?
>>>>
>>>
>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>
>>> which is currently 3.10.18.
>>>
>>>>> and it turned out that we were still
>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an IPI in
>>>>> the GEM control code, when the LLC cache is present.
>>>>>
>>>>> The jitter incurred by invalidating all internal caches exceeds 300 us
>>>>> in my test case, so it seems that we are not there yet.
>>>>
>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in the
>>>> I-pipe.
>>>>
>>>
>>> Looking at the comment in the GEM code, this invalidation is required to
>>> flush transactions before updating the fence register.
>>>
>>
>>   From what I understood, the preempt_rt patch asks users to pin the X
>> server on one cpu and disables the IPI, so the invalidation can be run
>> on only one cpu. That said, if that had solved the issue, Kurijn would
>> not have observed the latency spikes when running with only one cpu.
>>
>
> 	if (HAS_LLC(obj->base.dev))
> 		on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>
> So this will run on every CPU regardless of the number of CPUs, in sync
> mode. In addition, this section is interrupt-enabled. Some of my tests
> were conducted in UP mode to make sure we did not face a locking latency
> inherited from another core, like we had with the APIC madness in the
> early days, and the jitter was still right there. I don't see much hope.
>

I have not read the preempt_rt patch, only the announces. But for 
instance, in the 3.8.13-rt12 patch announce, I read:

- added an option to the i915 driver to disable the expensive wbinvd. A
   warning is printed once on RT if wbinvd is not disabled to let the
   user know about this problem. This problem was decoded by Carsten Emde.


-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04  9:51                         ` Gilles Chanteperdrix
@ 2013-12-04 10:29                           ` Philippe Gerum
  2013-12-04 10:33                             ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 10:29 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>>>>> geschreven:
>>>>>>>>
>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>>>>>>> geschreven:
>>>>>>>>>>
>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only disabled
>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile the
>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>> first/at
>>>>>>>>>>>> the same time?
>>>>>>>>>>>
>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>
>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>> BIOS, but
>>>>>>>>>> no success (tell me if it is worth trying to disable it in the
>>>>>>>>>> kernel
>>>>>>>>>> config in stead).
>>>>>>>>>
>>>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or you
>>>>>>>>> still
>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>> CONFIG_SMP, or
>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>> testing
>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>
>>>>>>>> the second: still latency...
>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>> -1, I
>>>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us
>>>>>>>>>>>> under
>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>> occurrences
>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>> only 2
>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>
>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>
>>>>>>>>>>> Something else you should try is to disable root thread priority
>>>>>>>>>>> coupling.
>>>>>>>>>>>
>>>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>>>> before,
>>>>>>>>>> but then the system was even more vulnerable for such latency
>>>>>>>>>> peaks
>>>>>>>>>> (however the mean latency was a little lower!) (I still have the
>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>> there)
>>>>>>>>>
>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I posted,
>>>>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>
>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>
>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>
>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>> following line seems to indicate a problem:
>>>>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>> 000391),
>>>>>>>> an exempt charity in England & Wales and a charity registered in
>>>>>>>> Scotland (SC 038302).
>>>>>>>
>>>>>>> Ah this is a known issue then. I traced back this issue some time
>>>>>>> ago,
>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>> fixed on
>>>>>>> more recent kernels. So, I would advise to update to 3.10.18 branch,
>>>>>>> available here by git:
>>>>>>
>>>>>> Incidentally, I've been chasing a latency issue on x86 involving the
>>>>>> i915 chipset recently on 3.10,
>>>>>
>>>>> was it 3.10 or 3.10.18 ?
>>>>>
>>>>
>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>
>>>> which is currently 3.10.18.
>>>>
>>>>>> and it turned out that we were still
>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>> IPI in
>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>
>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>> 300 us
>>>>>> in my test case, so it seems that we are not there yet.
>>>>>
>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in
>>>>> the
>>>>> I-pipe.
>>>>>
>>>>
>>>> Looking at the comment in the GEM code, this invalidation is
>>>> required to
>>>> flush transactions before updating the fence register.
>>>>
>>>
>>>   From what I understood, the preempt_rt patch asks users to pin the X
>>> server on one cpu and disables the IPI, so the invalidation can be run
>>> on only one cpu. That said, if that had solved the issue, Kurijn would
>>> not have observed the latency spikes when running with only one cpu.
>>>
>>
>>     if (HAS_LLC(obj->base.dev))
>>         on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>
>> So this will run on every CPU regardless of the number of CPUs, in sync
>> mode. In addition, this section is interrupt-enabled. Some of my tests
>> were conducted in UP mode to make sure we did not face a locking latency
>> inherited from another core, like we had with the APIC madness in the
>> early days, and the jitter was still right there. I don't see much hope.
>>
>
> I have not read the preempt_rt patch, only the announces. But for
> instance, in the 3.8.13-rt12 patch announce, I read:
>
> - added an option to the i915 driver to disable the expensive wbinvd. A
>    warning is printed once on RT if wbinvd is not disabled to let the
>    user know about this problem. This problem was decoded by Carsten Emde.
>
>

This is documented as a plain reversal of the former change aimed at 
fixing non-coherence issues with fence updates:

 From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed, 10 Jul 2013 13:36:24 +0100
Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence between
  fences and LLC across multiple CPUs"

This reverts commit 25ff119 and the follow on for Valleyview commit 2dc8aae.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 10:29                           ` Philippe Gerum
@ 2013-12-04 10:33                             ` Philippe Gerum
  2013-12-04 11:04                               ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 10:33 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 11:29 AM, Philippe Gerum wrote:
> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>>>>>> geschreven:
>>>>>>>>>
>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het volgende
>>>>>>>>>>> geschreven:
>>>>>>>>>>>
>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>> disabled
>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile
>>>>>>>>>>>>> the
>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>> first/at
>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>
>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>
>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>> BIOS, but
>>>>>>>>>>> no success (tell me if it is worth trying to disable it in the
>>>>>>>>>>> kernel
>>>>>>>>>>> config in stead).
>>>>>>>>>>
>>>>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or
>>>>>>>>>> you
>>>>>>>>>> still
>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>> testing
>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>
>>>>>>>>> the second: still latency...
>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us
>>>>>>>>>>>>> under
>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>> only 2
>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>
>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>
>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>> priority
>>>>>>>>>>>> coupling.
>>>>>>>>>>>>
>>>>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>>>>> before,
>>>>>>>>>>> but then the system was even more vulnerable for such latency
>>>>>>>>>>> peaks
>>>>>>>>>>> (however the mean latency was a little lower!) (I still have the
>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>> there)
>>>>>>>>>>
>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>> posted,
>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>
>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>
>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>
>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>> 000391),
>>>>>>>>> an exempt charity in England & Wales and a charity registered in
>>>>>>>>> Scotland (SC 038302).
>>>>>>>>
>>>>>>>> Ah this is a known issue then. I traced back this issue some time
>>>>>>>> ago,
>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>> fixed on
>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>> branch,
>>>>>>>> available here by git:
>>>>>>>
>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving the
>>>>>>> i915 chipset recently on 3.10,
>>>>>>
>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>
>>>>>
>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>
>>>>> which is currently 3.10.18.
>>>>>
>>>>>>> and it turned out that we were still
>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>> IPI in
>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>
>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>> 300 us
>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>
>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in
>>>>>> the
>>>>>> I-pipe.
>>>>>>
>>>>>
>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>> required to
>>>>> flush transactions before updating the fence register.
>>>>>
>>>>
>>>>   From what I understood, the preempt_rt patch asks users to pin the X
>>>> server on one cpu and disables the IPI, so the invalidation can be run
>>>> on only one cpu. That said, if that had solved the issue, Kurijn would
>>>> not have observed the latency spikes when running with only one cpu.
>>>>
>>>
>>>     if (HAS_LLC(obj->base.dev))
>>>         on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>
>>> So this will run on every CPU regardless of the number of CPUs, in sync
>>> mode. In addition, this section is interrupt-enabled. Some of my tests
>>> were conducted in UP mode to make sure we did not face a locking latency
>>> inherited from another core, like we had with the APIC madness in the
>>> early days, and the jitter was still right there. I don't see much hope.
>>>
>>
>> I have not read the preempt_rt patch, only the announces. But for
>> instance, in the 3.8.13-rt12 patch announce, I read:
>>
>> - added an option to the i915 driver to disable the expensive wbinvd. A
>>    warning is printed once on RT if wbinvd is not disabled to let the
>>    user know about this problem. This problem was decoded by Carsten
>> Emde.
>>
>>
>
> This is documented as a plain reversal of the former change aimed at
> fixing non-coherence issues with fence updates:
>
>  From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00 2001
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Date: Wed, 10 Jul 2013 13:36:24 +0100
> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence between
>   fences and LLC across multiple CPUs"
>
> This reverts commit 25ff119 and the follow on for Valleyview commit
> 2dc8aae.
>

That one seems to be suggested as a cheaper replacement for the ugly 
wbinvd, we should have a look at it:

drm/i915: Fix incoherence with fence updates on Sandybridge+

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 10:33                             ` Philippe Gerum
@ 2013-12-04 11:04                               ` Philippe Gerum
  2013-12-04 11:10                                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 11:04 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 11:33 AM, Philippe Gerum wrote:
> On 12/04/2013 11:29 AM, Philippe Gerum wrote:
>> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>>>>>>> geschreven:
>>>>>>>>>>
>>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het
>>>>>>>>>>>> volgende
>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>>> first/at
>>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>>
>>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>>
>>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>>> BIOS, but
>>>>>>>>>>>> no success (tell me if it is worth trying to disable it in the
>>>>>>>>>>>> kernel
>>>>>>>>>>>> config in stead).
>>>>>>>>>>>
>>>>>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or
>>>>>>>>>>> you
>>>>>>>>>>> still
>>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>>> testing
>>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>>
>>>>>>>>>> the second: still latency...
>>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us
>>>>>>>>>>>>>> under
>>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>>> only 2
>>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>>
>>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>>> priority
>>>>>>>>>>>>> coupling.
>>>>>>>>>>>>>
>>>>>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>>>>>> before,
>>>>>>>>>>>> but then the system was even more vulnerable for such latency
>>>>>>>>>>>> peaks
>>>>>>>>>>>> (however the mean latency was a little lower!) (I still have
>>>>>>>>>>>> the
>>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>>> there)
>>>>>>>>>>>
>>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>>> posted,
>>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>>
>>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>>
>>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>>> 000391),
>>>>>>>>>> an exempt charity in England & Wales and a charity registered in
>>>>>>>>>> Scotland (SC 038302).
>>>>>>>>>
>>>>>>>>> Ah this is a known issue then. I traced back this issue some time
>>>>>>>>> ago,
>>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>>> fixed on
>>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>>> branch,
>>>>>>>>> available here by git:
>>>>>>>>
>>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving
>>>>>>>> the
>>>>>>>> i915 chipset recently on 3.10,
>>>>>>>
>>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>>
>>>>>>
>>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>>
>>>>>> which is currently 3.10.18.
>>>>>>
>>>>>>>> and it turned out that we were still
>>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>>> IPI in
>>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>>
>>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>>> 300 us
>>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>>
>>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in
>>>>>>> the
>>>>>>> I-pipe.
>>>>>>>
>>>>>>
>>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>>> required to
>>>>>> flush transactions before updating the fence register.
>>>>>>
>>>>>
>>>>>   From what I understood, the preempt_rt patch asks users to pin the X
>>>>> server on one cpu and disables the IPI, so the invalidation can be run
>>>>> on only one cpu. That said, if that had solved the issue, Kurijn would
>>>>> not have observed the latency spikes when running with only one cpu.
>>>>>
>>>>
>>>>     if (HAS_LLC(obj->base.dev))
>>>>         on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>>
>>>> So this will run on every CPU regardless of the number of CPUs, in sync
>>>> mode. In addition, this section is interrupt-enabled. Some of my tests
>>>> were conducted in UP mode to make sure we did not face a locking
>>>> latency
>>>> inherited from another core, like we had with the APIC madness in the
>>>> early days, and the jitter was still right there. I don't see much
>>>> hope.
>>>>
>>>
>>> I have not read the preempt_rt patch, only the announces. But for
>>> instance, in the 3.8.13-rt12 patch announce, I read:
>>>
>>> - added an option to the i915 driver to disable the expensive wbinvd. A
>>>    warning is printed once on RT if wbinvd is not disabled to let the
>>>    user know about this problem. This problem was decoded by Carsten
>>> Emde.
>>>
>>>
>>
>> This is documented as a plain reversal of the former change aimed at
>> fixing non-coherence issues with fence updates:
>>
>>  From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00 2001
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>> Date: Wed, 10 Jul 2013 13:36:24 +0100
>> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence between
>>   fences and LLC across multiple CPUs"
>>
>> This reverts commit 25ff119 and the follow on for Valleyview commit
>> 2dc8aae.
>>
>
> That one seems to be suggested as a cheaper replacement for the ugly
> wbinvd, we should have a look at it:
>
> drm/i915: Fix incoherence with fence updates on Sandybridge+
>

We do have this one in 3.10.18, but not the reversal of the former 
workaround which produces jitter.

http://www.spinics.net/lists/stable-commits/msg27025.html

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 11:04                               ` Philippe Gerum
@ 2013-12-04 11:10                                 ` Gilles Chanteperdrix
  2013-12-04 11:36                                   ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-04 11:10 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 12:04 PM, Philippe Gerum wrote:
> On 12/04/2013 11:33 AM, Philippe Gerum wrote:
>> On 12/04/2013 11:29 AM, Philippe Gerum wrote:
>>> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>>>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het volgende
>>>>>>>>>>> geschreven:
>>>>>>>>>>>
>>>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>> volgende
>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at 1... I
>>>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>>>> first/at
>>>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT) in the
>>>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>>>> BIOS, but
>>>>>>>>>>>>> no success (tell me if it is worth trying to disable it in the
>>>>>>>>>>>>> kernel
>>>>>>>>>>>>> config in stead).
>>>>>>>>>>>>
>>>>>>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or
>>>>>>>>>>>> you
>>>>>>>>>>>> still
>>>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>>>> testing
>>>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>>>
>>>>>>>>>>> the second: still latency...
>>>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us
>>>>>>>>>>>>>>> under
>>>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>>>> only 2
>>>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>> coupling.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>>>>>>> before,
>>>>>>>>>>>>> but then the system was even more vulnerable for such latency
>>>>>>>>>>>>> peaks
>>>>>>>>>>>>> (however the mean latency was a little lower!) (I still have
>>>>>>>>>>>>> the
>>>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>>>> there)
>>>>>>>>>>>>
>>>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>>>> posted,
>>>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|   #end
>>>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>>>
>>>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>>>> :    +func                -141! 117.825  i915_gem_flush_ring+0x9
>>>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>>>> 000391),
>>>>>>>>>>> an exempt charity in England & Wales and a charity registered in
>>>>>>>>>>> Scotland (SC 038302).
>>>>>>>>>>
>>>>>>>>>> Ah this is a known issue then. I traced back this issue some time
>>>>>>>>>> ago,
>>>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>>>> fixed on
>>>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>>>> branch,
>>>>>>>>>> available here by git:
>>>>>>>>>
>>>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving
>>>>>>>>> the
>>>>>>>>> i915 chipset recently on 3.10,
>>>>>>>>
>>>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>>>
>>>>>>>
>>>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>>>
>>>>>>> which is currently 3.10.18.
>>>>>>>
>>>>>>>>> and it turned out that we were still
>>>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>>>> IPI in
>>>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>>>
>>>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>>>> 300 us
>>>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>>>
>>>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the patch in
>>>>>>>> the
>>>>>>>> I-pipe.
>>>>>>>>
>>>>>>>
>>>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>>>> required to
>>>>>>> flush transactions before updating the fence register.
>>>>>>>
>>>>>>
>>>>>>    From what I understood, the preempt_rt patch asks users to pin the X
>>>>>> server on one cpu and disables the IPI, so the invalidation can be run
>>>>>> on only one cpu. That said, if that had solved the issue, Kurijn would
>>>>>> not have observed the latency spikes when running with only one cpu.
>>>>>>
>>>>>
>>>>>      if (HAS_LLC(obj->base.dev))
>>>>>          on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>>>
>>>>> So this will run on every CPU regardless of the number of CPUs, in sync
>>>>> mode. In addition, this section is interrupt-enabled. Some of my tests
>>>>> were conducted in UP mode to make sure we did not face a locking
>>>>> latency
>>>>> inherited from another core, like we had with the APIC madness in the
>>>>> early days, and the jitter was still right there. I don't see much
>>>>> hope.
>>>>>
>>>>
>>>> I have not read the preempt_rt patch, only the announces. But for
>>>> instance, in the 3.8.13-rt12 patch announce, I read:
>>>>
>>>> - added an option to the i915 driver to disable the expensive wbinvd. A
>>>>     warning is printed once on RT if wbinvd is not disabled to let the
>>>>     user know about this problem. This problem was decoded by Carsten
>>>> Emde.
>>>>
>>>>
>>>
>>> This is documented as a plain reversal of the former change aimed at
>>> fixing non-coherence issues with fence updates:
>>>
>>>   From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00 2001
>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>> Date: Wed, 10 Jul 2013 13:36:24 +0100
>>> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence between
>>>    fences and LLC across multiple CPUs"
>>>
>>> This reverts commit 25ff119 and the follow on for Valleyview commit
>>> 2dc8aae.
>>>
>>
>> That one seems to be suggested as a cheaper replacement for the ugly
>> wbinvd, we should have a look at it:
>>
>> drm/i915: Fix incoherence with fence updates on Sandybridge+
>>
>
> We do have this one in 3.10.18, but not the reversal of the former
> workaround which produces jitter.
>
> http://www.spinics.net/lists/stable-commits/msg27025.html
>
 From here:
http://www.osadl.org/Examples-of-latency-regressions.latest-stable-test-latency.0.html

It seems this patch is even creating a regression.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 11:10                                 ` Gilles Chanteperdrix
@ 2013-12-04 11:36                                   ` Philippe Gerum
  2013-12-04 11:59                                     ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 11:36 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 12:10 PM, Gilles Chanteperdrix wrote:
> On 12/04/2013 12:04 PM, Philippe Gerum wrote:
>> On 12/04/2013 11:33 AM, Philippe Gerum wrote:
>>> On 12/04/2013 11:29 AM, Philippe Gerum wrote:
>>>> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>>>>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>>>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het
>>>>>>>>>>>> volgende
>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>>> volgende
>>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at
>>>>>>>>>>>>>>>> 1... I
>>>>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I recompile
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>>>>> first/at
>>>>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT)
>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>>>>> BIOS, but
>>>>>>>>>>>>>> no success (tell me if it is worth trying to disable it in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>> config in stead).
>>>>>>>>>>>>>
>>>>>>>>>>>>> When you say "no success", you mean you still have 2 cpus ? Or
>>>>>>>>>>>>> you
>>>>>>>>>>>>> still
>>>>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>>>>> testing
>>>>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>>>>
>>>>>>>>>>>> the second: still latency...
>>>>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to 87us
>>>>>>>>>>>>>>>> under
>>>>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>>>>> only 2
>>>>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>> coupling.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have tried a config with priority coupling support disabled
>>>>>>>>>>>>>> before,
>>>>>>>>>>>>>> but then the system was even more vulnerable for such latency
>>>>>>>>>>>>>> peaks
>>>>>>>>>>>>>> (however the mean latency was a little lower!) (I still have
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>>>>> there)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>>>>> posted,
>>>>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|
>>>>>>>>>>>>>>>> #end
>>>>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>>>>
>>>>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>>>>> :    +func                -141! 117.825
>>>>>>>>>>>> i915_gem_flush_ring+0x9
>>>>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>>>>> 000391),
>>>>>>>>>>>> an exempt charity in England & Wales and a charity
>>>>>>>>>>>> registered in
>>>>>>>>>>>> Scotland (SC 038302).
>>>>>>>>>>>
>>>>>>>>>>> Ah this is a known issue then. I traced back this issue some
>>>>>>>>>>> time
>>>>>>>>>>> ago,
>>>>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>>>>> fixed on
>>>>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>>>>> branch,
>>>>>>>>>>> available here by git:
>>>>>>>>>>
>>>>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving
>>>>>>>>>> the
>>>>>>>>>> i915 chipset recently on 3.10,
>>>>>>>>>
>>>>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>>>>
>>>>>>>>
>>>>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>>>>
>>>>>>>> which is currently 3.10.18.
>>>>>>>>
>>>>>>>>>> and it turned out that we were still
>>>>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>>>>> IPI in
>>>>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>>>>
>>>>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>>>>> 300 us
>>>>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>>>>
>>>>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the
>>>>>>>>> patch in
>>>>>>>>> the
>>>>>>>>> I-pipe.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>>>>> required to
>>>>>>>> flush transactions before updating the fence register.
>>>>>>>>
>>>>>>>
>>>>>>>    From what I understood, the preempt_rt patch asks users to pin
>>>>>>> the X
>>>>>>> server on one cpu and disables the IPI, so the invalidation can
>>>>>>> be run
>>>>>>> on only one cpu. That said, if that had solved the issue, Kurijn
>>>>>>> would
>>>>>>> not have observed the latency spikes when running with only one cpu.
>>>>>>>
>>>>>>
>>>>>>      if (HAS_LLC(obj->base.dev))
>>>>>>          on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>>>>
>>>>>> So this will run on every CPU regardless of the number of CPUs, in
>>>>>> sync
>>>>>> mode. In addition, this section is interrupt-enabled. Some of my
>>>>>> tests
>>>>>> were conducted in UP mode to make sure we did not face a locking
>>>>>> latency
>>>>>> inherited from another core, like we had with the APIC madness in the
>>>>>> early days, and the jitter was still right there. I don't see much
>>>>>> hope.
>>>>>>
>>>>>
>>>>> I have not read the preempt_rt patch, only the announces. But for
>>>>> instance, in the 3.8.13-rt12 patch announce, I read:
>>>>>
>>>>> - added an option to the i915 driver to disable the expensive
>>>>> wbinvd. A
>>>>>     warning is printed once on RT if wbinvd is not disabled to let the
>>>>>     user know about this problem. This problem was decoded by Carsten
>>>>> Emde.
>>>>>
>>>>>
>>>>
>>>> This is documented as a plain reversal of the former change aimed at
>>>> fixing non-coherence issues with fence updates:
>>>>
>>>>   From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00
>>>> 2001
>>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Date: Wed, 10 Jul 2013 13:36:24 +0100
>>>> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence
>>>> between
>>>>    fences and LLC across multiple CPUs"
>>>>
>>>> This reverts commit 25ff119 and the follow on for Valleyview commit
>>>> 2dc8aae.
>>>>
>>>
>>> That one seems to be suggested as a cheaper replacement for the ugly
>>> wbinvd, we should have a look at it:
>>>
>>> drm/i915: Fix incoherence with fence updates on Sandybridge+
>>>
>>
>> We do have this one in 3.10.18, but not the reversal of the former
>> workaround which produces jitter.
>>
>> http://www.spinics.net/lists/stable-commits/msg27025.html
>>
>  From here:
> http://www.osadl.org/Examples-of-latency-regressions.latest-stable-test-latency.0.html
>
>
> It seems this patch is even creating a regression.
>

Yes, in addition according to Chris Wilson, it did not actually fix the 
root issue, but only papered over it, making the bug less likely to 
happen when serializing the fence register updates among CPUs. It looks 
like we really want to drop it in ipipe-3.8, unless it is queued in 
-stable there. Did not check.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 11:36                                   ` Philippe Gerum
@ 2013-12-04 11:59                                     ` Philippe Gerum
  2013-12-04 12:00                                       ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 11:59 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 12:36 PM, Philippe Gerum wrote:
> On 12/04/2013 12:10 PM, Gilles Chanteperdrix wrote:
>> On 12/04/2013 12:04 PM, Philippe Gerum wrote:
>>> On 12/04/2013 11:33 AM, Philippe Gerum wrote:
>>>> On 12/04/2013 11:29 AM, Philippe Gerum wrote:
>>>>> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>>>>>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>>>>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>> volgende
>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>>>> volgende
>>>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at
>>>>>>>>>>>>>>>>> 1... I
>>>>>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I
>>>>>>>>>>>>>>>>> recompile
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>>>>>> first/at
>>>>>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT)
>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>>>>>> BIOS, but
>>>>>>>>>>>>>>> no success (tell me if it is worth trying to disable it in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>>> config in stead).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When you say "no success", you mean you still have 2 cpus
>>>>>>>>>>>>>> ? Or
>>>>>>>>>>>>>> you
>>>>>>>>>>>>>> still
>>>>>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>>>>>> testing
>>>>>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>>>>>
>>>>>>>>>>>>> the second: still latency...
>>>>>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to
>>>>>>>>>>>>>>>>> 87us
>>>>>>>>>>>>>>>>> under
>>>>>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>>>>>> only 2
>>>>>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>> coupling.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have tried a config with priority coupling support
>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>> before,
>>>>>>>>>>>>>>> but then the system was even more vulnerable for such
>>>>>>>>>>>>>>> latency
>>>>>>>>>>>>>>> peaks
>>>>>>>>>>>>>>> (however the mean latency was a little lower!) (I still have
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>>>>>> there)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>>>>>> posted,
>>>>>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|
>>>>>>>>>>>>>>>>> #end
>>>>>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>>>>>> :    +func                -141! 117.825
>>>>>>>>>>>>> i915_gem_flush_ring+0x9
>>>>>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>>>>>> 000391),
>>>>>>>>>>>>> an exempt charity in England & Wales and a charity
>>>>>>>>>>>>> registered in
>>>>>>>>>>>>> Scotland (SC 038302).
>>>>>>>>>>>>
>>>>>>>>>>>> Ah this is a known issue then. I traced back this issue some
>>>>>>>>>>>> time
>>>>>>>>>>>> ago,
>>>>>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>>>>>> fixed on
>>>>>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>>>>>> branch,
>>>>>>>>>>>> available here by git:
>>>>>>>>>>>
>>>>>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving
>>>>>>>>>>> the
>>>>>>>>>>> i915 chipset recently on 3.10,
>>>>>>>>>>
>>>>>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>>>>>
>>>>>>>>> which is currently 3.10.18.
>>>>>>>>>
>>>>>>>>>>> and it turned out that we were still
>>>>>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>>>>>> IPI in
>>>>>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>>>>>
>>>>>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>>>>>> 300 us
>>>>>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>>>>>
>>>>>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the
>>>>>>>>>> patch in
>>>>>>>>>> the
>>>>>>>>>> I-pipe.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>>>>>> required to
>>>>>>>>> flush transactions before updating the fence register.
>>>>>>>>>
>>>>>>>>
>>>>>>>>    From what I understood, the preempt_rt patch asks users to pin
>>>>>>>> the X
>>>>>>>> server on one cpu and disables the IPI, so the invalidation can
>>>>>>>> be run
>>>>>>>> on only one cpu. That said, if that had solved the issue, Kurijn
>>>>>>>> would
>>>>>>>> not have observed the latency spikes when running with only one
>>>>>>>> cpu.
>>>>>>>>
>>>>>>>
>>>>>>>      if (HAS_LLC(obj->base.dev))
>>>>>>>          on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>>>>>
>>>>>>> So this will run on every CPU regardless of the number of CPUs, in
>>>>>>> sync
>>>>>>> mode. In addition, this section is interrupt-enabled. Some of my
>>>>>>> tests
>>>>>>> were conducted in UP mode to make sure we did not face a locking
>>>>>>> latency
>>>>>>> inherited from another core, like we had with the APIC madness in
>>>>>>> the
>>>>>>> early days, and the jitter was still right there. I don't see much
>>>>>>> hope.
>>>>>>>
>>>>>>
>>>>>> I have not read the preempt_rt patch, only the announces. But for
>>>>>> instance, in the 3.8.13-rt12 patch announce, I read:
>>>>>>
>>>>>> - added an option to the i915 driver to disable the expensive
>>>>>> wbinvd. A
>>>>>>     warning is printed once on RT if wbinvd is not disabled to let
>>>>>> the
>>>>>>     user know about this problem. This problem was decoded by Carsten
>>>>>> Emde.
>>>>>>
>>>>>>
>>>>>
>>>>> This is documented as a plain reversal of the former change aimed at
>>>>> fixing non-coherence issues with fence updates:
>>>>>
>>>>>   From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00
>>>>> 2001
>>>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Date: Wed, 10 Jul 2013 13:36:24 +0100
>>>>> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence
>>>>> between
>>>>>    fences and LLC across multiple CPUs"
>>>>>
>>>>> This reverts commit 25ff119 and the follow on for Valleyview commit
>>>>> 2dc8aae.
>>>>>
>>>>
>>>> That one seems to be suggested as a cheaper replacement for the ugly
>>>> wbinvd, we should have a look at it:
>>>>
>>>> drm/i915: Fix incoherence with fence updates on Sandybridge+
>>>>
>>>
>>> We do have this one in 3.10.18, but not the reversal of the former
>>> workaround which produces jitter.
>>>
>>> http://www.spinics.net/lists/stable-commits/msg27025.html
>>>
>>  From here:
>> http://www.osadl.org/Examples-of-latency-regressions.latest-stable-test-latency.0.html
>>
>>
>>
>> It seems this patch is even creating a regression.
>>
>
> Yes, in addition according to Chris Wilson, it did not actually fix the
> root issue, but only papered over it, making the bug less likely to
> happen when serializing the fence register updates among CPUs. It looks
> like we really want to drop it in ipipe-3.8, unless it is queued in
> -stable there. Did not check.
>

I have a smoke test running over a patched kernel implementing the right 
fixup instead of the former workaround. Latency is ok so far. I'll leave 
this running a few hours more and see what happens.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 11:59                                     ` Philippe Gerum
@ 2013-12-04 12:00                                       ` Gilles Chanteperdrix
  2013-12-04 13:19                                         ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-04 12:00 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 12:59 PM, Philippe Gerum wrote:
> On 12/04/2013 12:36 PM, Philippe Gerum wrote:
>> On 12/04/2013 12:10 PM, Gilles Chanteperdrix wrote:
>>> On 12/04/2013 12:04 PM, Philippe Gerum wrote:
>>>> On 12/04/2013 11:33 AM, Philippe Gerum wrote:
>>>>> On 12/04/2013 11:29 AM, Philippe Gerum wrote:
>>>>>> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>>>>>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>>>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>>> volgende
>>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>>>>> volgende
>>>>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at
>>>>>>>>>>>>>>>>>> 1... I
>>>>>>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I
>>>>>>>>>>>>>>>>>> recompile
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>>>>>>> first/at
>>>>>>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT)
>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>>>>>>> BIOS, but
>>>>>>>>>>>>>>>> no success (tell me if it is worth trying to disable it in
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>>>> config in stead).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you say "no success", you mean you still have 2 cpus
>>>>>>>>>>>>>>> ? Or
>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>>>>>>> testing
>>>>>>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the second: still latency...
>>>>>>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to
>>>>>>>>>>>>>>>>>> 87us
>>>>>>>>>>>>>>>>>> under
>>>>>>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>>>>>>> only 2
>>>>>>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>> coupling.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have tried a config with priority coupling support
>>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>> before,
>>>>>>>>>>>>>>>> but then the system was even more vulnerable for such
>>>>>>>>>>>>>>>> latency
>>>>>>>>>>>>>>>> peaks
>>>>>>>>>>>>>>>> (however the mean latency was a little lower!) (I still have
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>>>>>>> there)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>>>>>>> posted,
>>>>>>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|
>>>>>>>>>>>>>>>>>> #end
>>>>>>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>>>>>>> :    +func                -141! 117.825
>>>>>>>>>>>>>> i915_gem_flush_ring+0x9
>>>>>>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>>>>>>> 000391),
>>>>>>>>>>>>>> an exempt charity in England & Wales and a charity
>>>>>>>>>>>>>> registered in
>>>>>>>>>>>>>> Scotland (SC 038302).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ah this is a known issue then. I traced back this issue some
>>>>>>>>>>>>> time
>>>>>>>>>>>>> ago,
>>>>>>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>>>>>>> fixed on
>>>>>>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>>>>>>> branch,
>>>>>>>>>>>>> available here by git:
>>>>>>>>>>>>
>>>>>>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving
>>>>>>>>>>>> the
>>>>>>>>>>>> i915 chipset recently on 3.10,
>>>>>>>>>>>
>>>>>>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>>>>>>
>>>>>>>>>> which is currently 3.10.18.
>>>>>>>>>>
>>>>>>>>>>>> and it turned out that we were still
>>>>>>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>>>>>>> IPI in
>>>>>>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>>>>>>
>>>>>>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>>>>>>> 300 us
>>>>>>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>>>>>>
>>>>>>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the
>>>>>>>>>>> patch in
>>>>>>>>>>> the
>>>>>>>>>>> I-pipe.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>>>>>>> required to
>>>>>>>>>> flush transactions before updating the fence register.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    From what I understood, the preempt_rt patch asks users to pin
>>>>>>>>> the X
>>>>>>>>> server on one cpu and disables the IPI, so the invalidation can
>>>>>>>>> be run
>>>>>>>>> on only one cpu. That said, if that had solved the issue, Kurijn
>>>>>>>>> would
>>>>>>>>> not have observed the latency spikes when running with only one
>>>>>>>>> cpu.
>>>>>>>>>
>>>>>>>>
>>>>>>>>      if (HAS_LLC(obj->base.dev))
>>>>>>>>          on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>>>>>>
>>>>>>>> So this will run on every CPU regardless of the number of CPUs, in
>>>>>>>> sync
>>>>>>>> mode. In addition, this section is interrupt-enabled. Some of my
>>>>>>>> tests
>>>>>>>> were conducted in UP mode to make sure we did not face a locking
>>>>>>>> latency
>>>>>>>> inherited from another core, like we had with the APIC madness in
>>>>>>>> the
>>>>>>>> early days, and the jitter was still right there. I don't see much
>>>>>>>> hope.
>>>>>>>>
>>>>>>>
>>>>>>> I have not read the preempt_rt patch, only the announces. But for
>>>>>>> instance, in the 3.8.13-rt12 patch announce, I read:
>>>>>>>
>>>>>>> - added an option to the i915 driver to disable the expensive
>>>>>>> wbinvd. A
>>>>>>>     warning is printed once on RT if wbinvd is not disabled to let
>>>>>>> the
>>>>>>>     user know about this problem. This problem was decoded by Carsten
>>>>>>> Emde.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> This is documented as a plain reversal of the former change aimed at
>>>>>> fixing non-coherence issues with fence updates:
>>>>>>
>>>>>>   From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00
>>>>>> 2001
>>>>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Date: Wed, 10 Jul 2013 13:36:24 +0100
>>>>>> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence
>>>>>> between
>>>>>>    fences and LLC across multiple CPUs"
>>>>>>
>>>>>> This reverts commit 25ff119 and the follow on for Valleyview commit
>>>>>> 2dc8aae.
>>>>>>
>>>>>
>>>>> That one seems to be suggested as a cheaper replacement for the ugly
>>>>> wbinvd, we should have a look at it:
>>>>>
>>>>> drm/i915: Fix incoherence with fence updates on Sandybridge+
>>>>>
>>>>
>>>> We do have this one in 3.10.18, but not the reversal of the former
>>>> workaround which produces jitter.
>>>>
>>>> http://www.spinics.net/lists/stable-commits/msg27025.html
>>>>
>>>  From here:
>>> http://www.osadl.org/Examples-of-latency-regressions.latest-stable-test-latency.0.html
>>>
>>>
>>>
>>> It seems this patch is even creating a regression.
>>>
>>
>> Yes, in addition according to Chris Wilson, it did not actually fix the
>> root issue, but only papered over it, making the bug less likely to
>> happen when serializing the fence register updates among CPUs. It looks
>> like we really want to drop it in ipipe-3.8, unless it is queued in
>> -stable there. Did not check.
>>
> 
> I have a smoke test running over a patched kernel implementing the right 
> fixup instead of the former workaround. Latency is ok so far. I'll leave 
> this running a few hours more and see what happens.
> 
Ok, could you push the branch somewhere so that I can try it?

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 12:00                                       ` Gilles Chanteperdrix
@ 2013-12-04 13:19                                         ` Philippe Gerum
  2013-12-04 16:03                                           ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 13:19 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
> On 12/04/2013 12:59 PM, Philippe Gerum wrote:
>> On 12/04/2013 12:36 PM, Philippe Gerum wrote:
>>> On 12/04/2013 12:10 PM, Gilles Chanteperdrix wrote:
>>>> On 12/04/2013 12:04 PM, Philippe Gerum wrote:
>>>>> On 12/04/2013 11:33 AM, Philippe Gerum wrote:
>>>>>> On 12/04/2013 11:29 AM, Philippe Gerum wrote:
>>>>>>> On 12/04/2013 10:51 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 12/04/2013 10:40 AM, Philippe Gerum wrote:
>>>>>>>>> On 12/04/2013 10:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 12/04/2013 10:27 AM, Philippe Gerum wrote:
>>>>>>>>>>> On 12/04/2013 09:51 AM, Gilles Chanteperdrix wrote:
>>>>>>>>>>>> On 12/04/2013 09:44 AM, Philippe Gerum wrote:
>>>>>>>>>>>>> On 12/03/2013 07:50 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>> On 12/03/2013 05:49 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>> Op 3-dec.-2013, om 15:54 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>>>> volgende
>>>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/03/2013 04:31 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>>>> Op 3-dec.-2013, om 13:23 heeft Gilles Chanteperdrix het
>>>>>>>>>>>>>>>>> volgende
>>>>>>>>>>>>>>>>> geschreven:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 12/03/2013 02:07 PM, Kurijn Buys wrote:
>>>>>>>>>>>>>>>>>>> Thanks for the quick response, ACPI is enabled, I only
>>>>>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>>>>> "Processor" in there... -1 was a typo indeed, it is at
>>>>>>>>>>>>>>>>>>> 1... I
>>>>>>>>>>>>>>>>>>> see SCHED_SMT [=y] in my kernel config... shall I
>>>>>>>>>>>>>>>>>>> recompile
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> kernel with this disabled then... no other things to try
>>>>>>>>>>>>>>>>>>> first/at
>>>>>>>>>>>>>>>>>>> the same time?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To remove hyperthreading, either: - disable it in the BIOS
>>>>>>>>>>>>>>>>>> configuration; - or disable CONFIG_SMP (not SCHED_SMPT)
>>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>>>> kernel configuration.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Ah I see, CONFIG_SMP is also enabled... I've disabled it in
>>>>>>>>>>>>>>>>> BIOS, but
>>>>>>>>>>>>>>>>> no success (tell me if it is worth trying to disable it in
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>>>>> config in stead).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When you say "no success", you mean you still have 2 cpus
>>>>>>>>>>>>>>>> ? Or
>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>> have latency pikes? If the former, then yes, try without
>>>>>>>>>>>>>>>> CONFIG_SMP, or
>>>>>>>>>>>>>>>> pass nr_cpus=1 on the command line. If the latter, then no,
>>>>>>>>>>>>>>>> testing
>>>>>>>>>>>>>>>> without CONFIG_SMP is useless.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the second: still latency...
>>>>>>>>>>>>>>> (lscpu says there is only 1 cpu now)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I realized that the test with sched_rt_runtime_us on -1 I
>>>>>>>>>>>>>>>>>>> performed was with an earlier set-up. When I set it now to
>>>>>>>>>>>>>>>>>>> -1, I
>>>>>>>>>>>>>>>>>>> have better performance, but: 1) still spikes of up to
>>>>>>>>>>>>>>>>>>> 87us
>>>>>>>>>>>>>>>>>>> under
>>>>>>>>>>>>>>>>>>> load with ./latency 2) still some completely shifted
>>>>>>>>>>>>>>>>>>> occurrences
>>>>>>>>>>>>>>>>>>> with the other latency test, with a 1000µs period (but now
>>>>>>>>>>>>>>>>>>> only 2
>>>>>>>>>>>>>>>>>>> out of 890814), and the rest of the distribution lies in
>>>>>>>>>>>>>>>>>>> [861-1139]µs, which is also rather large I suppose.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> sched_rt_runtime_us should not make any difference.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Something else you should try is to disable root thread
>>>>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>>> coupling.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have tried a config with priority coupling support
>>>>>>>>>>>>>>>>> disabled
>>>>>>>>>>>>>>>>> before,
>>>>>>>>>>>>>>>>> but then the system was even more vulnerable for such
>>>>>>>>>>>>>>>>> latency
>>>>>>>>>>>>>>>>> peaks
>>>>>>>>>>>>>>>>> (however the mean latency was a little lower!) (I still have
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> kernel, but unfortunately the I-pipe tracer isn't installed
>>>>>>>>>>>>>>>>> there)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please keep priority coupling disabled in further tests.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The ipipe trace after test (1) was similar to the one I
>>>>>>>>>>>>>>>>>>> posted,
>>>>>>>>>>>>>>>>>>> where this line seems to be the problem I suppose: :|
>>>>>>>>>>>>>>>>>>> #end
>>>>>>>>>>>>>>>>>>> 0x80000001 -179! 149.235  ipipe_check_context+0x87
>>>>>>>>>>>>>>>>>>> (add_preempt_count+0x15)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ...I hoped the I-pipe trace would help..?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Unfortunately the trace is not helping much.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If it would help, I've another trace (joint as txt) where the
>>>>>>>>>>>>>>> following line seems to indicate a problem:
>>>>>>>>>>>>>>> :    +func                -141! 117.825
>>>>>>>>>>>>>>> i915_gem_flush_ring+0x9
>>>>>>>>>>>>>>> [i915] (i915_gem_do_execbuffer+0xb46 [i915])
>>>>>>>>>>>>>>> -- The Open University is incorporated by Royal Charter (RC
>>>>>>>>>>>>>>> 000391),
>>>>>>>>>>>>>>> an exempt charity in England & Wales and a charity
>>>>>>>>>>>>>>> registered in
>>>>>>>>>>>>>>> Scotland (SC 038302).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ah this is a known issue then. I traced back this issue some
>>>>>>>>>>>>>> time
>>>>>>>>>>>>>> ago,
>>>>>>>>>>>>>> and from what I understood on the rt-users mailing list it is
>>>>>>>>>>>>>> fixed on
>>>>>>>>>>>>>> more recent kernels. So, I would advise to update to 3.10.18
>>>>>>>>>>>>>> branch,
>>>>>>>>>>>>>> available here by git:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Incidentally, I've been chasing a latency issue on x86 involving
>>>>>>>>>>>>> the
>>>>>>>>>>>>> i915 chipset recently on 3.10,
>>>>>>>>>>>>
>>>>>>>>>>>> was it 3.10 or 3.10.18 ?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> http://git.xenomai.org/ipipe.git/log/?h=ipipe-3.10
>>>>>>>>>>>
>>>>>>>>>>> which is currently 3.10.18.
>>>>>>>>>>>
>>>>>>>>>>>>> and it turned out that we were still
>>>>>>>>>>>>> badly hit by wbinvd instructions, emitted on _all_ cores via an
>>>>>>>>>>>>> IPI in
>>>>>>>>>>>>> the GEM control code, when the LLC cache is present.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The jitter incurred by invalidating all internal caches exceeds
>>>>>>>>>>>>> 300 us
>>>>>>>>>>>>> in my test case, so it seems that we are not there yet.
>>>>>>>>>>>>
>>>>>>>>>>>> Ok, maybe the preempt_rt workaround is only enabled for
>>>>>>>>>>>> CONFIG_PREEMPT_RT? In which case we can try and import the
>>>>>>>>>>>> patch in
>>>>>>>>>>>> the
>>>>>>>>>>>> I-pipe.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Looking at the comment in the GEM code, this invalidation is
>>>>>>>>>>> required to
>>>>>>>>>>> flush transactions before updating the fence register.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     From what I understood, the preempt_rt patch asks users to pin
>>>>>>>>>> the X
>>>>>>>>>> server on one cpu and disables the IPI, so the invalidation can
>>>>>>>>>> be run
>>>>>>>>>> on only one cpu. That said, if that had solved the issue, Kurijn
>>>>>>>>>> would
>>>>>>>>>> not have observed the latency spikes when running with only one
>>>>>>>>>> cpu.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>       if (HAS_LLC(obj->base.dev))
>>>>>>>>>           on_each_cpu(i915_gem_write_fence__ipi, NULL, 1);
>>>>>>>>>
>>>>>>>>> So this will run on every CPU regardless of the number of CPUs, in
>>>>>>>>> sync
>>>>>>>>> mode. In addition, this section is interrupt-enabled. Some of my
>>>>>>>>> tests
>>>>>>>>> were conducted in UP mode to make sure we did not face a locking
>>>>>>>>> latency
>>>>>>>>> inherited from another core, like we had with the APIC madness in
>>>>>>>>> the
>>>>>>>>> early days, and the jitter was still right there. I don't see much
>>>>>>>>> hope.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I have not read the preempt_rt patch, only the announces. But for
>>>>>>>> instance, in the 3.8.13-rt12 patch announce, I read:
>>>>>>>>
>>>>>>>> - added an option to the i915 driver to disable the expensive
>>>>>>>> wbinvd. A
>>>>>>>>      warning is printed once on RT if wbinvd is not disabled to let
>>>>>>>> the
>>>>>>>>      user know about this problem. This problem was decoded by Carsten
>>>>>>>> Emde.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> This is documented as a plain reversal of the former change aimed at
>>>>>>> fixing non-coherence issues with fence updates:
>>>>>>>
>>>>>>>    From 22d61b535bbb5f2b65bfe564d16b0d2b4413535a Mon Sep 17 00:00:00
>>>>>>> 2001
>>>>>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>>> Date: Wed, 10 Jul 2013 13:36:24 +0100
>>>>>>> Subject: [PATCH 003/293] Revert "drm/i915: Workaround incoherence
>>>>>>> between
>>>>>>>     fences and LLC across multiple CPUs"
>>>>>>>
>>>>>>> This reverts commit 25ff119 and the follow on for Valleyview commit
>>>>>>> 2dc8aae.
>>>>>>>
>>>>>>
>>>>>> That one seems to be suggested as a cheaper replacement for the ugly
>>>>>> wbinvd, we should have a look at it:
>>>>>>
>>>>>> drm/i915: Fix incoherence with fence updates on Sandybridge+
>>>>>>
>>>>>
>>>>> We do have this one in 3.10.18, but not the reversal of the former
>>>>> workaround which produces jitter.
>>>>>
>>>>> http://www.spinics.net/lists/stable-commits/msg27025.html
>>>>>
>>>>   From here:
>>>> http://www.osadl.org/Examples-of-latency-regressions.latest-stable-test-latency.0.html
>>>>
>>>>
>>>>
>>>> It seems this patch is even creating a regression.
>>>>
>>>
>>> Yes, in addition according to Chris Wilson, it did not actually fix the
>>> root issue, but only papered over it, making the bug less likely to
>>> happen when serializing the fence register updates among CPUs. It looks
>>> like we really want to drop it in ipipe-3.8, unless it is queued in
>>> -stable there. Did not check.
>>>
>>
>> I have a smoke test running over a patched kernel implementing the right
>> fixup instead of the former workaround. Latency is ok so far. I'll leave
>> this running a few hours more and see what happens.
>>
> Ok, could you push the branch somewhere so that I can try it?
>

testing/ipipe-3.8-i915-fix

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 13:19                                         ` Philippe Gerum
@ 2013-12-04 16:03                                           ` Gilles Chanteperdrix
  2013-12-04 17:43                                             ` Philippe Gerum
  0 siblings, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-04 16:03 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 02:19 PM, Philippe Gerum wrote:
> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>> Ok, could you push the branch somewhere so that I can try it?
>>
>
> testing/ipipe-3.8-i915-fix

I could test it, I no longer get high latencies while moving a large 
opengl window. So, it looks good.


-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 16:03                                           ` Gilles Chanteperdrix
@ 2013-12-04 17:43                                             ` Philippe Gerum
  2013-12-05  0:44                                               ` Kurijn Buys
  0 siblings, 1 reply; 38+ messages in thread
From: Philippe Gerum @ 2013-12-04 17:43 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Kurijn Buys, Xenomai

On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>> Ok, could you push the branch somewhere so that I can try it?
>>>
>>
>> testing/ipipe-3.8-i915-fix
>
> I could test it, I no longer get high latencies while moving a large
> opengl window. So, it looks good.
>
>

Same here.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-04 17:43                                             ` Philippe Gerum
@ 2013-12-05  0:44                                               ` Kurijn Buys
  2013-12-05 10:28                                                 ` Kurijn Buys
  2013-12-05 11:09                                                 ` Gilles Chanteperdrix
  0 siblings, 2 replies; 38+ messages in thread
From: Kurijn Buys @ 2013-12-05  0:44 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai


Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:

> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>
>>>
>>> testing/ipipe-3.8-i915-fix
>>
>> I could test it, I no longer get high latencies while moving a large
>> opengl window. So, it looks good.

I've it running as well (with the priority coupling option disabled this time). The latency test results in a worst value of 52us (no idea if this is normal, the lat max globally stays between 10 and 30us), but I have no higher pikes under load in each case...
The other test I mentioned before still has a few measurements that occur at 10us in stead of 1000us... maybe it's an issue with this test...

I have the same oddity as the other time I installed a 3.8 kernel on Ubuntu 10.04:
-the /proc/ipipe files don't look how they should I guess. The Linux file is full of lines with "__ipipe_do_IRQ" and near the end a line with "__ipipe_do_critical_sync"...
-the version file only states "1"
-the xenomai file mainly has lines with "..." and also this critical sync line near the end.

Also, I tried to enable the ipipe tracer but without the "tracing on boot" option, but I can't get it to work now. When I do "echo < 1 /proc/ipipe/trace/enable" (as root), the flag stays on 0. and the frozen file remains empty.

>>
>>
>
> Same here.
>
> --
> Philippe.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-05  0:44                                               ` Kurijn Buys
@ 2013-12-05 10:28                                                 ` Kurijn Buys
  2013-12-05 11:05                                                   ` Philippe Gerum
  2013-12-05 11:09                                                 ` Gilles Chanteperdrix
  1 sibling, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-05 10:28 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai


Op 5-dec.-2013, om 00:44 heeft Kurijn Buys het volgende geschreven:

>
> Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:
>
>> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>>
>>>>
>>>> testing/ipipe-3.8-i915-fix
>>>
>>> I could test it, I no longer get high latencies while moving a large
>>> opengl window. So, it looks good.
>
> I've it running as well (with the priority coupling option disabled this time). The latency test results in a worst value of 52us (no idea if this is normal, the lat max globally stays between 10 and 30us), but I have no higher pikes under load in each case...
> The other test I mentioned before still has a few measurements that occur at 10us in stead of 1000us... maybe it's an issue with this test...
>
> I have the same oddity as the other time I installed a 3.8 kernel on Ubuntu 10.04:
> -the /proc/ipipe files don't look how they should I guess. The Linux file is full of lines with "__ipipe_do_IRQ" and near the end a line with "__ipipe_do_critical_sync"...
> -the version file only states "1"
> -the xenomai file mainly has lines with "..." and also this critical sync line near the end.
>
> Also, I tried to enable the ipipe tracer but without the "tracing on boot" option, but I can't get it to work now. When I do "echo < 1 /proc/ipipe/trace/enable" (as root), the flag stays on 0. and the frozen file remains empty.

For some unknown reason, now the tracer works (I didn't even do a reboot en the mean time), and now I have difficulties setting trace/enable to 0.

Another latency test had a pike of 62us (but pikes are very rare now, and not dependent on load anymore), which seems related to the ipipe frozen line:
-56!    44.220  __ipipe_spin_lock_irqsave+0x5 (mask_and_ack_8259A+0x22)

>
>>>
>>>
>>
>> Same here.
>>
>> --
>> Philippe.
>

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-05 10:28                                                 ` Kurijn Buys
@ 2013-12-05 11:05                                                   ` Philippe Gerum
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Gerum @ 2013-12-05 11:05 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/05/2013 11:28 AM, Kurijn Buys wrote:
>
> Op 5-dec.-2013, om 00:44 heeft Kurijn Buys het volgende geschreven:
>
>>
>> Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:
>>
>>> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>>>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>>>
>>>>>
>>>>> testing/ipipe-3.8-i915-fix
>>>>
>>>> I could test it, I no longer get high latencies while moving a large
>>>> opengl window. So, it looks good.
>>
>> I've it running as well (with the priority coupling option disabled this time). The latency test results in a worst value of 52us (no idea if this is normal, the lat max globally stays between 10 and 30us), but I have no higher pikes under load in each case...
>> The other test I mentioned before still has a few measurements that occur at 10us in stead of 1000us... maybe it's an issue with this test...
>>
>> I have the same oddity as the other time I installed a 3.8 kernel on Ubuntu 10.04:
>> -the /proc/ipipe files don't look how they should I guess. The Linux file is full of lines with "__ipipe_do_IRQ" and near the end a line with "__ipipe_do_critical_sync"...
>> -the version file only states "1"

Which is expected, the ipipe now reports its core release number. It 
looks like we forgot to bump it a couple of times though.

>> -the xenomai file mainly has lines with "..." and also this critical sync line near the end.
>>
>> Also, I tried to enable the ipipe tracer but without the "tracing on boot" option, but I can't get it to work now. When I do "echo < 1 /proc/ipipe/trace/enable" (as root), the flag stays on 0. and the frozen file remains empty.
>

You likely mean:

# echo 1 > /proc/ipipe/trace/enable

> For some unknown reason, now the tracer works (I didn't even do a reboot en the mean time), and now I have difficulties setting trace/enable to 0.
>
> Another latency test had a pike of 62us (but pikes are very rare now, and not dependent on load anymore), which seems related to the ipipe frozen line:
> -56!    44.220  __ipipe_spin_lock_irqsave+0x5 (mask_and_ack_8259A+0x22)
>

Accessing the legacy 8259 PIC is slow.

# cat /proc/interrupt

?

-- 
Philippe.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-05  0:44                                               ` Kurijn Buys
  2013-12-05 10:28                                                 ` Kurijn Buys
@ 2013-12-05 11:09                                                 ` Gilles Chanteperdrix
  2013-12-09 15:19                                                   ` Kurijn Buys
  1 sibling, 1 reply; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-05 11:09 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/05/2013 01:44 AM, Kurijn Buys wrote:
>
> Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:
>
>> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>>
>>>>
>>>> testing/ipipe-3.8-i915-fix
>>>
>>> I could test it, I no longer get high latencies while moving a large
>>> opengl window. So, it looks good.
>
> I've it running as well (with the priority coupling option disabled this time).

The options I asked you to change were meant to eliminate any possible 
source for high latencies in case of bug. You can return to a normal 
configuration now: re-enable SMP or APIC if you prefer to remain in 
single processor mode, disable the I-pipe tracer if you want to avoid 
the overhead, and re-enable priority coupling if you prefer to use it.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-05 11:09                                                 ` Gilles Chanteperdrix
@ 2013-12-09 15:19                                                   ` Kurijn Buys
  2013-12-09 15:27                                                     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-09 15:19 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

Op 5-dec.-2013, om 11:09 heeft Gilles Chanteperdrix het volgende geschreven:

> On 12/05/2013 01:44 AM, Kurijn Buys wrote:
>>
>> Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:
>>
>>> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>>>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>>>
>>>>>
>>>>> testing/ipipe-3.8-i915-fix
>>>>
>>>> I could test it, I no longer get high latencies while moving a large
>>>> opengl window. So, it looks good.
>>
>> I've it running as well (with the priority coupling option disabled this time).
>
> The options I asked you to change were meant to eliminate any possible
> source for high latencies in case of bug. You can return to a normal
> configuration now: re-enable SMP or APIC if you prefer to remain in
> single processor mode, disable the I-pipe tracer if you want to avoid
> the overhead, and re-enable priority coupling if you prefer to use it.

Re-enabling SMP is fine, but doesn't necessarily seem to improve the latency.
While I'm still having some small pikes, they keep below 60us, so that is fine to me.

However, now I move on to the next problem...:
trying to attach the 6052E NI card (which is mentioned in ni_pcimio.c), using lspci and
/usr/xenomai/sbin/analogy_config analogy0 analogy_ni_pci 0x4,1
the system either:
-freezes more or less
-outputs the error "analogy_config: attach failed err=-22"
but mostly it's the former reaction...

I enabled Analogy drivers (as explained here http://www.lara.unb.br/wiki/index.php/Data_Acquisition_Xenomai_Analogy)
and disabled Comedi drivers (Data acquisition support (comedi) ).

I tried another PCI connection, even another PC.

The kernel boot log seems to have some output related to a IRQ problem, however I don't really understand... Maybe this has to do with an incompatibility of Ubuntu 10.04 with a 3.10 kernel...? (also, given the strange /proc/ipipe/Linux file, as I mentioned before.

I find this in dmesg:
[   21.793387] irq 9: nobody cared (try booting with the "irqpoll" option)
[   21.793451] CPU: 0 PID: 266 Comm: modprobe Not tainted 3.10.0-xenomai-2.6.3+ #1
[   21.793459] Hardware name: Viglen D945GTP/D945GTP, BIOS NT94510J.86A.3309.2006.0109.1312 01/09/2006
[   21.793466]  00000000 f5409f50 c06c6e81 f5409f70 c01b1fd0 c0853a28 00000009 00000000
[   21.793503]  f5407900 00000000 00000009 f5409f94 c01b2183 43578610 00000009 c0467dc6
[   21.793538]  00000000 f5407900 00000000 f5437b80 f5409fd4 c01afe86 c09f8fd8 00000200
[   21.793574] Call Trace:
[   21.793594]  [<c06c6e81>] dump_stack+0x16/0x1d
[   21.793609]  [<c01b1fd0>] __report_bad_irq+0x30/0xd0
[   21.793624]  [<c01b2183>] note_interrupt+0x113/0x1c0
[   21.793639]  [<c0467dc6>] ? acpi_os_read_port+0x24/0x4c
[   21.793653]  [<c01afe86>] handle_irq_event_percpu+0xd6/0x250
[   21.793670]  [<c01b003c>] handle_irq_event+0x3c/0x60
[   21.793682]  [<c01b2c10>] ? unmask_irq+0x70/0x70
[   21.793694]  [<c01b2c54>] handle_level_irq+0x44/0xa0
[   21.793702]  <IRQ>  [<c06d152a>] ? do_IRQ+0x4a/0xd0
[   21.793726]  [<c01400fa>] ? irq_exit+0x6a/0xd0
[   21.793739]  [<c06d15f5>] ? smp_apic_timer_interrupt+0x45/0x74
[   21.793752]  [<c06d14e0>] ? return_to_handler+0xf/0xf
[   21.793766]  [<c0121a0a>] ? __ipipe_do_IRQ+0x4a/0x60
[   21.793779]  [<c06d14e0>] ? return_to_handler+0xf/0xf
[   21.793793]  [<c01a007b>] ? cgroup_addrm_files+0x1db/0x2e0
[   21.793805]  [<c01b0033>] ? handle_irq_event+0x33/0x60
[   21.793818]  [<c0121a0f>] ? __ipipe_do_IRQ+0x4f/0x60
[   21.793833]  [<c01bbe62>] ? __ipipe_do_sync_stage+0x1c2/0x200
[   21.793849]  [<c01bd1cc>] ? ipipe_unstall_root+0x5c/0x90
[   21.793861]  [<c013fe4a>] ? __do_softirq+0x6a/0x260
[   21.793878]  [<c06cdbf3>] ? sub_preempt_count+0x13/0xd0
[   21.793891]  [<c06d15b0>] ? do_IRQ+0xd0/0xd0
[   21.793903]  [<c0140155>] ? irq_exit+0xc5/0xd0
[   21.793916]  [<c06d15f5>] ? smp_apic_timer_interrupt+0x45/0x74
[   21.793929]  [<c06d14e0>] ? return_to_handler+0xf/0xf
[   21.793941]  [<c0121a0a>] ? __ipipe_do_IRQ+0x4a/0x60
[   21.793954]  [<c06d15b0>] ? do_IRQ+0xd0/0xd0
[   21.793968]  [<c01a007b>] ? cgroup_addrm_files+0x1db/0x2e0
[   21.793980]  [<c01b0033>] ? handle_irq_event+0x33/0x60
[   21.793992]  [<c0121a0f>] ? __ipipe_do_IRQ+0x4f/0x60
[   21.794006]  [<c01bbe62>] ? __ipipe_do_sync_stage+0x1c2/0x200
[   21.794022]  [<c01bbf7d>] ? __ipipe_do_sync_pipeline+0xdd/0x1b0
[   21.794036]  [<c01bd48e>] ? __ipipe_dispatch_irq+0x24e/0x550
[   21.794051]  [<c02a857e>] ? T.1997+0xae/0x6d0
[   21.794065]  [<c0121c3d>] ? __ipipe_handle_irq+0x6d/0x230
[   21.794090]  [<f835c8b7>] ? snd_device_new+0x57/0xb0 [snd]
[   21.794105]  [<c06d1440>] ? common_interrupt+0x40/0x60
[   21.794122]  [<c01bce5a>] ? ipipe_root_only+0x8a/0x150
[   21.794137]  [<c06cdbf3>] ? sub_preempt_count+0x13/0xd0
[   21.794152]  [<c0426c2a>] ? delay_tsc+0x3a/0xb0
[   21.794166]  [<c0426beb>] ? __const_udelay+0x1b/0x20
[   21.794182]  [<f80ef03a>] ? azx_get_response+0x11a/0x290 [snd_hda_intel]
[   21.794202]  [<f80ef4b7>] ? azx_probe_continue+0x107/0x3c0 [snd_hda_intel]
[   21.794216]  [<c0425d95>] ? vsnprintf+0xb5/0x430
[   21.794233]  [<f80ef1b0>] ? azx_get_response+0x290/0x290 [snd_hda_intel]
[   21.794248]  [<f80eef20>] ? azx_pcm_open+0x380/0x380 [snd_hda_intel]
[   21.794262]  [<f80ef830>] ? azx_halt+0x30/0x30 [snd_hda_intel]
[   21.794277]  [<f80edd90>] ? azx_resume+0x100/0x100 [snd_hda_intel]
[   21.794291]  [<f80ee200>] ? azx_dev_free+0x20/0x20 [snd_hda_intel]
[   21.794307]  [<f80f0955>] ? azx_probe+0x2c5/0xbe4 [snd_hda_intel]
[   21.794327]  [<c0443cf8>] ? local_pci_probe+0x38/0x70
[   21.794341]  [<c0444dc0>] ? pci_device_probe+0x60/0x80
[   21.794357]  [<c04d5138>] ? driver_probe_device+0x78/0x200
[   21.794371]  [<c04d5341>] ? __driver_attach+0x81/0x90
[   21.794384]  [<c04d3ab8>] ? bus_for_each_dev+0x68/0x90
[   21.794398]  [<c04d4fae>] ? driver_attach+0x1e/0x20
[   21.794410]  [<c04d52c0>] ? driver_probe_device+0x200/0x200
[   21.794422]  [<c04d4b0f>] ? bus_add_driver+0x1bf/0x220
[   21.794436]  [<c0444c80>] ? pci_dev_put+0x20/0x20
[   21.794449]  [<c0444c80>] ? pci_dev_put+0x20/0x20
[   21.794461]  [<c04d599a>] ? driver_register+0x6a/0x140
[   21.794474]  [<c01c2d01>] ? tracepoint_module_notify+0x121/0x180
[   21.794489]  [<c0444e83>] ? __pci_register_driver+0x33/0x40
[   21.794504]  [<f8300199>] ? azx_driver_init+0x17/0x19 [snd_hda_intel]
[   21.794517]  [<c0100111>] ? do_one_initcall+0x31/0x160
[   21.794534]  [<f8300182>] ? ftrace_define_fields_azx_get_position+0xdd/0xdd [snd_hda_intel]
[   21.794549]  [<c0198801>] ? load_module+0x1421/0x1700
[   21.794562]  [<c01953f0>] ? free_notes_attrs+0x50/0x50
[   21.794589]  [<c0198c13>] ? SyS_init_module+0xa3/0xc0
[   21.794613]  [<c06d0f4f>] ? sysenter_do_call+0x12/0x16
[   21.794625] handlers:
[   21.794649] [<c0467d74>] acpi_irq
[   21.794683] Disabling IRQ #9


>
> --
>                                           Gilles.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-09 15:19                                                   ` Kurijn Buys
@ 2013-12-09 15:27                                                     ` Gilles Chanteperdrix
  2013-12-11 14:23                                                       ` Kurijn Buys
  2013-12-11 16:44                                                       ` [Xenomai] Analogy NI 6052E Kurijn Buys
  0 siblings, 2 replies; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-09 15:27 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On 12/09/2013 04:19 PM, Kurijn Buys wrote:
> Op 5-dec.-2013, om 11:09 heeft Gilles Chanteperdrix het volgende geschreven:
>
>> On 12/05/2013 01:44 AM, Kurijn Buys wrote:
>>>
>>> Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:
>>>
>>>> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>>>>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>>>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>>>>
>>>>>>
>>>>>> testing/ipipe-3.8-i915-fix
>>>>>
>>>>> I could test it, I no longer get high latencies while moving a large
>>>>> opengl window. So, it looks good.
>>>
>>> I've it running as well (with the priority coupling option disabled this time).
>>
>> The options I asked you to change were meant to eliminate any possible
>> source for high latencies in case of bug. You can return to a normal
>> configuration now: re-enable SMP or APIC if you prefer to remain in
>> single processor mode, disable the I-pipe tracer if you want to avoid
>> the overhead, and re-enable priority coupling if you prefer to use it.
>
> Re-enabling SMP is fine, but doesn't necessarily seem to improve the latency.
> While I'm still having some small pikes, they keep below 60us, so that is fine to me.

If your processor has hyper-threading, you may have much better results 
indeed in single processor mode, but enabling APIC and IO/APIC to avoid 
the spikes you got when using the legacy PIC.

Experiences on atom processor:
http://www.xenomai.org/~gch/core-3.4-latencies/atom.png

show that disabling hyper-threading divides latency by 2.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-09 15:27                                                     ` Gilles Chanteperdrix
@ 2013-12-11 14:23                                                       ` Kurijn Buys
  2013-12-11 14:51                                                         ` Lennart Sorensen
  2013-12-11 16:44                                                       ` [Xenomai] Analogy NI 6052E Kurijn Buys
  1 sibling, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-11 14:23 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai


Op 9-dec.-2013, om 15:27 heeft Gilles Chanteperdrix het volgende geschreven:

> On 12/09/2013 04:19 PM, Kurijn Buys wrote:
>> Op 5-dec.-2013, om 11:09 heeft Gilles Chanteperdrix het volgende geschreven:
>>
>>> On 12/05/2013 01:44 AM, Kurijn Buys wrote:
>>>>
>>>> Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:
>>>>
>>>>> On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
>>>>>> On 12/04/2013 02:19 PM, Philippe Gerum wrote:
>>>>>>> On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
>>>>>>>> Ok, could you push the branch somewhere so that I can try it?
>>>>>>>>
>>>>>>>
>>>>>>> testing/ipipe-3.8-i915-fix
>>>>>>
>>>>>> I could test it, I no longer get high latencies while moving a large
>>>>>> opengl window. So, it looks good.
>>>>
>>>> I've it running as well (with the priority coupling option disabled this time).
>>>
>>> The options I asked you to change were meant to eliminate any possible
>>> source for high latencies in case of bug. You can return to a normal
>>> configuration now: re-enable SMP or APIC if you prefer to remain in
>>> single processor mode, disable the I-pipe tracer if you want to avoid
>>> the overhead, and re-enable priority coupling if you prefer to use it.
>>
>> Re-enabling SMP is fine, but doesn't necessarily seem to improve the latency.
>> While I'm still having some small pikes, they keep below 60us, so that is fine to me.
>
> If your processor has hyper-threading, you may have much better results
> indeed in single processor mode, but enabling APIC and IO/APIC to avoid
> the spikes you got when using the legacy PIC.

I enabled APIC, and the latency peaks at 16us only now, and even less with load.
This also had an effect on my analogy problem, but not completely... I will start a new thread for that...

Thanks a lot already for resolving this issue!
]{urijn

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-11 14:23                                                       ` Kurijn Buys
@ 2013-12-11 14:51                                                         ` Lennart Sorensen
  2013-12-11 16:04                                                           ` Tobias Luksch
  0 siblings, 1 reply; 38+ messages in thread
From: Lennart Sorensen @ 2013-12-11 14:51 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

On Wed, Dec 11, 2013 at 02:23:38PM +0000, Kurijn Buys wrote:
> I enabled APIC, and the latency peaks at 16us only now, and even less with load.
> This also had an effect on my analogy problem, but not completely... I will start a new thread for that...

If your latency drops under loads, it sounds as if your CPU is slowing
down when idle and taking a while to speed back up.  Perhaps changing
the CPU governer to performance or user controlled and setting a fixed
CPU speed would keep the latency low all the time (although at the cost
of more power consumption).

-- 
Len Sorensen


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-11 14:51                                                         ` Lennart Sorensen
@ 2013-12-11 16:04                                                           ` Tobias Luksch
  2013-12-11 17:21                                                             ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Tobias Luksch @ 2013-12-11 16:04 UTC (permalink / raw)
  To: Lennart Sorensen, Kurijn Buys; +Cc: Xenomai

> > On Wed, Dec 11, 2013 at 02:23:38PM +0000, Kurijn Buys wrote:
> > I enabled APIC, and the latency peaks at 16us only now, and even less with
> load.
> > This also had an effect on my analogy problem, but not completely... I will
> start a new thread for that...
> 
> If your latency drops under loads, it sounds as if your CPU is slowing
> down when idle and taking a while to speed back up.  Perhaps changing
> the CPU governer to performance or user controlled and setting a fixed
> CPU speed would keep the latency low all the time (although at the cost
> of more power consumption).

I had a similar problem where the latency behavior changed depending on the CPU load on an Intel CPU (see " Problems with running Xenomai on Core i5" thread of this list). It turned out to be a C1E "feature" that I could not influence in the BIOS. But clearing the second bit of the MSR_IA32_POWER_CTL register did help. I used the wrmsr command of the msr-tools package.

-Tobias



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Xenomai] Analogy NI 6052E
  2013-12-09 15:27                                                     ` Gilles Chanteperdrix
  2013-12-11 14:23                                                       ` Kurijn Buys
@ 2013-12-11 16:44                                                       ` Kurijn Buys
  2014-03-21 14:33                                                         ` Erhart Robert (CC-DA/ESR3)
  1 sibling, 1 reply; 38+ messages in thread
From: Kurijn Buys @ 2013-12-11 16:44 UTC (permalink / raw)
  To: Xenomai

Hi,

Now that I got rid if my latency issue, I'm dealing with an Analogy problem. (I repeated my system specs below)

After running analogy_config, The NI card (6052E) seems to attach to the analogy devices:
[  128.594303] Analogy: analogy_ni_pcimio: pcimio_attach: found pci-6052e board
[  128.594358] Analogy: analogy_ni_pcimio: pcimio_attach: found irq 21
and
--  Analogy devices --
| idx | status | driver
|  00 | Linked | analogy_ni_pcimio

But when I try to observe the subdevices with cat /proc/analogy/00-analogy_ni_pcimio
it says "Killed" and the x-server crashes.
Then the kernel log states:
[  138.273837] BUG: unable to handle kernel NULL pointer dereference at 00000001
[  138.273910] IP: [<f85f7a72>] a4l_rdproc_transfer+0x32/0x160 [xeno_analogy]
[  138.273972] *pde = 00000000
[  138.274002] Oops: 0000 [#1] PREEMPT SMP
[  138.274050] Modules linked in: binfmt_misc snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy i915 snd_seq_oss fbcon analogy_ni_pcimio tileblit snd_seq_midi font bitblit snd_rawmidi analogy_ni_mio analogy_ni_tio analogy_8255 analogy_ni_mite softcursor snd_seq_midi_event xeno_analogy drm_kms_helper snd_seq drm snd_timer microcode ppdev psmouse snd_seq_device parport_pc serio_raw snd intel_agp i2c_algo_bit video lp intel_gtt agpgart lpc_ich soundcore parport snd_page_alloc e1000e ptp pps_core
[  138.274601] CPU: 0 PID: 1273 Comm: cat Not tainted 3.10.0-xenomai-2.6.3+ #1
[  138.274649] Hardware name: Viglen D945GTP/D945GTP, BIOS NT94510J.86A.3309.2006.0109.1312 01/09/2006
[  138.274707] task: e0e5c060 ti: e0efc000 task.ti: e0efc000
[  138.274746] EIP: 0060:[<f85f7a72>] EFLAGS: 00010246 CPU: 0
[  138.274789] EIP is at a4l_rdproc_transfer+0x32/0x160 [xeno_analogy]
[  138.274831] EAX: 00000000 EBX: e13ffe40 ECX: e0efdeec EDX: f85f896a
[  138.274874] ESI: e13ffe40 EDI: 00000001 EBP: e0efdf00 ESP: e0efdee4
[  138.274916]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  138.274954] CR0: 8005003b CR2: 00000001 CR3: 20d85000 CR4: 000007d0
[  138.274997] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  138.275039] DR6: ffff0ff0 DR7: 00000400
[  138.275067] I-pipe domain Linux
[  138.275091] Stack:
[  138.275110]  e13ffe40 f85f896a 000000d0 e0e87000 00000000 e13ffe40 00000001 e0efdf40
[  138.276018]  c02cc825 00020000 e0efdf30 09b0b000 e13ffe68 00008000 00000000 e0c6c200
[  138.276018]  e0c6c200 00008000 00000000 00000000 e0c6c200 e0d94600 c02cc730 e0efdf64
[  138.276018] Call Trace:
[  138.276018]  [<c02cc825>] seq_read+0xf5/0x390
[  138.276018]  [<c02cc730>] ? seq_lseek+0x170/0x170
[  138.276018]  [<c02fdc83>] proc_reg_read+0x53/0x90
[  138.276018]  [<c02fdc30>] ? proc_reg_write+0x90/0x90
[  138.276018]  [<c02ae3d7>] vfs_read+0x97/0x110
[  138.276018]  [<c02aed46>] SyS_read+0x56/0x90
[  138.276018]  [<c06d0f4f>] sysenter_do_call+0x12/0x16
[  138.276018] Code: ec 10 e8 12 9a 0d c8 89 d7 89 c3 c7 44 24 04 56 89 5f f8 89 04 24 e8 fe 46 cd c7 c7 44 24 04 6a 89 5f f8 89 1c 24 e8 ee 46 cd c7 <8b> 07 85 c0 0f 84 0c 01 00 00 31 c0 31 f6 eb 5b 8d b6 00 00 00
[  138.276018] EIP: [<f85f7a72>] a4l_rdproc_transfer+0x32/0x160 [xeno_analogy] SS:ESP 0068:e0efdee4
[  138.276018] CR2: 0000000000000001
[  138.320577] ---[ end trace a8594ddaa188276f ]---


It seems to me that this issue is related to this thread: http://www.mail-archive.com/xenomai-help@gna.org/msg10134.html
However, in my case the attach itself seems to work. And I suppose the solution for this thread was an update of the kernel patch, so maybe I'm dealing with a new issue for my specific card..?

I enabled Analogy drivers (as explained here http://www.lara.unb.br/wiki/index.php/Data_Acquisition_Xenomai_Analogy),
and disabled Comedi drivers (Data acquisition support (comedi) ).

I checked the specs of the card with its description in "ni_pcimio.c" and that looks all right for what I understand. (I don't know for these: caldac ai_speed).
I tried booting with pci=routeirq
I tried another PCI connection, even another PC.

My system:
Hardware: Pentium IV (lspci: 3,2GHz, i686, 32,64bit, 2 cpu's), 2Gb RAM
Software: Ubuntu 10.04, kernel&patch 3.10 (but also tried with 2.6.38.8 and 3.8), Xenomai 2.6.3


Thanks!
]{urijn

Op 9-dec.-2013, om 15:27 heeft Gilles Chanteperdrix het volgende geschreven:

On 12/09/2013 04:19 PM, Kurijn Buys wrote:
Op 5-dec.-2013, om 11:09 heeft Gilles Chanteperdrix het volgende geschreven:

On 12/05/2013 01:44 AM, Kurijn Buys wrote:

Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:

On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
On 12/04/2013 02:19 PM, Philippe Gerum wrote:
On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
Ok, could you push the branch somewhere so that I can try it?


testing/ipipe-3.8-i915-fix

I could test it, I no longer get high latencies while moving a large
opengl window. So, it looks good.

I've it running as well (with the priority coupling option disabled this time).

The options I asked you to change were meant to eliminate any possible
source for high latencies in case of bug. You can return to a normal
configuration now: re-enable SMP or APIC if you prefer to remain in
single processor mode, disable the I-pipe tracer if you want to avoid
the overhead, and re-enable priority coupling if you prefer to use it.

Re-enabling SMP is fine, but doesn't necessarily seem to improve the latency.
While I'm still having some small pikes, they keep below 60us, so that is fine to me.

If your processor has hyper-threading, you may have much better results
indeed in single processor mode, but enabling APIC and IO/APIC to avoid
the spikes you got when using the legacy PIC.

Experiences on atom processor:
http://www.xenomai.org/~gch/core-3.4-latencies/atom.png

show that disabling hyper-threading divides latency by 2.

--
   Gilles.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] latency spikes under load
  2013-12-11 16:04                                                           ` Tobias Luksch
@ 2013-12-11 17:21                                                             ` Gilles Chanteperdrix
  0 siblings, 0 replies; 38+ messages in thread
From: Gilles Chanteperdrix @ 2013-12-11 17:21 UTC (permalink / raw)
  To: Tobias Luksch; +Cc: Xenomai, Kurijn Buys

On 12/11/2013 05:04 PM, Tobias Luksch wrote:
>>> On Wed, Dec 11, 2013 at 02:23:38PM +0000, Kurijn Buys wrote: I
>>> enabled APIC, and the latency peaks at 16us only now, and even
>>> less with
>> load.
>>> This also had an effect on my analogy problem, but not
>>> completely... I will
>> start a new thread for that...
>>
>> If your latency drops under loads, it sounds as if your CPU is
>> slowing down when idle and taking a while to speed back up.
>> Perhaps changing the CPU governer to performance or user controlled
>> and setting a fixed CPU speed would keep the latency low all the
>> time (although at the cost of more power consumption).
>
> I had a similar problem where the latency behavior changed depending
> on the CPU load on an Intel CPU (see " Problems with running Xenomai
> on Core i5" thread of this list). It turned out to be a C1E "feature"
> that I could not influence in the BIOS. But clearing the second bit
> of the MSR_IA32_POWER_CTL register did help. I used the wrmsr command
> of the msr-tools package.

Ok, we could integrate this C1E workaround as the SMI workaround or AMD 
C1E workaround. Do you have any pointer, which would explain me a way to 
detect that this MSR is available (and C1E is enabled, but I guess it is 
just a matter of reading the MSR). Second question is: has not this 
change any repercussion on processor temperature.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] Analogy NI 6052E
  2013-12-11 16:44                                                       ` [Xenomai] Analogy NI 6052E Kurijn Buys
@ 2014-03-21 14:33                                                         ` Erhart Robert (CC-DA/ESR3)
  2014-03-22 16:58                                                           ` Gilles Chanteperdrix
  0 siblings, 1 reply; 38+ messages in thread
From: Erhart Robert (CC-DA/ESR3) @ 2014-03-21 14:33 UTC (permalink / raw)
  To: Kurijn Buys; +Cc: Xenomai

Servus,

I had the same problem on my System:
xenomai 2.6.3
linux 3.8.13
PCIe-6251

Here is a small patch for this problem. 
Relevant is only the line  <transfer = (a4l_trf_t *) p->private;>. The other stuff removes only the compiler warnings.

Good Luck!

Servus
Robert

----------------------- ksrc/drivers/analogy/transfer.c -----------------------
index add4414..6ba7223 100644
@@ -216,14 +216,14 @@ unsigned int a4l_get_irq(a4l_dev_t * dev)
 int a4l_rdproc_transfer(struct seq_file *p, void *data)
 {
 	int i;
-	a4l_trf_t *transfer = (a4l_trf_t *) data;
-
+	char *type;
+	a4l_trf_t *transfer;
+	transfer = (a4l_trf_t *) p->private;
 	seq_printf(p, "--  Subdevices --\n\n");
 	seq_printf(p, "| idx | type\n");
 
 	/* Gives the subdevice type's name */
 	for (i = 0; i < transfer->nb_subd; i++) {
-		char *type;
 		switch (transfer->subds[i]->flags & A4L_SUBD_TYPES) {
 		case A4L_SUBD_UNUSED:
 			type = "Unused subdevice";



Servus
Robert


-----Ursprüngliche Nachricht-----
Von: Xenomai [mailto:xenomai-bounces@xenomai.org] Im Auftrag von Kurijn Buys
Gesendet: Mittwoch, 11. Dezember 2013 17:45
An: Xenomai@xenomai.org
Betreff: [Xenomai] Analogy NI 6052E

Hi,

Now that I got rid if my latency issue, I'm dealing with an Analogy problem. (I repeated my system specs below)

After running analogy_config, The NI card (6052E) seems to attach to the analogy devices:
[  128.594303] Analogy: analogy_ni_pcimio: pcimio_attach: found pci-6052e board
[  128.594358] Analogy: analogy_ni_pcimio: pcimio_attach: found irq 21
and
--  Analogy devices --
| idx | status | driver
|  00 | Linked | analogy_ni_pcimio

But when I try to observe the subdevices with cat /proc/analogy/00-analogy_ni_pcimio
it says "Killed" and the x-server crashes.
Then the kernel log states:
[  138.273837] BUG: unable to handle kernel NULL pointer dereference at 00000001
[  138.273910] IP: [<f85f7a72>] a4l_rdproc_transfer+0x32/0x160 [xeno_analogy]
[  138.273972] *pde = 00000000
[  138.274002] Oops: 0000 [#1] PREEMPT SMP
[  138.274050] Modules linked in: binfmt_misc snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy i915 snd_seq_oss fbcon analogy_ni_pcimio tileblit snd_seq_midi font bitblit snd_rawmidi analogy_ni_mio analogy_ni_tio analogy_8255 analogy_ni_mite softcursor snd_seq_midi_event xeno_analogy drm_kms_helper snd_seq drm snd_timer microcode ppdev psmouse snd_seq_device parport_pc serio_raw snd intel_agp i2c_algo_bit video lp intel_gtt agpgart lpc_ich soundcore parport snd_page_alloc e1000e ptp pps_core
[  138.274601] CPU: 0 PID: 1273 Comm: cat Not tainted 3.10.0-xenomai-2.6.3+ #1
[  138.274649] Hardware name: Viglen D945GTP/D945GTP, BIOS NT94510J.86A.3309.2006.0109.1312 01/09/2006
[  138.274707] task: e0e5c060 ti: e0efc000 task.ti: e0efc000
[  138.274746] EIP: 0060:[<f85f7a72>] EFLAGS: 00010246 CPU: 0
[  138.274789] EIP is at a4l_rdproc_transfer+0x32/0x160 [xeno_analogy]
[  138.274831] EAX: 00000000 EBX: e13ffe40 ECX: e0efdeec EDX: f85f896a
[  138.274874] ESI: e13ffe40 EDI: 00000001 EBP: e0efdf00 ESP: e0efdee4
[  138.274916]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  138.274954] CR0: 8005003b CR2: 00000001 CR3: 20d85000 CR4: 000007d0
[  138.274997] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  138.275039] DR6: ffff0ff0 DR7: 00000400
[  138.275067] I-pipe domain Linux
[  138.275091] Stack:
[  138.275110]  e13ffe40 f85f896a 000000d0 e0e87000 00000000 e13ffe40 00000001 e0efdf40
[  138.276018]  c02cc825 00020000 e0efdf30 09b0b000 e13ffe68 00008000 00000000 e0c6c200
[  138.276018]  e0c6c200 00008000 00000000 00000000 e0c6c200 e0d94600 c02cc730 e0efdf64
[  138.276018] Call Trace:
[  138.276018]  [<c02cc825>] seq_read+0xf5/0x390
[  138.276018]  [<c02cc730>] ? seq_lseek+0x170/0x170
[  138.276018]  [<c02fdc83>] proc_reg_read+0x53/0x90
[  138.276018]  [<c02fdc30>] ? proc_reg_write+0x90/0x90
[  138.276018]  [<c02ae3d7>] vfs_read+0x97/0x110
[  138.276018]  [<c02aed46>] SyS_read+0x56/0x90
[  138.276018]  [<c06d0f4f>] sysenter_do_call+0x12/0x16
[  138.276018] Code: ec 10 e8 12 9a 0d c8 89 d7 89 c3 c7 44 24 04 56 89 5f f8 89 04 24 e8 fe 46 cd c7 c7 44 24 04 6a 89 5f f8 89 1c 24 e8 ee 46 cd c7 <8b> 07 85 c0 0f 84 0c 01 00 00 31 c0 31 f6 eb 5b 8d b6 00 00 00
[  138.276018] EIP: [<f85f7a72>] a4l_rdproc_transfer+0x32/0x160 [xeno_analogy] SS:ESP 0068:e0efdee4
[  138.276018] CR2: 0000000000000001
[  138.320577] ---[ end trace a8594ddaa188276f ]---


It seems to me that this issue is related to this thread: http://www.mail-archive.com/xenomai-help@gna.org/msg10134.html
However, in my case the attach itself seems to work. And I suppose the solution for this thread was an update of the kernel patch, so maybe I'm dealing with a new issue for my specific card..?

I enabled Analogy drivers (as explained here http://www.lara.unb.br/wiki/index.php/Data_Acquisition_Xenomai_Analogy),
and disabled Comedi drivers (Data acquisition support (comedi) ).

I checked the specs of the card with its description in "ni_pcimio.c" and that looks all right for what I understand. (I don't know for these: caldac ai_speed).
I tried booting with pci=routeirq
I tried another PCI connection, even another PC.

My system:
Hardware: Pentium IV (lspci: 3,2GHz, i686, 32,64bit, 2 cpu's), 2Gb RAM
Software: Ubuntu 10.04, kernel&patch 3.10 (but also tried with 2.6.38.8 and 3.8), Xenomai 2.6.3


Thanks!
]{urijn

Op 9-dec.-2013, om 15:27 heeft Gilles Chanteperdrix het volgende geschreven:

On 12/09/2013 04:19 PM, Kurijn Buys wrote:
Op 5-dec.-2013, om 11:09 heeft Gilles Chanteperdrix het volgende geschreven:

On 12/05/2013 01:44 AM, Kurijn Buys wrote:

Op 4-dec.-2013, om 17:43 heeft Philippe Gerum het volgende geschreven:

On 12/04/2013 05:03 PM, Gilles Chanteperdrix wrote:
On 12/04/2013 02:19 PM, Philippe Gerum wrote:
On 12/04/2013 01:00 PM, Gilles Chanteperdrix wrote:
Ok, could you push the branch somewhere so that I can try it?


testing/ipipe-3.8-i915-fix

I could test it, I no longer get high latencies while moving a large
opengl window. So, it looks good.

I've it running as well (with the priority coupling option disabled this time).

The options I asked you to change were meant to eliminate any possible
source for high latencies in case of bug. You can return to a normal
configuration now: re-enable SMP or APIC if you prefer to remain in
single processor mode, disable the I-pipe tracer if you want to avoid
the overhead, and re-enable priority coupling if you prefer to use it.

Re-enabling SMP is fine, but doesn't necessarily seem to improve the latency.
While I'm still having some small pikes, they keep below 60us, so that is fine to me.

If your processor has hyper-threading, you may have much better results
indeed in single processor mode, but enabling APIC and IO/APIC to avoid
the spikes you got when using the legacy PIC.

Experiences on atom processor:
http://www.xenomai.org/~gch/core-3.4-latencies/atom.png

show that disabling hyper-threading divides latency by 2.

--
   Gilles.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).
_______________________________________________
Xenomai mailing list
Xenomai@xenomai.org
http://www.xenomai.org/mailman/listinfo/xenomai


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xenomai] Analogy NI 6052E
  2014-03-21 14:33                                                         ` Erhart Robert (CC-DA/ESR3)
@ 2014-03-22 16:58                                                           ` Gilles Chanteperdrix
  0 siblings, 0 replies; 38+ messages in thread
From: Gilles Chanteperdrix @ 2014-03-22 16:58 UTC (permalink / raw)
  To: Erhart Robert (CC-DA/ESR3); +Cc: Kurijn Buys, Xenomai

On 03/21/2014 03:33 PM, Erhart Robert (CC-DA/ESR3) wrote:
> Servus,
> 
> I had the same problem on my System:
> xenomai 2.6.3
> linux 3.8.13
> PCIe-6251
> 
> Here is a small patch for this problem. 
> Relevant is only the line  <transfer = (a4l_trf_t *) p->private;>. The other stuff removes only the compiler warnings.

Merged, thanks.


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2014-03-22 16:58 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-03 11:38 [Xenomai] latency spikes under load Kurijn Buys
2013-12-03 11:54 ` Gilles Chanteperdrix
2013-12-03 12:31 ` Gilles Chanteperdrix
2013-12-03 13:07   ` Kurijn Buys
2013-12-03 13:23     ` Gilles Chanteperdrix
2013-12-03 15:31       ` Kurijn Buys
2013-12-03 15:54         ` Gilles Chanteperdrix
2013-12-03 16:49           ` Kurijn Buys
2013-12-03 18:50             ` Gilles Chanteperdrix
2013-12-04  8:44               ` Philippe Gerum
2013-12-04  8:51                 ` Gilles Chanteperdrix
2013-12-04  9:27                   ` Philippe Gerum
2013-12-04  9:31                     ` Gilles Chanteperdrix
2013-12-04  9:40                       ` Philippe Gerum
2013-12-04  9:51                         ` Gilles Chanteperdrix
2013-12-04 10:29                           ` Philippe Gerum
2013-12-04 10:33                             ` Philippe Gerum
2013-12-04 11:04                               ` Philippe Gerum
2013-12-04 11:10                                 ` Gilles Chanteperdrix
2013-12-04 11:36                                   ` Philippe Gerum
2013-12-04 11:59                                     ` Philippe Gerum
2013-12-04 12:00                                       ` Gilles Chanteperdrix
2013-12-04 13:19                                         ` Philippe Gerum
2013-12-04 16:03                                           ` Gilles Chanteperdrix
2013-12-04 17:43                                             ` Philippe Gerum
2013-12-05  0:44                                               ` Kurijn Buys
2013-12-05 10:28                                                 ` Kurijn Buys
2013-12-05 11:05                                                   ` Philippe Gerum
2013-12-05 11:09                                                 ` Gilles Chanteperdrix
2013-12-09 15:19                                                   ` Kurijn Buys
2013-12-09 15:27                                                     ` Gilles Chanteperdrix
2013-12-11 14:23                                                       ` Kurijn Buys
2013-12-11 14:51                                                         ` Lennart Sorensen
2013-12-11 16:04                                                           ` Tobias Luksch
2013-12-11 17:21                                                             ` Gilles Chanteperdrix
2013-12-11 16:44                                                       ` [Xenomai] Analogy NI 6052E Kurijn Buys
2014-03-21 14:33                                                         ` Erhart Robert (CC-DA/ESR3)
2014-03-22 16:58                                                           ` Gilles Chanteperdrix

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.