Long latencies during disk-io

* Long latencies during disk-io
@ 2019-09-03  7:23 Martin.Wirth
  2019-09-05 15:41 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 12+ messages in thread
From: Martin.Wirth @ 2019-09-03  7:23 UTC (permalink / raw)
  To: linux-rt-users; +Cc: bigeasy

Hi,

I'm operating a data acquisition system with several digitizer cards and an Adlink  cPCI-6530
 (i7-4700EQ  / Intel QM87 chipset) cpu card. The digitizer cards are hardware triggered with
an external low jitter clock at 200 Hz. I monitor the wakeup latencies and with older 
rt-kernels (4.14) I get maximum latencies of about 24 micro seconds most of the time
with extremely rare spikes of 50 micro seconds under medium other system load.

I now switched to 4.19 and 5.2 rt kernels and suddenly got latencies of 500 micro seconds
and one time I even noticed 1.5 ms, if there is concurrent SATA disk traffic going on (not
from the same DAQ-process, but unrelated file copies for example). I played a bit with different
configurations and found that the long latencies only occur if the ahci interrupt thread is on
the same processor as the interrupt thread of the digitizer card. If I separate them via the
 /proc/irq interface the latencies are gone. The real-time priority of the digitizer IRQ handler
 is set to 53 and that of the corresponding user space thread to 51. The ahci interrupt
 thread is at the standard value of 50.     

Although I now have the solution to put the ahci-handler on a different processor, I would
like to know if it is expected behavior that the ahci interrupt handler blocks other interrupt 
threads of higher priority for more than one millisecond?

Another observation which I cannot understand though, is that a concurrent 
cyclictest -m -Sp98 -i200 -h400 -n
run does not show the latencies...

Does anyone have an explanation for this? As said above this is absent on 4.14-rt and
I observed it with 4.19.59-rt24, 5.2.9-rt3 and 5.2.10-rt5.

Cheers,

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread