linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <Martin.Wirth@dlr.de>
To: <bigeasy@linutronix.de>
Cc: <linux-rt-users@vger.kernel.org>
Subject: AW: Long latencies during disk-io
Date: Fri, 6 Sep 2019 13:23:24 +0000	[thread overview]
Message-ID: <1A2AF5F768264E479963C4D8C015EB696C51159F@DLDEFFMIMP02EXC.intra.dlr.de> (raw)
In-Reply-To: <20190905154115.mnid6rue65x6vvt6@linutronix.de>

> Von: Sebastian Andrzej Siewior [mailto:bigeasy@linutronix.de]


> So based on that, the "digitizer IRQ handler" should run first, followed
> by the user space thread, followed by the AHCI interrupt.

Yes, that's how I understood it. But that is actually not happening. The ahci interrupt handler 
seems not to be interrupted by the higher prio DAQ tasks (both interrupt and user space thread).
I added a timestamp readout to the digitizer cards irq routine and it turned out that the 
interrupt thread is already delayed, not only the user land thread...

Just to be sure, this is a ps of the rt tasks (with some unrelated irq/threads at rtprio 50 removed)

tid,  pid,  cpu, cls, rtpri, pri, nice,  wchan, task
   10    10   0  FF      1  41   - -      rcuc/0
   11    11   1  FF      1  41   - -      rcu_preempt
   12    12   0  FF      1  41   - -      rcub/0
   13    13   0  FF     99 139   - -      posixcputmr/0
   14    14   0  FF     99 139   - -      migration/0
   17    17   1  FF     99 139   - -      migration/1
   18    18   1  FF     99 139   - -      posixcputmr/1
   19    19   1  FF      1  41   - -      rcuc/1
   24    24   2  FF     99 139   - -      migration/2
   25    25   2  FF     99 139   - -      posixcputmr/2
   26    26   2  FF      1  41   - -      rcuc/2
   31    31   3  FF     99 139   - -      migration/3
   32    32   3  FF     99 139   - -      posixcputmr/3
   33    33   3  FF      1  41   - -      rcuc/3
  102   102   1  FF     50  90   - -      irq/9-acpi
  108   108   1  FF     99 139   - -      watchdogd
  120   120   1  FF     50  90   - -      irq/34-i915
  122   122   2  FF     50  90   - -      irq/35-ahci[000
  300   300   2  FF     53  93   - -      irq/21-spec_
 7027  7027   0  FF      1  41   - -      lidaqserver
 7035  7027   3  FF     51  91   - -      ADCControl0

irq/21-spec_ is the digitizer irq thread, which does a simple hardware register read over 
the pci bus and then does a wake_up. No locking involved. ADCControl0 is the user land 
task which is woken up by the interrupt thread.

cat /proc/interrupt looks like this (without the separation of irqs of course...):

           CPU0       CPU1       CPU2       CPU3
  0:          5          0          0          0   IO-APIC   2-edge      timer
  8:          1          0          0          0   IO-APIC   8-edge      rtc0
  9:          0          5          0          0   IO-APIC   9-fasteoi   acpi
 16:         30          0          0          0   IO-APIC  16-fasteoi   ehci_hcd:usb1
 18:          0          0          0          0   IO-APIC  18-fasteoi   i801_smbus
 21:          0          0     447302          0   IO-APIC  21-fasteoi   spec_
 34:          0       7758          0          0   PCI-MSI 32768-edge      i915
 35:          0          0     254225          0   PCI-MSI 512000-edge      ahci[0000:00:1f.2]
 36:          0          0          0       4211   PCI-MSI 409600-edge      eth5
70:          0          0    2206737          0   PCI-MSI 327680-edge      xhci_hcd
 71:          0          0          0     109935   PCI-MSI 2097152-edge      xhci_hcd
 72:          0          0          0          0   PCI-MSI 2097153-edge      xhci_hcd
 73:          0          0          0          0   PCI-MSI 2097154-edge      xhci_hcd
 74:          0          0          0          0   PCI-MSI 2097155-edge      xhci_hcd
 75:          0          0          0          0   PCI-MSI 2097156-edge      xhci_hcd
NMI:          0          0          0          0   Non-maskable interrupts
LOC:    1190870    1000137     935198     947995   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0   Performance monitoring interrupts
IWI:         93         66         34          4   IRQ work interrupts
RTR:          3          0          0          0   APIC ICR read retries
RES:     160318     125789     106359     110046   Rescheduling interrupts
CAL:       5801       5217       4330       4275   Function call interrupts
TLB:       1610        415        244        335   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0   Machine check exceptions
MCP:         28         29         29         29   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0          0          0   Posted-interrupt notification event
NPI:          0          0          0          0   Nested posted-interrupt event
PIW:          0          0          0          0   Posted-interrupt wakeup event  

(I removed a bunch of interrupts from the on board Ethernet controllers)

Interestingly disk-io to an usb-disk handled by the xhci_hcd at irq  70 does not cause latencies, 
although it is handled also by CPU2, like the ahci and the digitizer (spec_).
So I think it is some specific to the ahci!

> > Another observation which I cannot understand though, is that a
> concurrent
> > cyclictest -m -Sp98 -i200 -h400 -n
> > run does not show the latencies...
> 
> you have here "-p98" while the other two threads have priority 53 and
> 51. So that is a difference unless there is nothing between 53 and 51.
> But if you disable the "digitizer card(s)" then with -p51 you should be
> close to that setup with cyclictest.
> 
> If there are no spikes in cyclictest then there must be something
> after the threaded handler runs that delays the user space task. Some
> tracing should help to figure out what happens between the hardware
> interrupt and the context switch to the user task.

If I run cyclictest at prio 51 there are no latencies. Only if I go to 50 I immediately 
see the 1 - 1.5 ms on CPU2 if I start a file copy  -- which is of course as expected.

For the tracing, I at the moment have no idea how to set it up. I only did some 
tracing which was triggered by cyclictest. But that won't help here...

Martin

  reply	other threads:[~2019-09-06 13:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03  7:23 Long latencies during disk-io Martin.Wirth
2019-09-05 15:41 ` Sebastian Andrzej Siewior
2019-09-06 13:23   ` Martin.Wirth [this message]
2019-09-06 22:06     ` Austin Schuh
2019-09-10 12:04       ` AW: " Martin.Wirth
2019-09-13 13:36         ` Sebastian Andrzej Siewior
2019-09-13 14:09           ` AW: " Martin.Wirth
2019-09-13 15:06             ` Sebastian Andrzej Siewior
2019-09-16 11:30               ` AW: " Martin.Wirth
2019-09-17 10:52                 ` Sebastian Andrzej Siewior
2019-09-17 11:09                   ` AW: " Martin.Wirth
2019-09-17 11:20                     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1A2AF5F768264E479963C4D8C015EB696C51159F@DLDEFFMIMP02EXC.intra.dlr.de \
    --to=martin.wirth@dlr.de \
    --cc=bigeasy@linutronix.de \
    --cc=linux-rt-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).