From: <Martin.Wirth@dlr.de>
To: <bigeasy@linutronix.de>
Cc: <linux-rt-users@vger.kernel.org>
Subject: AW: Long latencies during disk-io
Date: Fri, 6 Sep 2019 13:23:24 +0000 [thread overview]
Message-ID: <1A2AF5F768264E479963C4D8C015EB696C51159F@DLDEFFMIMP02EXC.intra.dlr.de> (raw)
In-Reply-To: <20190905154115.mnid6rue65x6vvt6@linutronix.de>
> Von: Sebastian Andrzej Siewior [mailto:bigeasy@linutronix.de]
> So based on that, the "digitizer IRQ handler" should run first, followed
> by the user space thread, followed by the AHCI interrupt.
Yes, that's how I understood it. But that is actually not happening. The ahci interrupt handler
seems not to be interrupted by the higher prio DAQ tasks (both interrupt and user space thread).
I added a timestamp readout to the digitizer cards irq routine and it turned out that the
interrupt thread is already delayed, not only the user land thread...
Just to be sure, this is a ps of the rt tasks (with some unrelated irq/threads at rtprio 50 removed)
tid, pid, cpu, cls, rtpri, pri, nice, wchan, task
10 10 0 FF 1 41 - - rcuc/0
11 11 1 FF 1 41 - - rcu_preempt
12 12 0 FF 1 41 - - rcub/0
13 13 0 FF 99 139 - - posixcputmr/0
14 14 0 FF 99 139 - - migration/0
17 17 1 FF 99 139 - - migration/1
18 18 1 FF 99 139 - - posixcputmr/1
19 19 1 FF 1 41 - - rcuc/1
24 24 2 FF 99 139 - - migration/2
25 25 2 FF 99 139 - - posixcputmr/2
26 26 2 FF 1 41 - - rcuc/2
31 31 3 FF 99 139 - - migration/3
32 32 3 FF 99 139 - - posixcputmr/3
33 33 3 FF 1 41 - - rcuc/3
102 102 1 FF 50 90 - - irq/9-acpi
108 108 1 FF 99 139 - - watchdogd
120 120 1 FF 50 90 - - irq/34-i915
122 122 2 FF 50 90 - - irq/35-ahci[000
300 300 2 FF 53 93 - - irq/21-spec_
7027 7027 0 FF 1 41 - - lidaqserver
7035 7027 3 FF 51 91 - - ADCControl0
irq/21-spec_ is the digitizer irq thread, which does a simple hardware register read over
the pci bus and then does a wake_up. No locking involved. ADCControl0 is the user land
task which is woken up by the interrupt thread.
cat /proc/interrupt looks like this (without the separation of irqs of course...):
CPU0 CPU1 CPU2 CPU3
0: 5 0 0 0 IO-APIC 2-edge timer
8: 1 0 0 0 IO-APIC 8-edge rtc0
9: 0 5 0 0 IO-APIC 9-fasteoi acpi
16: 30 0 0 0 IO-APIC 16-fasteoi ehci_hcd:usb1
18: 0 0 0 0 IO-APIC 18-fasteoi i801_smbus
21: 0 0 447302 0 IO-APIC 21-fasteoi spec_
34: 0 7758 0 0 PCI-MSI 32768-edge i915
35: 0 0 254225 0 PCI-MSI 512000-edge ahci[0000:00:1f.2]
36: 0 0 0 4211 PCI-MSI 409600-edge eth5
70: 0 0 2206737 0 PCI-MSI 327680-edge xhci_hcd
71: 0 0 0 109935 PCI-MSI 2097152-edge xhci_hcd
72: 0 0 0 0 PCI-MSI 2097153-edge xhci_hcd
73: 0 0 0 0 PCI-MSI 2097154-edge xhci_hcd
74: 0 0 0 0 PCI-MSI 2097155-edge xhci_hcd
75: 0 0 0 0 PCI-MSI 2097156-edge xhci_hcd
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 1190870 1000137 935198 947995 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring interrupts
IWI: 93 66 34 4 IRQ work interrupts
RTR: 3 0 0 0 APIC ICR read retries
RES: 160318 125789 106359 110046 Rescheduling interrupts
CAL: 5801 5217 4330 4275 Function call interrupts
TLB: 1610 415 244 335 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 28 29 29 29 Machine check polls
ERR: 0
MIS: 0
PIN: 0 0 0 0 Posted-interrupt notification event
NPI: 0 0 0 0 Nested posted-interrupt event
PIW: 0 0 0 0 Posted-interrupt wakeup event
(I removed a bunch of interrupts from the on board Ethernet controllers)
Interestingly disk-io to an usb-disk handled by the xhci_hcd at irq 70 does not cause latencies,
although it is handled also by CPU2, like the ahci and the digitizer (spec_).
So I think it is some specific to the ahci!
> > Another observation which I cannot understand though, is that a
> concurrent
> > cyclictest -m -Sp98 -i200 -h400 -n
> > run does not show the latencies...
>
> you have here "-p98" while the other two threads have priority 53 and
> 51. So that is a difference unless there is nothing between 53 and 51.
> But if you disable the "digitizer card(s)" then with -p51 you should be
> close to that setup with cyclictest.
>
> If there are no spikes in cyclictest then there must be something
> after the threaded handler runs that delays the user space task. Some
> tracing should help to figure out what happens between the hardware
> interrupt and the context switch to the user task.
If I run cyclictest at prio 51 there are no latencies. Only if I go to 50 I immediately
see the 1 - 1.5 ms on CPU2 if I start a file copy -- which is of course as expected.
For the tracing, I at the moment have no idea how to set it up. I only did some
tracing which was triggered by cyclictest. But that won't help here...
Martin
next prev parent reply other threads:[~2019-09-06 13:23 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-03 7:23 Long latencies during disk-io Martin.Wirth
2019-09-05 15:41 ` Sebastian Andrzej Siewior
2019-09-06 13:23 ` Martin.Wirth [this message]
2019-09-06 22:06 ` Austin Schuh
2019-09-10 12:04 ` AW: " Martin.Wirth
2019-09-13 13:36 ` Sebastian Andrzej Siewior
2019-09-13 14:09 ` AW: " Martin.Wirth
2019-09-13 15:06 ` Sebastian Andrzej Siewior
2019-09-16 11:30 ` AW: " Martin.Wirth
2019-09-17 10:52 ` Sebastian Andrzej Siewior
2019-09-17 11:09 ` AW: " Martin.Wirth
2019-09-17 11:20 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1A2AF5F768264E479963C4D8C015EB696C51159F@DLDEFFMIMP02EXC.intra.dlr.de \
--to=martin.wirth@dlr.de \
--cc=bigeasy@linutronix.de \
--cc=linux-rt-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).