All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Lacking instrumentation for threaded IRQs
       [not found] <e6194c95-9580-f24f-e8e2-1df8e67f804b@versatic.net>
@ 2018-10-02 14:23 ` Julien Desfossez
       [not found] ` <b1c31153-da2f-0d40-a678-4428c78dd11e@klipix.org>
  1 sibling, 0 replies; 6+ messages in thread
From: Julien Desfossez @ 2018-10-02 14:23 UTC (permalink / raw)
  To: Geneviève Bastien, lttng-dev

On 10/01/2018 01:25 PM, Geneviève Bastien wrote:
> Hi,
> 
> 
> I was writing a tutorial on distributed critical path in Trace Compass.
> I was tracing a client (my laptop on wifi) and a server. And to my
> surprise, it didn't work! After further investigation, here's the issue:
> 
> 
> The critical path assumes that a network packet is received in a softirq
> that wakes up the process to whom the packet is meant. This is right for
> ethernet network, but not wifi. Wifi packets are handled by a threaded
> IRQ. Here's the flow:
> 
> 
> 1- There's an irq that is handled with a return value of 2 (IRQ_WAKE_THREAD)
> 
> 2- The 2 means to wake up a process _after_ the irq_handler_exit
> tracepoint is hit
> 
> 3- The IRQ thread is woken up, scheduled in and is running
> 
> 4- The packet is received during that time
> 
> 5- The packet reception is hence not on the critical path!
> 
> 
> The issue we have here with the critical path for network will also be
> an issue for all threaded irqs and there is some additional
> instrumentation needed to be able to automatically take those into
> account in analyses.
> 
> There's additional instrumentation needed
> 
> 1- To associate the wakeup of the IRQ thread to the actual IRQ. The way
> it is now, the wakeup happens in the context of whatever else was
> running before the IRQ handler.
> 
> 2- To make it obvious that the IRQ thread is really an IRQ thread, so
> whatever is running there should be considered in the context of a
> threaded IRQ and not a simple running process.
> 
> 
> Here are some options for each, not mutually exclusive:
> 
> For 1-
> 
> * Instrument the kernel: Add tracepoints to the beginning and end of
> do_IRQ, to show the full IRQ duration, not just the handler and all
> events happening during that time can be associated with the IRQ.
> 
> * Instrument the kernel: Add tracepoints before and after waking
> processes if the return value of the handler is IRQ_WAKE_THREAD. Those
> tracepoints would be hit only if a process needs to be woken and would
> not add overhead otherwise
> 
> * At analysis time: Look at the return value of irq_handler_exit and if
> 2, wait for the next waking on that CPU and associate it with the last
> IRQ. But could there be more than one process to wakeup? Looking at the
> kernel code, it seems that not. But this is more error-prone.
> 
> 
> For 2-
> 
> * Instrument the kernel: Add tracepoints to the beginning and end of the
> irq_thread_fn function. Those events would indicate the context of a
> threaded IRQ so all events inside can be considered the results of an IRQ.
> 
> * Instrument LTTng: Add a field to the lttng_statedump_interrupt
> tracepoint to add the thread handling this IRQ if available. That would
> give this thread a special state: instead of running, whatever happens
> there would be considered as the result of an IRQ. But only LTTng traces
> would benefit from this tracepoint, and only if there's a statedump (so
> not snapshot). It's sad... But would make a quick win for now.
> 
> 
> Here's a pad showing those possible options:
> https://hebdo.framapad.org/p/threadedirqlttnglinux8gjkh51yhfha
> 
> What do you think? I'm mostly interested to know if there's a case to
> add instrumentation to the kernel and where.

For #1, I think adding the do_IRQ begin/end instrumentation, would be
very valuable, we did it with a kretprobe in the latency-tracker, but I
think this deserves a real tracepoint. In your example trace, we can see
with the do_IRQ event that the whole interruption lasted 5.2
micro-seconds, whereas the interrupt handler only lasted 482ns, so
without this event we are clearly missing long periods of interruption.

Using the "ret = 2" would work as a temporary fix, but there are some
corner cases where it could lead us into wrong assumptions (the process
to be woken up is exiting somehow and the interrupt arrived during a
syscall that needed to wake another process), so I think the real fix is
the do_IRQ instrumentation.

For #2, I am not exactly sure why we need special events for those
kernel threads, with the fix for #1, we know they are interrupt-related
threads, but they are not interrupts in the sense of "doing work while
the current task is interrupted". To me, everything this thread does is
related to the previous source of wakeup and I would imagine every chunk
of work it does is between an irq_thread_fn_entry/return block.
I am not against adding this information (or the statedump approach if
we are sure there is a 1 to 1 mapping), but I would like to understand
more precisely what this information would add to our understanding of
the trace.

Thanks,

Julien

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Lacking instrumentation for threaded IRQs
       [not found] ` <b1c31153-da2f-0d40-a678-4428c78dd11e@klipix.org>
@ 2018-10-02 15:40   ` Geneviève Bastien
       [not found]   ` <a3fe526e-710a-715f-05ce-36eb92602e5a@versatic.net>
  1 sibling, 0 replies; 6+ messages in thread
From: Geneviève Bastien @ 2018-10-02 15:40 UTC (permalink / raw)
  To: Julien Desfossez, lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 6446 bytes --]

Hi,

On 2018-10-02 10:23 a.m., Julien Desfossez wrote:
> On 10/01/2018 01:25 PM, Geneviève Bastien wrote:
>> Hi,
>>
>>
>> I was writing a tutorial on distributed critical path in Trace Compass.
>> I was tracing a client (my laptop on wifi) and a server. And to my
>> surprise, it didn't work! After further investigation, here's the issue:
>>
>>
>> The critical path assumes that a network packet is received in a softirq
>> that wakes up the process to whom the packet is meant. This is right for
>> ethernet network, but not wifi. Wifi packets are handled by a threaded
>> IRQ. Here's the flow:
>>
>>
>> 1- There's an irq that is handled with a return value of 2 (IRQ_WAKE_THREAD)
>>
>> 2- The 2 means to wake up a process _after_ the irq_handler_exit
>> tracepoint is hit
>>
>> 3- The IRQ thread is woken up, scheduled in and is running
>>
>> 4- The packet is received during that time
>>
>> 5- The packet reception is hence not on the critical path!
>>
>>
>> The issue we have here with the critical path for network will also be
>> an issue for all threaded irqs and there is some additional
>> instrumentation needed to be able to automatically take those into
>> account in analyses.
>>
>> There's additional instrumentation needed
>>
>> 1- To associate the wakeup of the IRQ thread to the actual IRQ. The way
>> it is now, the wakeup happens in the context of whatever else was
>> running before the IRQ handler.
>>
>> 2- To make it obvious that the IRQ thread is really an IRQ thread, so
>> whatever is running there should be considered in the context of a
>> threaded IRQ and not a simple running process.
>>
>>
>> Here are some options for each, not mutually exclusive:
>>
>> For 1-
>>
>> * Instrument the kernel: Add tracepoints to the beginning and end of
>> do_IRQ, to show the full IRQ duration, not just the handler and all
>> events happening during that time can be associated with the IRQ.
>>
>> * Instrument the kernel: Add tracepoints before and after waking
>> processes if the return value of the handler is IRQ_WAKE_THREAD. Those
>> tracepoints would be hit only if a process needs to be woken and would
>> not add overhead otherwise
>>
>> * At analysis time: Look at the return value of irq_handler_exit and if
>> 2, wait for the next waking on that CPU and associate it with the last
>> IRQ. But could there be more than one process to wakeup? Looking at the
>> kernel code, it seems that not. But this is more error-prone.
>>
>>
>> For 2-
>>
>> * Instrument the kernel: Add tracepoints to the beginning and end of the
>> irq_thread_fn function. Those events would indicate the context of a
>> threaded IRQ so all events inside can be considered the results of an IRQ.
>>
>> * Instrument LTTng: Add a field to the lttng_statedump_interrupt
>> tracepoint to add the thread handling this IRQ if available. That would
>> give this thread a special state: instead of running, whatever happens
>> there would be considered as the result of an IRQ. But only LTTng traces
>> would benefit from this tracepoint, and only if there's a statedump (so
>> not snapshot). It's sad... But would make a quick win for now.
>>
>>
>> Here's a pad showing those possible options:
>> https://hebdo.framapad.org/p/threadedirqlttnglinux8gjkh51yhfha
>>
>> What do you think? I'm mostly interested to know if there's a case to
>> add instrumentation to the kernel and where.
> For #1, I think adding the do_IRQ begin/end instrumentation, would be
> very valuable, we did it with a kretprobe in the latency-tracker, but I
> think this deserves a real tracepoint. In your example trace, we can see
> with the do_IRQ event that the whole interruption lasted 5.2
> micro-seconds, whereas the interrupt handler only lasted 482ns, so
> without this event we are clearly missing long periods of interruption.
>
> Using the "ret = 2" would work as a temporary fix, but there are some
> corner cases where it could lead us into wrong assumptions (the process
> to be woken up is exiting somehow and the interrupt arrived during a
> syscall that needed to wake another process), so I think the real fix is
> the do_IRQ instrumentation.
I agree
> For #2, I am not exactly sure why we need special events for those
> kernel threads, with the fix for #1, we know they are interrupt-related
> threads, but they are not interrupts in the sense of "doing work while
> the current task is interrupted". To me, everything this thread does is
> related to the previous source of wakeup and I would imagine every chunk
> of work it does is between an irq_thread_fn_entry/return block.
> I am not against adding this information (or the statedump approach if
> we are sure there is a 1 to 1 mapping), but I would like to understand
> more precisely what this information would add to our understanding of
> the trace.
>
We need to know that those threads are really interrupt thread for the
dependency analysis. Look at the following sequence of events:

[12:02:17.779984311] (+0.000002224) dorsallaptop *sched_waking*: {
cpu_id = 3 }, { comm = "irq/130-iwlwifi", tid = *639*, prio = -51,
target_cpu = 3 }
[12:02:17.779985668] (+0.000000565) dorsallaptop *do_IRQ_return*: {
cpu_id = 3 }, { ip = 0xFFFFFFFF89A01C30, parent_ip = 0xFFFFFFFF89A0098F }
[12:02:17.779986575] (+0.000000907) dorsallaptop *sched_switch*: {
cpu_id = 3 }, { prev_comm = "swapper/3", prev_tid = 0, prev_prio = 20,
prev_state = 0, next_comm = "irq/130-iwlwifi", next_tid = *639*,
next_prio = -51 }
[12:02:17.779998608] (+0.000011174) dorsallaptop net_if_receive_skb: {
cpu_id = 3 }, { ... }
[12:02:17.780019468] (+0.000020860) dorsallaptop sched_waking: { cpu_id
= 3 }, { comm = "wget", tid = 2430, prio = 20, target_cpu = 0 }

If we do the critical path of the wget process, with the irq/130-iwlwifi
thread considered a normal thread, we see it is woken up by an hardware
interrupt, that we know if iwlwifi, but we don't know anything else
about it, so the critical path ends up following the swapper process...
Now if we consider the irq/130-iwlwifi thread as an interrupt-running
process, we see that wget is actually woken up by the net_if_receive_skb
event and we can follow this to the packet sent by another machine.
Of course this can all be done at analysis time: if (ret = 2 and a
wakeup inside the do_IRQ but outside the irq_handler) { assume the woken
thread will be handling the interrupt } But it's the "assume" part that
I don't like. With instrumentation, we can clarify this.

[-- Attachment #1.2: Type: text/html, Size: 9073 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Lacking instrumentation for threaded IRQs
       [not found]   ` <a3fe526e-710a-715f-05ce-36eb92602e5a@versatic.net>
@ 2018-10-02 16:15     ` Julien Desfossez
       [not found]     ` <62b948bf-8d3d-62a9-6a66-526d74330371@klipix.org>
  1 sibling, 0 replies; 6+ messages in thread
From: Julien Desfossez @ 2018-10-02 16:15 UTC (permalink / raw)
  To: Geneviève Bastien, lttng-dev

[...]

>>> For 2-
>>>
>>> * Instrument the kernel: Add tracepoints to the beginning and end of the
>>> irq_thread_fn function. Those events would indicate the context of a
>>> threaded IRQ so all events inside can be considered the results of an IRQ.
>>>
>>> * Instrument LTTng: Add a field to the lttng_statedump_interrupt
>>> tracepoint to add the thread handling this IRQ if available. That would
>>> give this thread a special state: instead of running, whatever happens
>>> there would be considered as the result of an IRQ. But only LTTng traces
>>> would benefit from this tracepoint, and only if there's a statedump (so
>>> not snapshot). It's sad... But would make a quick win for now.
>>>
>>>
>>> Here's a pad showing those possible options:
>>> https://hebdo.framapad.org/p/threadedirqlttnglinux8gjkh51yhfha
[...]

>> For #2, I am not exactly sure why we need special events for those
>> kernel threads, with the fix for #1, we know they are interrupt-related
>> threads, but they are not interrupts in the sense of "doing work while
>> the current task is interrupted". To me, everything this thread does is
>> related to the previous source of wakeup and I would imagine every chunk
>> of work it does is between an irq_thread_fn_entry/return block.
>> I am not against adding this information (or the statedump approach if
>> we are sure there is a 1 to 1 mapping), but I would like to understand
>> more precisely what this information would add to our understanding of
>> the trace.
>>
> We need to know that those threads are really interrupt thread for the
> dependency analysis. Look at the following sequence of events:
> 
> [12:02:17.779984311] (+0.000002224) dorsallaptop *sched_waking*: {
> cpu_id = 3 }, { comm = "irq/130-iwlwifi", tid = *639*, prio = -51,
> target_cpu = 3 }
> [12:02:17.779985668] (+0.000000565) dorsallaptop *do_IRQ_return*: {
> cpu_id = 3 }, { ip = 0xFFFFFFFF89A01C30, parent_ip = 0xFFFFFFFF89A0098F }
> [12:02:17.779986575] (+0.000000907) dorsallaptop *sched_switch*: {
> cpu_id = 3 }, { prev_comm = "swapper/3", prev_tid = 0, prev_prio = 20,
> prev_state = 0, next_comm = "irq/130-iwlwifi", next_tid = *639*,
> next_prio = -51 }
> [12:02:17.779998608] (+0.000011174) dorsallaptop net_if_receive_skb: {
> cpu_id = 3 }, { ... }
> [12:02:17.780019468] (+0.000020860) dorsallaptop sched_waking: { cpu_id
> = 3 }, { comm = "wget", tid = 2430, prio = 20, target_cpu = 0 }
> 
> If we do the critical path of the wget process, with the irq/130-iwlwifi
> thread considered a normal thread, we see it is woken up by an hardware
> interrupt, that we know if iwlwifi, but we don't know anything else
> about it, so the critical path ends up following the swapper process...
> Now if we consider the irq/130-iwlwifi thread as an interrupt-running
> process, we see that wget is actually woken up by the net_if_receive_skb
> event and we can follow this to the packet sent by another machine.
> Of course this can all be done at analysis time: if (ret = 2 and a
> wakeup inside the do_IRQ but outside the irq_handler) { assume the woken
> thread will be handling the interrupt } But it's the "assume" part that
> I don't like. With instrumentation, we can clarify this.

In this example, assuming we have the do_IRQ event, the critical path
seems clear to me:
- irq wakes up task 639
- task 639 runs, receives a packet and wakes up another task

So wget has been woken up by a thread that was itself woken up by an
interrupt, I don't see how swapper could end up in this chain if we have
the do_IRQ event.

I think the distinction with your explanation is that I see the
receive_skb as context around the wake up, but the source of wake up
(the transition in the dependency graph) is the wakeup occurring while
process 639 running, and that trace backs to the IRQ.

Julien
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Lacking instrumentation for threaded IRQs
       [not found]     ` <62b948bf-8d3d-62a9-6a66-526d74330371@klipix.org>
@ 2018-10-02 17:17       ` Geneviève Bastien
       [not found]       ` <22e04d9c-bddd-dd45-ff39-8c15bd0fada4@versatic.net>
  1 sibling, 0 replies; 6+ messages in thread
From: Geneviève Bastien @ 2018-10-02 17:17 UTC (permalink / raw)
  To: Julien Desfossez, lttng-dev


[...]

>> We need to know that those threads are really interrupt thread for the
>> dependency analysis. Look at the following sequence of events:
>>
>> [12:02:17.779984311] (+0.000002224) dorsallaptop *sched_waking*: {
>> cpu_id = 3 }, { comm = "irq/130-iwlwifi", tid = *639*, prio = -51,
>> target_cpu = 3 }
>> [12:02:17.779985668] (+0.000000565) dorsallaptop *do_IRQ_return*: {
>> cpu_id = 3 }, { ip = 0xFFFFFFFF89A01C30, parent_ip = 0xFFFFFFFF89A0098F }
>> [12:02:17.779986575] (+0.000000907) dorsallaptop *sched_switch*: {
>> cpu_id = 3 }, { prev_comm = "swapper/3", prev_tid = 0, prev_prio = 20,
>> prev_state = 0, next_comm = "irq/130-iwlwifi", next_tid = *639*,
>> next_prio = -51 }
>> [12:02:17.779998608] (+0.000011174) dorsallaptop net_if_receive_skb: {
>> cpu_id = 3 }, { ... }
>> [12:02:17.780019468] (+0.000020860) dorsallaptop sched_waking: { cpu_id
>> = 3 }, { comm = "wget", tid = 2430, prio = 20, target_cpu = 0 }
>>
>> If we do the critical path of the wget process, with the irq/130-iwlwifi
>> thread considered a normal thread, we see it is woken up by an hardware
>> interrupt, that we know if iwlwifi, but we don't know anything else
>> about it, so the critical path ends up following the swapper process...
>> Now if we consider the irq/130-iwlwifi thread as an interrupt-running
>> process, we see that wget is actually woken up by the net_if_receive_skb
>> event and we can follow this to the packet sent by another machine.
>> Of course this can all be done at analysis time: if (ret = 2 and a
>> wakeup inside the do_IRQ but outside the irq_handler) { assume the woken
>> thread will be handling the interrupt } But it's the "assume" part that
>> I don't like. With instrumentation, we can clarify this.
> In this example, assuming we have the do_IRQ event, the critical path
> seems clear to me:
> - irq wakes up task 639
> - task 639 runs, receives a packet and wakes up another task
>
> So wget has been woken up by a thread that was itself woken up by an
> interrupt, I don't see how swapper could end up in this chain if we have
> the do_IRQ event.

That's just because it's the thread that was running when the interrupt
occurred and that's how the critical path handles it now. But that can
be fixed.

But more importantly, task 639 was woken up so that it could receive the
packet and wake up the other task, so that other task was really woken
up by the packet reception, which leads the critical path to the process
sending it. Compare this with what happens on eth interfaces, where the
packet is received in softirqs:

[15:17:55.861738353] (+0.000000642) wilbrod irq_softirq_entry: { cpu_id
= 1 }, { vec = 3 }
[15:17:55.861739968] (+0.000001615) wilbrod net_if_receive_skb: { cpu_id
= 1 }, { ...}
[15:17:55.861746816] (+0.000006848) wilbrod sched_waking: { cpu_id = 1
}, { comm = "sshd", tid = 6053, prio = 20, target_cpu = 5 }
[15:17:55.861757051] (+0.000005023) wilbrod irq_softirq_exit: { cpu_id =
1 }, { vec = 3 }

There the packet being received inside an interrupt context, the packet
is really the cause of the wakeup, not the hardware interrupt preceding
it. In threaded IRQ example, it would be the same if we could assign
some kind of irq context to task 639. And for that, we want to make sure
task 639 is an interrupt thread.

Geneviève


_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Lacking instrumentation for threaded IRQs
       [not found]       ` <22e04d9c-bddd-dd45-ff39-8c15bd0fada4@versatic.net>
@ 2018-10-02 17:51         ` Mathieu Desnoyers
  0 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2018-10-02 17:51 UTC (permalink / raw)
  To: Geneviève Bastien; +Cc: Julien Desfossez, lttng-dev

----- On Oct 2, 2018, at 1:17 PM, Geneviève Bastien gbastien@versatic.net wrote:

> [...]
> 
>>> We need to know that those threads are really interrupt thread for the
>>> dependency analysis. Look at the following sequence of events:
>>>
>>> [12:02:17.779984311] (+0.000002224) dorsallaptop *sched_waking*: {
>>> cpu_id = 3 }, { comm = "irq/130-iwlwifi", tid = *639*, prio = -51,
>>> target_cpu = 3 }
>>> [12:02:17.779985668] (+0.000000565) dorsallaptop *do_IRQ_return*: {
>>> cpu_id = 3 }, { ip = 0xFFFFFFFF89A01C30, parent_ip = 0xFFFFFFFF89A0098F }
>>> [12:02:17.779986575] (+0.000000907) dorsallaptop *sched_switch*: {
>>> cpu_id = 3 }, { prev_comm = "swapper/3", prev_tid = 0, prev_prio = 20,
>>> prev_state = 0, next_comm = "irq/130-iwlwifi", next_tid = *639*,
>>> next_prio = -51 }
>>> [12:02:17.779998608] (+0.000011174) dorsallaptop net_if_receive_skb: {
>>> cpu_id = 3 }, { ... }
>>> [12:02:17.780019468] (+0.000020860) dorsallaptop sched_waking: { cpu_id
>>> = 3 }, { comm = "wget", tid = 2430, prio = 20, target_cpu = 0 }
>>>
>>> If we do the critical path of the wget process, with the irq/130-iwlwifi
>>> thread considered a normal thread, we see it is woken up by an hardware
>>> interrupt, that we know if iwlwifi, but we don't know anything else
>>> about it, so the critical path ends up following the swapper process...

No, that's wrong. The dependency chain should take into account execution
contexts (irq, softirq, thread), not just the current thread. It should then
follow the wakeups.

The sched_waking here should belong to the interrupt handler context, which
would be made possible by adding the do_IRQ entry/exit instrumentation.

I don't see any valid reason to distinguish between interrupt-threads and
normal threads once this is solved.

>>> Now if we consider the irq/130-iwlwifi thread as an interrupt-running
>>> process, we see that wget is actually woken up by the net_if_receive_skb
>>> event and we can follow this to the packet sent by another machine.
>>> Of course this can all be done at analysis time: if (ret = 2 and a
>>> wakeup inside the do_IRQ but outside the irq_handler) { assume the woken
>>> thread will be handling the interrupt } But it's the "assume" part that
>>> I don't like. With instrumentation, we can clarify this.
>> In this example, assuming we have the do_IRQ event, the critical path
>> seems clear to me:
>> - irq wakes up task 639
>> - task 639 runs, receives a packet and wakes up another task
>>
>> So wget has been woken up by a thread that was itself woken up by an
>> interrupt, I don't see how swapper could end up in this chain if we have
>> the do_IRQ event.
> 
> That's just because it's the thread that was running when the interrupt
> occurred and that's how the critical path handles it now. But that can
> be fixed.

It seems to be an issue with the critical path analysis model being inaccurate.
AFAIK those cases are handled correctly by the model implemented within
Julien's latency tracker. Modeling the interrupt context surrounded by do_IRQ
entry/exit separately from the thread it nests upon should fix that.

Thanks,

Mathieu


> 
> But more importantly, task 639 was woken up so that it could receive the
> packet and wake up the other task, so that other task was really woken
> up by the packet reception, which leads the critical path to the process
> sending it. Compare this with what happens on eth interfaces, where the
> packet is received in softirqs:
> 
> [15:17:55.861738353] (+0.000000642) wilbrod irq_softirq_entry: { cpu_id
> = 1 }, { vec = 3 }
> [15:17:55.861739968] (+0.000001615) wilbrod net_if_receive_skb: { cpu_id
> = 1 }, { ...}
> [15:17:55.861746816] (+0.000006848) wilbrod sched_waking: { cpu_id = 1
> }, { comm = "sshd", tid = 6053, prio = 20, target_cpu = 5 }
> [15:17:55.861757051] (+0.000005023) wilbrod irq_softirq_exit: { cpu_id =
> 1 }, { vec = 3 }
> 
> There the packet being received inside an interrupt context, the packet
> is really the cause of the wakeup, not the hardware interrupt preceding
> it. In threaded IRQ example, it would be the same if we could assign
> some kind of irq context to task 639. And for that, we want to make sure
> task 639 is an interrupt thread.
> 
> Geneviève

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Lacking instrumentation for threaded IRQs
@ 2018-10-01 17:25 Geneviève Bastien
  0 siblings, 0 replies; 6+ messages in thread
From: Geneviève Bastien @ 2018-10-01 17:25 UTC (permalink / raw)
  To: lttng-dev; +Cc: ju

Hi,


I was writing a tutorial on distributed critical path in Trace Compass.
I was tracing a client (my laptop on wifi) and a server. And to my
surprise, it didn't work! After further investigation, here's the issue:


The critical path assumes that a network packet is received in a softirq
that wakes up the process to whom the packet is meant. This is right for
ethernet network, but not wifi. Wifi packets are handled by a threaded
IRQ. Here's the flow:


1- There's an irq that is handled with a return value of 2 (IRQ_WAKE_THREAD)

2- The 2 means to wake up a process _after_ the irq_handler_exit
tracepoint is hit

3- The IRQ thread is woken up, scheduled in and is running

4- The packet is received during that time

5- The packet reception is hence not on the critical path!


The issue we have here with the critical path for network will also be
an issue for all threaded irqs and there is some additional
instrumentation needed to be able to automatically take those into
account in analyses.

There's additional instrumentation needed

1- To associate the wakeup of the IRQ thread to the actual IRQ. The way
it is now, the wakeup happens in the context of whatever else was
running before the IRQ handler.

2- To make it obvious that the IRQ thread is really an IRQ thread, so
whatever is running there should be considered in the context of a
threaded IRQ and not a simple running process.


Here are some options for each, not mutually exclusive:

For 1-

* Instrument the kernel: Add tracepoints to the beginning and end of
do_IRQ, to show the full IRQ duration, not just the handler and all
events happening during that time can be associated with the IRQ.

* Instrument the kernel: Add tracepoints before and after waking
processes if the return value of the handler is IRQ_WAKE_THREAD. Those
tracepoints would be hit only if a process needs to be woken and would
not add overhead otherwise

* At analysis time: Look at the return value of irq_handler_exit and if
2, wait for the next waking on that CPU and associate it with the last
IRQ. But could there be more than one process to wakeup? Looking at the
kernel code, it seems that not. But this is more error-prone.


For 2-

* Instrument the kernel: Add tracepoints to the beginning and end of the
irq_thread_fn function. Those events would indicate the context of a
threaded IRQ so all events inside can be considered the results of an IRQ.

* Instrument LTTng: Add a field to the lttng_statedump_interrupt
tracepoint to add the thread handling this IRQ if available. That would
give this thread a special state: instead of running, whatever happens
there would be considered as the result of an IRQ. But only LTTng traces
would benefit from this tracepoint, and only if there's a statedump (so
not snapshot). It's sad... But would make a quick win for now.


Here's a pad showing those possible options:
https://hebdo.framapad.org/p/threadedirqlttnglinux8gjkh51yhfha

What do you think? I'm mostly interested to know if there's a case to
add instrumentation to the kernel and where.


Thanks,

Geneviève


_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-02 17:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e6194c95-9580-f24f-e8e2-1df8e67f804b@versatic.net>
2018-10-02 14:23 ` Lacking instrumentation for threaded IRQs Julien Desfossez
     [not found] ` <b1c31153-da2f-0d40-a678-4428c78dd11e@klipix.org>
2018-10-02 15:40   ` Geneviève Bastien
     [not found]   ` <a3fe526e-710a-715f-05ce-36eb92602e5a@versatic.net>
2018-10-02 16:15     ` Julien Desfossez
     [not found]     ` <62b948bf-8d3d-62a9-6a66-526d74330371@klipix.org>
2018-10-02 17:17       ` Geneviève Bastien
     [not found]       ` <22e04d9c-bddd-dd45-ff39-8c15bd0fada4@versatic.net>
2018-10-02 17:51         ` Mathieu Desnoyers
2018-10-01 17:25 Geneviève Bastien

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.