All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel ISR Latency
@ 2017-04-10 19:12 Brian Wrenn
  2017-04-10 19:15 ` Brian Wrenn
  2017-04-10 19:55 ` Julia Cartwright
  0 siblings, 2 replies; 9+ messages in thread
From: Brian Wrenn @ 2017-04-10 19:12 UTC (permalink / raw)
  To: linux-rt-users

Hi, I'm trying to track down the source of some ISR latency that I'm
observing on a logic analyzer.

Just for some background, my setup is the following.  I'm running
Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
My kernel has the following PREEMPT RT options.

CONFIG_PREEMPT_RCU=y
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_RT_BASE=y
CONFIG_HAVE_PREEMPT_LAZY=y
CONFIG_PREEMPT_LAZY=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT__LL is not set
# CONFIG_PREEMPT_RTB is not set
CONFIG_PREEMPT_RT_FULL=y
CONFIG_PREEMPT_COUNT=y
# CONFIG_DEBUG_PREEMPT is not set

I have a micro-controller attached to a GPIO pin and two separate SPI
ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
data to each of 2 SPI ports (total of 36 Kbytes between the two) and
issues an interrupt on the GPIO.  A kernel module has an ISR
registered to the GPIO pin.  When the ISR executes, it copies 18
Kbytes of data from each SPI.  There are two RT threads executing in
user space as SCHED_RR with identical priorities less than (i.e. lower
than) 50, where as I understand the kernel runs at priority 50.  Each
of those user space threads copies data from some shared memory within
the kernel module and performs some basic data integrity checks, i.e.
a SHA1.  A third thread runs non-RT.  It doesn't do anything other
than spawn the two user space RT threads and waits for them to
complete.

The vast majority of interrupts incur a delay in range of tens of
MICROseconds, meaning the ISR executes within such amount of time of
the micro-controller raising the GPIO pin of the ISR.  However,
occasionally, more than six MILLIseconds passes between the time of
the micro-controller issuing the GPIO interrupt and the ISR beginning
to execute.

I'm trying to track down the cause of the 6+ MILLIsecond latency,
which I would not expect to happen.  Are there any well practiced
techniques for determining what factor(s) may be introducing this
latency?  I looked at the description for RT Watchdog, but I don't
think it would help me much because the only two RT threads other than
the kernel have a lower priority.

Thanks for any guidance.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
  2017-04-10 19:12 Kernel ISR Latency Brian Wrenn
@ 2017-04-10 19:15 ` Brian Wrenn
  2017-04-10 19:55 ` Julia Cartwright
  1 sibling, 0 replies; 9+ messages in thread
From: Brian Wrenn @ 2017-04-10 19:15 UTC (permalink / raw)
  To: linux-rt-users

P.S. I attempted to apply the IRQF_NO_THREAD flag to the ISR, but that
increased the frequency of the high latency ISR iterations.

On Mon, Apr 10, 2017 at 3:12 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
> Hi, I'm trying to track down the source of some ISR latency that I'm
> observing on a logic analyzer.
>
> Just for some background, my setup is the following.  I'm running
> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
> My kernel has the following PREEMPT RT options.
>
> CONFIG_PREEMPT_RCU=y
> CONFIG_PREEMPT_NOTIFIERS=y
> CONFIG_PREEMPT=y
> CONFIG_PREEMPT_RT_BASE=y
> CONFIG_HAVE_PREEMPT_LAZY=y
> CONFIG_PREEMPT_LAZY=y
> # CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> # CONFIG_PREEMPT__LL is not set
> # CONFIG_PREEMPT_RTB is not set
> CONFIG_PREEMPT_RT_FULL=y
> CONFIG_PREEMPT_COUNT=y
> # CONFIG_DEBUG_PREEMPT is not set
>
> I have a micro-controller attached to a GPIO pin and two separate SPI
> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
> issues an interrupt on the GPIO.  A kernel module has an ISR
> registered to the GPIO pin.  When the ISR executes, it copies 18
> Kbytes of data from each SPI.  There are two RT threads executing in
> user space as SCHED_RR with identical priorities less than (i.e. lower
> than) 50, where as I understand the kernel runs at priority 50.  Each
> of those user space threads copies data from some shared memory within
> the kernel module and performs some basic data integrity checks, i.e.
> a SHA1.  A third thread runs non-RT.  It doesn't do anything other
> than spawn the two user space RT threads and waits for them to
> complete.
>
> The vast majority of interrupts incur a delay in range of tens of
> MICROseconds, meaning the ISR executes within such amount of time of
> the micro-controller raising the GPIO pin of the ISR.  However,
> occasionally, more than six MILLIseconds passes between the time of
> the micro-controller issuing the GPIO interrupt and the ISR beginning
> to execute.
>
> I'm trying to track down the cause of the 6+ MILLIsecond latency,
> which I would not expect to happen.  Are there any well practiced
> techniques for determining what factor(s) may be introducing this
> latency?  I looked at the description for RT Watchdog, but I don't
> think it would help me much because the only two RT threads other than
> the kernel have a lower priority.
>
> Thanks for any guidance.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
  2017-04-10 19:12 Kernel ISR Latency Brian Wrenn
  2017-04-10 19:15 ` Brian Wrenn
@ 2017-04-10 19:55 ` Julia Cartwright
       [not found]   ` <CAEYdEeR4Y138B9NinTso3hvJ6trEUa5OprsNEecDHFFrmP8j4Q@mail.gmail.com>
  1 sibling, 1 reply; 9+ messages in thread
From: Julia Cartwright @ 2017-04-10 19:55 UTC (permalink / raw)
  To: Brian Wrenn; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 2919 bytes --]

On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:
> Hi, I'm trying to track down the source of some ISR latency that I'm
> observing on a logic analyzer.
> 
> Just for some background, my setup is the following.  I'm running
> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
> My kernel has the following PREEMPT RT options.
[..]
> 
> I have a micro-controller attached to a GPIO pin and two separate SPI
> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
> issues an interrupt on the GPIO.  A kernel module has an ISR
> registered to the GPIO pin.  When the ISR executes, it copies 18
> Kbytes of data from each SPI.  There are two RT threads executing in
> user space as SCHED_RR with identical priorities less than (i.e. lower
> than) 50, where as I understand the kernel runs at priority 50.

It's a bit more complicated than "the kernel runs at priority 50".

Perhaps what you're getting at is that by default, with irq forced
threading enabled (which is the default with -rt build kernels) the
irqthreads have the default scheduling policy FIFO with priority 50.
It's up to the administrator/system integrator to appropriately
prioritize these irqthreads to suit the application.

For your usecase, this means that likely _all_ of the peripheral
interrupts, including the low priority stuff you might not care about
for your application (like networking or uart or whatever) are being
prioritized right alongside your latency sensitive gpio handler.

> Each of those user space threads copies data from some shared memory
> within the kernel module and performs some basic data integrity
> checks, i.e.  a SHA1.

Just to better understand: how are your RT threads signalled by your
gpio handler?

> The vast majority of interrupts incur a delay in range of tens of
> MICROseconds, meaning the ISR executes within such amount of time of
> the micro-controller raising the GPIO pin of the ISR.  However,
> occasionally, more than six MILLIseconds passes between the time of
> the micro-controller issuing the GPIO interrupt and the ISR beginning
> to execute.

Also: how are you measuring this latency?  In particular, where is your
test point for the "ISR beginning"?
>
> I'm trying to track down the cause of the 6+ MILLIsecond latency,
> which I would not expect to happen.  Are there any well practiced
> techniques for determining what factor(s) may be introducing this
> latency?  I looked at the description for RT Watchdog, but I don't
> think it would help me much because the only two RT threads other than
> the kernel have a lower priority.

I don't think the RT watchdog will help you here, however, I would
recommend that you get acquainted with using ftrace/trace-cmd and the
irq and sched traceevents.

   Julia

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
       [not found]   ` <CAEYdEeR4Y138B9NinTso3hvJ6trEUa5OprsNEecDHFFrmP8j4Q@mail.gmail.com>
@ 2017-04-10 21:31     ` Brian Wrenn
  2017-04-11  3:36       ` Brian Wrenn
  2017-04-11 22:06       ` Clark Williams
  0 siblings, 2 replies; 9+ messages in thread
From: Brian Wrenn @ 2017-04-10 21:31 UTC (permalink / raw)
  To: Julia Cartwright; +Cc: linux-rt-users

I imagine that it would be a bad idea to prioritize my ISR ahead of
any one of these?

   12  FF     99 139   - [posixcputmr/0]
   14  FF     99 139   - [migration/0]
   15  FF     99 139   - [watchdog/0]
   16  FF     99 139   - [watchdog/1]
   17  FF     99 139   - [migration/1]
   21  FF     99 139   - [posixcputmr/1]
   24  FF     99 139   - [watchdog/2]
   25  FF     99 139   - [migration/2]
   29  FF     99 139   - [posixcputmr/2]
   32  FF     99 139   - [watchdog/3]
   33  FF     99 139   - [migration/3]
   37  FF     99 139   - [posixcputmr/3]

At this stage, that's all that prioritized higher.

On Mon, Apr 10, 2017 at 4:48 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
> On Mon, Apr 10, 2017 at 3:55 PM, Julia Cartwright <julia@ni.com> wrote:
>> On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:
>>> Hi, I'm trying to track down the source of some ISR latency that I'm
>>> observing on a logic analyzer.
>>>
>>> Just for some background, my setup is the following.  I'm running
>>> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
>>> My kernel has the following PREEMPT RT options.
>> [..]
>>>
>>> I have a micro-controller attached to a GPIO pin and two separate SPI
>>> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
>>> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
>>> issues an interrupt on the GPIO.  A kernel module has an ISR
>>> registered to the GPIO pin.  When the ISR executes, it copies 18
>>> Kbytes of data from each SPI.  There are two RT threads executing in
>>> user space as SCHED_RR with identical priorities less than (i.e. lower
>>> than) 50, where as I understand the kernel runs at priority 50.
>>
>> It's a bit more complicated than "the kernel runs at priority 50".
>
> Yes, better stated, my kernel space IRQ gets a priority of 50.  (Most
> of the others seem to have 50 as well.)
>
>>
>> Perhaps what you're getting at is that by default, with irq forced
>> threading enabled (which is the default with -rt build kernels) the
>> irqthreads have the default scheduling policy FIFO with priority 50.
>> It's up to the administrator/system integrator to appropriately
>> prioritize these irqthreads to suit the application.
>>
>
> Pursuant to that, I'm reviewing this to gain a better understanding of
> such things.  Please forgive me; I hadn't found this link prior to
> messaging the listserve.
>
> http://subversion.ffado.org/wiki/IrqPriorities
>
>> For your usecase, this means that likely _all_ of the peripheral
>> interrupts, including the low priority stuff you might not care about
>> for your application (like networking or uart or whatever) are being
>> prioritized right alongside your latency sensitive gpio handler.
>>
>>> Each of those user space threads copies data from some shared memory
>>> within the kernel module and performs some basic data integrity
>>> checks, i.e.  a SHA1.
>>
>> Just to better understand: how are your RT threads signalled by your
>> gpio handler?
>
> The user space RT threads have mmap'ed memory shared with the kernel
> module.  The kernel module writes to a designated memory location the
> current read offset so that the user space RT threads know where to
> read from iteration to iteration.
>
>>
>>> The vast majority of interrupts incur a delay in range of tens of
>>> MICROseconds, meaning the ISR executes within such amount of time of
>>> the micro-controller raising the GPIO pin of the ISR.  However,
>>> occasionally, more than six MILLIseconds passes between the time of
>>> the micro-controller issuing the GPIO interrupt and the ISR beginning
>>> to execute.
>>
>> Also: how are you measuring this latency?  In particular, where is your
>> test point for the "ISR beginning"?
>>>
>
> Upon detection of a bad CRC or bad sequence number, the user space
> threads raise another GPIO, not the GPIO of the ISR, another one that
> triggers a logic analyzer to dump a forward captured buffer.  Then, I
> got back through the capture and identify the high latency ISR
> execution.  (See attached image.)
>
>>> I'm trying to track down the cause of the 6+ MILLIsecond latency,
>>> which I would not expect to happen.  Are there any well practiced
>>> techniques for determining what factor(s) may be introducing this
>>> latency?  I looked at the description for RT Watchdog, but I don't
>>> think it would help me much because the only two RT threads other than
>>> the kernel have a lower priority.
>>
>> I don't think the RT watchdog will help you here, however, I would
>> recommend that you get acquainted with using ftrace/trace-cmd and the
>> irq and sched traceevents.
>>
>>    Julia
>
> It sounds like to me that maybe you've confirmed that I'm looking in
> the right direction w.r.t. modifying the kernel ISRs with something
> like 'chrt' and 'rtiq'.  Yes?  But I also will look into
> ftrace/trace-cmd and irq and sched trace events.
>
> Thanks,
> Brian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
  2017-04-10 21:31     ` Brian Wrenn
@ 2017-04-11  3:36       ` Brian Wrenn
  2017-04-11 12:47         ` Brian Wrenn
  2017-04-11 22:06       ` Clark Williams
  1 sibling, 1 reply; 9+ messages in thread
From: Brian Wrenn @ 2017-04-11  3:36 UTC (permalink / raw)
  To: Julia Cartwright; +Cc: linux-rt-users

Update: I tried running the scheme described earlier in this thread
with one modification: no CRC, just sequence checking.  (Each SPI
transmission increments a count held in the first bytes.)  In this
case, no ISR had latency long enough to set the error GPIO (output
from the SOM's perspective) high.  The highest priority ISR, after
those with priority 99, is the SPI read ISR at priority 80.  I set the
two RT threads in user space to priority 10.  So at this point, I'm
trying to determine why the SPI read ISR doesn't preempt those user
space threads in less than 6 milliseconds.  This failure to preempt
happens for between 0.2%-0.01% of the micro-controller issued
interrupts.  The error percentages vary some per run.  FWIW, any
non-RT load I place on the system (generic network and processor
busywork) don't seem to have much impact.

Also, in answering how I'm measuring latency, I reviewed what I wrote
and I think I didn't explain something clearly.  A bad CRC or bad
sequence number makes the the error indicator GPIO pin go high (a GPIO
output from the SOM's perspective/input from the logic analyzer's
perspective).  A high latency (~6 milliseconds after the
micro-controller issued interrupt) invocation of the kernel module ISR
that performs the two SPI transfers results in the subsequent
micro-controller's interrupt occurring during the SPI transfers, which
causes the following SPI transfer to have bad data and hence a bad
CRC.  The .tiff I attached earlier illustrates this.  If you look at
the sad face SPI transfer (it only shows one of the SPIs, not both),
you can see the ISR for the next transfer overlap with the very end of
the sad face transfer.

Thanks,
Brian

On Mon, Apr 10, 2017 at 5:31 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
> I imagine that it would be a bad idea to prioritize my ISR ahead of
> any one of these?
>
>    12  FF     99 139   - [posixcputmr/0]
>    14  FF     99 139   - [migration/0]
>    15  FF     99 139   - [watchdog/0]
>    16  FF     99 139   - [watchdog/1]
>    17  FF     99 139   - [migration/1]
>    21  FF     99 139   - [posixcputmr/1]
>    24  FF     99 139   - [watchdog/2]
>    25  FF     99 139   - [migration/2]
>    29  FF     99 139   - [posixcputmr/2]
>    32  FF     99 139   - [watchdog/3]
>    33  FF     99 139   - [migration/3]
>    37  FF     99 139   - [posixcputmr/3]
>
> At this stage, that's all that prioritized higher.
>
> On Mon, Apr 10, 2017 at 4:48 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
>> On Mon, Apr 10, 2017 at 3:55 PM, Julia Cartwright <julia@ni.com> wrote:
>>> On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:
>>>> Hi, I'm trying to track down the source of some ISR latency that I'm
>>>> observing on a logic analyzer.
>>>>
>>>> Just for some background, my setup is the following.  I'm running
>>>> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
>>>> My kernel has the following PREEMPT RT options.
>>> [..]
>>>>
>>>> I have a micro-controller attached to a GPIO pin and two separate SPI
>>>> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
>>>> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
>>>> issues an interrupt on the GPIO.  A kernel module has an ISR
>>>> registered to the GPIO pin.  When the ISR executes, it copies 18
>>>> Kbytes of data from each SPI.  There are two RT threads executing in
>>>> user space as SCHED_RR with identical priorities less than (i.e. lower
>>>> than) 50, where as I understand the kernel runs at priority 50.
>>>
>>> It's a bit more complicated than "the kernel runs at priority 50".
>>
>> Yes, better stated, my kernel space IRQ gets a priority of 50.  (Most
>> of the others seem to have 50 as well.)
>>
>>>
>>> Perhaps what you're getting at is that by default, with irq forced
>>> threading enabled (which is the default with -rt build kernels) the
>>> irqthreads have the default scheduling policy FIFO with priority 50.
>>> It's up to the administrator/system integrator to appropriately
>>> prioritize these irqthreads to suit the application.
>>>
>>
>> Pursuant to that, I'm reviewing this to gain a better understanding of
>> such things.  Please forgive me; I hadn't found this link prior to
>> messaging the listserve.
>>
>> http://subversion.ffado.org/wiki/IrqPriorities
>>
>>> For your usecase, this means that likely _all_ of the peripheral
>>> interrupts, including the low priority stuff you might not care about
>>> for your application (like networking or uart or whatever) are being
>>> prioritized right alongside your latency sensitive gpio handler.
>>>
>>>> Each of those user space threads copies data from some shared memory
>>>> within the kernel module and performs some basic data integrity
>>>> checks, i.e.  a SHA1.
>>>
>>> Just to better understand: how are your RT threads signalled by your
>>> gpio handler?
>>
>> The user space RT threads have mmap'ed memory shared with the kernel
>> module.  The kernel module writes to a designated memory location the
>> current read offset so that the user space RT threads know where to
>> read from iteration to iteration.
>>
>>>
>>>> The vast majority of interrupts incur a delay in range of tens of
>>>> MICROseconds, meaning the ISR executes within such amount of time of
>>>> the micro-controller raising the GPIO pin of the ISR.  However,
>>>> occasionally, more than six MILLIseconds passes between the time of
>>>> the micro-controller issuing the GPIO interrupt and the ISR beginning
>>>> to execute.
>>>
>>> Also: how are you measuring this latency?  In particular, where is your
>>> test point for the "ISR beginning"?
>>>>
>>
>> Upon detection of a bad CRC or bad sequence number, the user space
>> threads raise another GPIO, not the GPIO of the ISR, another one that
>> triggers a logic analyzer to dump a forward captured buffer.  Then, I
>> got back through the capture and identify the high latency ISR
>> execution.  (See attached image.)
>>
>>>> I'm trying to track down the cause of the 6+ MILLIsecond latency,
>>>> which I would not expect to happen.  Are there any well practiced
>>>> techniques for determining what factor(s) may be introducing this
>>>> latency?  I looked at the description for RT Watchdog, but I don't
>>>> think it would help me much because the only two RT threads other than
>>>> the kernel have a lower priority.
>>>
>>> I don't think the RT watchdog will help you here, however, I would
>>> recommend that you get acquainted with using ftrace/trace-cmd and the
>>> irq and sched traceevents.
>>>
>>>    Julia
>>
>> It sounds like to me that maybe you've confirmed that I'm looking in
>> the right direction w.r.t. modifying the kernel ISRs with something
>> like 'chrt' and 'rtiq'.  Yes?  But I also will look into
>> ftrace/trace-cmd and irq and sched trace events.
>>
>> Thanks,
>> Brian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
  2017-04-11  3:36       ` Brian Wrenn
@ 2017-04-11 12:47         ` Brian Wrenn
  0 siblings, 0 replies; 9+ messages in thread
From: Brian Wrenn @ 2017-04-11 12:47 UTC (permalink / raw)
  To: Julia Cartwright; +Cc: linux-rt-users

Update 2: I ran with a different CRC implementation.  Previously I was
using a kernel based CRC, which uses file descriptors.  So I suspect
that may have been interfering with the preemption process as for
preempting the user space RT threads (priority 10).  I switched to a
boost based CRC, which seems not to interfere with the preemption
model.  The only problem, the user space RT threads take longer to
perform the CRCs in some cases than the scheduled interval.  So it
looks like it comes down to pick my poison, latent ISR calls or CRC
check (which happens on a batch of SPI transfers collected over an
interval, not instantly upon the arrival of each one) interval
overruns.

At this point, I've only done a 5 minute run to investigate this, so I
may uncover something else after doing a longer run (~2 hours), but I
wanted to update the thread in case there are any curious onlookers.
I follow up with some results later today.

On Mon, Apr 10, 2017 at 11:36 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
> Update: I tried running the scheme described earlier in this thread
> with one modification: no CRC, just sequence checking.  (Each SPI
> transmission increments a count held in the first bytes.)  In this
> case, no ISR had latency long enough to set the error GPIO (output
> from the SOM's perspective) high.  The highest priority ISR, after
> those with priority 99, is the SPI read ISR at priority 80.  I set the
> two RT threads in user space to priority 10.  So at this point, I'm
> trying to determine why the SPI read ISR doesn't preempt those user
> space threads in less than 6 milliseconds.  This failure to preempt
> happens for between 0.2%-0.01% of the micro-controller issued
> interrupts.  The error percentages vary some per run.  FWIW, any
> non-RT load I place on the system (generic network and processor
> busywork) don't seem to have much impact.
>
> Also, in answering how I'm measuring latency, I reviewed what I wrote
> and I think I didn't explain something clearly.  A bad CRC or bad
> sequence number makes the the error indicator GPIO pin go high (a GPIO
> output from the SOM's perspective/input from the logic analyzer's
> perspective).  A high latency (~6 milliseconds after the
> micro-controller issued interrupt) invocation of the kernel module ISR
> that performs the two SPI transfers results in the subsequent
> micro-controller's interrupt occurring during the SPI transfers, which
> causes the following SPI transfer to have bad data and hence a bad
> CRC.  The .tiff I attached earlier illustrates this.  If you look at
> the sad face SPI transfer (it only shows one of the SPIs, not both),
> you can see the ISR for the next transfer overlap with the very end of
> the sad face transfer.
>
> Thanks,
> Brian
>
> On Mon, Apr 10, 2017 at 5:31 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
>> I imagine that it would be a bad idea to prioritize my ISR ahead of
>> any one of these?
>>
>>    12  FF     99 139   - [posixcputmr/0]
>>    14  FF     99 139   - [migration/0]
>>    15  FF     99 139   - [watchdog/0]
>>    16  FF     99 139   - [watchdog/1]
>>    17  FF     99 139   - [migration/1]
>>    21  FF     99 139   - [posixcputmr/1]
>>    24  FF     99 139   - [watchdog/2]
>>    25  FF     99 139   - [migration/2]
>>    29  FF     99 139   - [posixcputmr/2]
>>    32  FF     99 139   - [watchdog/3]
>>    33  FF     99 139   - [migration/3]
>>    37  FF     99 139   - [posixcputmr/3]
>>
>> At this stage, that's all that prioritized higher.
>>
>> On Mon, Apr 10, 2017 at 4:48 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
>>> On Mon, Apr 10, 2017 at 3:55 PM, Julia Cartwright <julia@ni.com> wrote:
>>>> On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:
>>>>> Hi, I'm trying to track down the source of some ISR latency that I'm
>>>>> observing on a logic analyzer.
>>>>>
>>>>> Just for some background, my setup is the following.  I'm running
>>>>> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
>>>>> My kernel has the following PREEMPT RT options.
>>>> [..]
>>>>>
>>>>> I have a micro-controller attached to a GPIO pin and two separate SPI
>>>>> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
>>>>> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
>>>>> issues an interrupt on the GPIO.  A kernel module has an ISR
>>>>> registered to the GPIO pin.  When the ISR executes, it copies 18
>>>>> Kbytes of data from each SPI.  There are two RT threads executing in
>>>>> user space as SCHED_RR with identical priorities less than (i.e. lower
>>>>> than) 50, where as I understand the kernel runs at priority 50.
>>>>
>>>> It's a bit more complicated than "the kernel runs at priority 50".
>>>
>>> Yes, better stated, my kernel space IRQ gets a priority of 50.  (Most
>>> of the others seem to have 50 as well.)
>>>
>>>>
>>>> Perhaps what you're getting at is that by default, with irq forced
>>>> threading enabled (which is the default with -rt build kernels) the
>>>> irqthreads have the default scheduling policy FIFO with priority 50.
>>>> It's up to the administrator/system integrator to appropriately
>>>> prioritize these irqthreads to suit the application.
>>>>
>>>
>>> Pursuant to that, I'm reviewing this to gain a better understanding of
>>> such things.  Please forgive me; I hadn't found this link prior to
>>> messaging the listserve.
>>>
>>> http://subversion.ffado.org/wiki/IrqPriorities
>>>
>>>> For your usecase, this means that likely _all_ of the peripheral
>>>> interrupts, including the low priority stuff you might not care about
>>>> for your application (like networking or uart or whatever) are being
>>>> prioritized right alongside your latency sensitive gpio handler.
>>>>
>>>>> Each of those user space threads copies data from some shared memory
>>>>> within the kernel module and performs some basic data integrity
>>>>> checks, i.e.  a SHA1.
>>>>
>>>> Just to better understand: how are your RT threads signalled by your
>>>> gpio handler?
>>>
>>> The user space RT threads have mmap'ed memory shared with the kernel
>>> module.  The kernel module writes to a designated memory location the
>>> current read offset so that the user space RT threads know where to
>>> read from iteration to iteration.
>>>
>>>>
>>>>> The vast majority of interrupts incur a delay in range of tens of
>>>>> MICROseconds, meaning the ISR executes within such amount of time of
>>>>> the micro-controller raising the GPIO pin of the ISR.  However,
>>>>> occasionally, more than six MILLIseconds passes between the time of
>>>>> the micro-controller issuing the GPIO interrupt and the ISR beginning
>>>>> to execute.
>>>>
>>>> Also: how are you measuring this latency?  In particular, where is your
>>>> test point for the "ISR beginning"?
>>>>>
>>>
>>> Upon detection of a bad CRC or bad sequence number, the user space
>>> threads raise another GPIO, not the GPIO of the ISR, another one that
>>> triggers a logic analyzer to dump a forward captured buffer.  Then, I
>>> got back through the capture and identify the high latency ISR
>>> execution.  (See attached image.)
>>>
>>>>> I'm trying to track down the cause of the 6+ MILLIsecond latency,
>>>>> which I would not expect to happen.  Are there any well practiced
>>>>> techniques for determining what factor(s) may be introducing this
>>>>> latency?  I looked at the description for RT Watchdog, but I don't
>>>>> think it would help me much because the only two RT threads other than
>>>>> the kernel have a lower priority.
>>>>
>>>> I don't think the RT watchdog will help you here, however, I would
>>>> recommend that you get acquainted with using ftrace/trace-cmd and the
>>>> irq and sched traceevents.
>>>>
>>>>    Julia
>>>
>>> It sounds like to me that maybe you've confirmed that I'm looking in
>>> the right direction w.r.t. modifying the kernel ISRs with something
>>> like 'chrt' and 'rtiq'.  Yes?  But I also will look into
>>> ftrace/trace-cmd and irq and sched trace events.
>>>
>>> Thanks,
>>> Brian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
  2017-04-10 21:31     ` Brian Wrenn
  2017-04-11  3:36       ` Brian Wrenn
@ 2017-04-11 22:06       ` Clark Williams
  2017-04-12 23:50         ` Brian Wrenn
  1 sibling, 1 reply; 9+ messages in thread
From: Clark Williams @ 2017-04-11 22:06 UTC (permalink / raw)
  To: Brian Wrenn; +Cc: Julia Cartwright, linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 6617 bytes --]

On Mon, 10 Apr 2017 17:31:52 -0400
Brian Wrenn <dcbrianw@gmail.com> wrote:

> I imagine that it would be a bad idea to prioritize my ISR ahead of
> any one of these?
> 
>    12  FF     99 139   - [posixcputmr/0]
>    14  FF     99 139   - [migration/0]
>    15  FF     99 139   - [watchdog/0]
>    16  FF     99 139   - [watchdog/1]
>    17  FF     99 139   - [migration/1]
>    21  FF     99 139   - [posixcputmr/1]
>    24  FF     99 139   - [watchdog/2]
>    25  FF     99 139   - [migration/2]
>    29  FF     99 139   - [posixcputmr/2]
>    32  FF     99 139   - [watchdog/3]
>    33  FF     99 139   - [migration/3]
>    37  FF     99 139   - [posixcputmr/3]
> 
> At this stage, that's all that prioritized higher.

Yeah, it's never a good idea to use FIFO:99 and compete with these guys. 

> 
> On Mon, Apr 10, 2017 at 4:48 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
> > On Mon, Apr 10, 2017 at 3:55 PM, Julia Cartwright <julia@ni.com> wrote:  
> >> On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:  
> >>> Hi, I'm trying to track down the source of some ISR latency that I'm
> >>> observing on a logic analyzer.
> >>>
> >>> Just for some background, my setup is the following.  I'm running
> >>> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
> >>> My kernel has the following PREEMPT RT options.  
> >> [..]  
> >>>
> >>> I have a micro-controller attached to a GPIO pin and two separate SPI
> >>> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
> >>> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
> >>> issues an interrupt on the GPIO.  A kernel module has an ISR
> >>> registered to the GPIO pin.  When the ISR executes, it copies 18
> >>> Kbytes of data from each SPI.  There are two RT threads executing in
> >>> user space as SCHED_RR with identical priorities less than (i.e. lower
> >>> than) 50, where as I understand the kernel runs at priority 50.  
> >>
> >> It's a bit more complicated than "the kernel runs at priority 50".  
> >
> > Yes, better stated, my kernel space IRQ gets a priority of 50.  (Most
> > of the others seem to have 50 as well.)

The RT kernel assigns all threaded IRQs  default priority of fifo:50. 

> >  
> >>
> >> Perhaps what you're getting at is that by default, with irq forced
> >> threading enabled (which is the default with -rt build kernels) the
> >> irqthreads have the default scheduling policy FIFO with priority 50.
> >> It's up to the administrator/system integrator to appropriately
> >> prioritize these irqthreads to suit the application.
> >>  
> >
> > Pursuant to that, I'm reviewing this to gain a better understanding of
> > such things.  Please forgive me; I hadn't found this link prior to
> > messaging the listserve.
> >
> > http://subversion.ffado.org/wiki/IrqPriorities
> >  
> >> For your usecase, this means that likely _all_ of the peripheral
> >> interrupts, including the low priority stuff you might not care about
> >> for your application (like networking or uart or whatever) are being
> >> prioritized right alongside your latency sensitive gpio handler.
> >>  
> >>> Each of those user space threads copies data from some shared memory
> >>> within the kernel module and performs some basic data integrity
> >>> checks, i.e.  a SHA1.  
> >>
> >> Just to better understand: how are your RT threads signalled by your
> >> gpio handler?  
> >
> > The user space RT threads have mmap'ed memory shared with the kernel
> > module.  The kernel module writes to a designated memory location the
> > current read offset so that the user space RT threads know where to
> > read from iteration to iteration.
> >  
> >>  
> >>> The vast majority of interrupts incur a delay in range of tens of
> >>> MICROseconds, meaning the ISR executes within such amount of time of
> >>> the micro-controller raising the GPIO pin of the ISR.  However,
> >>> occasionally, more than six MILLIseconds passes between the time of
> >>> the micro-controller issuing the GPIO interrupt and the ISR beginning
> >>> to execute.  
> >>
> >> Also: how are you measuring this latency?  In particular, where is your
> >> test point for the "ISR beginning"?  
> >>>  
> >
> > Upon detection of a bad CRC or bad sequence number, the user space
> > threads raise another GPIO, not the GPIO of the ISR, another one that
> > triggers a logic analyzer to dump a forward captured buffer.  Then, I
> > got back through the capture and identify the high latency ISR
> > execution.  (See attached image.)
> >  
> >>> I'm trying to track down the cause of the 6+ MILLIsecond latency,
> >>> which I would not expect to happen.  Are there any well practiced
> >>> techniques for determining what factor(s) may be introducing this
> >>> latency?  I looked at the description for RT Watchdog, but I don't
> >>> think it would help me much because the only two RT threads other than
> >>> the kernel have a lower priority.  
> >>
> >> I don't think the RT watchdog will help you here, however, I would
> >> recommend that you get acquainted with using ftrace/trace-cmd and the
> >> irq and sched traceevents.
> >>
> >>    Julia  
> >
> > It sounds like to me that maybe you've confirmed that I'm looking in
> > the right direction w.r.t. modifying the kernel ISRs with something
> > like 'chrt' and 'rtiq'.  Yes?  But I also will look into
> > ftrace/trace-cmd and irq and sched trace events.

I'd suggest using chrt to boost your irq(s) up above the default 50 to say 51. That would insure that they're scheduled in before the other interrupt threads are processed. 

Of course that just means your interrupt gets serviced first and does nothing to help if some other interrupt (or other entity) is delaying the scheduling of your user-space threads. As Julia said, you probably want to look at ftrace and run your application with some tracepoints enabled. That will let you look at the timestamps when various events are hit and hopefully give you an idea of what's happening when your latency spike hits. 

If you want to get fancy, look at what cyclictest (from the rt-tests package) does when the --breaktrace option is used. It takes a threshold value and stops processing when the threshold is exceeded. In addition it stops any active trace and writes a message to the trace buffers, so it's quite obvious when you've hit a spike and you get a nice event trace leading up to the spike. 

Clark

-- 
The United States Coast Guard
Ruining Natural Selection since 1790

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel ISR Latency
  2017-04-11 22:06       ` Clark Williams
@ 2017-04-12 23:50         ` Brian Wrenn
  2017-04-26  8:29           ` Piotr Gregor
  0 siblings, 1 reply; 9+ messages in thread
From: Brian Wrenn @ 2017-04-12 23:50 UTC (permalink / raw)
  To: Clark Williams; +Cc: Julia Cartwright, linux-rt-users

Hi All,

thank you for all the suggestions.  I made some findings, so I thought
I would share them with anyone watching this thread. The delays shown
on the scope screen capture I shared in an earlier email resulted form
either of the lower priority threads using some kernel calls into some
optimized CRC routines.  The kernel operating at a higher priority
than the ISR, I believe invoked a priority inversion upon the user
space threads causing them to periodically preempt the ISR.  When I
swap those CRC routines with ones not embedded in the kernel, the ISR
does not incur the delay as seen in the scope.

One problem I've had; however, is I can't verify this with ftrace.
After building the kernel with ftrace support, attempts to use the
kernel optimized CRC calls cause the entire system to crash.  I don't
know why.  To use these optimized calls requires establishing a socket
between the user space thread and the kernel.  Calling bind() upon
that socket seems to cause the crash.  So I can only guess that ftrace
somehow obstructs creating some kinds of sockets between user space
and kernel space.

The outstanding problem for me to solve is that the user space threads
don't perform the CRCs fast enough before needing to read the next
batch of SPI transfer data from shared memory space, that is the
memory shared with the kernel module of the ISR that does the SPI
transfers.  I've tried running the ISR at priority 80 (nothing else
higher except for the kernel threads that run at priority 99) and the
user space threads that perform the CRCs at priority 60 (nothing else
higher other than the aforementioned ISR and again the kernel threads
that run at 99).

I attempted to use processor affinity with cpuset to isolate the user
space threads.  I must have been using it improperly because I had to
shield 3 (of 4) cores for the executable that spawns the two user
space RT threads to run just one of them to completion.  Any number of
cores less, and the executable just hangs.  Nonetheless, the single
user space RT thread that did run to completion didn't seem to get any
speedup in its CRC processing, so by my observations PREEMPT RT is
doing just fine giving it what it needs from a priority standpoint.

It may simply be that I've squeezed all I can out of this processor.

On Tue, Apr 11, 2017 at 6:06 PM, Clark Williams <williams@redhat.com> wrote:
> On Mon, 10 Apr 2017 17:31:52 -0400
> Brian Wrenn <dcbrianw@gmail.com> wrote:
>
>> I imagine that it would be a bad idea to prioritize my ISR ahead of
>> any one of these?
>>
>>    12  FF     99 139   - [posixcputmr/0]
>>    14  FF     99 139   - [migration/0]
>>    15  FF     99 139   - [watchdog/0]
>>    16  FF     99 139   - [watchdog/1]
>>    17  FF     99 139   - [migration/1]
>>    21  FF     99 139   - [posixcputmr/1]
>>    24  FF     99 139   - [watchdog/2]
>>    25  FF     99 139   - [migration/2]
>>    29  FF     99 139   - [posixcputmr/2]
>>    32  FF     99 139   - [watchdog/3]
>>    33  FF     99 139   - [migration/3]
>>    37  FF     99 139   - [posixcputmr/3]
>>
>> At this stage, that's all that prioritized higher.
>
> Yeah, it's never a good idea to use FIFO:99 and compete with these guys.
>
>>
>> On Mon, Apr 10, 2017 at 4:48 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
>> > On Mon, Apr 10, 2017 at 3:55 PM, Julia Cartwright <julia@ni.com> wrote:
>> >> On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:
>> >>> Hi, I'm trying to track down the source of some ISR latency that I'm
>> >>> observing on a logic analyzer.
>> >>>
>> >>> Just for some background, my setup is the following.  I'm running
>> >>> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
>> >>> My kernel has the following PREEMPT RT options.
>> >> [..]
>> >>>
>> >>> I have a micro-controller attached to a GPIO pin and two separate SPI
>> >>> ports.  Every 10 milliseconds the micro-controller writes 18 Kbytes of
>> >>> data to each of 2 SPI ports (total of 36 Kbytes between the two) and
>> >>> issues an interrupt on the GPIO.  A kernel module has an ISR
>> >>> registered to the GPIO pin.  When the ISR executes, it copies 18
>> >>> Kbytes of data from each SPI.  There are two RT threads executing in
>> >>> user space as SCHED_RR with identical priorities less than (i.e. lower
>> >>> than) 50, where as I understand the kernel runs at priority 50.
>> >>
>> >> It's a bit more complicated than "the kernel runs at priority 50".
>> >
>> > Yes, better stated, my kernel space IRQ gets a priority of 50.  (Most
>> > of the others seem to have 50 as well.)
>
> The RT kernel assigns all threaded IRQs  default priority of fifo:50.
>
>> >
>> >>
>> >> Perhaps what you're getting at is that by default, with irq forced
>> >> threading enabled (which is the default with -rt build kernels) the
>> >> irqthreads have the default scheduling policy FIFO with priority 50.
>> >> It's up to the administrator/system integrator to appropriately
>> >> prioritize these irqthreads to suit the application.
>> >>
>> >
>> > Pursuant to that, I'm reviewing this to gain a better understanding of
>> > such things.  Please forgive me; I hadn't found this link prior to
>> > messaging the listserve.
>> >
>> > http://subversion.ffado.org/wiki/IrqPriorities
>> >
>> >> For your usecase, this means that likely _all_ of the peripheral
>> >> interrupts, including the low priority stuff you might not care about
>> >> for your application (like networking or uart or whatever) are being
>> >> prioritized right alongside your latency sensitive gpio handler.
>> >>
>> >>> Each of those user space threads copies data from some shared memory
>> >>> within the kernel module and performs some basic data integrity
>> >>> checks, i.e.  a SHA1.
>> >>
>> >> Just to better understand: how are your RT threads signalled by your
>> >> gpio handler?
>> >
>> > The user space RT threads have mmap'ed memory shared with the kernel
>> > module.  The kernel module writes to a designated memory location the
>> > current read offset so that the user space RT threads know where to
>> > read from iteration to iteration.
>> >
>> >>
>> >>> The vast majority of interrupts incur a delay in range of tens of
>> >>> MICROseconds, meaning the ISR executes within such amount of time of
>> >>> the micro-controller raising the GPIO pin of the ISR.  However,
>> >>> occasionally, more than six MILLIseconds passes between the time of
>> >>> the micro-controller issuing the GPIO interrupt and the ISR beginning
>> >>> to execute.
>> >>
>> >> Also: how are you measuring this latency?  In particular, where is your
>> >> test point for the "ISR beginning"?
>> >>>
>> >
>> > Upon detection of a bad CRC or bad sequence number, the user space
>> > threads raise another GPIO, not the GPIO of the ISR, another one that
>> > triggers a logic analyzer to dump a forward captured buffer.  Then, I
>> > got back through the capture and identify the high latency ISR
>> > execution.  (See attached image.)
>> >
>> >>> I'm trying to track down the cause of the 6+ MILLIsecond latency,
>> >>> which I would not expect to happen.  Are there any well practiced
>> >>> techniques for determining what factor(s) may be introducing this
>> >>> latency?  I looked at the description for RT Watchdog, but I don't
>> >>> think it would help me much because the only two RT threads other than
>> >>> the kernel have a lower priority.
>> >>
>> >> I don't think the RT watchdog will help you here, however, I would
>> >> recommend that you get acquainted with using ftrace/trace-cmd and the
>> >> irq and sched traceevents.
>> >>
>> >>    Julia
>> >
>> > It sounds like to me that maybe you've confirmed that I'm looking in
>> > the right direction w.r.t. modifying the kernel ISRs with something
>> > like 'chrt' and 'rtiq'.  Yes?  But I also will look into
>> > ftrace/trace-cmd and irq and sched trace events.
>
> I'd suggest using chrt to boost your irq(s) up above the default 50 to say 51. That would insure that they're scheduled in before the other interrupt threads are processed.
>
> Of course that just means your interrupt gets serviced first and does nothing to help if some other interrupt (or other entity) is delaying the scheduling of your user-space threads. As Julia said, you probably want to look at ftrace and run your application with some tracepoints enabled. That will let you look at the timestamps when various events are hit and hopefully give you an idea of what's happening when your latency spike hits.
>
> If you want to get fancy, look at what cyclictest (from the rt-tests package) does when the --breaktrace option is used. It takes a threshold value and stops processing when the threshold is exceeded. In addition it stops any active trace and writes a message to the trace buffers, so it's quite obvious when you've hit a spike and you get a nice event trace leading up to the spike.
>
> Clark
>
> --
> The United States Coast Guard
> Ruining Natural Selection since 1790

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Kernel ISR Latency
  2017-04-12 23:50         ` Brian Wrenn
@ 2017-04-26  8:29           ` Piotr Gregor
  0 siblings, 0 replies; 9+ messages in thread
From: Piotr Gregor @ 2017-04-26  8:29 UTC (permalink / raw)
  To: Brian Wrenn, Clark Williams; +Cc: Julia Cartwright, linux-rt-users

Hi Brian,

Maybe you could consider the rolling checksum if CRC cannot be calculated fast enough by your threads.

cheers,
Piotr

-----Original Message-----
From: linux-rt-users-owner@vger.kernel.org [mailto:linux-rt-users-owner@vger.kernel.org] On Behalf Of Brian Wrenn
Sent: 13 April 2017 00:51
To: Clark Williams <williams@redhat.com>
Cc: Julia Cartwright <julia@ni.com>; linux-rt-users@vger.kernel.org
Subject: Re: Kernel ISR Latency

Hi All,

thank you for all the suggestions.  I made some findings, so I thought I would share them with anyone watching this thread. The delays shown on the scope screen capture I shared in an earlier email resulted form either of the lower priority threads using some kernel calls into some optimized CRC routines.  The kernel operating at a higher priority than the ISR, I believe invoked a priority inversion upon the user space threads causing them to periodically preempt the ISR.  When I swap those CRC routines with ones not embedded in the kernel, the ISR does not incur the delay as seen in the scope.

One problem I've had; however, is I can't verify this with ftrace.
After building the kernel with ftrace support, attempts to use the kernel optimized CRC calls cause the entire system to crash.  I don't know why.  To use these optimized calls requires establishing a socket between the user space thread and the kernel.  Calling bind() upon that socket seems to cause the crash.  So I can only guess that ftrace somehow obstructs creating some kinds of sockets between user space and kernel space.

The outstanding problem for me to solve is that the user space threads don't perform the CRCs fast enough before needing to read the next batch of SPI transfer data from shared memory space, that is the memory shared with the kernel module of the ISR that does the SPI transfers.  I've tried running the ISR at priority 80 (nothing else higher except for the kernel threads that run at priority 99) and the user space threads that perform the CRCs at priority 60 (nothing else higher other than the aforementioned ISR and again the kernel threads that run at 99).

I attempted to use processor affinity with cpuset to isolate the user space threads.  I must have been using it improperly because I had to shield 3 (of 4) cores for the executable that spawns the two user space RT threads to run just one of them to completion.  Any number of cores less, and the executable just hangs.  Nonetheless, the single user space RT thread that did run to completion didn't seem to get any speedup in its CRC processing, so by my observations PREEMPT RT is doing just fine giving it what it needs from a priority standpoint.

It may simply be that I've squeezed all I can out of this processor.

On Tue, Apr 11, 2017 at 6:06 PM, Clark Williams <williams@redhat.com> wrote:
> On Mon, 10 Apr 2017 17:31:52 -0400
> Brian Wrenn <dcbrianw@gmail.com> wrote:
>
>> I imagine that it would be a bad idea to prioritize my ISR ahead of 
>> any one of these?
>>
>>    12  FF     99 139   - [posixcputmr/0]
>>    14  FF     99 139   - [migration/0]
>>    15  FF     99 139   - [watchdog/0]
>>    16  FF     99 139   - [watchdog/1]
>>    17  FF     99 139   - [migration/1]
>>    21  FF     99 139   - [posixcputmr/1]
>>    24  FF     99 139   - [watchdog/2]
>>    25  FF     99 139   - [migration/2]
>>    29  FF     99 139   - [posixcputmr/2]
>>    32  FF     99 139   - [watchdog/3]
>>    33  FF     99 139   - [migration/3]
>>    37  FF     99 139   - [posixcputmr/3]
>>
>> At this stage, that's all that prioritized higher.
>
> Yeah, it's never a good idea to use FIFO:99 and compete with these guys.
>
>>
>> On Mon, Apr 10, 2017 at 4:48 PM, Brian Wrenn <dcbrianw@gmail.com> wrote:
>> > On Mon, Apr 10, 2017 at 3:55 PM, Julia Cartwright <julia@ni.com> wrote:
>> >> On Mon, Apr 10, 2017 at 03:12:28PM -0400, Brian Wrenn wrote:
>> >>> Hi, I'm trying to track down the source of some ISR latency that 
>> >>> I'm observing on a logic analyzer.
>> >>>
>> >>> Just for some background, my setup is the following.  I'm running 
>> >>> Linaro 4.4.9 with PREEMPT RT Patch 17 on a Variscite SD410 SOM devkit.
>> >>> My kernel has the following PREEMPT RT options.
>> >> [..]
>> >>>
>> >>> I have a micro-controller attached to a GPIO pin and two separate 
>> >>> SPI ports.  Every 10 milliseconds the micro-controller writes 18 
>> >>> Kbytes of data to each of 2 SPI ports (total of 36 Kbytes between 
>> >>> the two) and issues an interrupt on the GPIO.  A kernel module 
>> >>> has an ISR registered to the GPIO pin.  When the ISR executes, it 
>> >>> copies 18 Kbytes of data from each SPI.  There are two RT threads 
>> >>> executing in user space as SCHED_RR with identical priorities 
>> >>> less than (i.e. lower
>> >>> than) 50, where as I understand the kernel runs at priority 50.
>> >>
>> >> It's a bit more complicated than "the kernel runs at priority 50".
>> >
>> > Yes, better stated, my kernel space IRQ gets a priority of 50.  
>> > (Most of the others seem to have 50 as well.)
>
> The RT kernel assigns all threaded IRQs  default priority of fifo:50.
>
>> >
>> >>
>> >> Perhaps what you're getting at is that by default, with irq forced 
>> >> threading enabled (which is the default with -rt build kernels) 
>> >> the irqthreads have the default scheduling policy FIFO with priority 50.
>> >> It's up to the administrator/system integrator to appropriately 
>> >> prioritize these irqthreads to suit the application.
>> >>
>> >
>> > Pursuant to that, I'm reviewing this to gain a better understanding 
>> > of such things.  Please forgive me; I hadn't found this link prior 
>> > to messaging the listserve.
>> >
>> > http://subversion.ffado.org/wiki/IrqPriorities
>> >
>> >> For your usecase, this means that likely _all_ of the peripheral 
>> >> interrupts, including the low priority stuff you might not care 
>> >> about for your application (like networking or uart or whatever) 
>> >> are being prioritized right alongside your latency sensitive gpio handler.
>> >>
>> >>> Each of those user space threads copies data from some shared 
>> >>> memory within the kernel module and performs some basic data 
>> >>> integrity checks, i.e.  a SHA1.
>> >>
>> >> Just to better understand: how are your RT threads signalled by 
>> >> your gpio handler?
>> >
>> > The user space RT threads have mmap'ed memory shared with the 
>> > kernel module.  The kernel module writes to a designated memory 
>> > location the current read offset so that the user space RT threads 
>> > know where to read from iteration to iteration.
>> >
>> >>
>> >>> The vast majority of interrupts incur a delay in range of tens of 
>> >>> MICROseconds, meaning the ISR executes within such amount of time 
>> >>> of the micro-controller raising the GPIO pin of the ISR.  
>> >>> However, occasionally, more than six MILLIseconds passes between 
>> >>> the time of the micro-controller issuing the GPIO interrupt and 
>> >>> the ISR beginning to execute.
>> >>
>> >> Also: how are you measuring this latency?  In particular, where is 
>> >> your test point for the "ISR beginning"?
>> >>>
>> >
>> > Upon detection of a bad CRC or bad sequence number, the user space 
>> > threads raise another GPIO, not the GPIO of the ISR, another one 
>> > that triggers a logic analyzer to dump a forward captured buffer.  
>> > Then, I got back through the capture and identify the high latency 
>> > ISR execution.  (See attached image.)
>> >
>> >>> I'm trying to track down the cause of the 6+ MILLIsecond latency, 
>> >>> which I would not expect to happen.  Are there any well practiced 
>> >>> techniques for determining what factor(s) may be introducing this 
>> >>> latency?  I looked at the description for RT Watchdog, but I 
>> >>> don't think it would help me much because the only two RT threads 
>> >>> other than the kernel have a lower priority.
>> >>
>> >> I don't think the RT watchdog will help you here, however, I would 
>> >> recommend that you get acquainted with using ftrace/trace-cmd and 
>> >> the irq and sched traceevents.
>> >>
>> >>    Julia
>> >
>> > It sounds like to me that maybe you've confirmed that I'm looking 
>> > in the right direction w.r.t. modifying the kernel ISRs with 
>> > something like 'chrt' and 'rtiq'.  Yes?  But I also will look into 
>> > ftrace/trace-cmd and irq and sched trace events.
>
> I'd suggest using chrt to boost your irq(s) up above the default 50 to say 51. That would insure that they're scheduled in before the other interrupt threads are processed.
>
> Of course that just means your interrupt gets serviced first and does nothing to help if some other interrupt (or other entity) is delaying the scheduling of your user-space threads. As Julia said, you probably want to look at ftrace and run your application with some tracepoints enabled. That will let you look at the timestamps when various events are hit and hopefully give you an idea of what's happening when your latency spike hits.
>
> If you want to get fancy, look at what cyclictest (from the rt-tests package) does when the --breaktrace option is used. It takes a threshold value and stops processing when the threshold is exceeded. In addition it stops any active trace and writes a message to the trace buffers, so it's quite obvious when you've hit a spike and you get a nice event trace leading up to the spike.
>
> Clark
>
> --
> The United States Coast Guard
> Ruining Natural Selection since 1790
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-26  8:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-10 19:12 Kernel ISR Latency Brian Wrenn
2017-04-10 19:15 ` Brian Wrenn
2017-04-10 19:55 ` Julia Cartwright
     [not found]   ` <CAEYdEeR4Y138B9NinTso3hvJ6trEUa5OprsNEecDHFFrmP8j4Q@mail.gmail.com>
2017-04-10 21:31     ` Brian Wrenn
2017-04-11  3:36       ` Brian Wrenn
2017-04-11 12:47         ` Brian Wrenn
2017-04-11 22:06       ` Clark Williams
2017-04-12 23:50         ` Brian Wrenn
2017-04-26  8:29           ` Piotr Gregor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.