linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* High interrupt latency with low power idle mode on i.MX6
@ 2020-05-27 10:39 Schrempf Frieder
  2020-05-27 11:53 ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 4+ messages in thread
From: Schrempf Frieder @ 2020-05-27 10:39 UTC (permalink / raw)
  To: Russell King, Shawn Guo, Sascha Hauer, Rafael J. Wysocki,
	Daniel Lezcano, Kate Stewart, Enrico Weigelt, Thomas Gleixner
  Cc: linux-arm-kernel, linux-kernel, linux-pm

Hi,

on our i.MX6UL/ULL boards running mainline kernels, we see an issue with 
RS485 collisions on the bus. These are caused by the resetting of the 
RTS signal being delayed after each transmission. The TXDC interrupt 
takes several milliseconds to trigger and the slave on the bus already 
starts to send a reply in the meantime.

We found out that these delays only happen when the CPU is in "low power 
idle" mode (ARM power off). When we disable cpuidle state 2 or put some 
background load on the CPU everything works fine and the delays are gone.

echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable

It seems like also other interfaces (I2C, etc.) might be affected by 
these increased latencies, we haven't investigated this more closely, 
though.

We currently apply a patch to our kernel, that disables low power idle 
mode by default, but I'm wondering if there's a way to fix this 
properly? Any ideas?

Thanks,
Frieder

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: High interrupt latency with low power idle mode on i.MX6
  2020-05-27 10:39 High interrupt latency with low power idle mode on i.MX6 Schrempf Frieder
@ 2020-05-27 11:53 ` Russell King - ARM Linux admin
  2020-05-27 12:50   ` Schrempf Frieder
  0 siblings, 1 reply; 4+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-27 11:53 UTC (permalink / raw)
  To: Schrempf Frieder
  Cc: Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano,
	Kate Stewart, Enrico Weigelt, Thomas Gleixner, linux-arm-kernel,
	linux-kernel, linux-pm

On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote:
> Hi,
> 
> on our i.MX6UL/ULL boards running mainline kernels, we see an issue with 
> RS485 collisions on the bus. These are caused by the resetting of the 
> RTS signal being delayed after each transmission. The TXDC interrupt 
> takes several milliseconds to trigger and the slave on the bus already 
> starts to send a reply in the meantime.
> 
> We found out that these delays only happen when the CPU is in "low power 
> idle" mode (ARM power off). When we disable cpuidle state 2 or put some 
> background load on the CPU everything works fine and the delays are gone.
> 
> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable
> 
> It seems like also other interfaces (I2C, etc.) might be affected by 
> these increased latencies, we haven't investigated this more closely, 
> though.
> 
> We currently apply a patch to our kernel, that disables low power idle 
> mode by default, but I'm wondering if there's a way to fix this 
> properly? Any ideas?

Let's examine a basic fact about power management:

The deeper PM modes that the system enters, the higher the latency to
resume operation.

So, I'm not surprised that you have higher latency when you allow the
system to enter lower power modes.  Does that mean that the kernel
should not permit entering lower power modes - no, it's policy and
application dependent.

If the hardware is designed to use software to manage the RTS signal
to control the RS485 receiver, then I'm afraid that your report really
does not surprise me - throwing that at software to manage is a really
stupid idea, but it seems lots of people do this.  I've held this view
since I worked on a safety critical system that used RS485 back in the
1990s (London Underground Jubilee Line Extension public address system.)

So, what we have here is several things that come together to create a
problem:

1) higher power savings produce higher latency to resume from
2) lack of hardware support for RS485 half duplex communication needing
   software support
3) an application that makes use of RS485 half duplex communication
   without disabling the higher latency power saving modes

The question is, who should disable those higher latency power saving
modes - the kernel, or userspace?

The kernel knows whether it needs to provide software control of the
RTS signal or not, but the kernel does not know the maximum permissible
latency (which is application specific.)  So, the kernel doesn't have
all the information it needs.  However, there is a QoS subsystem which
may help you.

There's also tweaks available via
/sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us

which can be poked to configure the latency that is required, and will
prevent the deeper PM states being entered.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: High interrupt latency with low power idle mode on i.MX6
  2020-05-27 11:53 ` Russell King - ARM Linux admin
@ 2020-05-27 12:50   ` Schrempf Frieder
  2020-05-27 13:23     ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 4+ messages in thread
From: Schrempf Frieder @ 2020-05-27 12:50 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano,
	Kate Stewart, Enrico Weigelt, Thomas Gleixner, linux-arm-kernel,
	linux-kernel, linux-pm

On 27.05.20 13:53, Russell King - ARM Linux admin wrote:
> On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote:
>> Hi,
>>
>> on our i.MX6UL/ULL boards running mainline kernels, we see an issue with
>> RS485 collisions on the bus. These are caused by the resetting of the
>> RTS signal being delayed after each transmission. The TXDC interrupt
>> takes several milliseconds to trigger and the slave on the bus already
>> starts to send a reply in the meantime.
>>
>> We found out that these delays only happen when the CPU is in "low power
>> idle" mode (ARM power off). When we disable cpuidle state 2 or put some
>> background load on the CPU everything works fine and the delays are gone.
>>
>> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable
>>
>> It seems like also other interfaces (I2C, etc.) might be affected by
>> these increased latencies, we haven't investigated this more closely,
>> though.
>>
>> We currently apply a patch to our kernel, that disables low power idle
>> mode by default, but I'm wondering if there's a way to fix this
>> properly? Any ideas?
> 
> Let's examine a basic fact about power management:
> 
> The deeper PM modes that the system enters, the higher the latency to
> resume operation.
> 
> So, I'm not surprised that you have higher latency when you allow the
> system to enter lower power modes.  Does that mean that the kernel
> should not permit entering lower power modes - no, it's policy and
> application dependent.
> 
> If the hardware is designed to use software to manage the RTS signal
> to control the RS485 receiver, then I'm afraid that your report really
> does not surprise me - throwing that at software to manage is a really
> stupid idea, but it seems lots of people do this.  I've held this view
> since I worked on a safety critical system that used RS485 back in the
> 1990s (London Underground Jubilee Line Extension public address system.)
> 
> So, what we have here is several things that come together to create a
> problem:
> 
> 1) higher power savings produce higher latency to resume from
> 2) lack of hardware support for RS485 half duplex communication needing
>     software support
> 3) an application that makes use of RS485 half duplex communication
>     without disabling the higher latency power saving modes
> 
> The question is, who should disable those higher latency power saving
> modes - the kernel, or userspace?
> 
> The kernel knows whether it needs to provide software control of the
> RTS signal or not, but the kernel does not know the maximum permissible
> latency (which is application specific.)  So, the kernel doesn't have
> all the information it needs.  However, there is a QoS subsystem which
> may help you.
> 
> There's also tweaks available via
> /sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us
> 
> which can be poked to configure the latency that is required, and will
> prevent the deeper PM states being entered.

Thanks for the detailed explanation. This all makes perfect sense to me.
I will keep in mind that we need to consider this aspect of power saving 
vs. latency when designing systems and also that we need to provide the 
information for the kernel to decide which of the two is more important.

Also thanks for pointing out the QoS subsystem. I'm not quite sure if it 
would work for us to use pm_qos_resume_latency_us in our specific case. 
The actual latency we observe is something like 2 to 3 milliseconds 
longer with low power idle than without, but the exit_latency for low 
power idle specified in the cpuidle driver is only 300 us.

So as far as I can see with this difference even if we would set 
pm_qos_resume_latency_us to 1000 us (which should be fast enough for the 
RS485 to work properly), the low power idle wouldn't be disabled.

It's rather this discrepancy between the latency set in the driver and 
what we see in reality which makes me wonder if there's something I'm 
missing.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: High interrupt latency with low power idle mode on i.MX6
  2020-05-27 12:50   ` Schrempf Frieder
@ 2020-05-27 13:23     ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 4+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-27 13:23 UTC (permalink / raw)
  To: Schrempf Frieder
  Cc: Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano,
	Kate Stewart, Enrico Weigelt, Thomas Gleixner, linux-arm-kernel,
	linux-kernel, linux-pm

On Wed, May 27, 2020 at 12:50:01PM +0000, Schrempf Frieder wrote:
> On 27.05.20 13:53, Russell King - ARM Linux admin wrote:
> > On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote:
> >> Hi,
> >>
> >> on our i.MX6UL/ULL boards running mainline kernels, we see an issue with
> >> RS485 collisions on the bus. These are caused by the resetting of the
> >> RTS signal being delayed after each transmission. The TXDC interrupt
> >> takes several milliseconds to trigger and the slave on the bus already
> >> starts to send a reply in the meantime.
> >>
> >> We found out that these delays only happen when the CPU is in "low power
> >> idle" mode (ARM power off). When we disable cpuidle state 2 or put some
> >> background load on the CPU everything works fine and the delays are gone.
> >>
> >> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable
> >>
> >> It seems like also other interfaces (I2C, etc.) might be affected by
> >> these increased latencies, we haven't investigated this more closely,
> >> though.
> >>
> >> We currently apply a patch to our kernel, that disables low power idle
> >> mode by default, but I'm wondering if there's a way to fix this
> >> properly? Any ideas?
> > 
> > Let's examine a basic fact about power management:
> > 
> > The deeper PM modes that the system enters, the higher the latency to
> > resume operation.
> > 
> > So, I'm not surprised that you have higher latency when you allow the
> > system to enter lower power modes.  Does that mean that the kernel
> > should not permit entering lower power modes - no, it's policy and
> > application dependent.
> > 
> > If the hardware is designed to use software to manage the RTS signal
> > to control the RS485 receiver, then I'm afraid that your report really
> > does not surprise me - throwing that at software to manage is a really
> > stupid idea, but it seems lots of people do this.  I've held this view
> > since I worked on a safety critical system that used RS485 back in the
> > 1990s (London Underground Jubilee Line Extension public address system.)
> > 
> > So, what we have here is several things that come together to create a
> > problem:
> > 
> > 1) higher power savings produce higher latency to resume from
> > 2) lack of hardware support for RS485 half duplex communication needing
> >     software support
> > 3) an application that makes use of RS485 half duplex communication
> >     without disabling the higher latency power saving modes
> > 
> > The question is, who should disable those higher latency power saving
> > modes - the kernel, or userspace?
> > 
> > The kernel knows whether it needs to provide software control of the
> > RTS signal or not, but the kernel does not know the maximum permissible
> > latency (which is application specific.)  So, the kernel doesn't have
> > all the information it needs.  However, there is a QoS subsystem which
> > may help you.
> > 
> > There's also tweaks available via
> > /sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us
> > 
> > which can be poked to configure the latency that is required, and will
> > prevent the deeper PM states being entered.
> 
> Thanks for the detailed explanation. This all makes perfect sense to me.
> I will keep in mind that we need to consider this aspect of power saving 
> vs. latency when designing systems and also that we need to provide the 
> information for the kernel to decide which of the two is more important.
> 
> Also thanks for pointing out the QoS subsystem. I'm not quite sure if it 
> would work for us to use pm_qos_resume_latency_us in our specific case. 
> The actual latency we observe is something like 2 to 3 milliseconds 
> longer with low power idle than without, but the exit_latency for low 
> power idle specified in the cpuidle driver is only 300 us.

I wonder whether the exit latencies are correct in that case.
From the comments, it seems 80us is allowed for the software overhead
of entering/leaving the idle state vs 220us for the hardware.
It may be a good idea for someone to add some tracing points in there
to try and measure the minimum software latencies.

> So as far as I can see with this difference even if we would set 
> pm_qos_resume_latency_us to 1000 us (which should be fast enough for the 
> RS485 to work properly), the low power idle wouldn't be disabled.
> 
> It's rather this discrepancy between the latency set in the driver and 
> what we see in reality which makes me wonder if there's something I'm 
> missing.

It's possible that there's something missing from the kernel's
estimation of the latency required for entering / exiting those
states.

There is an amount of cache flushing that is required when entering
those lower states, and I wonder if that has been accounted for.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-27 13:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27 10:39 High interrupt latency with low power idle mode on i.MX6 Schrempf Frieder
2020-05-27 11:53 ` Russell King - ARM Linux admin
2020-05-27 12:50   ` Schrempf Frieder
2020-05-27 13:23     ` Russell King - ARM Linux admin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).