* High interrupt latency with low power idle mode on i.MX6 @ 2020-05-27 10:39 Schrempf Frieder 2020-05-27 11:53 ` Russell King - ARM Linux admin 0 siblings, 1 reply; 4+ messages in thread From: Schrempf Frieder @ 2020-05-27 10:39 UTC (permalink / raw) To: Russell King, Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano, Kate Stewart, Enrico Weigelt, Thomas Gleixner Cc: linux-arm-kernel, linux-kernel, linux-pm Hi, on our i.MX6UL/ULL boards running mainline kernels, we see an issue with RS485 collisions on the bus. These are caused by the resetting of the RTS signal being delayed after each transmission. The TXDC interrupt takes several milliseconds to trigger and the slave on the bus already starts to send a reply in the meantime. We found out that these delays only happen when the CPU is in "low power idle" mode (ARM power off). When we disable cpuidle state 2 or put some background load on the CPU everything works fine and the delays are gone. echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable It seems like also other interfaces (I2C, etc.) might be affected by these increased latencies, we haven't investigated this more closely, though. We currently apply a patch to our kernel, that disables low power idle mode by default, but I'm wondering if there's a way to fix this properly? Any ideas? Thanks, Frieder ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: High interrupt latency with low power idle mode on i.MX6 2020-05-27 10:39 High interrupt latency with low power idle mode on i.MX6 Schrempf Frieder @ 2020-05-27 11:53 ` Russell King - ARM Linux admin 2020-05-27 12:50 ` Schrempf Frieder 0 siblings, 1 reply; 4+ messages in thread From: Russell King - ARM Linux admin @ 2020-05-27 11:53 UTC (permalink / raw) To: Schrempf Frieder Cc: Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano, Kate Stewart, Enrico Weigelt, Thomas Gleixner, linux-arm-kernel, linux-kernel, linux-pm On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote: > Hi, > > on our i.MX6UL/ULL boards running mainline kernels, we see an issue with > RS485 collisions on the bus. These are caused by the resetting of the > RTS signal being delayed after each transmission. The TXDC interrupt > takes several milliseconds to trigger and the slave on the bus already > starts to send a reply in the meantime. > > We found out that these delays only happen when the CPU is in "low power > idle" mode (ARM power off). When we disable cpuidle state 2 or put some > background load on the CPU everything works fine and the delays are gone. > > echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable > > It seems like also other interfaces (I2C, etc.) might be affected by > these increased latencies, we haven't investigated this more closely, > though. > > We currently apply a patch to our kernel, that disables low power idle > mode by default, but I'm wondering if there's a way to fix this > properly? Any ideas? Let's examine a basic fact about power management: The deeper PM modes that the system enters, the higher the latency to resume operation. So, I'm not surprised that you have higher latency when you allow the system to enter lower power modes. Does that mean that the kernel should not permit entering lower power modes - no, it's policy and application dependent. If the hardware is designed to use software to manage the RTS signal to control the RS485 receiver, then I'm afraid that your report really does not surprise me - throwing that at software to manage is a really stupid idea, but it seems lots of people do this. I've held this view since I worked on a safety critical system that used RS485 back in the 1990s (London Underground Jubilee Line Extension public address system.) So, what we have here is several things that come together to create a problem: 1) higher power savings produce higher latency to resume from 2) lack of hardware support for RS485 half duplex communication needing software support 3) an application that makes use of RS485 half duplex communication without disabling the higher latency power saving modes The question is, who should disable those higher latency power saving modes - the kernel, or userspace? The kernel knows whether it needs to provide software control of the RTS signal or not, but the kernel does not know the maximum permissible latency (which is application specific.) So, the kernel doesn't have all the information it needs. However, there is a QoS subsystem which may help you. There's also tweaks available via /sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us which can be poked to configure the latency that is required, and will prevent the deeper PM states being entered. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: High interrupt latency with low power idle mode on i.MX6 2020-05-27 11:53 ` Russell King - ARM Linux admin @ 2020-05-27 12:50 ` Schrempf Frieder 2020-05-27 13:23 ` Russell King - ARM Linux admin 0 siblings, 1 reply; 4+ messages in thread From: Schrempf Frieder @ 2020-05-27 12:50 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano, Kate Stewart, Enrico Weigelt, Thomas Gleixner, linux-arm-kernel, linux-kernel, linux-pm On 27.05.20 13:53, Russell King - ARM Linux admin wrote: > On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote: >> Hi, >> >> on our i.MX6UL/ULL boards running mainline kernels, we see an issue with >> RS485 collisions on the bus. These are caused by the resetting of the >> RTS signal being delayed after each transmission. The TXDC interrupt >> takes several milliseconds to trigger and the slave on the bus already >> starts to send a reply in the meantime. >> >> We found out that these delays only happen when the CPU is in "low power >> idle" mode (ARM power off). When we disable cpuidle state 2 or put some >> background load on the CPU everything works fine and the delays are gone. >> >> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable >> >> It seems like also other interfaces (I2C, etc.) might be affected by >> these increased latencies, we haven't investigated this more closely, >> though. >> >> We currently apply a patch to our kernel, that disables low power idle >> mode by default, but I'm wondering if there's a way to fix this >> properly? Any ideas? > > Let's examine a basic fact about power management: > > The deeper PM modes that the system enters, the higher the latency to > resume operation. > > So, I'm not surprised that you have higher latency when you allow the > system to enter lower power modes. Does that mean that the kernel > should not permit entering lower power modes - no, it's policy and > application dependent. > > If the hardware is designed to use software to manage the RTS signal > to control the RS485 receiver, then I'm afraid that your report really > does not surprise me - throwing that at software to manage is a really > stupid idea, but it seems lots of people do this. I've held this view > since I worked on a safety critical system that used RS485 back in the > 1990s (London Underground Jubilee Line Extension public address system.) > > So, what we have here is several things that come together to create a > problem: > > 1) higher power savings produce higher latency to resume from > 2) lack of hardware support for RS485 half duplex communication needing > software support > 3) an application that makes use of RS485 half duplex communication > without disabling the higher latency power saving modes > > The question is, who should disable those higher latency power saving > modes - the kernel, or userspace? > > The kernel knows whether it needs to provide software control of the > RTS signal or not, but the kernel does not know the maximum permissible > latency (which is application specific.) So, the kernel doesn't have > all the information it needs. However, there is a QoS subsystem which > may help you. > > There's also tweaks available via > /sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us > > which can be poked to configure the latency that is required, and will > prevent the deeper PM states being entered. Thanks for the detailed explanation. This all makes perfect sense to me. I will keep in mind that we need to consider this aspect of power saving vs. latency when designing systems and also that we need to provide the information for the kernel to decide which of the two is more important. Also thanks for pointing out the QoS subsystem. I'm not quite sure if it would work for us to use pm_qos_resume_latency_us in our specific case. The actual latency we observe is something like 2 to 3 milliseconds longer with low power idle than without, but the exit_latency for low power idle specified in the cpuidle driver is only 300 us. So as far as I can see with this difference even if we would set pm_qos_resume_latency_us to 1000 us (which should be fast enough for the RS485 to work properly), the low power idle wouldn't be disabled. It's rather this discrepancy between the latency set in the driver and what we see in reality which makes me wonder if there's something I'm missing. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: High interrupt latency with low power idle mode on i.MX6 2020-05-27 12:50 ` Schrempf Frieder @ 2020-05-27 13:23 ` Russell King - ARM Linux admin 0 siblings, 0 replies; 4+ messages in thread From: Russell King - ARM Linux admin @ 2020-05-27 13:23 UTC (permalink / raw) To: Schrempf Frieder Cc: Shawn Guo, Sascha Hauer, Rafael J. Wysocki, Daniel Lezcano, Kate Stewart, Enrico Weigelt, Thomas Gleixner, linux-arm-kernel, linux-kernel, linux-pm On Wed, May 27, 2020 at 12:50:01PM +0000, Schrempf Frieder wrote: > On 27.05.20 13:53, Russell King - ARM Linux admin wrote: > > On Wed, May 27, 2020 at 10:39:12AM +0000, Schrempf Frieder wrote: > >> Hi, > >> > >> on our i.MX6UL/ULL boards running mainline kernels, we see an issue with > >> RS485 collisions on the bus. These are caused by the resetting of the > >> RTS signal being delayed after each transmission. The TXDC interrupt > >> takes several milliseconds to trigger and the slave on the bus already > >> starts to send a reply in the meantime. > >> > >> We found out that these delays only happen when the CPU is in "low power > >> idle" mode (ARM power off). When we disable cpuidle state 2 or put some > >> background load on the CPU everything works fine and the delays are gone. > >> > >> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable > >> > >> It seems like also other interfaces (I2C, etc.) might be affected by > >> these increased latencies, we haven't investigated this more closely, > >> though. > >> > >> We currently apply a patch to our kernel, that disables low power idle > >> mode by default, but I'm wondering if there's a way to fix this > >> properly? Any ideas? > > > > Let's examine a basic fact about power management: > > > > The deeper PM modes that the system enters, the higher the latency to > > resume operation. > > > > So, I'm not surprised that you have higher latency when you allow the > > system to enter lower power modes. Does that mean that the kernel > > should not permit entering lower power modes - no, it's policy and > > application dependent. > > > > If the hardware is designed to use software to manage the RTS signal > > to control the RS485 receiver, then I'm afraid that your report really > > does not surprise me - throwing that at software to manage is a really > > stupid idea, but it seems lots of people do this. I've held this view > > since I worked on a safety critical system that used RS485 back in the > > 1990s (London Underground Jubilee Line Extension public address system.) > > > > So, what we have here is several things that come together to create a > > problem: > > > > 1) higher power savings produce higher latency to resume from > > 2) lack of hardware support for RS485 half duplex communication needing > > software support > > 3) an application that makes use of RS485 half duplex communication > > without disabling the higher latency power saving modes > > > > The question is, who should disable those higher latency power saving > > modes - the kernel, or userspace? > > > > The kernel knows whether it needs to provide software control of the > > RTS signal or not, but the kernel does not know the maximum permissible > > latency (which is application specific.) So, the kernel doesn't have > > all the information it needs. However, there is a QoS subsystem which > > may help you. > > > > There's also tweaks available via > > /sys/devices/system/cpu/cpu*/power/pm_qos_resume_latency_us > > > > which can be poked to configure the latency that is required, and will > > prevent the deeper PM states being entered. > > Thanks for the detailed explanation. This all makes perfect sense to me. > I will keep in mind that we need to consider this aspect of power saving > vs. latency when designing systems and also that we need to provide the > information for the kernel to decide which of the two is more important. > > Also thanks for pointing out the QoS subsystem. I'm not quite sure if it > would work for us to use pm_qos_resume_latency_us in our specific case. > The actual latency we observe is something like 2 to 3 milliseconds > longer with low power idle than without, but the exit_latency for low > power idle specified in the cpuidle driver is only 300 us. I wonder whether the exit latencies are correct in that case. From the comments, it seems 80us is allowed for the software overhead of entering/leaving the idle state vs 220us for the hardware. It may be a good idea for someone to add some tracing points in there to try and measure the minimum software latencies. > So as far as I can see with this difference even if we would set > pm_qos_resume_latency_us to 1000 us (which should be fast enough for the > RS485 to work properly), the low power idle wouldn't be disabled. > > It's rather this discrepancy between the latency set in the driver and > what we see in reality which makes me wonder if there's something I'm > missing. It's possible that there's something missing from the kernel's estimation of the latency required for entering / exiting those states. There is an amount of cache flushing that is required when entering those lower states, and I wonder if that has been accounted for. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-05-27 13:24 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-27 10:39 High interrupt latency with low power idle mode on i.MX6 Schrempf Frieder 2020-05-27 11:53 ` Russell King - ARM Linux admin 2020-05-27 12:50 ` Schrempf Frieder 2020-05-27 13:23 ` Russell King - ARM Linux admin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).