* [PATCH 0/2] sc16is7xx interrupt fixes @ 2018-09-12 14:31 Phil Elwell 2018-09-12 14:31 ` [PATCH 1/2] sc16is7xx: Fix for multi-channel stall Phil Elwell 2018-09-12 14:31 ` [PATCH 2/2] sc16is7xx: Fix for "Unexpected interrupt: 8" Phil Elwell 0 siblings, 2 replies; 9+ messages in thread From: Phil Elwell @ 2018-09-12 14:31 UTC (permalink / raw) To: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren Cc: Phil Elwell The interrupt handling of the sc16is7xx driver is broken in a number of ways, as observed by multiple Raspberry Pi users. The attached patches attempt to address its failings, with apparent success. The first is a workaround for a side-effect of the switch away from using a thread IRQ, a change which has necessitated using what is actually a level-triggered interrupt as if it were edge-triggered. Doing so is fraught with potential race conditions, but the patch makes them much less likely. The second is a workaround for a bug in the design of the SC16IS752 which requires mutual exclusion between the interrupt handler and access to the Enhanced Features Register. Phil Elwell (2): sc16is7xx: Fix for multi-channel stall sc16is7xx: Fix for "Unexpected interrupt: 8" drivers/tty/serial/sc16is7xx.c | 50 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 44 insertions(+), 6 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-12 14:31 [PATCH 0/2] sc16is7xx interrupt fixes Phil Elwell @ 2018-09-12 14:31 ` Phil Elwell 2018-09-18 13:02 ` Greg Kroah-Hartman 2018-09-29 14:04 ` Andreas Färber 2018-09-12 14:31 ` [PATCH 2/2] sc16is7xx: Fix for "Unexpected interrupt: 8" Phil Elwell 1 sibling, 2 replies; 9+ messages in thread From: Phil Elwell @ 2018-09-12 14:31 UTC (permalink / raw) To: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren Cc: Phil Elwell The SC16IS752 is a dual-channel device. The two channels are largely independent, but the IRQ signals are wired together as an open-drain, active low signal which will be driven low while either of the channels requires attention, which can be for significant periods of time until operations complete and the interrupt can be acknowledged. In that respect it is should be treated as a true level-sensitive IRQ. The kernel, however, needs to be able to exit interrupt context in order to use I2C or SPI to access the device registers (which may involve sleeping). Therefore the interrupt needs to be masked out or paused in some way. The usual way to manage sleeping from within an interrupt handler is to use a threaded interrupt handler - a regular interrupt routine does the minimum amount of work needed to triage the interrupt before waking the interrupt service thread. If the threaded IRQ is marked as IRQF_ONESHOT the kernel will automatically mask out the interrupt until the thread runs to completion. The sc16is7xx driver used to use a threaded IRQ, but a patch switched to using a kthread_worker in order to set realtime priorities on the handler thread and for other optimisations. The end result is non-threaded IRQ that schedules some work then returns IRQ_HANDLED, making the kernel think that all IRQ processing has completed. The work-around to prevent a constant stream of interrupts is to mark the interrupt as edge-sensitive rather than level-sensitive, but interpreting an active-low source as a falling-edge source requires care to prevent a total cessation of interrupts. Whereas an edge-triggering source will generate a new edge for every interrupt condition a level-triggering source will keep the signal at the interrupting level until it no longer requires attention; in other words, the host won't see another edge until all interrupt conditions are cleared. It is therefore vital that the interrupt handler does not exit with an outstanding interrupt condition, otherwise the kernel will not receive another interrupt unless some other operation causes the interrupt state on the device to be cleared. The existing sc16is7xx driver has a very simple interrupt "thread" (kthread_work job) that processes interrupts on each channel in turn until there are no more. If both channels are active and the first channel starts interrupting while the handler for the second channel is running then it will not be detected and an IRQ stall ensues. This could be handled easily if there was a shared IRQ status register, or a convenient way to determine if the IRQ had been deasserted for any length of time, but both appear to be lacking. Avoid this problem (or at least make it much less likely to happen) by reducing the granularity of per-channel interrupt processing to one condition per iteration, only exiting the overall loop when both channels are no longer interrupting. Signed-off-by: Phil Elwell <phil@raspberrypi.org> --- drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index 243c960..47b4115 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -657,7 +657,7 @@ static void sc16is7xx_handle_tx(struct uart_port *port) uart_write_wakeup(port); } -static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) +static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) { struct uart_port *port = &s->p[portno].port; @@ -666,7 +666,7 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG); if (iir & SC16IS7XX_IIR_NO_INT_BIT) - break; + return false; iir &= SC16IS7XX_IIR_ID_MASK; @@ -688,16 +688,23 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) port->line, iir); break; } - } while (1); + } while (0); + return true; } static void sc16is7xx_ist(struct kthread_work *ws) { struct sc16is7xx_port *s = to_sc16is7xx_port(ws, irq_work); - int i; - for (i = 0; i < s->devtype->nr_uart; ++i) - sc16is7xx_port_irq(s, i); + while (1) { + bool keep_polling = false; + int i; + + for (i = 0; i < s->devtype->nr_uart; ++i) + keep_polling |= sc16is7xx_port_irq(s, i); + if (!keep_polling) + break; + } } static irqreturn_t sc16is7xx_irq(int irq, void *dev_id) -- 2.7.4 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-12 14:31 ` [PATCH 1/2] sc16is7xx: Fix for multi-channel stall Phil Elwell @ 2018-09-18 13:02 ` Greg Kroah-Hartman 2018-09-18 13:13 ` Phil Elwell 2018-09-29 14:04 ` Andreas Färber 1 sibling, 1 reply; 9+ messages in thread From: Greg Kroah-Hartman @ 2018-09-18 13:02 UTC (permalink / raw) To: Phil Elwell Cc: Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren On Wed, Sep 12, 2018 at 03:31:55PM +0100, Phil Elwell wrote: > The SC16IS752 is a dual-channel device. The two channels are largely > independent, but the IRQ signals are wired together as an open-drain, > active low signal which will be driven low while either of the > channels requires attention, which can be for significant periods of > time until operations complete and the interrupt can be acknowledged. > In that respect it is should be treated as a true level-sensitive IRQ. > > The kernel, however, needs to be able to exit interrupt context in > order to use I2C or SPI to access the device registers (which may > involve sleeping). Therefore the interrupt needs to be masked out or > paused in some way. > > The usual way to manage sleeping from within an interrupt handler > is to use a threaded interrupt handler - a regular interrupt routine > does the minimum amount of work needed to triage the interrupt before > waking the interrupt service thread. If the threaded IRQ is marked as > IRQF_ONESHOT the kernel will automatically mask out the interrupt > until the thread runs to completion. The sc16is7xx driver used to > use a threaded IRQ, but a patch switched to using a kthread_worker > in order to set realtime priorities on the handler thread and for > other optimisations. The end result is non-threaded IRQ that > schedules some work then returns IRQ_HANDLED, making the kernel > think that all IRQ processing has completed. > > The work-around to prevent a constant stream of interrupts is to > mark the interrupt as edge-sensitive rather than level-sensitive, > but interpreting an active-low source as a falling-edge source > requires care to prevent a total cessation of interrupts. Whereas > an edge-triggering source will generate a new edge for every interrupt > condition a level-triggering source will keep the signal at the > interrupting level until it no longer requires attention; in other > words, the host won't see another edge until all interrupt conditions > are cleared. It is therefore vital that the interrupt handler does not > exit with an outstanding interrupt condition, otherwise the kernel > will not receive another interrupt unless some other operation causes > the interrupt state on the device to be cleared. > > The existing sc16is7xx driver has a very simple interrupt "thread" > (kthread_work job) that processes interrupts on each channel in turn > until there are no more. If both channels are active and the first > channel starts interrupting while the handler for the second channel > is running then it will not be detected and an IRQ stall ensues. This > could be handled easily if there was a shared IRQ status register, or > a convenient way to determine if the IRQ had been deasserted for any > length of time, but both appear to be lacking. > > Avoid this problem (or at least make it much less likely to happen) > by reducing the granularity of per-channel interrupt processing > to one condition per iteration, only exiting the overall loop when > both channels are no longer interrupting. > > Signed-off-by: Phil Elwell <phil@raspberrypi.org> > --- > drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c > index 243c960..47b4115 100644 > --- a/drivers/tty/serial/sc16is7xx.c > +++ b/drivers/tty/serial/sc16is7xx.c > @@ -657,7 +657,7 @@ static void sc16is7xx_handle_tx(struct uart_port *port) > uart_write_wakeup(port); > } > > -static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > +static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > { > struct uart_port *port = &s->p[portno].port; > > @@ -666,7 +666,7 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > > iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG); > if (iir & SC16IS7XX_IIR_NO_INT_BIT) > - break; > + return false; > > iir &= SC16IS7XX_IIR_ID_MASK; > > @@ -688,16 +688,23 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > port->line, iir); > break; > } > - } while (1); > + } while (0); > + return true; > } > > static void sc16is7xx_ist(struct kthread_work *ws) > { > struct sc16is7xx_port *s = to_sc16is7xx_port(ws, irq_work); > - int i; > > - for (i = 0; i < s->devtype->nr_uart; ++i) > - sc16is7xx_port_irq(s, i); > + while (1) { > + bool keep_polling = false; > + int i; > + > + for (i = 0; i < s->devtype->nr_uart; ++i) > + keep_polling |= sc16is7xx_port_irq(s, i); > + if (!keep_polling) > + break; This makes me worried, there is no "timeout" now? What happens if this never happens, will you just sit and spin forever? What prevents that? thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-18 13:02 ` Greg Kroah-Hartman @ 2018-09-18 13:13 ` Phil Elwell 2018-09-18 13:26 ` Greg Kroah-Hartman 2018-09-18 13:37 ` Rogier Wolff 0 siblings, 2 replies; 9+ messages in thread From: Phil Elwell @ 2018-09-18 13:13 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren Hi Greg, On 18/09/2018 14:02, Greg Kroah-Hartman wrote: > On Wed, Sep 12, 2018 at 03:31:55PM +0100, Phil Elwell wrote: >> The SC16IS752 is a dual-channel device. The two channels are largely >> independent, but the IRQ signals are wired together as an open-drain, >> active low signal which will be driven low while either of the >> channels requires attention, which can be for significant periods of >> time until operations complete and the interrupt can be acknowledged. >> In that respect it is should be treated as a true level-sensitive IRQ. >> >> The kernel, however, needs to be able to exit interrupt context in >> order to use I2C or SPI to access the device registers (which may >> involve sleeping). Therefore the interrupt needs to be masked out or >> paused in some way. >> >> The usual way to manage sleeping from within an interrupt handler >> is to use a threaded interrupt handler - a regular interrupt routine >> does the minimum amount of work needed to triage the interrupt before >> waking the interrupt service thread. If the threaded IRQ is marked as >> IRQF_ONESHOT the kernel will automatically mask out the interrupt >> until the thread runs to completion. The sc16is7xx driver used to >> use a threaded IRQ, but a patch switched to using a kthread_worker >> in order to set realtime priorities on the handler thread and for >> other optimisations. The end result is non-threaded IRQ that >> schedules some work then returns IRQ_HANDLED, making the kernel >> think that all IRQ processing has completed. >> >> The work-around to prevent a constant stream of interrupts is to >> mark the interrupt as edge-sensitive rather than level-sensitive, >> but interpreting an active-low source as a falling-edge source >> requires care to prevent a total cessation of interrupts. Whereas >> an edge-triggering source will generate a new edge for every interrupt >> condition a level-triggering source will keep the signal at the >> interrupting level until it no longer requires attention; in other >> words, the host won't see another edge until all interrupt conditions >> are cleared. It is therefore vital that the interrupt handler does not >> exit with an outstanding interrupt condition, otherwise the kernel >> will not receive another interrupt unless some other operation causes >> the interrupt state on the device to be cleared. >> >> The existing sc16is7xx driver has a very simple interrupt "thread" >> (kthread_work job) that processes interrupts on each channel in turn >> until there are no more. If both channels are active and the first >> channel starts interrupting while the handler for the second channel >> is running then it will not be detected and an IRQ stall ensues. This >> could be handled easily if there was a shared IRQ status register, or >> a convenient way to determine if the IRQ had been deasserted for any >> length of time, but both appear to be lacking. >> >> Avoid this problem (or at least make it much less likely to happen) >> by reducing the granularity of per-channel interrupt processing >> to one condition per iteration, only exiting the overall loop when >> both channels are no longer interrupting. >> >> Signed-off-by: Phil Elwell <phil@raspberrypi.org> >> --- >> drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------ >> 1 file changed, 13 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c >> index 243c960..47b4115 100644 >> --- a/drivers/tty/serial/sc16is7xx.c >> +++ b/drivers/tty/serial/sc16is7xx.c >> @@ -657,7 +657,7 @@ static void sc16is7xx_handle_tx(struct uart_port *port) >> uart_write_wakeup(port); >> } >> >> -static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) >> +static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) >> { >> struct uart_port *port = &s->p[portno].port; >> >> @@ -666,7 +666,7 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) >> >> iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG); >> if (iir & SC16IS7XX_IIR_NO_INT_BIT) >> - break; >> + return false; >> >> iir &= SC16IS7XX_IIR_ID_MASK; >> >> @@ -688,16 +688,23 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) >> port->line, iir); >> break; >> } >> - } while (1); >> + } while (0); >> + return true; >> } >> >> static void sc16is7xx_ist(struct kthread_work *ws) >> { >> struct sc16is7xx_port *s = to_sc16is7xx_port(ws, irq_work); >> - int i; >> >> - for (i = 0; i < s->devtype->nr_uart; ++i) >> - sc16is7xx_port_irq(s, i); >> + while (1) { >> + bool keep_polling = false; >> + int i; >> + >> + for (i = 0; i < s->devtype->nr_uart; ++i) >> + keep_polling |= sc16is7xx_port_irq(s, i); >> + if (!keep_polling) >> + break; > > This makes me worried, there is no "timeout" now? What happens if this > never happens, will you just sit and spin forever? What prevents that? The patch is keeping to the spirit of the original, which also has a potentially infinite loop (in sc16is7xx_port_irq) - this just moves it up one level. I could add a limit on the number of iterations, but if the limit is ever hit, leading to an early exit, the port is basically dead because it will never receive another interrupt. Phil ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-18 13:13 ` Phil Elwell @ 2018-09-18 13:26 ` Greg Kroah-Hartman 2018-09-18 13:37 ` Rogier Wolff 1 sibling, 0 replies; 9+ messages in thread From: Greg Kroah-Hartman @ 2018-09-18 13:26 UTC (permalink / raw) To: Phil Elwell Cc: Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren On Tue, Sep 18, 2018 at 02:13:15PM +0100, Phil Elwell wrote: > Hi Greg, > > On 18/09/2018 14:02, Greg Kroah-Hartman wrote: > > On Wed, Sep 12, 2018 at 03:31:55PM +0100, Phil Elwell wrote: > >> The SC16IS752 is a dual-channel device. The two channels are largely > >> independent, but the IRQ signals are wired together as an open-drain, > >> active low signal which will be driven low while either of the > >> channels requires attention, which can be for significant periods of > >> time until operations complete and the interrupt can be acknowledged. > >> In that respect it is should be treated as a true level-sensitive IRQ. > >> > >> The kernel, however, needs to be able to exit interrupt context in > >> order to use I2C or SPI to access the device registers (which may > >> involve sleeping). Therefore the interrupt needs to be masked out or > >> paused in some way. > >> > >> The usual way to manage sleeping from within an interrupt handler > >> is to use a threaded interrupt handler - a regular interrupt routine > >> does the minimum amount of work needed to triage the interrupt before > >> waking the interrupt service thread. If the threaded IRQ is marked as > >> IRQF_ONESHOT the kernel will automatically mask out the interrupt > >> until the thread runs to completion. The sc16is7xx driver used to > >> use a threaded IRQ, but a patch switched to using a kthread_worker > >> in order to set realtime priorities on the handler thread and for > >> other optimisations. The end result is non-threaded IRQ that > >> schedules some work then returns IRQ_HANDLED, making the kernel > >> think that all IRQ processing has completed. > >> > >> The work-around to prevent a constant stream of interrupts is to > >> mark the interrupt as edge-sensitive rather than level-sensitive, > >> but interpreting an active-low source as a falling-edge source > >> requires care to prevent a total cessation of interrupts. Whereas > >> an edge-triggering source will generate a new edge for every interrupt > >> condition a level-triggering source will keep the signal at the > >> interrupting level until it no longer requires attention; in other > >> words, the host won't see another edge until all interrupt conditions > >> are cleared. It is therefore vital that the interrupt handler does not > >> exit with an outstanding interrupt condition, otherwise the kernel > >> will not receive another interrupt unless some other operation causes > >> the interrupt state on the device to be cleared. > >> > >> The existing sc16is7xx driver has a very simple interrupt "thread" > >> (kthread_work job) that processes interrupts on each channel in turn > >> until there are no more. If both channels are active and the first > >> channel starts interrupting while the handler for the second channel > >> is running then it will not be detected and an IRQ stall ensues. This > >> could be handled easily if there was a shared IRQ status register, or > >> a convenient way to determine if the IRQ had been deasserted for any > >> length of time, but both appear to be lacking. > >> > >> Avoid this problem (or at least make it much less likely to happen) > >> by reducing the granularity of per-channel interrupt processing > >> to one condition per iteration, only exiting the overall loop when > >> both channels are no longer interrupting. > >> > >> Signed-off-by: Phil Elwell <phil@raspberrypi.org> > >> --- > >> drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------ > >> 1 file changed, 13 insertions(+), 6 deletions(-) > >> > >> diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c > >> index 243c960..47b4115 100644 > >> --- a/drivers/tty/serial/sc16is7xx.c > >> +++ b/drivers/tty/serial/sc16is7xx.c > >> @@ -657,7 +657,7 @@ static void sc16is7xx_handle_tx(struct uart_port *port) > >> uart_write_wakeup(port); > >> } > >> > >> -static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > >> +static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > >> { > >> struct uart_port *port = &s->p[portno].port; > >> > >> @@ -666,7 +666,7 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > >> > >> iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG); > >> if (iir & SC16IS7XX_IIR_NO_INT_BIT) > >> - break; > >> + return false; > >> > >> iir &= SC16IS7XX_IIR_ID_MASK; > >> > >> @@ -688,16 +688,23 @@ static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno) > >> port->line, iir); > >> break; > >> } > >> - } while (1); > >> + } while (0); > >> + return true; > >> } > >> > >> static void sc16is7xx_ist(struct kthread_work *ws) > >> { > >> struct sc16is7xx_port *s = to_sc16is7xx_port(ws, irq_work); > >> - int i; > >> > >> - for (i = 0; i < s->devtype->nr_uart; ++i) > >> - sc16is7xx_port_irq(s, i); > >> + while (1) { > >> + bool keep_polling = false; > >> + int i; > >> + > >> + for (i = 0; i < s->devtype->nr_uart; ++i) > >> + keep_polling |= sc16is7xx_port_irq(s, i); > >> + if (!keep_polling) > >> + break; > > > > This makes me worried, there is no "timeout" now? What happens if this > > never happens, will you just sit and spin forever? What prevents that? > > The patch is keeping to the spirit of the original, which also has a > potentially infinite loop (in sc16is7xx_port_irq) - this just moves it > up one level. > > I could add a limit on the number of iterations, but if the limit is ever hit, > leading to an early exit, the port is basically dead because it will never > receive another interrupt. Ok, it's your hardware, good luck with it! :) greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-18 13:13 ` Phil Elwell 2018-09-18 13:26 ` Greg Kroah-Hartman @ 2018-09-18 13:37 ` Rogier Wolff 1 sibling, 0 replies; 9+ messages in thread From: Rogier Wolff @ 2018-09-18 13:37 UTC (permalink / raw) To: Phil Elwell Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren On Tue, Sep 18, 2018 at 02:13:15PM +0100, Phil Elwell wrote: > I could add a limit on the number of iterations, but if the limit is ever hit, > leading to an early exit, the port is basically dead because it will never > receive another interrupt. Especially if you print something like: "<driver name>: Too many iterations with work-to-do bailing out" while bailing out, then hanging just one driver/piece of hardware as opposed to the whole system when somehow the hardware never indicates "all work done" would be preferable. Under normal circumstances you never expect to hit that number of iterations. But if the card keeps hitting the driver with "more work to do" then you'd hang the system. Better try and recover, and provide debug info for the user who knows where to look. Best would be to ignore the driver for say a second and start handling interrupts again a while later. Should the system be overloaded with work, (and a slow CPU?) then you can recover and just make things slow down a bit without hanging the system. Getting edge-triggered interrupts right is VERY difficult. In general I'd advise against them. It looks like a nice solution, but in reality the chances for difficult-to-debug race conditions is enormous. In those race conditions the card will get "new work to do" and (re-)assert the interrupt line when the driver is already on the "no more work to do" path. Roger. -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** ** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 ** *-- BitWizard writes Linux device drivers for any device you may have! --* The plan was simple, like my brother-in-law Phil. But unlike Phil, this plan just might work. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-12 14:31 ` [PATCH 1/2] sc16is7xx: Fix for multi-channel stall Phil Elwell 2018-09-18 13:02 ` Greg Kroah-Hartman @ 2018-09-29 14:04 ` Andreas Färber 2018-10-02 20:36 ` Greg Kroah-Hartman 1 sibling, 1 reply; 9+ messages in thread From: Andreas Färber @ 2018-09-29 14:04 UTC (permalink / raw) To: Phil Elwell, Greg Kroah-Hartman Cc: Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren, Mian Yousaf Kaukab, Matthias Brugger, Michael Allwright, Jakub Kicinski, Xue Liu Hi Phil and Greg, Am 12.09.18 um 16:31 schrieb Phil Elwell: > The SC16IS752 is a dual-channel device. The two channels are largely > independent, but the IRQ signals are wired together as an open-drain, > active low signal which will be driven low while either of the > channels requires attention, which can be for significant periods of > time until operations complete and the interrupt can be acknowledged. > In that respect it is should be treated as a true level-sensitive IRQ. > > The kernel, however, needs to be able to exit interrupt context in > order to use I2C or SPI to access the device registers (which may > involve sleeping). Therefore the interrupt needs to be masked out or > paused in some way. > > The usual way to manage sleeping from within an interrupt handler > is to use a threaded interrupt handler - a regular interrupt routine > does the minimum amount of work needed to triage the interrupt before > waking the interrupt service thread. If the threaded IRQ is marked as > IRQF_ONESHOT the kernel will automatically mask out the interrupt > until the thread runs to completion. The sc16is7xx driver used to > use a threaded IRQ, but a patch switched to using a kthread_worker > in order to set realtime priorities on the handler thread and for > other optimisations. The end result is non-threaded IRQ that > schedules some work then returns IRQ_HANDLED, making the kernel > think that all IRQ processing has completed. > > The work-around to prevent a constant stream of interrupts is to > mark the interrupt as edge-sensitive rather than level-sensitive, > but interpreting an active-low source as a falling-edge source > requires care to prevent a total cessation of interrupts. Whereas > an edge-triggering source will generate a new edge for every interrupt > condition a level-triggering source will keep the signal at the > interrupting level until it no longer requires attention; in other > words, the host won't see another edge until all interrupt conditions > are cleared. It is therefore vital that the interrupt handler does not > exit with an outstanding interrupt condition, otherwise the kernel > will not receive another interrupt unless some other operation causes > the interrupt state on the device to be cleared. > > The existing sc16is7xx driver has a very simple interrupt "thread" > (kthread_work job) that processes interrupts on each channel in turn > until there are no more. If both channels are active and the first > channel starts interrupting while the handler for the second channel > is running then it will not be detected and an IRQ stall ensues. This > could be handled easily if there was a shared IRQ status register, or > a convenient way to determine if the IRQ had been deasserted for any > length of time, but both appear to be lacking. > > Avoid this problem (or at least make it much less likely to happen) > by reducing the granularity of per-channel interrupt processing > to one condition per iteration, only exiting the overall loop when > both channels are no longer interrupting. > > Signed-off-by: Phil Elwell <phil@raspberrypi.org> > --- > drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) These two patches seem to be applied in linux-next tree, but are lacking a Fixes: header for backporting to affected stable trees. openSUSE Tumbleweed's 4.18 appears to be affected, and I didn't see it in linux.git for upcoming 4.19 either. Can the commit message still be updated to get this fixed everywhere? Which is the offending commit, this one from 2016? The only other 2018 change seems an unrelated error handling cleanup. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/tty/serial/sc16is7xx.c?id=04da73803c05dc1150ccc31cbf93e8cd56679c09 Also, the above commit message confuses me as to how after this commit the driver should be used: Should the interrupt still be falling-edge, as documented in the DT bindings, or do user/documentation need to switch to level-triggered now? Thanks, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall 2018-09-29 14:04 ` Andreas Färber @ 2018-10-02 20:36 ` Greg Kroah-Hartman 0 siblings, 0 replies; 9+ messages in thread From: Greg Kroah-Hartman @ 2018-10-02 20:36 UTC (permalink / raw) To: Andreas Färber Cc: Phil Elwell, Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren, Mian Yousaf Kaukab, Matthias Brugger, Michael Allwright, Jakub Kicinski, Xue Liu On Sat, Sep 29, 2018 at 04:04:48PM +0200, Andreas Färber wrote: > Hi Phil and Greg, > > Am 12.09.18 um 16:31 schrieb Phil Elwell: > > The SC16IS752 is a dual-channel device. The two channels are largely > > independent, but the IRQ signals are wired together as an open-drain, > > active low signal which will be driven low while either of the > > channels requires attention, which can be for significant periods of > > time until operations complete and the interrupt can be acknowledged. > > In that respect it is should be treated as a true level-sensitive IRQ. > > > > The kernel, however, needs to be able to exit interrupt context in > > order to use I2C or SPI to access the device registers (which may > > involve sleeping). Therefore the interrupt needs to be masked out or > > paused in some way. > > > > The usual way to manage sleeping from within an interrupt handler > > is to use a threaded interrupt handler - a regular interrupt routine > > does the minimum amount of work needed to triage the interrupt before > > waking the interrupt service thread. If the threaded IRQ is marked as > > IRQF_ONESHOT the kernel will automatically mask out the interrupt > > until the thread runs to completion. The sc16is7xx driver used to > > use a threaded IRQ, but a patch switched to using a kthread_worker > > in order to set realtime priorities on the handler thread and for > > other optimisations. The end result is non-threaded IRQ that > > schedules some work then returns IRQ_HANDLED, making the kernel > > think that all IRQ processing has completed. > > > > The work-around to prevent a constant stream of interrupts is to > > mark the interrupt as edge-sensitive rather than level-sensitive, > > but interpreting an active-low source as a falling-edge source > > requires care to prevent a total cessation of interrupts. Whereas > > an edge-triggering source will generate a new edge for every interrupt > > condition a level-triggering source will keep the signal at the > > interrupting level until it no longer requires attention; in other > > words, the host won't see another edge until all interrupt conditions > > are cleared. It is therefore vital that the interrupt handler does not > > exit with an outstanding interrupt condition, otherwise the kernel > > will not receive another interrupt unless some other operation causes > > the interrupt state on the device to be cleared. > > > > The existing sc16is7xx driver has a very simple interrupt "thread" > > (kthread_work job) that processes interrupts on each channel in turn > > until there are no more. If both channels are active and the first > > channel starts interrupting while the handler for the second channel > > is running then it will not be detected and an IRQ stall ensues. This > > could be handled easily if there was a shared IRQ status register, or > > a convenient way to determine if the IRQ had been deasserted for any > > length of time, but both appear to be lacking. > > > > Avoid this problem (or at least make it much less likely to happen) > > by reducing the granularity of per-channel interrupt processing > > to one condition per iteration, only exiting the overall loop when > > both channels are no longer interrupting. > > > > Signed-off-by: Phil Elwell <phil@raspberrypi.org> > > --- > > drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------ > > 1 file changed, 13 insertions(+), 6 deletions(-) > > These two patches seem to be applied in linux-next tree, but are lacking > a Fixes: header for backporting to affected stable trees. > > openSUSE Tumbleweed's 4.18 appears to be affected, and I didn't see it > in linux.git for upcoming 4.19 either. > > Can the commit message still be updated to get this fixed everywhere? I can't change the tree, sorry, I can not rebase. If you want these applied to the stable tree, please email stable@vger.kernel.org when they hit Linus's tree with the git commit ids and I will be glad to queue them up then. thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] sc16is7xx: Fix for "Unexpected interrupt: 8" 2018-09-12 14:31 [PATCH 0/2] sc16is7xx interrupt fixes Phil Elwell 2018-09-12 14:31 ` [PATCH 1/2] sc16is7xx: Fix for multi-channel stall Phil Elwell @ 2018-09-12 14:31 ` Phil Elwell 1 sibling, 0 replies; 9+ messages in thread From: Phil Elwell @ 2018-09-12 14:31 UTC (permalink / raw) To: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel, Alexander Graf, Stefan Wahren Cc: Phil Elwell The SC16IS752 has an Enhanced Feature Register which is aliased at the same address as the Interrupt Identification Register; accessing it requires that a magic value is written to the Line Configuration Register. If an interrupt is raised while the EFR is mapped in then the ISR won't be able to access the IIR, leading to the "Unexpected interrupt" error messages. Avoid the problem by claiming a mutex around accesses to the EFR register, also claiming the mutex in the interrupt handler work item (this is equivalent to disabling interrupts to interlock against a non-threaded interrupt handler). See: https://github.com/raspberrypi/linux/issues/2529 Signed-off-by: Phil Elwell <phil@raspberrypi.org> --- drivers/tty/serial/sc16is7xx.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index 47b4115..2680986 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -328,6 +328,7 @@ struct sc16is7xx_port { struct kthread_worker kworker; struct task_struct *kworker_task; struct kthread_work irq_work; + struct mutex efr_lock; struct sc16is7xx_one p[0]; }; @@ -499,6 +500,21 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud) div /= 4; } + /* In an amazing feat of design, the Enhanced Features Register shares + * the address of the Interrupt Identification Register, and is + * switched in by writing a magic value (0xbf) to the Line Control + * Register. Any interrupt firing during this time will see the EFR + * where it expects the IIR to be, leading to "Unexpected interrupt" + * messages. + * + * Prevent this possibility by claiming a mutex while accessing the + * EFR, and claiming the same mutex from within the interrupt handler. + * This is similar to disabling the interrupt, but that doesn't work + * because the bulk of the interrupt processing is run as a workqueue + * job in thread context. + */ + mutex_lock(&s->efr_lock); + lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG); /* Open the LCR divisors for configuration */ @@ -514,6 +530,8 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud) /* Put LCR back to the normal mode */ sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr); + mutex_unlock(&s->efr_lock); + sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_CLKSEL_BIT, prescaler); @@ -696,6 +714,8 @@ static void sc16is7xx_ist(struct kthread_work *ws) { struct sc16is7xx_port *s = to_sc16is7xx_port(ws, irq_work); + mutex_lock(&s->efr_lock); + while (1) { bool keep_polling = false; int i; @@ -705,6 +725,8 @@ static void sc16is7xx_ist(struct kthread_work *ws) if (!keep_polling) break; } + + mutex_unlock(&s->efr_lock); } static irqreturn_t sc16is7xx_irq(int irq, void *dev_id) @@ -899,6 +921,9 @@ static void sc16is7xx_set_termios(struct uart_port *port, if (!(termios->c_cflag & CREAD)) port->ignore_status_mask |= SC16IS7XX_LSR_BRK_ERROR_MASK; + /* As above, claim the mutex while accessing the EFR. */ + mutex_lock(&s->efr_lock); + sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, SC16IS7XX_LCR_CONF_MODE_B); @@ -920,6 +945,8 @@ static void sc16is7xx_set_termios(struct uart_port *port, /* Update LCR register */ sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr); + mutex_unlock(&s->efr_lock); + /* Get baud rate generator configuration */ baud = uart_get_baud_rate(port, termios, old, port->uartclk / 16 / 4 / 0xffff, @@ -1185,6 +1212,7 @@ static int sc16is7xx_probe(struct device *dev, s->regmap = regmap; s->devtype = devtype; dev_set_drvdata(dev, s); + mutex_init(&s->efr_lock); kthread_init_worker(&s->kworker); kthread_init_work(&s->irq_work, sc16is7xx_ist); -- 2.7.4 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-10-02 20:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-09-12 14:31 [PATCH 0/2] sc16is7xx interrupt fixes Phil Elwell 2018-09-12 14:31 ` [PATCH 1/2] sc16is7xx: Fix for multi-channel stall Phil Elwell 2018-09-18 13:02 ` Greg Kroah-Hartman 2018-09-18 13:13 ` Phil Elwell 2018-09-18 13:26 ` Greg Kroah-Hartman 2018-09-18 13:37 ` Rogier Wolff 2018-09-29 14:04 ` Andreas Färber 2018-10-02 20:36 ` Greg Kroah-Hartman 2018-09-12 14:31 ` [PATCH 2/2] sc16is7xx: Fix for "Unexpected interrupt: 8" Phil Elwell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.