* [PATCH v2 0/2] tty: serial: msm_serial: Fix lockup issues @ 2019-11-27 14:15 Leo Yan 2019-11-27 14:15 ` [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops Leo Yan 2019-11-27 14:15 ` [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output Leo Yan 0 siblings, 2 replies; 13+ messages in thread From: Leo Yan @ 2019-11-27 14:15 UTC (permalink / raw) To: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, Jeffrey Hugo, linux-arm-msm, linux-serial, linux-kernel Cc: Leo Yan This patch set is to address two msm serial driver's lockup issues. The first lockup issue is a well known and common issue which is caused by recursive locking between normal printing and the kernel debugging facilities (e.g. sysrq and oops). Patch 0001 follows up other drivers general approach to fix this lockup issue. The second lockup issue is related with msm serial driver's specific implementation. Since the serial driver invokes dma related operations after has acquired spinlock, then the dma functions might allocat dma descriptors and when the system has very less free memory the kernel tries to print out warning, this leads to recursive output and causes deadlock. Patch 0002 is used to resolve this deadlock issue. These two patches have been tested on DB410c with backported on 4.14.96, they also have been verified with mainline kernel for boot testing. Changes from v1: * Added 'Fixes' tags for two patches (Jeffrey Hugo). * Added cover letter for more clear context description (Jeffrey Hugo). Leo Yan (2): tty: serial: msm_serial: Fix lockup for sysrq and oops tty: serial: msm_serial: Fix deadlock caused by recursive output drivers/tty/serial/msm_serial.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops 2019-11-27 14:15 [PATCH v2 0/2] tty: serial: msm_serial: Fix lockup issues Leo Yan @ 2019-11-27 14:15 ` Leo Yan 2019-12-02 16:37 ` Jeffrey Hugo 2019-12-05 0:40 ` Doug Anderson 2019-11-27 14:15 ` [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output Leo Yan 1 sibling, 2 replies; 13+ messages in thread From: Leo Yan @ 2019-11-27 14:15 UTC (permalink / raw) To: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, Jeffrey Hugo, linux-arm-msm, linux-serial, linux-kernel Cc: Leo Yan As the commit 677fe555cbfb ("serial: imx: Fix recursive locking bug") has mentioned the uart driver might cause recursive locking between normal printing and the kernel debugging facilities (e.g. sysrq and oops). In the commit it gave out suggestion for fixing recursive locking issue: "The solution is to avoid locking in the sysrq case and trylock in the oops_in_progress case." This patch follows the suggestion (also used the exactly same code with other serial drivers, e.g. amba-pl011.c) to fix the recursive locking issue, this can avoid stuck caused by deadlock and print out log for sysrq and oops. Fixes: 04896a77a97b ("msm_serial: serial driver for MSM7K onboard serial peripheral.") Signed-off-by: Leo Yan <leo.yan@linaro.org> --- drivers/tty/serial/msm_serial.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c index 3657a24913fc..889538182e83 100644 --- a/drivers/tty/serial/msm_serial.c +++ b/drivers/tty/serial/msm_serial.c @@ -1576,6 +1576,7 @@ static void __msm_console_write(struct uart_port *port, const char *s, int num_newlines = 0; bool replaced = false; void __iomem *tf; + int locked = 1; if (is_uartdm) tf = port->membase + UARTDM_TF; @@ -1588,7 +1589,13 @@ static void __msm_console_write(struct uart_port *port, const char *s, num_newlines++; count += num_newlines; - spin_lock(&port->lock); + if (port->sysrq) + locked = 0; + else if (oops_in_progress) + locked = spin_trylock(&port->lock); + else + spin_lock(&port->lock); + if (is_uartdm) msm_reset_dm_count(port, count); @@ -1624,7 +1631,9 @@ static void __msm_console_write(struct uart_port *port, const char *s, iowrite32_rep(tf, buf, 1); i += num_chars; } - spin_unlock(&port->lock); + + if (locked) + spin_unlock(&port->lock); } static void msm_console_write(struct console *co, const char *s, -- 2.17.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops 2019-11-27 14:15 ` [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops Leo Yan @ 2019-12-02 16:37 ` Jeffrey Hugo 2019-12-05 0:40 ` Doug Anderson 1 sibling, 0 replies; 13+ messages in thread From: Jeffrey Hugo @ 2019-12-02 16:37 UTC (permalink / raw) To: Leo Yan Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Wed, Nov 27, 2019 at 7:16 AM Leo Yan <leo.yan@linaro.org> wrote: > > As the commit 677fe555cbfb ("serial: imx: Fix recursive locking bug") > has mentioned the uart driver might cause recursive locking between > normal printing and the kernel debugging facilities (e.g. sysrq and > oops). In the commit it gave out suggestion for fixing recursive > locking issue: "The solution is to avoid locking in the sysrq case > and trylock in the oops_in_progress case." > > This patch follows the suggestion (also used the exactly same code with > other serial drivers, e.g. amba-pl011.c) to fix the recursive locking > issue, this can avoid stuck caused by deadlock and print out log for > sysrq and oops. > > Fixes: 04896a77a97b ("msm_serial: serial driver for MSM7K onboard serial peripheral.") > Signed-off-by: Leo Yan <leo.yan@linaro.org> Looks sane to me Reviewed-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com> > --- > drivers/tty/serial/msm_serial.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > index 3657a24913fc..889538182e83 100644 > --- a/drivers/tty/serial/msm_serial.c > +++ b/drivers/tty/serial/msm_serial.c > @@ -1576,6 +1576,7 @@ static void __msm_console_write(struct uart_port *port, const char *s, > int num_newlines = 0; > bool replaced = false; > void __iomem *tf; > + int locked = 1; > > if (is_uartdm) > tf = port->membase + UARTDM_TF; > @@ -1588,7 +1589,13 @@ static void __msm_console_write(struct uart_port *port, const char *s, > num_newlines++; > count += num_newlines; > > - spin_lock(&port->lock); > + if (port->sysrq) > + locked = 0; > + else if (oops_in_progress) > + locked = spin_trylock(&port->lock); > + else > + spin_lock(&port->lock); > + > if (is_uartdm) > msm_reset_dm_count(port, count); > > @@ -1624,7 +1631,9 @@ static void __msm_console_write(struct uart_port *port, const char *s, > iowrite32_rep(tf, buf, 1); > i += num_chars; > } > - spin_unlock(&port->lock); > + > + if (locked) > + spin_unlock(&port->lock); > } > > static void msm_console_write(struct console *co, const char *s, > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops 2019-11-27 14:15 ` [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops Leo Yan 2019-12-02 16:37 ` Jeffrey Hugo @ 2019-12-05 0:40 ` Doug Anderson 2019-12-27 6:44 ` Leo Yan 1 sibling, 1 reply; 13+ messages in thread From: Doug Anderson @ 2019-12-05 0:40 UTC (permalink / raw) To: Leo Yan Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, Jeffrey Hugo, linux-arm-msm, linux-serial, LKML Hi, On Wed, Nov 27, 2019 at 10:16 PM Leo Yan <leo.yan@linaro.org> wrote: > > As the commit 677fe555cbfb ("serial: imx: Fix recursive locking bug") > has mentioned the uart driver might cause recursive locking between > normal printing and the kernel debugging facilities (e.g. sysrq and > oops). In the commit it gave out suggestion for fixing recursive > locking issue: "The solution is to avoid locking in the sysrq case > and trylock in the oops_in_progress case." > > This patch follows the suggestion (also used the exactly same code with > other serial drivers, e.g. amba-pl011.c) to fix the recursive locking > issue, this can avoid stuck caused by deadlock and print out log for > sysrq and oops. > > Fixes: 04896a77a97b ("msm_serial: serial driver for MSM7K onboard serial peripheral.") > Signed-off-by: Leo Yan <leo.yan@linaro.org> > --- > drivers/tty/serial/msm_serial.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > index 3657a24913fc..889538182e83 100644 > --- a/drivers/tty/serial/msm_serial.c > +++ b/drivers/tty/serial/msm_serial.c > @@ -1576,6 +1576,7 @@ static void __msm_console_write(struct uart_port *port, const char *s, > int num_newlines = 0; > bool replaced = false; > void __iomem *tf; > + int locked = 1; > > if (is_uartdm) > tf = port->membase + UARTDM_TF; > @@ -1588,7 +1589,13 @@ static void __msm_console_write(struct uart_port *port, const char *s, > num_newlines++; > count += num_newlines; > > - spin_lock(&port->lock); > + if (port->sysrq) > + locked = 0; > + else if (oops_in_progress) > + locked = spin_trylock(&port->lock); > + else > + spin_lock(&port->lock); I don't have tons of experience with the "msm" serial driver, but the above snippet tickled a memory in my brain for when I was looking at the "qcom_geni" serial driver, which is a close cousin. I seemed to remember that the "if (port->sysrq)" was something you didn't want. ...but maybe that's only if you do something like commit 336447b3298c ("serial: qcom_geni_serial: Process sysrq at port unlock time")? Any way you can try making a similar change to the msm driver and see if it allow you to remove the special case for "port->sysrq"? -Doug ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops 2019-12-05 0:40 ` Doug Anderson @ 2019-12-27 6:44 ` Leo Yan 0 siblings, 0 replies; 13+ messages in thread From: Leo Yan @ 2019-12-27 6:44 UTC (permalink / raw) To: Doug Anderson Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, Jeffrey Hugo, linux-arm-msm, linux-serial, LKML Hi Doug, On Thu, Dec 05, 2019 at 08:40:20AM +0800, Doug Anderson wrote: > Hi, > > On Wed, Nov 27, 2019 at 10:16 PM Leo Yan <leo.yan@linaro.org> wrote: > > > > As the commit 677fe555cbfb ("serial: imx: Fix recursive locking bug") > > has mentioned the uart driver might cause recursive locking between > > normal printing and the kernel debugging facilities (e.g. sysrq and > > oops). In the commit it gave out suggestion for fixing recursive > > locking issue: "The solution is to avoid locking in the sysrq case > > and trylock in the oops_in_progress case." > > > > This patch follows the suggestion (also used the exactly same code with > > other serial drivers, e.g. amba-pl011.c) to fix the recursive locking > > issue, this can avoid stuck caused by deadlock and print out log for > > sysrq and oops. > > > > Fixes: 04896a77a97b ("msm_serial: serial driver for MSM7K onboard serial peripheral.") > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > > --- > > drivers/tty/serial/msm_serial.c | 13 +++++++++++-- > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > > index 3657a24913fc..889538182e83 100644 > > --- a/drivers/tty/serial/msm_serial.c > > +++ b/drivers/tty/serial/msm_serial.c > > @@ -1576,6 +1576,7 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > int num_newlines = 0; > > bool replaced = false; > > void __iomem *tf; > > + int locked = 1; > > > > if (is_uartdm) > > tf = port->membase + UARTDM_TF; > > @@ -1588,7 +1589,13 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > num_newlines++; > > count += num_newlines; > > > > - spin_lock(&port->lock); > > + if (port->sysrq) > > + locked = 0; > > + else if (oops_in_progress) > > + locked = spin_trylock(&port->lock); > > + else > > + spin_lock(&port->lock); > > I don't have tons of experience with the "msm" serial driver, but the > above snippet tickled a memory in my brain for when I was looking at > the "qcom_geni" serial driver, which is a close cousin. Good point and thanks for sharing info. > I seemed to remember that the "if (port->sysrq)" was something you > didn't want. ...but maybe that's only if you do something like commit > 336447b3298c ("serial: qcom_geni_serial: Process sysrq at port unlock > time")? The patch you mentioned can allow to read sysrq chars without acquiring port lock. > Any way you can try making a similar change to the msm driver > and see if it allow you to remove the special case for "port->sysrq"? I was deliberately to add the case for "port->sysrq" in the function __msm_console_write() when I prepared this patch. Let's see the issue obersved: when the UART drive is deadlock by any reason, when sent break + h to test system is alive or not, sysrq tries to ouput log by calling __msm_console_write(), it tries to acquire console's spinlock but also run into deadlock; finally it doesn't have chance to output log for sysrq. P.s. Sorry my long late and didn't follow good practice to reply quickly due worked on other works. Thanks, Leo Yan ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-11-27 14:15 [PATCH v2 0/2] tty: serial: msm_serial: Fix lockup issues Leo Yan 2019-11-27 14:15 ` [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops Leo Yan @ 2019-11-27 14:15 ` Leo Yan 2019-12-02 16:40 ` Jeffrey Hugo 1 sibling, 1 reply; 13+ messages in thread From: Leo Yan @ 2019-11-27 14:15 UTC (permalink / raw) To: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, Jeffrey Hugo, linux-arm-msm, linux-serial, linux-kernel Cc: Leo Yan The uart driver might run into deadlock caused by recursive output; the basic flow is after uart driver has acquired the port spinlock, then it invokes some functions, e.g. dma engine operations and allocate dma descriptor with kzalloc(), if the system has very less free memory, the kernel will give out warning by printing logs, thus recursive output will happen and at the second time the attempting to acquire lock will cause deadlock. The detailed flow is shown as below: msm_uart_irq() spin_lock_irqsave(&port->lock, flags) => First time to acquire lock msm_handle_tx(port) msm_handle_tx_dma() dmaengine_prep_slave_single() bam_prep_slave_sg() kzalloc() __kmalloc() ___slab_alloc() alloc_pages_current() __alloc_pages_nodemask() warn_alloc() printk() msm_console_write() __msm_console_write() spin_lock(&port->lock) => Cause deadlock This patch fixes the deadlock issue for recursive output; it adds a variable 'curr_user' to indicate the uart port is used by which CPU, if the CPU has acquired spinlock and wants to execute recursive output, it will directly bail out. Here we don't choose to avoid locking and print out log, the reason is in this case we don't want to reset the uart port with function msm_reset_dm_count(); otherwise it can introduce confliction with other flows and results in uart port malfunction and later cannot output anymore. Fixes: 99693945013a ("tty: serial: msm: Add RX DMA support") Fixes: 3a878c430fd6 ("tty: serial: msm: Add TX DMA support") Signed-off-by: Leo Yan <leo.yan@linaro.org> --- drivers/tty/serial/msm_serial.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c index 889538182e83..06076cd2948f 100644 --- a/drivers/tty/serial/msm_serial.c +++ b/drivers/tty/serial/msm_serial.c @@ -182,6 +182,7 @@ struct msm_port { bool break_detected; struct msm_dma tx_dma; struct msm_dma rx_dma; + struct cpumask curr_user; }; #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) @@ -436,6 +437,7 @@ static void msm_complete_tx_dma(void *args) u32 val; spin_lock_irqsave(&port->lock, flags); + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); /* Already stopped */ if (!dma->count) @@ -470,6 +472,7 @@ static void msm_complete_tx_dma(void *args) msm_handle_tx(port); done: + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); spin_unlock_irqrestore(&port->lock, flags); } @@ -544,6 +547,7 @@ static void msm_complete_rx_dma(void *args) u32 val; spin_lock_irqsave(&port->lock, flags); + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); /* Already stopped */ if (!dma->count) @@ -590,6 +594,7 @@ static void msm_complete_rx_dma(void *args) msm_start_rx_dma(msm_port); done: + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); spin_unlock_irqrestore(&port->lock, flags); if (count) @@ -931,6 +936,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) u32 val; spin_lock_irqsave(&port->lock, flags); + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); misr = msm_read(port, UART_MISR); msm_write(port, 0, UART_IMR); /* disable interrupt */ @@ -962,6 +968,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) msm_handle_delta_cts(port); msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); spin_unlock_irqrestore(&port->lock, flags); return IRQ_HANDLED; @@ -1572,6 +1579,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) static void __msm_console_write(struct uart_port *port, const char *s, unsigned int count, bool is_uartdm) { + struct msm_port *msm_port = UART_TO_MSM(port); int i; int num_newlines = 0; bool replaced = false; @@ -1593,6 +1601,8 @@ static void __msm_console_write(struct uart_port *port, const char *s, locked = 0; else if (oops_in_progress) locked = spin_trylock(&port->lock); + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) + return; else spin_lock(&port->lock); -- 2.17.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-11-27 14:15 ` [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output Leo Yan @ 2019-12-02 16:40 ` Jeffrey Hugo 2019-12-03 8:23 ` Leo Yan 0 siblings, 1 reply; 13+ messages in thread From: Jeffrey Hugo @ 2019-12-02 16:40 UTC (permalink / raw) To: Leo Yan Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Wed, Nov 27, 2019 at 7:16 AM Leo Yan <leo.yan@linaro.org> wrote: > > The uart driver might run into deadlock caused by recursive output; the > basic flow is after uart driver has acquired the port spinlock, then it > invokes some functions, e.g. dma engine operations and allocate dma > descriptor with kzalloc(), if the system has very less free memory, the > kernel will give out warning by printing logs, thus recursive output > will happen and at the second time the attempting to acquire lock will > cause deadlock. The detailed flow is shown as below: > > msm_uart_irq() > spin_lock_irqsave(&port->lock, flags) => First time to acquire lock > msm_handle_tx(port) > msm_handle_tx_dma() > dmaengine_prep_slave_single() > bam_prep_slave_sg() > kzalloc() > __kmalloc() > ___slab_alloc() > alloc_pages_current() > __alloc_pages_nodemask() > warn_alloc() > printk() > msm_console_write() > __msm_console_write() > spin_lock(&port->lock) => Cause deadlock > > This patch fixes the deadlock issue for recursive output; it adds a > variable 'curr_user' to indicate the uart port is used by which CPU, if > the CPU has acquired spinlock and wants to execute recursive output, > it will directly bail out. Here we don't choose to avoid locking and > print out log, the reason is in this case we don't want to reset the > uart port with function msm_reset_dm_count(); otherwise it can introduce > confliction with other flows and results in uart port malfunction and > later cannot output anymore. Is this not fixable? Sure, fixing the deadlock is an improvement, but dropping logs (particularly a memory warning like in your example) seems undesirable. > > Fixes: 99693945013a ("tty: serial: msm: Add RX DMA support") > Fixes: 3a878c430fd6 ("tty: serial: msm: Add TX DMA support") > Signed-off-by: Leo Yan <leo.yan@linaro.org> > --- > drivers/tty/serial/msm_serial.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > index 889538182e83..06076cd2948f 100644 > --- a/drivers/tty/serial/msm_serial.c > +++ b/drivers/tty/serial/msm_serial.c > @@ -182,6 +182,7 @@ struct msm_port { > bool break_detected; > struct msm_dma tx_dma; > struct msm_dma rx_dma; > + struct cpumask curr_user; > }; > > #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) > @@ -436,6 +437,7 @@ static void msm_complete_tx_dma(void *args) > u32 val; > > spin_lock_irqsave(&port->lock, flags); > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > /* Already stopped */ > if (!dma->count) > @@ -470,6 +472,7 @@ static void msm_complete_tx_dma(void *args) > > msm_handle_tx(port); > done: > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > spin_unlock_irqrestore(&port->lock, flags); > } > > @@ -544,6 +547,7 @@ static void msm_complete_rx_dma(void *args) > u32 val; > > spin_lock_irqsave(&port->lock, flags); > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > /* Already stopped */ > if (!dma->count) > @@ -590,6 +594,7 @@ static void msm_complete_rx_dma(void *args) > > msm_start_rx_dma(msm_port); > done: > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > spin_unlock_irqrestore(&port->lock, flags); > > if (count) > @@ -931,6 +936,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > u32 val; > > spin_lock_irqsave(&port->lock, flags); > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > misr = msm_read(port, UART_MISR); > msm_write(port, 0, UART_IMR); /* disable interrupt */ > > @@ -962,6 +968,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > msm_handle_delta_cts(port); > > msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > spin_unlock_irqrestore(&port->lock, flags); > > return IRQ_HANDLED; > @@ -1572,6 +1579,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > static void __msm_console_write(struct uart_port *port, const char *s, > unsigned int count, bool is_uartdm) > { > + struct msm_port *msm_port = UART_TO_MSM(port); > int i; > int num_newlines = 0; > bool replaced = false; > @@ -1593,6 +1601,8 @@ static void __msm_console_write(struct uart_port *port, const char *s, > locked = 0; > else if (oops_in_progress) > locked = spin_trylock(&port->lock); > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > + return; > else > spin_lock(&port->lock); > > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-12-02 16:40 ` Jeffrey Hugo @ 2019-12-03 8:23 ` Leo Yan 2019-12-03 22:42 ` Jeffrey Hugo 0 siblings, 1 reply; 13+ messages in thread From: Leo Yan @ 2019-12-03 8:23 UTC (permalink / raw) To: Jeffrey Hugo Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Mon, Dec 02, 2019 at 09:40:55AM -0700, Jeffrey Hugo wrote: > On Wed, Nov 27, 2019 at 7:16 AM Leo Yan <leo.yan@linaro.org> wrote: > > > > The uart driver might run into deadlock caused by recursive output; the > > basic flow is after uart driver has acquired the port spinlock, then it > > invokes some functions, e.g. dma engine operations and allocate dma > > descriptor with kzalloc(), if the system has very less free memory, the > > kernel will give out warning by printing logs, thus recursive output > > will happen and at the second time the attempting to acquire lock will > > cause deadlock. The detailed flow is shown as below: > > > > msm_uart_irq() > > spin_lock_irqsave(&port->lock, flags) => First time to acquire lock > > msm_handle_tx(port) > > msm_handle_tx_dma() > > dmaengine_prep_slave_single() > > bam_prep_slave_sg() > > kzalloc() > > __kmalloc() > > ___slab_alloc() > > alloc_pages_current() > > __alloc_pages_nodemask() > > warn_alloc() > > printk() > > msm_console_write() > > __msm_console_write() > > spin_lock(&port->lock) => Cause deadlock > > > > This patch fixes the deadlock issue for recursive output; it adds a > > variable 'curr_user' to indicate the uart port is used by which CPU, if > > the CPU has acquired spinlock and wants to execute recursive output, > > it will directly bail out. Here we don't choose to avoid locking and > > print out log, the reason is in this case we don't want to reset the > > uart port with function msm_reset_dm_count(); otherwise it can introduce > > confliction with other flows and results in uart port malfunction and > > later cannot output anymore. > > Is this not fixable? Sure, fixing the deadlock is an improvement, but > dropping logs (particularly a memory warning like in your example) > seems undesirable. Thanks a lot for your reviewing, Jeffrey. Agreed with you for the concern. To be honest, I am not familiar with the msm uart driver, so have no confidence which is the best way for uart port operations. I can think out one possible fixing is shown in below, if detects the lock is not acquired then it will force to reset UART port before exit the function __msm_console_write(). This approach is not tested yet and it looks too arbitrary; I will give a try for it. At the meantime, welcome any insight suggestion with proper register operations. @@ -1572,6 +1579,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) static void __msm_console_write(struct uart_port *port, const char *s, unsigned int count, bool is_uartdm) { + struct msm_port *msm_port = UART_TO_MSM(port); int i; int num_newlines = 0; bool replaced = false; @@ -1593,6 +1601,8 @@ static void __msm_console_write(struct uart_port *port, const char *s, locked = 0; else if (oops_in_progress) locked = spin_trylock(&port->lock); + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) + return; else spin_lock(&port->lock); @@ -1632,6 +1642,9 @@ static void __msm_console_write(struct uart_port *port, const char *s, i += num_chars; } + if (!locked && is_uartdm) + msm_reset(port); + if (locked) spin_unlock(&port->lock); } Thanks, Leo > > > > Fixes: 99693945013a ("tty: serial: msm: Add RX DMA support") > > Fixes: 3a878c430fd6 ("tty: serial: msm: Add TX DMA support") > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > > --- > > drivers/tty/serial/msm_serial.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > > index 889538182e83..06076cd2948f 100644 > > --- a/drivers/tty/serial/msm_serial.c > > +++ b/drivers/tty/serial/msm_serial.c > > @@ -182,6 +182,7 @@ struct msm_port { > > bool break_detected; > > struct msm_dma tx_dma; > > struct msm_dma rx_dma; > > + struct cpumask curr_user; > > }; > > > > #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) > > @@ -436,6 +437,7 @@ static void msm_complete_tx_dma(void *args) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > /* Already stopped */ > > if (!dma->count) > > @@ -470,6 +472,7 @@ static void msm_complete_tx_dma(void *args) > > > > msm_handle_tx(port); > > done: > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > } > > > > @@ -544,6 +547,7 @@ static void msm_complete_rx_dma(void *args) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > /* Already stopped */ > > if (!dma->count) > > @@ -590,6 +594,7 @@ static void msm_complete_rx_dma(void *args) > > > > msm_start_rx_dma(msm_port); > > done: > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > > > if (count) > > @@ -931,6 +936,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > misr = msm_read(port, UART_MISR); > > msm_write(port, 0, UART_IMR); /* disable interrupt */ > > > > @@ -962,6 +968,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > msm_handle_delta_cts(port); > > > > msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > > > return IRQ_HANDLED; > > @@ -1572,6 +1579,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > > static void __msm_console_write(struct uart_port *port, const char *s, > > unsigned int count, bool is_uartdm) > > { > > + struct msm_port *msm_port = UART_TO_MSM(port); > > int i; > > int num_newlines = 0; > > bool replaced = false; > > @@ -1593,6 +1601,8 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > locked = 0; > > else if (oops_in_progress) > > locked = spin_trylock(&port->lock); > > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > > + return; > > else > > spin_lock(&port->lock); > > > > -- > > 2.17.1 > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-12-03 8:23 ` Leo Yan @ 2019-12-03 22:42 ` Jeffrey Hugo 2019-12-04 16:13 ` Leo Yan 0 siblings, 1 reply; 13+ messages in thread From: Jeffrey Hugo @ 2019-12-03 22:42 UTC (permalink / raw) To: Leo Yan Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Tue, Dec 3, 2019 at 1:23 AM Leo Yan <leo.yan@linaro.org> wrote: > > On Mon, Dec 02, 2019 at 09:40:55AM -0700, Jeffrey Hugo wrote: > > On Wed, Nov 27, 2019 at 7:16 AM Leo Yan <leo.yan@linaro.org> wrote: > > > > > > The uart driver might run into deadlock caused by recursive output; the > > > basic flow is after uart driver has acquired the port spinlock, then it > > > invokes some functions, e.g. dma engine operations and allocate dma > > > descriptor with kzalloc(), if the system has very less free memory, the > > > kernel will give out warning by printing logs, thus recursive output > > > will happen and at the second time the attempting to acquire lock will > > > cause deadlock. The detailed flow is shown as below: > > > > > > msm_uart_irq() > > > spin_lock_irqsave(&port->lock, flags) => First time to acquire lock > > > msm_handle_tx(port) > > > msm_handle_tx_dma() > > > dmaengine_prep_slave_single() > > > bam_prep_slave_sg() > > > kzalloc() > > > __kmalloc() > > > ___slab_alloc() > > > alloc_pages_current() > > > __alloc_pages_nodemask() > > > warn_alloc() > > > printk() > > > msm_console_write() > > > __msm_console_write() > > > spin_lock(&port->lock) => Cause deadlock > > > > > > This patch fixes the deadlock issue for recursive output; it adds a > > > variable 'curr_user' to indicate the uart port is used by which CPU, if > > > the CPU has acquired spinlock and wants to execute recursive output, > > > it will directly bail out. Here we don't choose to avoid locking and > > > print out log, the reason is in this case we don't want to reset the > > > uart port with function msm_reset_dm_count(); otherwise it can introduce > > > confliction with other flows and results in uart port malfunction and > > > later cannot output anymore. > > > > Is this not fixable? Sure, fixing the deadlock is an improvement, but > > dropping logs (particularly a memory warning like in your example) > > seems undesirable. > > Thanks a lot for your reviewing, Jeffrey. > > Agreed with you for the concern. > > To be honest, I am not familiar with the msm uart driver, so have no > confidence which is the best way for uart port operations. I can > think out one possible fixing is shown in below, if detects the lock > is not acquired then it will force to reset UART port before exit the > function __msm_console_write(). > > This approach is not tested yet and it looks too arbitrary; I will > give a try for it. At the meantime, welcome any insight suggestion > with proper register operations. According to the documentation, NCF_TX is only needed for SW transmit mode, where software is directly puttting characters in the fifo. Its not needed for BAM mode. According to your example, recursive console printing will only happen in BAM mode, and not in SW mode. Perhaps if we put the NCF_TX uses to just the SW mode, we avoid the issue and can allow recursive printing? > > @@ -1572,6 +1579,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > static void __msm_console_write(struct uart_port *port, const char *s, > unsigned int count, bool is_uartdm) > { > + struct msm_port *msm_port = UART_TO_MSM(port); > int i; > int num_newlines = 0; > bool replaced = false; > @@ -1593,6 +1601,8 @@ static void __msm_console_write(struct uart_port *port, const char *s, > locked = 0; > else if (oops_in_progress) > locked = spin_trylock(&port->lock); > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > + return; > else > spin_lock(&port->lock); > > @@ -1632,6 +1642,9 @@ static void __msm_console_write(struct uart_port *port, const char *s, > i += num_chars; > } > > + if (!locked && is_uartdm) > + msm_reset(port); > + > if (locked) > spin_unlock(&port->lock); > } > > Thanks, > Leo > > > > > > > Fixes: 99693945013a ("tty: serial: msm: Add RX DMA support") > > > Fixes: 3a878c430fd6 ("tty: serial: msm: Add TX DMA support") > > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > > > --- > > > drivers/tty/serial/msm_serial.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > > > index 889538182e83..06076cd2948f 100644 > > > --- a/drivers/tty/serial/msm_serial.c > > > +++ b/drivers/tty/serial/msm_serial.c > > > @@ -182,6 +182,7 @@ struct msm_port { > > > bool break_detected; > > > struct msm_dma tx_dma; > > > struct msm_dma rx_dma; > > > + struct cpumask curr_user; > > > }; > > > > > > #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) > > > @@ -436,6 +437,7 @@ static void msm_complete_tx_dma(void *args) > > > u32 val; > > > > > > spin_lock_irqsave(&port->lock, flags); > > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > > > /* Already stopped */ > > > if (!dma->count) > > > @@ -470,6 +472,7 @@ static void msm_complete_tx_dma(void *args) > > > > > > msm_handle_tx(port); > > > done: > > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > > spin_unlock_irqrestore(&port->lock, flags); > > > } > > > > > > @@ -544,6 +547,7 @@ static void msm_complete_rx_dma(void *args) > > > u32 val; > > > > > > spin_lock_irqsave(&port->lock, flags); > > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > > > /* Already stopped */ > > > if (!dma->count) > > > @@ -590,6 +594,7 @@ static void msm_complete_rx_dma(void *args) > > > > > > msm_start_rx_dma(msm_port); > > > done: > > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > > spin_unlock_irqrestore(&port->lock, flags); > > > > > > if (count) > > > @@ -931,6 +936,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > > u32 val; > > > > > > spin_lock_irqsave(&port->lock, flags); > > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > misr = msm_read(port, UART_MISR); > > > msm_write(port, 0, UART_IMR); /* disable interrupt */ > > > > > > @@ -962,6 +968,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > > msm_handle_delta_cts(port); > > > > > > msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ > > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > > spin_unlock_irqrestore(&port->lock, flags); > > > > > > return IRQ_HANDLED; > > > @@ -1572,6 +1579,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > > > static void __msm_console_write(struct uart_port *port, const char *s, > > > unsigned int count, bool is_uartdm) > > > { > > > + struct msm_port *msm_port = UART_TO_MSM(port); > > > int i; > > > int num_newlines = 0; > > > bool replaced = false; > > > @@ -1593,6 +1601,8 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > > locked = 0; > > > else if (oops_in_progress) > > > locked = spin_trylock(&port->lock); > > > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > > > + return; > > > else > > > spin_lock(&port->lock); > > > > > > -- > > > 2.17.1 > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-12-03 22:42 ` Jeffrey Hugo @ 2019-12-04 16:13 ` Leo Yan 2019-12-16 18:49 ` Jeffrey Hugo [not found] ` <CAD9cdQ6ROYf6B2PkQYJds_80-0dA=Jcew=TCNCSB=r+WEUNvdQ@mail.gmail.com> 0 siblings, 2 replies; 13+ messages in thread From: Leo Yan @ 2019-12-04 16:13 UTC (permalink / raw) To: Jeffrey Hugo Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Tue, Dec 03, 2019 at 03:42:31PM -0700, Jeffrey Hugo wrote: [...] > > > > This patch fixes the deadlock issue for recursive output; it adds a > > > > variable 'curr_user' to indicate the uart port is used by which CPU, if > > > > the CPU has acquired spinlock and wants to execute recursive output, > > > > it will directly bail out. Here we don't choose to avoid locking and > > > > print out log, the reason is in this case we don't want to reset the > > > > uart port with function msm_reset_dm_count(); otherwise it can introduce > > > > confliction with other flows and results in uart port malfunction and > > > > later cannot output anymore. > > > > > > Is this not fixable? Sure, fixing the deadlock is an improvement, but > > > dropping logs (particularly a memory warning like in your example) > > > seems undesirable. > > > > Thanks a lot for your reviewing, Jeffrey. > > > > Agreed with you for the concern. > > > > To be honest, I am not familiar with the msm uart driver, so have no > > confidence which is the best way for uart port operations. I can > > think out one possible fixing is shown in below, if detects the lock > > is not acquired then it will force to reset UART port before exit the > > function __msm_console_write(). > > > > This approach is not tested yet and it looks too arbitrary; I will > > give a try for it. At the meantime, welcome any insight suggestion > > with proper register operations. > > According to the documentation, NCF_TX is only needed for SW transmit > mode, where software is directly puttting characters in the fifo. Its > not needed for BAM mode. According to your example, recursive console > printing will only happen in BAM mode, and not in SW mode. Perhaps if > we put the NCF_TX uses to just the SW mode, we avoid the issue and can > allow recursive printing? Thanks for the suggestion! But based on the suggestion, I tried to change code as below, the console even cannot work when boot the kernel: static void msm_reset_dm_count(struct uart_port *port, int count) { + u32 val; + msm_wait_for_xmitr(port); - msm_write(port, count, UARTDM_NCF_TX); - msm_read(port, UARTDM_NCF_TX); + + val = msm_read(port, UARTDM_DMEN); + + /* + * NCF is only enabled for SW transmit mode and is + * skipped for BAM mode. + */ + if (!(val & UARTDM_DMEN_TX_BAM_ENABLE) && + !(val & UARTDM_DMEN_RX_BAM_ENABLE)) { + msm_write(port, count, UARTDM_NCF_TX); + msm_read(port, UARTDM_NCF_TX); + } } Alternatively, when exit from __msm_console_write() and if detect the case for without acquiring spinlock, invoke msm_wait_for_xmitr() to wait for transmit completion looks a good candidate solution. The updated patch is as below. Please let me know if this is doable? diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c index 1db79ee8a886..aa6a494c898d 100644 --- a/drivers/tty/serial/msm_serial.c +++ b/drivers/tty/serial/msm_serial.c @@ -190,6 +190,7 @@ struct msm_port { bool break_detected; struct msm_dma tx_dma; struct msm_dma rx_dma; + struct cpumask curr_user; }; #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) @@ -440,6 +441,7 @@ static void msm_complete_tx_dma(void *args) u32 val; spin_lock_irqsave(&port->lock, flags); + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); /* Already stopped */ if (!dma->count) @@ -474,6 +476,7 @@ static void msm_complete_tx_dma(void *args) msm_handle_tx(port); done: + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); spin_unlock_irqrestore(&port->lock, flags); } @@ -548,6 +551,7 @@ static void msm_complete_rx_dma(void *args) u32 val; spin_lock_irqsave(&port->lock, flags); + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); /* Already stopped */ if (!dma->count) @@ -594,6 +598,7 @@ static void msm_complete_rx_dma(void *args) msm_start_rx_dma(msm_port); done: + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); spin_unlock_irqrestore(&port->lock, flags); if (count) @@ -932,6 +937,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) u32 val; spin_lock_irqsave(&port->lock, flags); + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); misr = msm_read(port, UART_MISR); msm_write(port, 0, UART_IMR); /* disable interrupt */ @@ -963,6 +969,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) msm_handle_delta_cts(port); msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); spin_unlock_irqrestore(&port->lock, flags); return IRQ_HANDLED; @@ -1573,10 +1580,12 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) static void __msm_console_write(struct uart_port *port, const char *s, unsigned int count, bool is_uartdm) { + struct msm_port *msm_port = UART_TO_MSM(port); int i; int num_newlines = 0; bool replaced = false; void __iomem *tf; + int locked = 1; if (is_uartdm) tf = port->membase + UARTDM_TF; @@ -1589,7 +1598,15 @@ static void __msm_console_write(struct uart_port *port, const char *s, num_newlines++; count += num_newlines; - spin_lock(&port->lock); + if (port->sysrq) + locked = 0; + else if (oops_in_progress) + locked = spin_trylock(&port->lock); + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) + locked = 0; + else + spin_lock(&port->lock); + if (is_uartdm) msm_reset_dm_count(port, count); @@ -1625,7 +1642,12 @@ static void __msm_console_write(struct uart_port *port, const char *s, iowrite32_rep(tf, buf, 1); i += num_chars; } - spin_unlock(&port->lock); + + if (!locked) + msm_wait_for_xmitr(port); + + if (locked) + spin_unlock(&port->lock); } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-12-04 16:13 ` Leo Yan @ 2019-12-16 18:49 ` Jeffrey Hugo 2019-12-27 5:52 ` Leo Yan [not found] ` <CAD9cdQ6ROYf6B2PkQYJds_80-0dA=Jcew=TCNCSB=r+WEUNvdQ@mail.gmail.com> 1 sibling, 1 reply; 13+ messages in thread From: Jeffrey Hugo @ 2019-12-16 18:49 UTC (permalink / raw) To: Leo Yan Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Wed, Dec 4, 2019 at 9:13 AM Leo Yan <leo.yan@linaro.org> wrote: > > On Tue, Dec 03, 2019 at 03:42:31PM -0700, Jeffrey Hugo wrote: > > [...] > > > > > > This patch fixes the deadlock issue for recursive output; it adds a > > > > > variable 'curr_user' to indicate the uart port is used by which CPU, if > > > > > the CPU has acquired spinlock and wants to execute recursive output, > > > > > it will directly bail out. Here we don't choose to avoid locking and > > > > > print out log, the reason is in this case we don't want to reset the > > > > > uart port with function msm_reset_dm_count(); otherwise it can introduce > > > > > confliction with other flows and results in uart port malfunction and > > > > > later cannot output anymore. > > > > > > > > Is this not fixable? Sure, fixing the deadlock is an improvement, but > > > > dropping logs (particularly a memory warning like in your example) > > > > seems undesirable. > > > > > > Thanks a lot for your reviewing, Jeffrey. > > > > > > Agreed with you for the concern. > > > > > > To be honest, I am not familiar with the msm uart driver, so have no > > > confidence which is the best way for uart port operations. I can > > > think out one possible fixing is shown in below, if detects the lock > > > is not acquired then it will force to reset UART port before exit the > > > function __msm_console_write(). > > > > > > This approach is not tested yet and it looks too arbitrary; I will > > > give a try for it. At the meantime, welcome any insight suggestion > > > with proper register operations. > > > > According to the documentation, NCF_TX is only needed for SW transmit > > mode, where software is directly puttting characters in the fifo. Its > > not needed for BAM mode. According to your example, recursive console > > printing will only happen in BAM mode, and not in SW mode. Perhaps if > > we put the NCF_TX uses to just the SW mode, we avoid the issue and can > > allow recursive printing? > > Thanks for the suggestion! But based on the suggestion, I tried to > change code as below, the console even cannot work when boot the > kernel: > > static void msm_reset_dm_count(struct uart_port *port, int count) > { > + u32 val; > + > msm_wait_for_xmitr(port); > - msm_write(port, count, UARTDM_NCF_TX); > - msm_read(port, UARTDM_NCF_TX); > + > + val = msm_read(port, UARTDM_DMEN); > + > + /* > + * NCF is only enabled for SW transmit mode and is > + * skipped for BAM mode. > + */ > + if (!(val & UARTDM_DMEN_TX_BAM_ENABLE) && > + !(val & UARTDM_DMEN_RX_BAM_ENABLE)) { > + msm_write(port, count, UARTDM_NCF_TX); > + msm_read(port, UARTDM_NCF_TX); > + } > } > > > Alternatively, when exit from __msm_console_write() and if detect the > case for without acquiring spinlock, invoke msm_wait_for_xmitr() to wait > for transmit completion looks a good candidate solution. The updated > patch is as below. Please let me know if this is doable? > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > index 1db79ee8a886..aa6a494c898d 100644 > --- a/drivers/tty/serial/msm_serial.c > +++ b/drivers/tty/serial/msm_serial.c > @@ -190,6 +190,7 @@ struct msm_port { > bool break_detected; > struct msm_dma tx_dma; > struct msm_dma rx_dma; > + struct cpumask curr_user; > }; > > #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) > @@ -440,6 +441,7 @@ static void msm_complete_tx_dma(void *args) > u32 val; > > spin_lock_irqsave(&port->lock, flags); > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > /* Already stopped */ > if (!dma->count) > @@ -474,6 +476,7 @@ static void msm_complete_tx_dma(void *args) > > msm_handle_tx(port); > done: > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > spin_unlock_irqrestore(&port->lock, flags); > } > > @@ -548,6 +551,7 @@ static void msm_complete_rx_dma(void *args) > u32 val; > > spin_lock_irqsave(&port->lock, flags); > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > /* Already stopped */ > if (!dma->count) > @@ -594,6 +598,7 @@ static void msm_complete_rx_dma(void *args) > > msm_start_rx_dma(msm_port); > done: > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > spin_unlock_irqrestore(&port->lock, flags); > > if (count) > @@ -932,6 +937,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > u32 val; > > spin_lock_irqsave(&port->lock, flags); > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > misr = msm_read(port, UART_MISR); > msm_write(port, 0, UART_IMR); /* disable interrupt */ > > @@ -963,6 +969,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > msm_handle_delta_cts(port); > > msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > spin_unlock_irqrestore(&port->lock, flags); > > return IRQ_HANDLED; > @@ -1573,10 +1580,12 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > static void __msm_console_write(struct uart_port *port, const char *s, > unsigned int count, bool is_uartdm) > { > + struct msm_port *msm_port = UART_TO_MSM(port); > int i; > int num_newlines = 0; > bool replaced = false; > void __iomem *tf; > + int locked = 1; > > if (is_uartdm) > tf = port->membase + UARTDM_TF; > @@ -1589,7 +1598,15 @@ static void __msm_console_write(struct uart_port *port, const char *s, > num_newlines++; > count += num_newlines; > > - spin_lock(&port->lock); > + if (port->sysrq) > + locked = 0; > + else if (oops_in_progress) > + locked = spin_trylock(&port->lock); > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > + locked = 0; > + else > + spin_lock(&port->lock); > + > if (is_uartdm) > msm_reset_dm_count(port, count); > > @@ -1625,7 +1642,12 @@ static void __msm_console_write(struct uart_port *port, const char *s, > iowrite32_rep(tf, buf, 1); > i += num_chars; > } > - spin_unlock(&port->lock); > + > + if (!locked) > + msm_wait_for_xmitr(port); Sorry, catching up from some travel. I don't understand this. At this point, haven't we already called msm_reset_dm_count() and "corrupted" the state of the hardware? > + > + if (locked) > + spin_unlock(&port->lock); > } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output 2019-12-16 18:49 ` Jeffrey Hugo @ 2019-12-27 5:52 ` Leo Yan 0 siblings, 0 replies; 13+ messages in thread From: Leo Yan @ 2019-12-27 5:52 UTC (permalink / raw) To: Jeffrey Hugo Cc: Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml Hi Jeffrey, On Mon, Dec 16, 2019 at 11:49:52AM -0700, Jeffrey Hugo wrote: > On Wed, Dec 4, 2019 at 9:13 AM Leo Yan <leo.yan@linaro.org> wrote: > > > > On Tue, Dec 03, 2019 at 03:42:31PM -0700, Jeffrey Hugo wrote: > > > > [...] > > > > > > > > This patch fixes the deadlock issue for recursive output; it adds a > > > > > > variable 'curr_user' to indicate the uart port is used by which CPU, if > > > > > > the CPU has acquired spinlock and wants to execute recursive output, > > > > > > it will directly bail out. Here we don't choose to avoid locking and > > > > > > print out log, the reason is in this case we don't want to reset the > > > > > > uart port with function msm_reset_dm_count(); otherwise it can introduce > > > > > > confliction with other flows and results in uart port malfunction and > > > > > > later cannot output anymore. > > > > > > > > > > Is this not fixable? Sure, fixing the deadlock is an improvement, but > > > > > dropping logs (particularly a memory warning like in your example) > > > > > seems undesirable. > > > > > > > > Thanks a lot for your reviewing, Jeffrey. > > > > > > > > Agreed with you for the concern. > > > > > > > > To be honest, I am not familiar with the msm uart driver, so have no > > > > confidence which is the best way for uart port operations. I can > > > > think out one possible fixing is shown in below, if detects the lock > > > > is not acquired then it will force to reset UART port before exit the > > > > function __msm_console_write(). > > > > > > > > This approach is not tested yet and it looks too arbitrary; I will > > > > give a try for it. At the meantime, welcome any insight suggestion > > > > with proper register operations. > > > > > > According to the documentation, NCF_TX is only needed for SW transmit > > > mode, where software is directly puttting characters in the fifo. Its > > > not needed for BAM mode. According to your example, recursive console > > > printing will only happen in BAM mode, and not in SW mode. Perhaps if > > > we put the NCF_TX uses to just the SW mode, we avoid the issue and can > > > allow recursive printing? > > > > Thanks for the suggestion! But based on the suggestion, I tried to > > change code as below, the console even cannot work when boot the > > kernel: > > > > static void msm_reset_dm_count(struct uart_port *port, int count) > > { > > + u32 val; > > + > > msm_wait_for_xmitr(port); > > - msm_write(port, count, UARTDM_NCF_TX); > > - msm_read(port, UARTDM_NCF_TX); > > + > > + val = msm_read(port, UARTDM_DMEN); > > + > > + /* > > + * NCF is only enabled for SW transmit mode and is > > + * skipped for BAM mode. > > + */ > > + if (!(val & UARTDM_DMEN_TX_BAM_ENABLE) && > > + !(val & UARTDM_DMEN_RX_BAM_ENABLE)) { > > + msm_write(port, count, UARTDM_NCF_TX); > > + msm_read(port, UARTDM_NCF_TX); > > + } > > } > > > > > > Alternatively, when exit from __msm_console_write() and if detect the > > case for without acquiring spinlock, invoke msm_wait_for_xmitr() to wait > > for transmit completion looks a good candidate solution. The updated > > patch is as below. Please let me know if this is doable? > > > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > > index 1db79ee8a886..aa6a494c898d 100644 > > --- a/drivers/tty/serial/msm_serial.c > > +++ b/drivers/tty/serial/msm_serial.c > > @@ -190,6 +190,7 @@ struct msm_port { > > bool break_detected; > > struct msm_dma tx_dma; > > struct msm_dma rx_dma; > > + struct cpumask curr_user; > > }; > > > > #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) > > @@ -440,6 +441,7 @@ static void msm_complete_tx_dma(void *args) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > /* Already stopped */ > > if (!dma->count) > > @@ -474,6 +476,7 @@ static void msm_complete_tx_dma(void *args) > > > > msm_handle_tx(port); > > done: > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > } > > > > @@ -548,6 +551,7 @@ static void msm_complete_rx_dma(void *args) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > /* Already stopped */ > > if (!dma->count) > > @@ -594,6 +598,7 @@ static void msm_complete_rx_dma(void *args) > > > > msm_start_rx_dma(msm_port); > > done: > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > > > if (count) > > @@ -932,6 +937,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > misr = msm_read(port, UART_MISR); > > msm_write(port, 0, UART_IMR); /* disable interrupt */ > > > > @@ -963,6 +969,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > msm_handle_delta_cts(port); > > > > msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > > > return IRQ_HANDLED; > > @@ -1573,10 +1580,12 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > > static void __msm_console_write(struct uart_port *port, const char *s, > > unsigned int count, bool is_uartdm) > > { > > + struct msm_port *msm_port = UART_TO_MSM(port); > > int i; > > int num_newlines = 0; > > bool replaced = false; > > void __iomem *tf; > > + int locked = 1; > > > > if (is_uartdm) > > tf = port->membase + UARTDM_TF; > > @@ -1589,7 +1598,15 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > num_newlines++; > > count += num_newlines; > > > > - spin_lock(&port->lock); > > + if (port->sysrq) > > + locked = 0; > > + else if (oops_in_progress) > > + locked = spin_trylock(&port->lock); > > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > > + locked = 0; > > + else > > + spin_lock(&port->lock); > > + > > if (is_uartdm) > > msm_reset_dm_count(port, count); > > > > @@ -1625,7 +1642,12 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > iowrite32_rep(tf, buf, 1); > > i += num_chars; > > } > > - spin_unlock(&port->lock); > > + > > + if (!locked) > > + msm_wait_for_xmitr(port); > > Sorry, catching up from some travel. > > I don't understand this. At this point, haven't we already called > msm_reset_dm_count() and "corrupted" the state of the hardware? Yeah, at here msm_reset_dm_count() has been called. msm_wait_for_xmitr() is used to wait for completing transmition. So we can get flow as: msm_complete_tx_dma() kmalloc() fail __msm_console_write() msm_reset_dm_count() output logs msm_wait_for_xmitr() => ensure to not impact out flow My essential reason for adding msm_wait_for_xmitr() is to cleanup the "corrupted" state before return to out flow. Thanks, Leo Yan > > + > > + if (locked) > > + spin_unlock(&port->lock); > > } ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAD9cdQ6ROYf6B2PkQYJds_80-0dA=Jcew=TCNCSB=r+WEUNvdQ@mail.gmail.com>]
* Re: [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output [not found] ` <CAD9cdQ6ROYf6B2PkQYJds_80-0dA=Jcew=TCNCSB=r+WEUNvdQ@mail.gmail.com> @ 2019-12-16 18:50 ` Jeffrey Hugo 0 siblings, 0 replies; 13+ messages in thread From: Jeffrey Hugo @ 2019-12-16 18:50 UTC (permalink / raw) To: Rainer Sickinger Cc: Leo Yan, Andy Gross, Greg Kroah-Hartman, Jiri Slaby, Bjorn Andersson, Stephen Boyd, Nicolas Dechesne, MSM, linux-serial, lkml On Wed, Dec 4, 2019 at 9:21 AM Rainer Sickinger <rainersickinger.official@gmail.com> wrote: > > Can't you just exit with System.exit()? Isn't System.exit() a Java thing, and we are in a C environment? > > Am Mi., 4. Dez. 2019 um 17:14 Uhr schrieb Leo Yan <leo.yan@linaro.org>: >> >> On Tue, Dec 03, 2019 at 03:42:31PM -0700, Jeffrey Hugo wrote: >> >> [...] >> >> > > > > This patch fixes the deadlock issue for recursive output; it adds a >> > > > > variable 'curr_user' to indicate the uart port is used by which CPU, if >> > > > > the CPU has acquired spinlock and wants to execute recursive output, >> > > > > it will directly bail out. Here we don't choose to avoid locking and >> > > > > print out log, the reason is in this case we don't want to reset the >> > > > > uart port with function msm_reset_dm_count(); otherwise it can introduce >> > > > > confliction with other flows and results in uart port malfunction and >> > > > > later cannot output anymore. >> > > > >> > > > Is this not fixable? Sure, fixing the deadlock is an improvement, but >> > > > dropping logs (particularly a memory warning like in your example) >> > > > seems undesirable. >> > > >> > > Thanks a lot for your reviewing, Jeffrey. >> > > >> > > Agreed with you for the concern. >> > > >> > > To be honest, I am not familiar with the msm uart driver, so have no >> > > confidence which is the best way for uart port operations. I can >> > > think out one possible fixing is shown in below, if detects the lock >> > > is not acquired then it will force to reset UART port before exit the >> > > function __msm_console_write(). >> > > >> > > This approach is not tested yet and it looks too arbitrary; I will >> > > give a try for it. At the meantime, welcome any insight suggestion >> > > with proper register operations. >> > >> > According to the documentation, NCF_TX is only needed for SW transmit >> > mode, where software is directly puttting characters in the fifo. Its >> > not needed for BAM mode. According to your example, recursive console >> > printing will only happen in BAM mode, and not in SW mode. Perhaps if >> > we put the NCF_TX uses to just the SW mode, we avoid the issue and can >> > allow recursive printing? >> >> Thanks for the suggestion! But based on the suggestion, I tried to >> change code as below, the console even cannot work when boot the >> kernel: >> >> static void msm_reset_dm_count(struct uart_port *port, int count) >> { >> + u32 val; >> + >> msm_wait_for_xmitr(port); >> - msm_write(port, count, UARTDM_NCF_TX); >> - msm_read(port, UARTDM_NCF_TX); >> + >> + val = msm_read(port, UARTDM_DMEN); >> + >> + /* >> + * NCF is only enabled for SW transmit mode and is >> + * skipped for BAM mode. >> + */ >> + if (!(val & UARTDM_DMEN_TX_BAM_ENABLE) && >> + !(val & UARTDM_DMEN_RX_BAM_ENABLE)) { >> + msm_write(port, count, UARTDM_NCF_TX); >> + msm_read(port, UARTDM_NCF_TX); >> + } >> } >> >> >> Alternatively, when exit from __msm_console_write() and if detect the >> case for without acquiring spinlock, invoke msm_wait_for_xmitr() to wait >> for transmit completion looks a good candidate solution. The updated >> patch is as below. Please let me know if this is doable? >> >> diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c >> index 1db79ee8a886..aa6a494c898d 100644 >> --- a/drivers/tty/serial/msm_serial.c >> +++ b/drivers/tty/serial/msm_serial.c >> @@ -190,6 +190,7 @@ struct msm_port { >> bool break_detected; >> struct msm_dma tx_dma; >> struct msm_dma rx_dma; >> + struct cpumask curr_user; >> }; >> >> #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) >> @@ -440,6 +441,7 @@ static void msm_complete_tx_dma(void *args) >> u32 val; >> >> spin_lock_irqsave(&port->lock, flags); >> + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); >> >> /* Already stopped */ >> if (!dma->count) >> @@ -474,6 +476,7 @@ static void msm_complete_tx_dma(void *args) >> >> msm_handle_tx(port); >> done: >> + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); >> spin_unlock_irqrestore(&port->lock, flags); >> } >> >> @@ -548,6 +551,7 @@ static void msm_complete_rx_dma(void *args) >> u32 val; >> >> spin_lock_irqsave(&port->lock, flags); >> + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); >> >> /* Already stopped */ >> if (!dma->count) >> @@ -594,6 +598,7 @@ static void msm_complete_rx_dma(void *args) >> >> msm_start_rx_dma(msm_port); >> done: >> + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); >> spin_unlock_irqrestore(&port->lock, flags); >> >> if (count) >> @@ -932,6 +937,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) >> u32 val; >> >> spin_lock_irqsave(&port->lock, flags); >> + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); >> misr = msm_read(port, UART_MISR); >> msm_write(port, 0, UART_IMR); /* disable interrupt */ >> >> @@ -963,6 +969,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) >> msm_handle_delta_cts(port); >> >> msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ >> + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); >> spin_unlock_irqrestore(&port->lock, flags); >> >> return IRQ_HANDLED; >> @@ -1573,10 +1580,12 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) >> static void __msm_console_write(struct uart_port *port, const char *s, >> unsigned int count, bool is_uartdm) >> { >> + struct msm_port *msm_port = UART_TO_MSM(port); >> int i; >> int num_newlines = 0; >> bool replaced = false; >> void __iomem *tf; >> + int locked = 1; >> >> if (is_uartdm) >> tf = port->membase + UARTDM_TF; >> @@ -1589,7 +1598,15 @@ static void __msm_console_write(struct uart_port *port, const char *s, >> num_newlines++; >> count += num_newlines; >> >> - spin_lock(&port->lock); >> + if (port->sysrq) >> + locked = 0; >> + else if (oops_in_progress) >> + locked = spin_trylock(&port->lock); >> + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) >> + locked = 0; >> + else >> + spin_lock(&port->lock); >> + >> if (is_uartdm) >> msm_reset_dm_count(port, count); >> >> @@ -1625,7 +1642,12 @@ static void __msm_console_write(struct uart_port *port, const char *s, >> iowrite32_rep(tf, buf, 1); >> i += num_chars; >> } >> - spin_unlock(&port->lock); >> + >> + if (!locked) >> + msm_wait_for_xmitr(port); >> + >> + if (locked) >> + spin_unlock(&port->lock); >> } ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-12-27 6:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-27 14:15 [PATCH v2 0/2] tty: serial: msm_serial: Fix lockup issues Leo Yan 2019-11-27 14:15 ` [PATCH v2 1/2] tty: serial: msm_serial: Fix lockup for sysrq and oops Leo Yan 2019-12-02 16:37 ` Jeffrey Hugo 2019-12-05 0:40 ` Doug Anderson 2019-12-27 6:44 ` Leo Yan 2019-11-27 14:15 ` [PATCH v2 2/2] tty: serial: msm_serial: Fix deadlock caused by recursive output Leo Yan 2019-12-02 16:40 ` Jeffrey Hugo 2019-12-03 8:23 ` Leo Yan 2019-12-03 22:42 ` Jeffrey Hugo 2019-12-04 16:13 ` Leo Yan 2019-12-16 18:49 ` Jeffrey Hugo 2019-12-27 5:52 ` Leo Yan [not found] ` <CAD9cdQ6ROYf6B2PkQYJds_80-0dA=Jcew=TCNCSB=r+WEUNvdQ@mail.gmail.com> 2019-12-16 18:50 ` Jeffrey Hugo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).