linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure
@ 2020-02-11 10:13 satya priya
  2020-02-12 22:49 ` Stephen Boyd
  0 siblings, 1 reply; 5+ messages in thread
From: satya priya @ 2020-02-11 10:13 UTC (permalink / raw)
  To: gregkh
  Cc: swboyd, mgautam, linux-arm-msm, linux-serial, akashast, satya priya

RX cancel command fails when BT is switched on and off multiple times.

To handle this, poll for the cancel bit in SE_GENI_S_IRQ_STATUS register
instead of SE_GENI_S_CMD_CTRL_REG.

As per the HPG update, handle the RX last bit after cancel command
and flush out the RX FIFO buffer.

Signed-off-by: satya priya <skakit@codeaurora.org>
---
 drivers/tty/serial/qcom_geni_serial.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 191abb1..0bd1684 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -129,6 +129,7 @@ static int handle_rx_console(struct uart_port *uport, u32 bytes, bool drop);
 static int handle_rx_uart(struct uart_port *uport, u32 bytes, bool drop);
 static unsigned int qcom_geni_serial_tx_empty(struct uart_port *port);
 static void qcom_geni_serial_stop_rx(struct uart_port *uport);
+static void qcom_geni_serial_handle_rx(struct uart_port *uport, bool drop);
 
 static const unsigned long root_freq[] = {7372800, 14745600, 19200000, 29491200,
 					32000000, 48000000, 64000000, 80000000,
@@ -599,7 +600,7 @@ static void qcom_geni_serial_stop_rx(struct uart_port *uport)
 	u32 irq_en;
 	u32 status;
 	struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
-	u32 irq_clear = S_CMD_DONE_EN;
+	u32 s_irq_status;
 
 	irq_en = readl(uport->membase + SE_GENI_S_IRQ_EN);
 	irq_en &= ~(S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN);
@@ -615,10 +616,19 @@ static void qcom_geni_serial_stop_rx(struct uart_port *uport)
 		return;
 
 	geni_se_cancel_s_cmd(&port->se);
-	qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
-					S_GENI_CMD_CANCEL, false);
+	qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS,
+					S_CMD_CANCEL_EN, true);
+	/*
+	 * If timeout occurs secondary engine remains active
+	 * and Abort sequence is executed.
+	 */
+	s_irq_status = readl(uport->membase + SE_GENI_S_IRQ_STATUS);
+	/* Flush the Rx buffer */
+	if (s_irq_status & S_RX_FIFO_LAST_EN)
+		qcom_geni_serial_handle_rx(uport, true);
+	writel(s_irq_status, uport->membase + SE_GENI_S_IRQ_CLEAR);
+
 	status = readl(uport->membase + SE_GENI_STATUS);
-	writel(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
 	if (status & S_GENI_CMD_ACTIVE)
 		qcom_geni_serial_abort_rx(uport);
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure
  2020-02-11 10:13 [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure satya priya
@ 2020-02-12 22:49 ` Stephen Boyd
  2020-02-14 13:17   ` skakit
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Boyd @ 2020-02-12 22:49 UTC (permalink / raw)
  To: gregkh, satya priya
  Cc: mgautam, linux-arm-msm, linux-serial, akashast, satya priya

Quoting satya priya (2020-02-11 02:13:02)
> RX cancel command fails when BT is switched on and off multiple times.
> 
> To handle this, poll for the cancel bit in SE_GENI_S_IRQ_STATUS register
> instead of SE_GENI_S_CMD_CTRL_REG.
> 
> As per the HPG update, handle the RX last bit after cancel command
> and flush out the RX FIFO buffer.
> 
> Signed-off-by: satya priya <skakit@codeaurora.org>
> ---

I'm pretty sure I saw an oops with this patch.

   Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
   Mem abort info:
     ESR = 0x96000046
     EC = 0x25: DABT (current EL), IL = 32 bits
   Bluetooth: hci_ldisc.c:hci_uart_register_proto() HCI UART protocol QCA registered
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
   Data abort info:
     ISV = 0, ISS = 0x00000046
     CM = 0, WnR = 1
   user pgtable: 4k pages, 39-bit VAs, pgdp=00000001b4645000
   [0000000000000000] pgd=00000001ac579003, pud=00000001ac579003, pmd=0000000000000000
   Internal error: Oops: 96000046 [#1] PREEMPT SMP
   Modules linked in: hci_uart(+) qcom_spmi_temp_alarm btqca qcom_spmi_adc5 qcom_vadc_common bluetooth ecdh_generic ecc venus_core v4l2_mem2mem videobuf2_v4l2 videobuf2_common ipa qcom_q6v5_mss qcom_q6v5_ipa_notify qcom_common qcom_q6v5 xt_MASQUERADE fuse rmtfs_mem iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf cros_ec_sensorsupport ath10k_snoc qmi_helpers ath10k_core ath lzo_rle lzo_compress mac80211 zram cfg80211 joydev
   CPU: 7 PID: 258 Comm: kworker/u16:3 Tainted: G S      W         5.4.17 #19
   Workqueue: events_unbound async_run_entry_fn
   pstate: 60c00009 (nZCv daif +PAN +UAO)
   pc : handle_rx_uart+0x64/0x278
   lr : qcom_geni_serial_handle_rx+0x84/0x90
   sp : ffffff814348f960
   x29: ffffff814348f960 x28: ffffffd01ac24288
   x27: 0000000000000018 x26: 0000000000000002
   x25: 0000000000000001 x24: ffffff8146341348
   x23: ffffff8146341000 x22: ffffffd01accc978
   x21: ffffff8146341000 x20: 0000000000000001
   x19: 0000000000000001 x18: ffffffd01b22d000
   x17: 0000000000008000 x16: 00000000000000b0
   x15: ffffffd01afdbdd0 x14: ffffffd01b3edde0
   x13: ffffffd01b7fb000 x12: 0000000000000001
   x11: 0000000000000000 x10: 0000000000000000
   x9 : ffffffd010344780 x8 : 0000000000000000
   x7 : ffffffd019d8e768 x6 : 0000000000000000
   x5 : ffffffd01adbb000 x4 : 0000000000008004
   x3 : 0000000000000001 x2 : 0000000000000001
   x1 : 0000000000000001 x0 : ffffffd01accc978
   Call trace:
    handle_rx_uart+0x64/0x278
    qcom_geni_serial_handle_rx+0x84/0x90
    qcom_geni_serial_stop_rx+0x110/0x180
    qcom_geni_serial_port_setup+0x68/0x1b0
    qcom_geni_serial_startup+0x24/0x70
    uart_startup+0x164/0x28c
    uart_port_activate+0x6c/0xbc
    tty_port_open+0xa8/0x114
    uart_open+0x28/0x38
    ttyport_open+0x7c/0x164
    serdev_device_open+0x38/0xe4
    hci_uart_register_device+0x54/0x2e8 [hci_uart]
    qca_serdev_probe+0x1c4/0x374 [hci_uart]
    serdev_drv_probe+0x3c/0x64
    really_probe+0x144/0x3f8
    driver_probe_device+0x70/0x140
    __driver_attach_async_helper+0x7c/0xa8
    async_run_entry_fn+0x60/0x178
    process_one_work+0x33c/0x640
    worker_thread+0x2a0/0x470
    kthread+0x128/0x138
    ret_from_fork+0x10/0x18
   Code: 1aca096a 911e0129 b940012b 7100054a (b800450b)

> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> index 191abb1..0bd1684 100644
> --- a/drivers/tty/serial/qcom_geni_serial.c
> +++ b/drivers/tty/serial/qcom_geni_serial.c
> @@ -599,7 +600,7 @@ static void qcom_geni_serial_stop_rx(struct uart_port *uport)
>         u32 irq_en;
>         u32 status;
>         struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> -       u32 irq_clear = S_CMD_DONE_EN;
> +       u32 s_irq_status;
>  
>         irq_en = readl(uport->membase + SE_GENI_S_IRQ_EN);
>         irq_en &= ~(S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN);
> @@ -615,10 +616,19 @@ static void qcom_geni_serial_stop_rx(struct uart_port *uport)
>                 return;
>  
>         geni_se_cancel_s_cmd(&port->se);
> -       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
> -                                       S_GENI_CMD_CANCEL, false);
> +       qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS,
> +                                       S_CMD_CANCEL_EN, true);
> +       /*
> +        * If timeout occurs secondary engine remains active
> +        * and Abort sequence is executed.
> +        */
> +       s_irq_status = readl(uport->membase + SE_GENI_S_IRQ_STATUS);
> +       /* Flush the Rx buffer */
> +       if (s_irq_status & S_RX_FIFO_LAST_EN)
> +               qcom_geni_serial_handle_rx(uport, true);

This seems to be the problematic line. We didn't call handle_rx() from
the stop_rx() path before. And this qcom_geni_serial_stop_rx() function
is called from qcom_geni_serial_startup(), but most importantly, we call
into this function from startup before we allocate memory for the
port->rx_fifo member (see the devm_kcalloc() later in
qcom_geni_serial_port_setup() and see how it's after we stop rx).

Why do we need to flush the rx buffer by reading it into the software
buffer? Can't we simply ack any outstanding RX interrupts in the
hardware when we're stopping receive?

> +       writel(s_irq_status, uport->membase + SE_GENI_S_IRQ_CLEAR);
> +
>         status = readl(uport->membase + SE_GENI_STATUS);
> -       writel(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
>         if (status & S_GENI_CMD_ACTIVE)
>                 qcom_geni_serial_abort_rx(uport);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure
  2020-02-12 22:49 ` Stephen Boyd
@ 2020-02-14 13:17   ` skakit
  2020-02-19 17:50     ` Stephen Boyd
  0 siblings, 1 reply; 5+ messages in thread
From: skakit @ 2020-02-14 13:17 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: gregkh, mgautam, linux-arm-msm, linux-serial, akashast, rojay

On 2020-02-13 04:19, Stephen Boyd wrote:
> Quoting satya priya (2020-02-11 02:13:02)
>> RX cancel command fails when BT is switched on and off multiple times.
>> 
>> To handle this, poll for the cancel bit in SE_GENI_S_IRQ_STATUS 
>> register
>> instead of SE_GENI_S_CMD_CTRL_REG.
>> 
>> As per the HPG update, handle the RX last bit after cancel command
>> and flush out the RX FIFO buffer.
>> 
>> Signed-off-by: satya priya <skakit@codeaurora.org>
>> ---
> 
> I'm pretty sure I saw an oops with this patch.
> 
>    Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000000
>    Mem abort info:
>      ESR = 0x96000046
>      EC = 0x25: DABT (current EL), IL = 32 bits
>    Bluetooth: hci_ldisc.c:hci_uart_register_proto() HCI UART protocol
> QCA registered
>      SET = 0, FnV = 0
>      EA = 0, S1PTW = 0
>    Data abort info:
>      ISV = 0, ISS = 0x00000046
>      CM = 0, WnR = 1
>    user pgtable: 4k pages, 39-bit VAs, pgdp=00000001b4645000
>    [0000000000000000] pgd=00000001ac579003, pud=00000001ac579003,
> pmd=0000000000000000
>    Internal error: Oops: 96000046 [#1] PREEMPT SMP
>    Modules linked in: hci_uart(+) qcom_spmi_temp_alarm btqca
> qcom_spmi_adc5 qcom_vadc_common bluetooth ecdh_generic ecc venus_core
> v4l2_mem2mem videobuf2_v4l2 videobuf2_common ipa qcom_q6v5_mss
> qcom_q6v5_ipa_notify qcom_common qcom_q6v5 xt_MASQUERADE fuse
> rmtfs_mem iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors
> cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf
> cros_ec_sensorsupport ath10k_snoc qmi_helpers ath10k_core ath lzo_rle
> lzo_compress mac80211 zram cfg80211 joydev
>    CPU: 7 PID: 258 Comm: kworker/u16:3 Tainted: G S      W         
> 5.4.17 #19
>    Workqueue: events_unbound async_run_entry_fn
>    pstate: 60c00009 (nZCv daif +PAN +UAO)
>    pc : handle_rx_uart+0x64/0x278
>    lr : qcom_geni_serial_handle_rx+0x84/0x90
>    sp : ffffff814348f960
>    x29: ffffff814348f960 x28: ffffffd01ac24288
>    x27: 0000000000000018 x26: 0000000000000002
>    x25: 0000000000000001 x24: ffffff8146341348
>    x23: ffffff8146341000 x22: ffffffd01accc978
>    x21: ffffff8146341000 x20: 0000000000000001
>    x19: 0000000000000001 x18: ffffffd01b22d000
>    x17: 0000000000008000 x16: 00000000000000b0
>    x15: ffffffd01afdbdd0 x14: ffffffd01b3edde0
>    x13: ffffffd01b7fb000 x12: 0000000000000001
>    x11: 0000000000000000 x10: 0000000000000000
>    x9 : ffffffd010344780 x8 : 0000000000000000
>    x7 : ffffffd019d8e768 x6 : 0000000000000000
>    x5 : ffffffd01adbb000 x4 : 0000000000008004
>    x3 : 0000000000000001 x2 : 0000000000000001
>    x1 : 0000000000000001 x0 : ffffffd01accc978
>    Call trace:
>     handle_rx_uart+0x64/0x278
>     qcom_geni_serial_handle_rx+0x84/0x90
>     qcom_geni_serial_stop_rx+0x110/0x180
>     qcom_geni_serial_port_setup+0x68/0x1b0
>     qcom_geni_serial_startup+0x24/0x70
>     uart_startup+0x164/0x28c
>     uart_port_activate+0x6c/0xbc
>     tty_port_open+0xa8/0x114
>     uart_open+0x28/0x38
>     ttyport_open+0x7c/0x164
>     serdev_device_open+0x38/0xe4
>     hci_uart_register_device+0x54/0x2e8 [hci_uart]
>     qca_serdev_probe+0x1c4/0x374 [hci_uart]
>     serdev_drv_probe+0x3c/0x64
>     really_probe+0x144/0x3f8
>     driver_probe_device+0x70/0x140
>     __driver_attach_async_helper+0x7c/0xa8
>     async_run_entry_fn+0x60/0x178
>     process_one_work+0x33c/0x640
>     worker_thread+0x2a0/0x470
>     kthread+0x128/0x138
>     ret_from_fork+0x10/0x18
>    Code: 1aca096a 911e0129 b940012b 7100054a (b800450b)
I think the most probable explanation of the crash is, set_termios call 
is starting RX engine and RX engine is sampling some garbage data from 
line, and by the time startup is called, we have few data to read.
How frequently you are able to see this crash? because internally we are 
unable to reproduce it.
> 
>> diff --git a/drivers/tty/serial/qcom_geni_serial.c 
>> b/drivers/tty/serial/qcom_geni_serial.c
>> index 191abb1..0bd1684 100644
>> --- a/drivers/tty/serial/qcom_geni_serial.c
>> +++ b/drivers/tty/serial/qcom_geni_serial.c
>> @@ -599,7 +600,7 @@ static void qcom_geni_serial_stop_rx(struct 
>> uart_port *uport)
>>         u32 irq_en;
>>         u32 status;
>>         struct qcom_geni_serial_port *port = to_dev_port(uport, 
>> uport);
>> -       u32 irq_clear = S_CMD_DONE_EN;
>> +       u32 s_irq_status;
>> 
>>         irq_en = readl(uport->membase + SE_GENI_S_IRQ_EN);
>>         irq_en &= ~(S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN);
>> @@ -615,10 +616,19 @@ static void qcom_geni_serial_stop_rx(struct 
>> uart_port *uport)
>>                 return;
>> 
>>         geni_se_cancel_s_cmd(&port->se);
>> -       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
>> -                                       S_GENI_CMD_CANCEL, false);
>> +       qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS,
>> +                                       S_CMD_CANCEL_EN, true);
>> +       /*
>> +        * If timeout occurs secondary engine remains active
>> +        * and Abort sequence is executed.
>> +        */
>> +       s_irq_status = readl(uport->membase + SE_GENI_S_IRQ_STATUS);
>> +       /* Flush the Rx buffer */
>> +       if (s_irq_status & S_RX_FIFO_LAST_EN)
>> +               qcom_geni_serial_handle_rx(uport, true);
> 
> This seems to be the problematic line. We didn't call handle_rx() from
> the stop_rx() path before. And this qcom_geni_serial_stop_rx() function
> is called from qcom_geni_serial_startup(), but most importantly, we 
> call
> into this function from startup before we allocate memory for the
> port->rx_fifo member (see the devm_kcalloc() later in
> qcom_geni_serial_port_setup() and see how it's after we stop rx).
> 
> Why do we need to flush the rx buffer by reading it into the software
> buffer? Can't we simply ack any outstanding RX interrupts in the
> hardware when we're stopping receive?
We can't simply ack RX_LAST interrupt, there is a sticky bit that get 
set on HW level(not exposed to SW) with RX last interrupt. The only way 
to clear it is flush out RX FIFO HW buffer. The sticky bit can create 
problem for future transfers if remained uncleared.
How about we allocate buffer to port->rx_fifo in probe itself?
> 
>> +       writel(s_irq_status, uport->membase + SE_GENI_S_IRQ_CLEAR);
>> +
>>         status = readl(uport->membase + SE_GENI_STATUS);
>> -       writel(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
>>         if (status & S_GENI_CMD_ACTIVE)
>>                 qcom_geni_serial_abort_rx(uport);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure
  2020-02-14 13:17   ` skakit
@ 2020-02-19 17:50     ` Stephen Boyd
  2020-02-24 14:47       ` skakit
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Boyd @ 2020-02-19 17:50 UTC (permalink / raw)
  To: skakit; +Cc: gregkh, mgautam, linux-arm-msm, linux-serial, akashast, rojay

Quoting skakit@codeaurora.org (2020-02-14 05:17:01)
> On 2020-02-13 04:19, Stephen Boyd wrote:
> >     driver_probe_device+0x70/0x140
> >     __driver_attach_async_helper+0x7c/0xa8
> >     async_run_entry_fn+0x60/0x178
> >     process_one_work+0x33c/0x640
> >     worker_thread+0x2a0/0x470
> >     kthread+0x128/0x138
> >     ret_from_fork+0x10/0x18
> >    Code: 1aca096a 911e0129 b940012b 7100054a (b800450b)
> I think the most probable explanation of the crash is, set_termios call 
> is starting RX engine and RX engine is sampling some garbage data from 
> line, and by the time startup is called, we have few data to read.
> How frequently you are able to see this crash? because internally we are 
> unable to reproduce it.

How is set_termios involved? Is that starting the RX side before
uart_startup() is called? Sorry I haven't looked into the code flow very
deeply here.

It seems to happen when the bluetooth driver probes so maybe constantly
adding and removing the hci_uart module will cause this to happen for
you? I also run the kernel with many debug options enabled, so maybe try
enabling all the debug stuff? I see it randomly right now so I'm not
sure.

> > 
> > 
> > This seems to be the problematic line. We didn't call handle_rx() from
> > the stop_rx() path before. And this qcom_geni_serial_stop_rx() function
> > is called from qcom_geni_serial_startup(), but most importantly, we 
> > call
> > into this function from startup before we allocate memory for the
> > port->rx_fifo member (see the devm_kcalloc() later in
> > qcom_geni_serial_port_setup() and see how it's after we stop rx).
> > 
> > Why do we need to flush the rx buffer by reading it into the software
> > buffer? Can't we simply ack any outstanding RX interrupts in the
> > hardware when we're stopping receive?
> We can't simply ack RX_LAST interrupt, there is a sticky bit that get 
> set on HW level(not exposed to SW) with RX last interrupt. The only way 
> to clear it is flush out RX FIFO HW buffer. The sticky bit can create 
> problem for future transfers if remained uncleared.
> How about we allocate buffer to port->rx_fifo in probe itself?

Ok. If we have to read the rx fifo to flush the buffer then perhaps
write another function that qcom_geni_serial_stop_rx() can use to
indicate that it wants to throw away whatever is in the rx fifo? Or
adjust handle_rx() to check for a NULL fifo pointer and throw it away in
that case? When we're setting up this device I don't understand why we
would want to read anything out of the rx fifo that was there before the
driver started.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure
  2020-02-19 17:50     ` Stephen Boyd
@ 2020-02-24 14:47       ` skakit
  0 siblings, 0 replies; 5+ messages in thread
From: skakit @ 2020-02-24 14:47 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: gregkh, mgautam, linux-arm-msm, linux-serial, akashast, rojay, msavaliy

On 2020-02-19 23:20, Stephen Boyd wrote:
> Quoting skakit@codeaurora.org (2020-02-14 05:17:01)
>> On 2020-02-13 04:19, Stephen Boyd wrote:
>> >     driver_probe_device+0x70/0x140
>> >     __driver_attach_async_helper+0x7c/0xa8
>> >     async_run_entry_fn+0x60/0x178
>> >     process_one_work+0x33c/0x640
>> >     worker_thread+0x2a0/0x470
>> >     kthread+0x128/0x138
>> >     ret_from_fork+0x10/0x18
>> >    Code: 1aca096a 911e0129 b940012b 7100054a (b800450b)
>> I think the most probable explanation of the crash is, set_termios 
>> call
>> is starting RX engine and RX engine is sampling some garbage data from
>> line, and by the time startup is called, we have few data to read.
>> How frequently you are able to see this crash? because internally we 
>> are
>> unable to reproduce it.
> 
> How is set_termios involved? Is that starting the RX side before
> uart_startup() is called? Sorry I haven't looked into the code flow 
> very
> deeply here.
> 
> It seems to happen when the bluetooth driver probes so maybe constantly
> adding and removing the hci_uart module will cause this to happen for
> you? I also run the kernel with many debug options enabled, so maybe 
> try
> enabling all the debug stuff? I see it randomly right now so I'm not
> sure.
> 

In general uart_startup() is called before set_termios() ,but as per the 
crash logs shared, it seems RX engine is active(which can only happen 
from set_termios) before startup() is called.

>> >
>> >
>> > This seems to be the problematic line. We didn't call handle_rx() from
>> > the stop_rx() path before. And this qcom_geni_serial_stop_rx() function
>> > is called from qcom_geni_serial_startup(), but most importantly, we
>> > call
>> > into this function from startup before we allocate memory for the
>> > port->rx_fifo member (see the devm_kcalloc() later in
>> > qcom_geni_serial_port_setup() and see how it's after we stop rx).
>> >
>> > Why do we need to flush the rx buffer by reading it into the software
>> > buffer? Can't we simply ack any outstanding RX interrupts in the
>> > hardware when we're stopping receive?
>> We can't simply ack RX_LAST interrupt, there is a sticky bit that get
>> set on HW level(not exposed to SW) with RX last interrupt. The only 
>> way
>> to clear it is flush out RX FIFO HW buffer. The sticky bit can create
>> problem for future transfers if remained uncleared.
>> How about we allocate buffer to port->rx_fifo in probe itself?
> 
> Ok. If we have to read the rx fifo to flush the buffer then perhaps
> write another function that qcom_geni_serial_stop_rx() can use to
> indicate that it wants to throw away whatever is in the rx fifo? Or
> adjust handle_rx() to check for a NULL fifo pointer and throw it away 
> in
> that case? When we're setting up this device I don't understand why we
> would want to read anything out of the rx fifo that was there before 
> the
> driver started.

We cannot adjust handle_rx() to check for a NULL fifo pointer as reading 
from RX FIFO is mandatory to clear the sticky bit set.  We are passing 
"true" to handle_rx function so it will read and discard whatever data 
present in RX FIFO, it won't send to upper layers. We are planning to 
post a patch which has rx_fifo buffer allocated in probe itself, in 
order to avoid the NULL dereference.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-02-24 14:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 10:13 [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure satya priya
2020-02-12 22:49 ` Stephen Boyd
2020-02-14 13:17   ` skakit
2020-02-19 17:50     ` Stephen Boyd
2020-02-24 14:47       ` skakit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).